Difference between revisions of "GEPS 023: Storing data from large sources"

From Gramps
Jump to: navigation, search
m (Bug tracker)
(Update to simplify interface by distinguishing Citation view and Source view, removal of deduction content (which should be a separate GEPS) and revisions to reflect more details of design.)
Line 77: Line 77:
 
A new object which we call here a Citation is introduced.
 
A new object which we call here a Citation is introduced.
  
A CitationRef would replace the current SourceRef. The CitationRef would not have any associated fields (just the reference to the Citation).
+
A CitationBase would replace the current SourceRef. The CitationRef would not have any associated fields (just the reference to the Citation). The CitationBase plus the Citation would be equivalent to the Sourceef.
  
 
The Source object is unchanged.
 
The Source object is unchanged.
Line 97: Line 97:
 
[[File:Source citation changes.jpg|500px]]
 
[[File:Source citation changes.jpg|500px]]
  
Note that there is no need for a CitationRef editor, because this does not have any associated properties.
+
Note that there is no need for a CitationBase editor, because this does not have any associated properties.
  
 
Recording the information about the people would proceed exactly as at present. The first source to be recorded would display an empty Citation editor, which would be completed in the normal way. Subsequent data would be linked to the same Citation object.
 
Recording the information about the people would proceed exactly as at present. The first source to be recorded would display an empty Citation editor, which would be completed in the normal way. Subsequent data would be linked to the same Citation object.
Line 202: Line 202:
 
== Design ==
 
== Design ==
  
The design is based on the first of the 'Other changes' not being carried out at this time, because the approach to providing the information for GEPS 018: Evidence style sources has not yet been decided.
+
The design is based on the 'Other changes' not being carried out at this time.
  
The design does apply the second of the 'Other changes' (i.e. deduction content). However, perhaps this should be a separate change.
+
The first is not applied because the approach to providing the information for GEPS 018: Evidence style sources has not yet been decided.
 +
 
 +
The second is not applied because it would be better as the subject of a separate GEPS.
 +
 
 +
The design is intended to have minimal change from the existing user interface. Aunt Martha should be able to continue to use Gramps just as at present. Only if a user wants to take advantage of the ability to share Citations should the user ned to be aware of the changes.
  
 
=== Database changes ===
 
=== Database changes ===
  
A Global Confidence field is added to the Source object:
+
The Source PrimaryObject is unchanged.
 
 
Source
 
  ...no changes except add...
 
  1 Global Confidence
 
  
A new Citation object has the following content:
+
A new Citation PrimaryObject has the following content:
  
 
  Citation
 
  Citation
   1 SourceRef --> Source
+
   1 RefBase --> Source
 
   1 Gramps Id
 
   1 Gramps Id
 
   1 Confidence (5 values)
 
   1 Confidence (5 values)
Line 227: Line 227:
 
   1 Private
 
   1 Private
  
This should be a full primary object so that it has all the properties of a primary object.  
+
A Citation object '''always''' refers to one and only one Source object. Therefore when creating a new Citation object, one first chooses a Source object to refer to. When deleting a Source object, all Citations that refer to that Source object must first be deleted, and references to those Citations object must be deleted before the Citation object.
 
 
A new CitationRef object replaces existing SourceRef objects. The object has the following content:
 
  
CitationRef
+
A new CitationBase object is simply a list of references to Citations, with no attributes. This object is analogous to the NoteBase and TagBase objects.
  1 Type: Transcript or Deduction
 
  1 Deduction Confidence (5 values)
 
  1 Argument (one line string)
 
  n Note
 
  
 
The proxies will also need to be updated.
 
The proxies will also need to be updated.
Line 241: Line 235:
 
=== Upgrade ===
 
=== Upgrade ===
  
When upgrading from an old database version all objects that have a SourceRef need to be changed to the new format.  The primary objects Person, Family, Event, Media and Place contain SourceRef objects.  These also contain secondary objects which have SourceRef objects.
+
When upgrading from an old database version all objects that have a SourceRef need to be changed to the new format.  The primary objects Person, Family, Event, Media and Place contain SourceRef objects.  These Primary objects also contain secondary objects which have SourceRef objects.
  
 
  Person
 
  Person
Line 271: Line 265:
 
   Attribute
 
   Attribute
  
Each old SourceRef object should be used to create a new Citation record.  The old SourceRef will be replaced by a new CitationRef.
+
Each old SourceRef object should be used to create a new Citation record.  The old SourceRef will be replaced by a new CitationBase.
  
 
Because upgrading an old database version is automatic, the program should not prejudge the user's intention for similar SourceRefs. Therefore, SourceRefs should only be merged if they have the same Volume/Page, Date, Confidence and source and all Notes refer to shared copies of the same Notes (this will be the case where a SourceRef had been created by dragging and dropping on the Clipboard). It would be convenient if there were a separate Gramplet to automatically merge Citation objects on less stringent and probably configurable criteria. Such a Gramplet would need to merge Notes into the merged Citation.
 
Because upgrading an old database version is automatic, the program should not prejudge the user's intention for similar SourceRefs. Therefore, SourceRefs should only be merged if they have the same Volume/Page, Date, Confidence and source and all Notes refer to shared copies of the same Notes (this will be the case where a SourceRef had been created by dragging and dropping on the Clipboard). It would be convenient if there were a separate Gramplet to automatically merge Citation objects on less stringent and probably configurable criteria. Such a Gramplet would need to merge Notes into the merged Citation.
Line 416: Line 410:
 
=== User Interface changes ===
 
=== User Interface changes ===
  
==== Source model ====
+
==== Source model and view ====
 +
 
 +
The existing source model and the source view in the navigator are retained unchanged. This minimises the user interface changes. Source objects can be created, edited and deleted as at present.
 +
 
 +
==== Citation model ====
  
Change the existing source model to introduce a SourceTreeModel. Change the model so that it encompasses both Source objects and Citation objects. The tree model should have lines for Sources with a disclosure triangle, and subsidiary lines for Citations. Add new fields Title/Page (actually Source:Title or Citation:Volume/Page) and Citation:Volume/Page, retaining Source:Title.  Id is changed to "Source:id or Citation:Id". Source only fields are blank for Citations, and Citation only fields are blank for Sources. Title/Page displays either Title or Volume/Page (alternative of displaying Title:Volume/Page rejected). Change ID to be either Source Id or Citation Id depending on the record. Change the model so that either a Source or a Citation can be selected, and the returned value indicates which.
+
New CitationTreeModel and CitationListModel are introduced. The models encompass both Source objects and Citation objects. The tree model should have lines for Sources with a disclosure triangle, and subsidiary lines for Citations. Fields for Source should include Title, Author, ID, Abbreviation, Publication information and date last changed. Fields for Citation should include Volume/Page, ID, Date, Confidence and date last changed. The models should either a Source or a Citation can be selected, and the returned value indicates which.
  
 
This arrangement would allow sorting by Volume/Page (in case this were useful for some users - e.g. if Title is Birth certificate or Marriage certificate, and Volume/Page is the name of the individual).
 
This arrangement would allow sorting by Volume/Page (in case this were useful for some users - e.g. if Title is Birth certificate or Marriage certificate, and Volume/Page is the name of the individual).
  
Should the view show all Sources and on separate lines all Citations? Or should some views just show one or the other. Suggest that all views should show all Sources and on separate lines all Citations.
+
==== Citation Selector ====
 
 
==== Source Selector ====
 
  
This uses the modified source model.
+
This uses the new CitationTreeModel.
  
 
[[File:select source.jpg|200px]]
 
[[File:select source.jpg|200px]]
  
The display for the source selector comprises Title/Page and Id, using the SourceTreeModel.
+
The display for the source selector comprises Title/Page and Id.
  
==== Source View ====
+
==== Citation View ====
  
This uses the modified source model.
+
This uses the new citation models.
  
Add buttons to select Source view or Source Tree view. The default display for either source view comprises Title/Page, Id, Author, Publication information.  
+
Add buttons to select Source view or Source Tree view. The default display for either source view comprises Title/Page, Id, Date and Confidence.  
  
 
There are several possible objectives to add/edit:
 
There are several possible objectives to add/edit:
# Edit a Citation (select Citation, click Edit, - allows Citation and Source to be changed),
+
# Edit a Citation (Citation view: select Citation, click Edit, - allows Citation and Source to be changed),
# Edit a Source (select Source, click Edit - only allows Source to be changed),
+
# Edit a Source (Source view: select Source, click Edit - only allows Source to be changed),
# Add a new Source (selection is irrelevant, click Add - allows Citation and Source to be added - just input a source),
+
# Add a new Source (Source view: click Add - allows Source to be added),
# Add a new Citation to an existing Source (select the source, click Add new Citation - allows new Citation, and allows the Source to be changed).  
+
# Add a new Citation to an existing Source (Citation view: click Add; select the source - allows new Citation, and allows the Source to be changed).  
 +
# Add a new citation to a new source (Source view: add the Source; then Citation view: add the Citation)
  
Add button for "Add a new citation" (Suggest using GTK_STOCK_INDENT)
+
In the Citation view:
 
+
*If the "Add a new Citation" button is clicked: Bring up the Source selector. When a Source is chosen, bring up the Source-Citation editor with the specified Source populated. On clicking OK, store the new Citation and if changed, update the Source object.
If the "Add a new source" button is clicked: Bring up the Source-Citation editor. On clicking OK, if just Source information is entered, then just store a new Source object. Otherwise store both a Citation and a Source object.
+
*If the "Edit" button is clicked, and a Citation is highlighted, the Source-Citation editor should allow both the Source and Citation to be changed.
If the "Add a new Citation" button is clicked: Bring up the Source-Citation editor with the specified Source populated. On clicking OK, if just Source information is entered, then just update the Source object. Otherwise store the new Citation and if changed, update the Source object.
+
*If the "Edit" button is clicked, and a Source is highlighted: Do nothing
 
+
*If the "Remove" button is clicked, then the highlighted object and all objects that reference it should be removed.
Would it be better to avoid adding an extra button (with extra complication for the user) and just add a Citation if a Source row is highlighted and add a Source and Citation if not?
 
 
 
If the "Edit" button is clicked, and a Source is highlighted: Bring up the current Source editor. (i.e. this does not allow a Citation to be input or amended).
 
If the "Edit" button is clicked, and a Citation is highlighted, the editor should allow both the Source and Citation to be changed.
 
 
 
If the "Remove" button is clicked, then the highlighted object and all objects that reference it should be removed.
 
  
 
==== Editor Source tabs ====
 
==== Editor Source tabs ====
  
Should these be hierarchical or flat? Given the likely relatively small number of CitationRefs in a given editor, this should probably remain flat.
+
CitationEmbedList replaces all occurrences of SourceEmbedList. The default fields displayed and the buttons etc. remain unchanged, except that the ID field contains the Citation ID rather than the Source ID.
  
The possible scenarios are different here, because "add a new Source" does not apply, because a source is always linked to the current primary or secondary object through a Citation, so a Citation is always created. There is therefore no need for an additional button.
+
The same Source-Citation editor is used as for editing a Citation.
  
Because the CitationRef has data, the editor will need three areas, as follows:
+
If the "Create and add a new citation" button is clicked, then allow the creation of both Source and Citation objects. This is consistent with the current model, where "Create and add a new source" adds a new Source object and creates the current sourceref. It is distinct from the "Add an existing source" which allows either an existing Source or an existing Citation and associated Source to be selected. Bring up the Source-Citation editor with all fields empty. On clicking OK, save both a new Source and a new Citation. Error if the source is not entered. Link the CitationBase to the new Citation.
  
[[File:Combined source editor.jpg|center|200px]]
+
If the "Add an existing source" (Share) button is clicked, the citation selector is dispalyed. The user will either select a Source or a Citation. Pre-populate the editor according to what has been selected. On clicking OK, save either a new or an updated Citation; save the Source if it was updated (actually, the Source seems to be re-saved even if it was not changed, which may affect the last changed date incorrectly). Link the CitationBase to the new Citation.
 
 
If the "Create and add a new source" button is clicked, then allow the creation of both Source and Citation objects. This is consistent with the current model, where "Create and add a new source" adds a new Source object and create the current sourceref. It is distinct from the "Add an existing source" which allows either an existing Source or an existing Citation and associated Source to be selected. Bring up the Source-Citation editor with all fields empty. On clicking OK, save both a new Source and a new Citation. Link the CitationRef to the new Citation.
 
 
 
If the "Add an existing source" (Share) button is clicked, the user will either have selected a Source or a Citation. Pre-populate the editor according to what has been selected. On clicking OK, save either a new or an updated Citation; save the Source if it was updated (actually, the Source seems to be re-saved even if it was not changed, which may affect the last changed date incorrectly). Link the CitationRef to the new Citation.
 
 
   
 
   
If the "Remove" button is clicked, then the highlighted Citation should be removed.
+
If the "Remove" button is clicked, then the highlighted Citation should be removed, together with the links to the citation. The Source is not removed.
  
 
==== Editors ====
 
==== Editors ====
  
There are three separate 'source' editors:
+
There are two separate 'citation/source' editors:
  
 
{|cellspacing="0" border="1" align="center" width="95%"
 
{|cellspacing="0" border="1" align="center" width="95%"
Line 480: Line 467:
 
|style="padding: 0.3em;" |[[Image:Add_source_1.png‎|center|thumb|200px]] editsource
 
|style="padding: 0.3em;" |[[Image:Add_source_1.png‎|center|thumb|200px]] editsource
 
|style="padding: 0.3em;" |[[Image:Source citation changes.jpg|center|thumb|200px]] editcitation. As used with the Citation reference having data, then this would not have the warning signs in the top half.
 
|style="padding: 0.3em;" |[[Image:Source citation changes.jpg|center|thumb|200px]] editcitation. As used with the Citation reference having data, then this would not have the warning signs in the top half.
|style="padding: 0.3em;" |[[Image:Combined source editor.jpg|center|thumb|200px]] editcitationref
 
 
|}
 
|}
  
editsource is used from the Source view when selecting a source and clicking the "Edit" button. Existing editsource is unchanged (adding a new source is not done through this interface, so the code that deals with that case can be removed).
+
editsource is used from
 +
* Source view when selecting a source and clicking the "Edit" button,
 +
* Source view when clicking the "Add" button.
 +
Existing editsource is unchanged.
  
 
editcitation is used from
 
editcitation is used from
* Source view when selecting a Citation and clicking the "Edit" button,
+
* Citation view when selecting a Citation and clicking the "Edit" button,
* Source view when clicking the "Add a new source" button (irrespective of the selection),
+
* Citation view when clicking the "Add a new citation" button (following the display of the source selector, and selection of a source),
* Source view when selecting a source and clicking the "Add a new Citation".
+
* Editor Source tabs (CitationEmbedList) when clicking the 'Create and add a new citation' button.
 +
* Editor Source tabs (CitationEmbedList) when clicking the 'Edit the selected citation' button.
 +
* Editor Source tabs (CitationEmbedList) when clicking the 'Add an existing citation' button.  
  
 
This is similar to the current editsourceref. On clicking OK,
 
This is similar to the current editsourceref. On clicking OK,
Line 494: Line 485:
 
* if nothing was passed in, if the Citation is blank store the new Source else store a new Citation and Source,
 
* if nothing was passed in, if the Citation is blank store the new Source else store a new Citation and Source,
 
* if a Source is passed in, add a new Citation and update the Source.
 
* if a Source is passed in, add a new Citation and update the Source.
 
editcitationref is used from the editor source tabs when clicking the "Add an existing source" or the "Create and add a new source".
 
 
On clicking OK,
 
* if a Citation was passed in, update the source and Citation and link the CitationRef to the Citation,
 
* if a Source is passed in, store a new Citation (even if the fields are empty), update the Source and link the CitationRef to the Citation,
 
* if nothing is passed in, if the Source is blank, error, else, store a new Citation (even if the fields are empty), store a new Source and link the CitationRef to the Citation
 
  
 
==== Rules ====
 
==== Rules ====

Revision as of 14:48, 22 July 2011

Proposed changes for enhancing GRAMPS by enhancing the mechanism for storing data from ‘large’ sources.

See SVN and tarball.

User story (Problem that needs to be solved)

I have a book that details, on page 7:

“In the 1870s B moved to the town of BT. It was here that I's father K was born in 1860. By the time he was 30 he had married. His first child M was born there. Shortly afterwards his wife died and two years later he married G. M was 12 before her brother I appeared.”

So I wish to record B, K, and the fact that K was born in 1860, and married around 1890. K's children were M and I, M was born around 1890 and I was born around 1900. [Actually, from other sources he was born on 5 Dec 1902.] I need to record page 7 of this book as the source for all these pieces of information.

Some time later I decide I should record a transcript of the source text.

Some time later, I decide to scan that page of the book, and need to store the scan as the source.

Later still, I discover that page 212 of the same book details that I married W in 1946.

Now I wish to record W, and the marriage of W and I in 1946. The source for all this is page 212 of the book, and this time I record the scan against the source.

List various solutions

  • Record each page as a source reference with the book as a single source.
  • Record each page as a separate source (i.e. page 7 as one source and page 212 as a second source).
  • Modify Gramps to introduce Source Content that can be shared and can have media attachments and record each page as a Source Content.

Record each page as a source reference

The page number and the text from the page are stored in the Source Reference, while the Title and author of the book are stored in the shared Source.

This may be considered the natural approach, given the fact that the Source Reference headings include the page number, while the shared Source includes the Title and the Author etc.

This approach may be illustrated as follows (only some of the people and facts are shown here):

Separate source ref.gif

The problems with this approach are

  • The Source Reference does not allow the Media scan to be stored.
  • The Source Reference is not shared, there is a separate instance for each place where it occurs (e.g. each event).

The media scan can be stored with the source, as shown in the figure above. However, when one looks at the source, it will not be clear which media object in the gallery relate to which source reference, except by some naming convention (the applicable media file may be obvious in this case, but not in others). Also, in the source reference editor, one cannot immediately see which scan relates to this particular page.

The fact that the source reference is not shared (and is not a separate object) means that any updates to the source reference will have to be done for each occurrence, rather than once. For example, when I want to add the transcript of the paragraph to the source reference, I need to find each source reference and update it individually. It is also difficult to find all the source references, because they are not listed in a separate tab.

Note that there is an argument that separate source references for the different events is preferable, because the exact text that relates to that particular event can be attached. For example, for the birth event for person K, one could attach: “…the town of BT. It was here that I's father K was born in 1860…”. There are two objections to this:

  • It is difficult to identify exactly which parts of the text are relevant to each event. Should I’s father be included in the source for K’s birth?
  • It is far too tedious and laborious to devise separate source texts for each event. Given that the original paragraph giving family history information (this is a genuine example) is quite short, it is much quicker and easier to include the whole paragraph in each reference.

When it comes to adding the scan, the only option I really have is to add it to the source itself, despite the fact that the scan only relates to one page.

An alternative way to record the media is that suggested in the tutorial “Recording UK Census data”. Here the media object has a source reference which contains the details of the Date; Volume/page and Confidence that the media is associated with. This has the advantage that the details are unambiguously associated with the media. However, the relevant media cannot be found simply from the events.

Separate source ref media.jpg

Record each page as a source

The page number and the text from the page, the book Title and the author of the book are all stored in the Source. The Source Reference does not hold any particular information.

This arrangement can be illustrated as here (again only some of the people and facts are shown here):

Separate source.gif

This arrangement has the advantage that the transcript and media scan can be directly attached to the relevant page.

However it has a number of problems:

  • the Volume/Page field in the source reference is not used for its standard purpose
  • the information about the book itself is duplicated many times in each source object
  • the approach rapidly becomes unmanageable if there are a large number of pages to be recorded for a single source (i.e. if the source is 'large')

Modify Gramps to introduce a Citation object that can be shared and can have media attachments and record each page as a Citation

A new object which we call here a Citation is introduced.

A CitationBase would replace the current SourceRef. The CitationRef would not have any associated fields (just the reference to the Citation). The CitationBase plus the Citation would be equivalent to the Sourceef.

The Source object is unchanged.

Newproposal.jpg

The existing icon options in the Source tabs would be unchanged, but add existing source would bring up a treeview Source-Citation.

Add source citation.jpg

The treeview would be as follows:

Select source.jpg

The user would be able to select a Source as shown by the highlighting. In this case, the subsequent Citation dialogue would not be populated in the Citation area, only in the Source area. On the other hand, if a Citation were to be selected, then the subsequent Citation dialogue would be opened already populated with the shared Citation data.

The source reference editor would be changed to a Source-Citation editor as follows:

Source citation changes.jpg

Note that there is no need for a CitationBase editor, because this does not have any associated properties.

Recording the information about the people would proceed exactly as at present. The first source to be recorded would display an empty Citation editor, which would be completed in the normal way. Subsequent data would be linked to the same Citation object.

When it comes to record the transcript of the source text, there would only be a single Citation object to which the transcript would be added as a Note.

The scan of the page could be added as a Gallery item in the Citation object.

A new display category would be needed for the Citation, and it would be important to implement the merge facility for this category so that existing separate Citations could be merged.

There is no particular need to provide a means to make an existing Citation object refer to a different Source object. At present, it is not possible to change a source reference so as to preserve the information that has been input (such as date, Volume/Page, confidence or the links to notes) but to make it point to a different source. The source reference must be deleted and a new one created. Similarly, if one wanted to change a Citation to refer to a different source, the Citation would have to be deleted.

Discussion

Either of the two ways of using the current Gramps features has difficulties. This is shown not only by the points made in the descriptions above, but also by the fact that there have been many discussions in the lists about how to use the features.

The new approach make it simple to implement the given user story. The story focusses on recording information from a book, because that is a scenario which everyone can understand and relate to. However, it also applies equally to other scenarios, like recording census data.

The new approach is a very simple extension to the current features from the user's point of view. Gramps can be used exactly as at present with no additional inputs required from the user (the workflow is completely unchanged). If a particular Citation is to be shared, then again there is no need for any additional inputs, just use of the feature to add an existing Citation, instead of creating a new one. Apart from the Citation category display there are no additional screens.

The new approach is somewhat similar to the subsource approach in http://gramps.1791082.n4.nabble.com/sources-subsources-and-sourceref-td1794804.html. Subsource corresponds with Citation. However, in order to avoid any extra complexity in the approach, it is proposed that the notes are kept to the 'subsource'/Citation only, and not to the reference to them.

Using subsources can remove the above problems, as a subsource would be a nucleus information set: one line in a census, one marriage act, ...., and that connected to a source.

Why not use the sourceref?

The sourceref cannot be used for this! The text of an entire source, or an image of an entire source is something you need to share between objects, so it must be a source, not a sourceref. The note in the sourceref should only be used to explain how the information of the source was used for the object.

However, it is true that a subsource can have a date/page/volume/number within the larger source. Note that in the census example this is repeated on *all* sourceref objects. In the case of a subsource, it would be entered once in the subsource, and then not be repeated in the sourceref. I don't care to much for this as in the case where subsource is usefull, there is no problem of mentioning this in the note, .... section.

Is sourceref useless then?

I don't think this. For sources of which little information is learnt, or real books (bibliography, diary, ...) the sourceref is ideal. However, for sources which contain many unrelated short subparts that each can lead to large amount of changes throughout your genealogical database, the sourceref is less usefull, and a subsource is in order.

Note that adding a hierarchy of subsource objects would introduce much greater complexity, which is not warranted by the problems outlined.

Also, http://gramps.1791082.n4.nabble.com/Source-references-names-and-notes-td1813805.html#a1813818

The problem is still there when you discover a mistake in the source ref (page number), or want to add a note to these source references (eg transcript of the original latin text). Then you need to track down all source ref that where copies, and change them all. This is why in my workflow I keep most notes in the source object, where the media files (scans) also live. This is why I think about a subsource implementation, it does not hurt the people who want to keep working as they do now. The main thing holding such a thing back is GEDCOM though. The more we deviate of that in GRAMPS, the more difficult to map our internal data to something that can eg be uploaded to websites logically as you want it. It is really hard to keep having to drag an old and dead standard with us :-(

The proposed approach is directly compatible with GEDCOM; variants like hierarchical sources would not be.

References

There have been a large number of posts about this topic on the Gramps mailing lists. A selection is shown below.

  1. database issues sourceref references (2005) among other things suggests "We want to allow in the future that the sourceref can have a media coupled to it"
  2. sources subsources and sourceref (2007) proposes subsources between sources and objects
  3. GEDCOM and sourceref (2007) too much information in sourceref/citation or problem summarised as need to manually edit every independent source reference to change a page number
  4. Sources and sourceref (2007) second proposal is for "sourceref is made a primary object that an object can share, but is unique to a source source 1 -----> n sourceref n <----> n object (person, attribute, event, ...)"
  5. Media and attributes in data model for Gramps 3.0 (2007)
  6. local gallery tab in source reference (2007)
  7. nested sources - sourceref - one big source (2007)
  8. sources vs. repositories (2007) "Some genealogy apps allow to subdivide a source in pieces, eg divide a book in chapters. GRAMPS does _not_ have this at the moment"
  9. Sources, media and galleries: how to tie it all up "as intended" (2008)
  10. Medias, sources and sourcerefs (2009) concerned with relationship between media and sourceref
  11. Source references names and notes (2009) "have to stop using my sourceref approach and just put everything as a source, which I find clumsy and lacking in elegance. Since sourcerefs can't be shared, only copied, it is extremely difficult to keep track of them."
  12. Storing data from large sources (2010)
  13. Sharing sources (2010) actually he is talking about sharing source references (i.e. similar to the current proposal but with different terminology)

Other changes

Some other related changes have been suggested here:

  1. Use the Data key-value pairs field to store Publication data in the Source, and split the Volume/Page in the Citation, and add a couple of extra fields. This seems to be related to changing the data stored in a Citation as part of GEPS 018: Evidence style sources. It may be better to wait till GEPS 018 is resolved.
  2. Adding deduction content to the CitationRef, namely a type, confidence argument and set of notes, and a global confidence field to the Source. This is related to the BetterGEDCOM and methodology proposals.


Design

The design is based on the 'Other changes' not being carried out at this time.

The first is not applied because the approach to providing the information for GEPS 018: Evidence style sources has not yet been decided.

The second is not applied because it would be better as the subject of a separate GEPS.

The design is intended to have minimal change from the existing user interface. Aunt Martha should be able to continue to use Gramps just as at present. Only if a user wants to take advantage of the ability to share Citations should the user ned to be aware of the changes.

Database changes

The Source PrimaryObject is unchanged.

A new Citation PrimaryObject has the following content:

Citation
  1 RefBase  --> Source
  1 Gramps Id
  1 Confidence (5 values)
  1 Volume/Page
  1 Log Date (The date that this data was entered into the original source document)
  n Information (key-value pairs, current Data)
  n NoteIds
  n MediaRef (Region, Src, attr, notes)  --> Media
  1 Private

A Citation object always refers to one and only one Source object. Therefore when creating a new Citation object, one first chooses a Source object to refer to. When deleting a Source object, all Citations that refer to that Source object must first be deleted, and references to those Citations object must be deleted before the Citation object.

A new CitationBase object is simply a list of references to Citations, with no attributes. This object is analogous to the NoteBase and TagBase objects.

The proxies will also need to be updated.

Upgrade

When upgrading from an old database version all objects that have a SourceRef need to be changed to the new format. The primary objects Person, Family, Event, Media and Place contain SourceRef objects. These Primary objects also contain secondary objects which have SourceRef objects.

Person
 Name
 Address
 Attribute
 PersonRef
 MediaRef
  Attribute
 LdsOrd
Family
 Attribute
 ChildRef
 MediaRef
  Attribute
 LdsOrd
Event
 Attribute
 MediaRef
  Attribute
MediaObject
 Attribute
Place
 MediaRef
  Attribute

Each old SourceRef object should be used to create a new Citation record. The old SourceRef will be replaced by a new CitationBase.

Because upgrading an old database version is automatic, the program should not prejudge the user's intention for similar SourceRefs. Therefore, SourceRefs should only be merged if they have the same Volume/Page, Date, Confidence and source and all Notes refer to shared copies of the same Notes (this will be the case where a SourceRef had been created by dragging and dropping on the Clipboard). It would be convenient if there were a separate Gramplet to automatically merge Citation objects on less stringent and probably configurable criteria. Such a Gramplet would need to merge Notes into the merged Citation.

   Upgrade needs to process every SourceRef in primary objects and secondary objects.
   for each SourceRef:
       assemble Volume/Page, Date, Confidence and SourceId and all NoteIds
           for each Citation:
               if Volume/Page, Date, Confidence and SourceId and all NoteIdsfor are the same:
                   use the existing Citation
       if no match, then create a new Citation

Should the criteria for matching Citations be weaker, for example just the Volume/Page, Date, Confidence and source matching, with Notes from each SourceRef being added into the Citation?

Import/Export

The following formats will need to be updated: Gramps XML, GEDCOM, CSV, GeneWeb.

Gramps XML

This will need a new <Citations> section with <Citation> entries.

Gramps should be able to import both old and new versions of the Gramps XML. If a <sourceref> tag appears outside of a <Citation> entry then it will indicate an old version.

GEDCOM

GEDCOM Gramps
SOURCE_RECORD:=
  n @<XREF:SOUR>@ SOUR {1:1}                              
    +1 DATA {0:1}                                         
      +2 EVEN <EVENTS_RECORDED> {0:M}                    
        +3 DATE <DATE_PERIOD> {0:1}                     
        +3 PLAC <SOURCE_JURISDICTION_PLACE> {0:1}       
      +2 AGNC <RESPONSIBLE_AGENCY> {0:1}                
      +2 <<NOTE_STRUCTURE>> {0:M}                       
Not supported. See feature request 1371. Discussion in Role and event tags...
    +1 AUTH <SOURCE_ORIGINATOR> {0:1}
      +2 [CONC|CONT] <SOURCE_ORIGINATOR> {0:M}
Source:Author
    +1 TITL <SOURCE_DESCRIPTIVE_TITLE> {0:1}
      +2 [CONC|CONT] <SOURCE_DESCRIPTIVE_TITLE> {0:M}
Source:Title
    +1 ABBR <SOURCE_FILED_BY_ENTRY> {0:1}
Source:Abbreviation
    +1 PUBL <SOURCE_PUBLICATION_FACTS> {0:1}
      +2 [CONC|CONT] <SOURCE_PUBLICATION_FACTS> {0:M}
If the publication data is changed to key-value pairs, then on import, this GEDCOM field should be stored as a predefined key (e.g. PUBL) , and on export all the value should probably be concatenated with comma separators.
    +1 TEXT <TEXT_FROM_SOURCE> {0:1}
      +2 [CONC|CONT] <TEXT_FROM_SOURCE> {0:M}
Not directly supported. Note that in Gramps, one would store the text from source in a note. On import, the 'text from source' should probably be stored as the contents of a Source:NoteId. On export this GEDCOM field would not be output.
    +1 <<SOURCE_REPOSITORY_CITATION>> {0:M}
Contents stored as the content of a Source:RepoRef
    +1 REFN <USER_REFERENCE_NUMBER> {0:M}
      +2 TYPE <USER_REFERENCE_TYPE> {0:1}
Not supported for data interchange (discussion in REFN strategy)
    +1 RIN <AUTOMATED_RECORD_ID> {0:1}
Not supported for data interchange. The Source:gramps_id can be considered to be the AUTOMATED_RECORD_ID
    +1 <<CHANGE_DATE>> {0:1}
Source:change (automatically maintained by Gramps)
    +1 <<NOTE_STRUCTURE>> {0:M}                 ||
Contents stored as the content of a Source:NoteId
    +1 <<MULTIMEDIA_LINK>> {0:M}                 ||
Contents stored as the contents of a Source:MediaRef
Not supported Source:Global Confidence


GEDCOM Gramps
SOURCE_CITATION:=
  n SOUR @<XREF:SOUR>@ {1:1} p.27
    +1 PAGE <WHERE_WITHIN_SOURCE> {0:1} p.64
Citation:Volume/page
    +1 EVEN <EVENT_TYPE_CITED_FROM> {0:1} p.49
      +2 ROLE <ROLE_IN_EVENT> {0:1} p.61
Not supported. See feature request 2918 and 2924, (which are mostly duplicates of each other)
    +1 DATA {0:1}
      +2 DATE <ENTRY_RECORDING_DATE> {0:1} p.48
Citation:Log Date
      +2 TEXT <TEXT_FROM_SOURCE> {0:M} p.63
        +3 [CONC|CONT] <TEXT_FROM_SOURCE> {0:M}
Not directly supported. Note that in Gramps, one would store the text from source in a note. On import, the 'text from source' should probably be stored as the contents of a Citation:NoteId. On export this GEDCOM field would not be output.
    +1 <<MULTIMEDIA_LINK>> {0:M} p.37, 26
Contents stored as the contents of a Citation:MediaRef
    +1 <<NOTE_STRUCTURE>> {0:M} p.37
Contents stored as the contents of a Citation:NoteId. On export all the Data:Value pairs should probably be concatenated with comma separators into another separate note. On export, the fields in the CitationRef should probably be output as notes.
    +1 QUAY <CERTAINTY_ASSESSMENT> {0:1} p.43
Citation:Confidence


User Interface changes

Source model and view

The existing source model and the source view in the navigator are retained unchanged. This minimises the user interface changes. Source objects can be created, edited and deleted as at present.

Citation model

New CitationTreeModel and CitationListModel are introduced. The models encompass both Source objects and Citation objects. The tree model should have lines for Sources with a disclosure triangle, and subsidiary lines for Citations. Fields for Source should include Title, Author, ID, Abbreviation, Publication information and date last changed. Fields for Citation should include Volume/Page, ID, Date, Confidence and date last changed. The models should either a Source or a Citation can be selected, and the returned value indicates which.

This arrangement would allow sorting by Volume/Page (in case this were useful for some users - e.g. if Title is Birth certificate or Marriage certificate, and Volume/Page is the name of the individual).

Citation Selector

This uses the new CitationTreeModel.

Select source.jpg

The display for the source selector comprises Title/Page and Id.

Citation View

This uses the new citation models.

Add buttons to select Source view or Source Tree view. The default display for either source view comprises Title/Page, Id, Date and Confidence.

There are several possible objectives to add/edit:

  1. Edit a Citation (Citation view: select Citation, click Edit, - allows Citation and Source to be changed),
  2. Edit a Source (Source view: select Source, click Edit - only allows Source to be changed),
  3. Add a new Source (Source view: click Add - allows Source to be added),
  4. Add a new Citation to an existing Source (Citation view: click Add; select the source - allows new Citation, and allows the Source to be changed).
  5. Add a new citation to a new source (Source view: add the Source; then Citation view: add the Citation)

In the Citation view:

  • If the "Add a new Citation" button is clicked: Bring up the Source selector. When a Source is chosen, bring up the Source-Citation editor with the specified Source populated. On clicking OK, store the new Citation and if changed, update the Source object.
  • If the "Edit" button is clicked, and a Citation is highlighted, the Source-Citation editor should allow both the Source and Citation to be changed.
  • If the "Edit" button is clicked, and a Source is highlighted: Do nothing
  • If the "Remove" button is clicked, then the highlighted object and all objects that reference it should be removed.

Editor Source tabs

CitationEmbedList replaces all occurrences of SourceEmbedList. The default fields displayed and the buttons etc. remain unchanged, except that the ID field contains the Citation ID rather than the Source ID.

The same Source-Citation editor is used as for editing a Citation.

If the "Create and add a new citation" button is clicked, then allow the creation of both Source and Citation objects. This is consistent with the current model, where "Create and add a new source" adds a new Source object and creates the current sourceref. It is distinct from the "Add an existing source" which allows either an existing Source or an existing Citation and associated Source to be selected. Bring up the Source-Citation editor with all fields empty. On clicking OK, save both a new Source and a new Citation. Error if the source is not entered. Link the CitationBase to the new Citation.

If the "Add an existing source" (Share) button is clicked, the citation selector is dispalyed. The user will either select a Source or a Citation. Pre-populate the editor according to what has been selected. On clicking OK, save either a new or an updated Citation; save the Source if it was updated (actually, the Source seems to be re-saved even if it was not changed, which may affect the last changed date incorrectly). Link the CitationBase to the new Citation.

If the "Remove" button is clicked, then the highlighted Citation should be removed, together with the links to the citation. The Source is not removed.

Editors

There are two separate 'citation/source' editors:

Add source 1.png
editsource
Source citation changes.jpg
editcitation. As used with the Citation reference having data, then this would not have the warning signs in the top half.

editsource is used from

  • Source view when selecting a source and clicking the "Edit" button,
  • Source view when clicking the "Add" button.

Existing editsource is unchanged.

editcitation is used from

  • Citation view when selecting a Citation and clicking the "Edit" button,
  • Citation view when clicking the "Add a new citation" button (following the display of the source selector, and selection of a source),
  • Editor Source tabs (CitationEmbedList) when clicking the 'Create and add a new citation' button.
  • Editor Source tabs (CitationEmbedList) when clicking the 'Edit the selected citation' button.
  • Editor Source tabs (CitationEmbedList) when clicking the 'Add an existing citation' button.

This is similar to the current editsourceref. On clicking OK,

  • if a Citation was passed in, update the Citation and the linked Source,
  • if nothing was passed in, if the Citation is blank store the new Source else store a new Citation and Source,
  • if a Source is passed in, add a new Citation and update the Source.

Rules

Do we need extra rules to match a Citation?

Reports

Reports access Source References through the Bibliography and Endnotes functionality. This allows the changes to be made in a single place.

Some changes will be needed in the Narrative Web and Simple Database Access functionality.


Links

Background

Bug tracker

1022: Sources dialog hangs for 2 minutes after opening
2918: Add Gedcom Source Citation Fields
2924: Implement GEDCOM EVENT_TYPE_CITED_FROM and ROLE_IN_EVENT
4491 (Feature request): Matching source and quality level
4913 (Feature request): Additional Event Filters

Others interfaces