GEPS 024: Natural transcription of Records

From Gramps
Revision as of 08:47, 25 May 2013 by Bmcage (talk | contribs) (Needs)
Jump to: navigation, search

This was formarly GEPS 024: Certificates

Natural transcription of Records is a method for creating and storing genealogical information from a document. For example, one might create a Census Record that would allow users to enter data straight from a Census sheet. The current Census Gramplet does this now. In this GEP, the associated data would be stored in a well-defined location in the database, and in exported file formats, so that the the record not only leads to update of the data in the family tree, but allows retracing the flow of a source to the data in the family tree.

Needs

  1. A manner to create, and edit over time, a record definition
    1. See Gramps Census XML format
  2. A way to view the records
  3. A way to map record fields onto database items, where user intervention allows to couple to existing objects, or to not add data.
  4. A way to recreate the document from the database (see Gramps Census Report)

Records themself would not be part of GEDCOM export. The data learnt from the record would be present in the family tree and would appear in GEDCOM like that.

Example Workflows

Census

  1. Find a census sheet that you would like to enter data from
  2. Select in the Records transcription view Record Type 'Census', and Layout eg 'UK1871'. Record Types and Layouts are predefined in an xml.
  3. Set the header data of the census
  4. Add the rows in column manner as present in the census sheet. This is literal transcription.
  5. An import function could be written for downloadable content that enters this automatic from the downloaded data
  6. Click the "Transcribe to Family Tree" button. This saves the Record to database. Gramps calculates what would be added to an empty family tree from this data, the "Proposed Transcription". The Left an empty 'Before' is shown, Right the new data (Person Objects, Census Events, Source Objects, Citation Objects, ...).
  7. The Left empty part shows drop down boxes to select possible existing Persons that could be the People in the census based on the given Name. The User can explicitly select an existing person. This updates the Right part of the window. Check boxes in the Right part are given to allow choices, eg 'Set name as alternate name', ....
  8. For every setting in the Left Part of this "Proposed Transcription", the user can give a "Reasoning", which is free text, a "Conclusion", which is also free text, and a Confidence Level.
  9. User needs to Approve the Changes
  10. On Approval, the data is stored in the family tree. The Confidence level goes to the citations. The Reasoning and Conclusion are stored in a Note of type " Analysis Document" in the citation. Citation holds a link to the Record ID.

Baptize Record

The baptize records lists typically the following information:

  1. The gender and name of the child(ren)
  2. The date of baptism, sometimes the date of birth as well
  3. The names of the parents of the child. Some lazy record keepers only list the father.
  4. Usually, the names of the witnesses, The baptism had to be witnessed by at least two people. In Catholic families, these were the godparents.
  5. Often the name of the Priest

So, the workflow here would be:

  1. Find the baptize record you would like to enter data from
  2. Select in the Records transcription view Record Type 'Baptize Curch Record', and Layout 'Generic'. Record Types and Layouts are predefined in an xml.
  3. Set the header data of the record, this would page number or index number, name of the church, source title. We could do publication info with import from eg BibTeX.
  4. As in Census, every 'entry' could be added in a column manner. As normally only one row is needed, it would be logical to default to an editor style of data entry. For baptize record possibly a natural language input is possible like in a form that must be filled in (Think: On ______ the <boy/girl> was baptized, and given the name _______ ....)

This step is literal transcription. Note that for twins double entry is needed, so 'multiple rows' like in census.

  1. An import function could be written for downloadable content that enters this automatic from the downloaded data
  2. Click the "Transcribe to Family Tree" button. This saves the Record to database. Gramps calculates what would be added to an empty family tree from this data, the "Proposed Transcription". The Left an empty 'Before' is shown, Right the new data (Person Objects, Baptize Events, Associations to witnesses, Source Objects, Citation Objects, ...).
  3. The Left empty part shows drop down boxes to select possible existing Persons that could be the People in the census based on the given Name. The User can explicitly select an existing person. This updates the Right part of the window. Check boxes in the Right part are given to allow choices, eg 'Set name as alternate name', ....
  4. For every setting in the Left Part of this "Proposed Transcription", the user can give a "Reasoning", which is free text, a "Conclusion", which is also free text, and a Confidence Level.
  5. User needs to Approve the Changes
  6. On Approval, the data is stored in the family tree. The Confidence level goes to the citations. The Reasoning and Conclusion are stored in a Note of type " Analysis Document" in the citation. Citation holds a link to the Record ID.

Retracing Steps

The normal data of Gramps now holds information like it would be the case without the natural transcription. However, in the normal data it is problematic to quickly know what data comes from which source. Notes and citation objects must be checked. In this proposal Citation holds a link to the Record ID. So for all data that is entered in this way, the record used by the user to enter the data can be shown again. Note that this info will be present in a collection of sources, citations and objects in the normal family tree too. Seeing this information in the original form is usefull. (Alternative is storing this duplicate in all citation objects, or in the main source object in a Note form).

As the Record is still present, a user theoretically could indicate to show how eg a person would look like if a certain record was not taken into account. This opens many possibilities.

Other

Current Census Limitations

Things needed to bring the Census work up to this level of integration:

  1. If you change the way a certificate is defined, we need a way to change the data. For example, changing a column name disassociates all of the information in the database.
  2. How to handle translations?
  3. Code should be able to add items such as sources to the database
  4. Items lose their ordering

Other Connections

  • Gramps' FamilySearch API will perhaps have the ability to connect and download an entire census sheet. Consider creating the Certificate from the downloaded definition.

See also