GEPS 015: Repository Research Support

From Gramps
Revision as of 19:37, 17 October 2011 by Gioto (Talk | contribs)

Jump to: navigation, search

This is a work in progress, I hope to develop the content over the next couple of weeks - rjt

User Stories

1. Plan a research visit to a Repository

Aunt Martha (AM) is planning a visit to the UK National Archives at Kew, London. She wants to produce a report from GRAMPS that tells her all of the Sources that are held in the National Archives that might be of interest to her. She also wants the report to give her a list of all the people in her database that could potentially appear in those sources, without showing those that she already has the source attached to. She would also like the report to include a table with a row for each person and a column for each piece of information that the Source should contain so that she can print it out and fill it while she is at Kew.

2. Import Sources for a Repository

AM has discovered the Ancestry is a great Repository that she knows is a good place to find genealogy sources relevant to her database. She wants to import the information about the sources that are held by Ancestry so that she can start her research. She clicks on the 'Import Repository' button and gets a list of all of the Repositories that are in the online GRAMPS Repository database and imports Ancestry. This populates the Sources that are contained in Ancestry and add Ancestry to her list of Repositories.

3. Create Research Plan for a Person

AM sits down at her computer. she has an hour to spare and what to progress her research. She selects her Grandfather, Frank in the Person View. GRAMPS shows her a research plan for Frank that shows all the Sources that Frank might be found in and the type of information that can be found in each Source. It also shows her the Repositories that contain those sources so that she can immediately start to look for the information.

Reports

The first report is a research plan for a Repository. I shows all the sources that are held in that repository, if there are people that might appear in them. For each source if shows the candidate people and provides a template for recording the information found in the source.

Research plan mock.jpg

The second report is an individual research plan. I shows all the sources that the person might be listed in, these would be filtered so that those that are already listed for the individual are excluded (or may listed at the end).

Individual plan mock.jpg

Implementation Issues

Supporting queries

There are two primary queries:

* get_all_people_that_might_appear_in_source(source)
* get_all_sources_that_a_person_might_appear_in(person)

Both of these queries require some way of matching a Person to a Source. There are two types of source meta-data that could be used for doing this matching:

  1. dates - if a Source had a start and end date we could match this against the result of probably_alive
  2. places - places are trickier, what we want to ask is is this person likely to have lived somewhere in the region covered by this source. An initial implementation might associate a Place object with a Source and match if the Person has any Place or Address references that match all of the fields that are set in the Sources Place. So for a Census the Source's Place would say England and if any of the Addresses on the Person also had England it would be a match. Source might need to have multiple Places and the matching algorithm might need to be rather fuzzy. A 'default place' might be needed to cover all the People that have no Address or Place references.

I think that the date matching is clearly simpler than the place matching and should be the initial target.

The aspect of meta-data is record what information a Source contains. This could be as simple as a list of titles (e.g. Birthday, Name, Sex etc.) or it could be more sophisticated. It could be a list of Event templates. The Event templates could then be used to check against the Person, so the Source only matches against the Person if they do not have an Event of that type references to that Source. It might even be possible to right-click the Source on a Probable Sources tab on the Person and select Populate Events to create the empty events on the Person, already setup to reference the Source.

To be able to produce these reports it is going to be necessary to record additional information in the database. Most of this additional information is recorded against Sources but some will also be needed against people and possibly against Repositories.

Important issues

  1. It must be possible to exclude Sources altogether. Many Sources are not related to documentary evidence and you would not want them cluttering up the reports.
  2. It must be possible to exclude Sources on an Person. If you know that you have checked a Source for an Individual you want to exclude that Source from showing up next time you run the reports.


Implementation Plan

It should be possible to make a start on this by storing the Source meta-data in the key/value data of the Source. This will allow an initial proof-of-concept of the reports without touching the database schema.

For instance:

key              value
_r_:start_date 1841-06-06
_r_:end_date   1841-06-07
_r_:include    True

A more general version of probably_alive would be needed that can take a range and decide if the Person might be alive during that period.


Testing

Future Possibilities

Related Work

Jerome's RepositoriesReport is an example what can be done at the moment. This GEP seeks to develop this idea further.