Difference between revisions of "Addon:Place completion tool"

From Gramps
Jump to: navigation, search
(New page: A tool to bring the places in your GRAMPS database in accordance with the GRAMPS requirements: batch add country, county; look-up latitude-longitude; set description; ... [[Category:Propo...)
 
(Download resources)
(126 intermediate revisions by 17 users not shown)
Line 1: Line 1:
A tool to bring the places in your GRAMPS database in accordance with the GRAMPS requirements: batch add country, county; look-up latitude-longitude; set description; ...
+
{{Third-party plugin}}
  
[[Category:Proposed Tool Specifications]]
+
[[File:Place completion tool example1.png|right|300px|Place completion tool - Example Results]]
  
== Place Completion Tool==
+
A tool to bring the places in your Gramps database in accordance with the Gramps requirements: batch add country, county; look-up latitude-longitude; set description (title); ...
  
The place completion tool has grown out of a desirement to output places as used in events to maps. For this to work, people must have correct place information in GRAMPS. However, for many users, entering places is neglected: a short description and the city is added, and that's it.
+
This tool helps you fill in the place attributes like county, country, ..., by allowing you to select the places you work on, and do changes on all these places with one button click.
 
 
A tool to help you fill in the data needed for places is hence usefull.
 
  
 
The general aims are:
 
The general aims are:
*Place/Location is a newer concept in GRAMPS. Many older databases only have a Place title field with a descriptive text containing city, state, country. This should be parsed to insert the values in the correct attribute fields.
+
*Place/Location is a newer concept in Gramps. Many older databases only have a Place title field which is a descriptive text containing city, state, country. This should be parsed to insert the values in the correct attribute fields.
*Latitude and longitude are important to show data on a map. However, doing a lookup of this data on the internet is slow and time consuming. The tool allows to search in the free resources on the net.
+
*Latitude and longitude are important to show data on a map. However, doing a look-up of this data on the internet is slow and time consuming. The tool allows to search in the free resources on the net.
*Conversion of latitude and longitude to a fixed data format. On import one might obtain latitude and longitude in several different formats. A conversion tool to store them all in the same format is usefull.
+
*Setting of an attribute of a set of places in one go.
*Construction of a uniform title/description field, from the data in the place object
+
*Conversion of latitude and longitude to a fixed data format. On import one might obtain latitude and longitude in several different formats. A conversion tool to store them all in the same format is useful.
 +
*Construction of a uniform title/description field, from the data in the place object.
  
== Design Specification ==
+
Follow the installation details in the [[Place_completion_tool#Download|Download]] section for your version of Gramp.
See [[Place completion tool specification]]
 
  
== Manual ==
+
== Usage Instructions ==
The place completion tool gives a lot of functionality. This manual should help you to understand how it works.
+
The place completion tool provides a lot of functionality. These usage instruction should help you to understand how it works.
  
 
=== Download resources ===
 
=== Download resources ===
The place completion tool can look up for you latitude/longitude, add county information (USA), ... For some of this functionality, you must download datafiles of the countries you are interested in. Right now you have three options:
+
{{stub}}<!--page needs to be reworked and maybe split-->
 +
The place completion tool can look up for you latitude/longitude, add county information (USA), ... . For some of this functionality, you must download data files of the countries you are interested in. Right now you have three options:
  
#Download geonames country files. You can do this [http://download.geonames.org/export/dump/ here freely]. Geonames parses fastest, so is the advised format to use
+
#Download geonames country files. You can do this [http://download.geonames.org/export/dump/ here freely]. {{man menu|Geonames parses fastest, so is the advised format to use.}}
#Download geonames USA state files. You can do this [http://geonames.usgs.gov/domestic/download_data.htm here freely]. This is advised for USA searches, as the data in the USA country contains many doubles, which can be avoided by searching state per state. State info also contains county information.
+
#Download geonames USA state files. You can do this [http://geonames.usgs.gov/domestic/download_data.htm here freely]. This is advised for USA searches, as the data in the USA country file contains many doubles, which can be avoided by searching state per state. State info also contains county information.
#download GNS Geonet country files (not available for usa). You can do this [ftp://ftp.nga.mil/pub/gns_data here freely with ftp].  
+
#Download GNS Geonet country files (not available for usa). You can do this [ftp://ftp.nga.mil/pub2/gns_data here freely with ftp].  
  
Watch out, some of these downloads are '''VERY''' large, especially USA data. Only download what you need.
+
Watch out, some of these downloads are '''VERY''' large, especially USA data. Only download what you need! If the download is a compressed zip file, you will need to extract the data file before you can use it.
  
<small>'''Note''': The geonames data of popular places is in English, so eg municipalities in Italy will be found, but Roma not, as this is Rome in English. To find data with these you need to search in the localised variants of the name (see below)</small>
+
{{man note|Note:|The geonames data of popular places is in English, so e.g. municipalities in Italy will be found, but Roma not, as this is Rome in English. To find data with these you need to search in the localised variants of the name (see below)}}
  
'''DO NOT BETA TEST WITH YOUR RESEARCH DATA. EXPORT DATA FIRST TO HAVE A BACKUP'''
+
{{man warn|Warning:|DO NOT BETA TEST WITH YOUR RESEARCH DATA. EXPORT DATA FIRST TO HAVE A BACKUP.}}
  
 
=== Starting the tool ===
 
=== Starting the tool ===
The placecheck tool is in the tools menu, option tools, under ''place completion''
+
You will find the plugin under {{man menu|Tools > Utilities > PlaceCompletion...}}
 +
 
 +
=== The dialog explained ===
 +
[[File:Place completion tool empty.png|600px|The Place Completion Tool]]
 +
 
 +
The Dialog consists of 4 parts:
 +
====Part 1. Selection of places====
 +
First you need to choose with which places you want to work. You can use several methods to define your places:
 +
# Use a place filter. You can use two preset filters: ''All places'', which returns all places, and ''No Latitude/Longitude given'', which returns all places of which the latitude or the longitude is not set. You can also created a custom place filter in the place view, test it with the filter sidebar, and then use it in this tool. All custom filters you made will be available
 +
# To prevent the need to make a filter for every city, ... in your data, you can set country,state,county,city or parish of the places you want to search on. This works just like in the filter sideview in the places view.
 +
# Use a latitude, longitude rectangle. Eg, suppose you have the latitude and longitude of all places in the UK, and now want to add in the state attribute ''Wales'', for all places in Wales. You can look on a map, note down the centre of Wales in latitude and longitude, as well as roughly the width and height of this rectangle. This will allow you to obtain all places in Wales (and some in England), allowing to much faster set the state information.
  
{{cleanup}}
+
====Part 2. Completion of places ====
''This is for the present version, the version in development will be different, manual will have to be adapted then.''
+
#The first possibility is to look up in a datafile the latitude and longitude of your places. For this you must have downloaded the necessary resources, see section above. You can select with a file dialog the file you want to search, and set how this data must be parsed. The following parsing options are available:
You have to consider 5 actions:
+
##''GeoNames country file, city search'': use the city attribute to look for lat/lon in a GeoNames country file. This is the fastest search.
====Set filter====
+
##''GeoNames country file, city localized variants search'': use the city attribute to look for lat/lon in a GeoNames country file using the localised (non-English) known names in the GeoNames file. Eg, Roma will be found with this option (as Roma is the Italian local variant of the English name Rome)
Set a place filter which determines which places will be considered. You can set a general filter (standard is All Places). To avoid the need for different filters for every country, state, these can be set in two separate entry fields.
+
##''GeoNames country file, title begin, general search'': Use the start of the title field to search in a GeoNames file. With start it is meant everything before a comma:''',''' . This allows to find landmarks, squares, ... . Eg, if the title of your place is: ''Piazza Navona, Rome'', using this search will find you the latitude and longitude of this famous square in Rome.
 +
##''GeoNames USA state file, city search'': Looking for places in the USA file is almost worthless: it takes a long time and every name exists several times. Hence, it is worthwhile to use state by state. If a USA state file is selected for doing a search, you '''must''' select this option. The city attribute is used for the search.
 +
##''GNS Geonet country file, city search'': use the city attribute to search in a GNS file (slower than GeoNames search!).
 +
##''GNS Geonet country file, title begin search'': use the start of the title of a place to search in a GNS file. With start everything appearing before the first comma is meant.
 +
#A second option is to parse some existing data in your places.
 +
##You can parse the title attribute to extract information from it. Eg a title like ''Albany, NY'' can be used to set the city attribute to ''Albany'' and the state attribute to ''NY''.
 +
##You can set the title of all the selected places to a uniform way. This is interesting if due to imports you have different styles for the title field, which can be annoying in reports. At the moment there are two options:
 +
###Set title field to ''City[, State]'': This means the title of your places will contain the city, and if the state field is present, the state will be appended with a comma.
 +
###Set title field to ''Titlestart[, City][, State]'': This means the present start of your title will be kept. If this start is not the city, then the city will be appended. If state is present, also state will be appended. An example: suppose your title is ''Piazza Navona, Italy'', the city is ''Rome'' and the State is ''Lazio''. Using this option to set the title would change the title attribute into ''Piazza Navona, Rome, Lazio''.
 +
##Convert latitude and longitude to a uniform way. Again due to import, copy/paste, you might have latitude and longitude entered in different formats. This is annoying on reports. This options allows you to set for all selected places the lat/lon to one form. The options are:
 +
###All in degree notation: use the classical degree notation with degree, minutes and seconds.
 +
###All in decimal notation: use the decimal system to denote lat/lon.
 +
###Correct -50° in 50°S: a much seen error is to use - for the classical degree notation, which is wrong, and which Gramps will not be able to interpret. With this option this error is looked for and corrected.
  
====Parse title====
+
====Part 3. Overview of the results ====  
Determine if the title must be parsed. Lookup op lat/lon happens with the city field normally. Older grdb/gedcom do not have this fields. Therefore you can parse this field with a regex. This is a comboentry field. Some common options are provided in the dropdown. You can however construct your own regex. Leave this field blank to do no parsing of the title
+
After having entered all data in Part 1 and 2, you click find for Gramps to search all changes that will occur. This part of the dialog shows all changes that will occur.  
  
====Look-up latitude/longitude====
+
[[Image:place_completion_tool_results.png]]
Search latitude and longitude in data files. A comboentry field is given. You can give your own regex, or you can choose from the dropdown common possibilities. Geonames is fastest. In the file dialog you must select the file you downloaded.
 
The dropdown has the following entries:
 
*geonames, city search: search populated places with the data in the city search
 
*geonames, city localised variants : same as before, but now the city is searched in the non-english list. Only well known places (eg Roma, Antwerpen, ...) are listed under the English name (Rome, Antwerp, ...). When English is identical to localized variant, this list is empty and no match can be found with this search
 
*geonames, title begin general search: searches with what is in the title field before the first ',' in a general fashion: populated places, parks, mountains, ... are returned. Eg: title = Piazza Navona, Rome; will return lat/lon of this famous place in Rome by searching the IT.txt geonames file
 
*geonames, USA state city search: search in a USA state file, with city. Only populated places are searched
 
*geonet, city search: search a geonet file, with the city data. Only populated places, and the search is the local variant with accents!
 
*geonet, title begin geenral search: same as the geonames analog, but on a geonet file
 
  
====Convert latitude/longitude====
+
All selected places are shown. If changes will be done all changes are listed as subentries of the place. Every change will be a subentry.  
Convert lat/lon. Some general conversion on lat/lon are possible, to make sure all data is in the same format. This search also catches mistyped lat/lon and gives a warning. It is possible to correct degree notation with - to correct degree notation.
 
  
====Set title/description field====
+
If the change will '''overwrite''' an existing entry, the subentry is '''shown in orange'''.  
Some possibilities to set the title field automatically from the data in the place object are offered.
 
  
====Do the changes 1: what would change ====
+
{{man warn|Warning|TO AVOID PROBLEMS, GO OVER ALL CHANGES QUICKLY, AND CHECK ALL ENTRIES IN ORANGE!}}
Now, by clicking find, a list with all places is made, and the changes that can be proposed. These are presented in a tree view with two levels: the general level with placename, and the details with suggested changes. Data overwriting previous data is shown with orange background.
 
You can perform several actions:
 
*press delete: the entry disappears from the tree view
 
*press tab : your preferred browser opens a page on google maps. If a lat/lon is suggested, you see this place, otherwise google maps is searched with the place data (city or title)
 
*select an entry and click on Google Maps: this does the same as pressing tab.
 
*double-click an entry: the place editor opens, with the data preentered. On the place, all suggested changes are preentered. On a detail, only this specific suggestion is preentered
 
  
====Do the changes 2: change it ====
+
The following actions are possible in the result screen:
Click Apply: all suggested changes in the treeview are consecutively done. '''Check for doubles and overwrites before clicking on Apply'''  
+
# press delete to delete the entry, making sure that this change will not occur. You can delete the entry to delete all changes, or select one subentry, to only delete that specific chagne
 +
# double-click on an entry to open the place dialog. If you double-click on the entry, '''all changes will be preentered'''. If you double-click on a subentry, only this specific change will be preentered in the place dialog.
 +
# press tab to open in a browser window google maps. Pressing tab on a subentry showing a '''new''' lat/lon entry will open google maps on this new lat/lon position. Pressing tab on the top place entry will give open google maps with the old lat/lon position, or if that is not known the title/city field is used for the search.
  
The changes cannot be undone. Take a backup if unsure!
+
====Part 4. Actions ====
 +
After you have checked the changes in '''Part 3''', you can apply them by clicking the {{man button|Apply}} button.  
  
'''DO NOT BETA TEST WITH YOUR RESEARCH DATA. EXPORT DATA FIRST TO HAVE A BACKUP'''
+
Selecting {{man button|Help}} will bring you to this page, clicking {{man button|Close}} will close the ''Place completion tool'' window and clicking {{man button|Google Maps} when an entry is selected in the results field has the same effect as pressing tab on an entry (see above).
 +
 
 +
== Example ==
 +
 
 +
Open the example file from the examples where latitude and longitude are empty: [http://svn.code.sf.net/p/gramps/code/branches/maintenance/gramps34/example/gramps/example.gramps example.gramps].
 +
 
 +
We will now show how the places in this file can be completed. The best thing to do is to create a new Family Tree, give it a name, and import the example.gramps file. This file has 852 places, which would mean a lot of manual edits if you do not use this tool!
 +
 
 +
Now, open the place view. You will see all places are of the form:
 +
:Aberdeen, WA
 +
This value is the <code>Place Name</code> attribute (the title or description of the place).
 +
 
 +
=== Step 1: City and State data ===
 +
Our first step will be to split this field into a <code>City</code> value (here Aberdeen), and a <code>State</code> value (here WA).
 +
 
 +
We open the ''Place completion tool'':
 +
[[File:Place completion tool example1.png|500px|left|Parse the Place Name Field]]
 +
Here we have selected {{man label|All Places}}, and we '''Change the title into''' as {{man label|City [,State]}}. Click on {{man button|Find}}, quickly scan the data if all looks ok, and then click on {{man button|Apply}}. You are notified that 443 place records were modified. This is one less that the number of places. Indeed, one place does have a different type of title: ''Puerto Rico'' has no state information.
 +
 
 +
{{-}}
 +
 
 +
=== Step 2: Look-up latitude and longitude ===
 +
We have downloaded the GeoNames datafiles for the USA states, and will now use that to complete the latitude and longitude of the data. At the same time, this will fill up the county field.
 +
 
 +
[[Image:place_completion_tool_example2.png|500px|left|Look up lat/lon for Alaska]]
 +
 
 +
In the screenshot, you see we have selected All Places with State=AK. In the second part of the window we give that we want to search in the AK_DECI.txt file downloaded from GeoNames, using the parsing method: ''GeoNames USA state file, city search''.
 +
{{-}}
 +
Note that if you want to change AK into Alaska, this would be possible. Just set state=Alaska in the set attributes section of the window.
 +
 
 +
Do this now for all the states. Always check for doubles. Eg, for state ''AL'', going over the changes, we encounter:
 +
 
 +
 
 +
[[Image:place_completion_tool_example3.png|Double in lat/lon, city Enterprise exists in two counties]]
 +
 
 +
 
 +
We see that the first time 'Enterprise' if found, it is in county ''Coffee'' in lat/lon:31.31/-85.85. The second hit is for county ''Chilton'' with lat/lon:32.73/-86.62.
 +
 
 +
You can now use the Google Maps button (or press TAB key) while the lat/lon subentry is selected to see where this city is in both cases. From this it will be clear for example that one is a hamlet, not really a city, while the first is a real city. So now, select the second lat/lon entry, and delete it by pressing the DEL key. Do the same for the second county entry.
 +
 
 +
In case google maps did not allow you to determine which is the correct city, you can double click on the city to open the Place Dialog ('''Warning: this will preenter the data of the Place Completion tool. So hit cancel here if you want to exit without these changes done'''). In this dialog the references tab allows you to navigate to all events coupled to this place. This will give you extra information you might use to decide which of the two found places is the correct place.
 +
 
 +
=== Step 3: Problem entries ===
 +
While updating all places in step 2, you will have noticed some errors in the state information: Some places have a dubious state: eg OH-AL
 +
 
 +
You can obtain these states by choosing ''All Places'' en setting the state search box to '''-'''. Clicking Find will give you all these problem places. You can use google maps or the place dialog to sort them out. You can also use the USA country GeoNames file to search these places in the entire USA.
 +
 
 +
{{man note|Note:|You will need sufficient memory for this, or you will obtain a MemoryError (see [[Place_completion_tool#Memory_Error|below]])!}}
 +
 
 +
=== Step 4: Lat/Lon not found ===
 +
 
 +
After the above, still some 45 places have no latitude/longitude found. You can now select these places by setting the Place filter to 'No Latitude/Longitude', which will find you all places with no coordinates.
 +
 
 +
It will be clear that many of those can be quickly corrected: abbreviations, eg the city field contains ''St.George'', which should be ''Saint George''; double names, eg Waterloo-Cedar Falls, IA means Waterloo near Cedar Falls, changing the city to Waterloo and redoing the search using Google Maps will allow to quickly find which coordinates for Waterloo are needed.
 +
 
 +
== Advanced usage ==
 +
 
 +
 
 +
This is for advanced users who know how to use [http://en.wikipedia.org/wiki/Regular_expression regular expressions].
 +
 
 +
The parsing fields have entry fields allowing you to give your own parsing. Parsing uses regular expressions. You can use this to parse your title, and to parse a lat/lon file in your own way. For reference, here an overview of the parsing codes used for the predefined parses:
 +
 
 +
=== Parse title details ===
 +
 
 +
 
 +
In 'Parsing and Conversion of existing title or position', 'Parse title' and 'Change title into' provide some pre-defined options for extracting location values from a Place Title. Otherwise regex parsing is needed.
 +
 
 +
Write your regex in 'Parse title:'.Click on 'Find', which shows the proposed changes. Then click 'Apply'.
 +
 
 +
====Example 1====
 +
 
 +
For France, some [http://www.geneawiki.com/index.php/Informatique_-_saisie_des_lieux practical rules] could be useful for seizing place. It needs :
 +
 +
* the city name + [http://en.wikipedia.org/wiki/INSEE_code#Geographical_codes INSEE code] (at option). This code is unique and can identify with certainty a common (with the county, district, township and municipality). It can identify with a common insurance even if it has changed its name. This code is used in Archives. ''Using postal code is not advisable ...''
 +
 +
* a subdivision: identifies a parish or a place called within a municipality
 +
 +
* the state (at option) ''or county but is already on [http://en.wikipedia.org/wiki/INSEE_code#Geographical_codes INSEE code]''
 +
 +
* the country (at option). Ideally it should still take the country. It is understandable that this is tedious. ''Maybe do not enter the country if the genealogy is mostly of one country and seize enter the country for events outside the country's main. Everyone will appreciate.''
 +
 
 +
e.g. Avignon,84000,Vaucluse,Provence-Cote-d'Azur,FRANCE,
 +
 
 +
where some fields may be missing:
 +
 
 +
e.g. Woerth,,,Alsace,FRANCE,
 +
 
 +
the regex:
 +
 
 +
<code>\s*(?P<city>[^,]+)[,]\s*(?P<zip>\d*)[,](?P<county>[^,]*)[,](?P<state>[^,]*)[,](?P<country>[^,]*)[,]*$</code>
 +
 
 +
treating the comma character as an end-of-string delimiter, will distribute
 +
Avignon to City, 84000 to ZIP, Vaucluse to County, Province-Cote-d'Azur to State and France to Country
 +
 
 +
and in the case of missing fields, as long as the first is not empty, will distribute Woerth to City, Alsace to State and FRANCE to Country.
 +
 
 +
It allows initial whitespace and an optional comma after the Country
 +
 +
====Example 2====
 +
 
 +
In many 16th and 17th century English IGI records the situation is more complicated.
 +
 
 +
The Place Title takes the form of 3 strings (Town, County, Country) or 4 strings (Parish, District, County, Country) for example:
 +
 
 +
(a) Chester le Street, Durham, England - 3-string
 +
 
 +
(b) Of Middleton-in-Teesdale, Durham, England - 3-string
 +
 
 +
(c) Bishoply,Stanhope, Durham, England - 4-string
 +
 
 +
(d) Of St. Margaret's, Stanhope, Durham, England - 4-string
 +
 
 +
 
 +
For the 3-string record the following regex, treating the comma character as an end-of-string delimiter, will distribute the 3 strings correctly to City, County, Country locations, leaving 4-string records untouched.
 +
 
 +
Regex A:
 +
 
 +
<code>\s*(Of[,]*\s*)*(?P<city>[^,]+?)[,]\s*(?P<county>[^,]+?)[,]\s*((?P<country>[^,]+?)){1,1}$</code>
 +
 
 +
 
 +
For 4-string records the following regex will distribute the 4 strings correctly to Parish, City, County, Country locations, leaving 3-string records untouched.
 +
 
 +
Regex B:
 +
 
 +
<code>\s*(Of[,]*\s*)*(?P<parish>[^,]+?)[,]\s*(?P<city>[^,]+?)[,]\s*(?P<county>[^,]+?)[,]\s*((?P<country>[^,]+?)){1,1}$</code>
 +
 
 +
 
 +
"Bogus" 4-string records: a not uncommon error in these old records is for 'Of' to be followed by a comma, e.g.
 +
 
 +
(e) Of, Houghton-le-Spring, Durham, England
 +
 
 +
Regex A will parse 3-string records correctly and Regex B 4-string records. However Regex B will attempt to treat a 3-string record with 'Of,' as if it were a 4-string record. This would give the 3-string record a non-existent Parish called "Of" !
 +
 
 +
To avoid this, when using Regex B click on 'Find' to display the records proposed for change and delete all the 3-string 'Of,' records before clicking 'Apply'.
 +
 
 +
{{man note|take care when pasting a Regex (for example, after testing in a regex editor) into the Parse title details field. Pasting adds redundant spaces before or after the Regex that will prevent it working properly in the Place Completion Tool}}
 +
 
 +
The predefined regex expressions are as follows, where for brevity we use some variables defined lower.
 +
 
 +
{{man note|Regex Help|For those new to Python and Regex please review the HOWTO here:<br /> http://docs.python.org/dev/howto/regex.html}}
 +
 +
#"City [,|.] State" is parsed by : <code>r'\s*(?P<'+city_translated +r'>.+?)\s*[.,]\s*(?P<'+state_translated +r'>.+?)\s*$'</code>
 +
#"City [,|.] Country" is parsed by : <code>r'\s*(?P<'+city_translated +r'>.+?)\s*[.,]\s*(?P<'+country_translated +r'>.+?)\s*$'</code>
 +
#"City (Country)" is parsed by : <code>r'\s*(?P<'+city_translated +r'>.*?)\s*\(\s*(?P<'+country_translated +r'>[^\)]+)\s*\)\s*$'</code>
 +
#"City" is parsed by : <code>r'\s*(?P<'+city_translated +r'>.*?)\s*$'</code>
 +
 
 +
Here the variables used are:
 +
<pre>
 +
lat_translated = _('lat')
 +
lon_translated = _('lon')
 +
city_translated = _('city')
 +
county_translated = _('county')
 +
state_translated = _('state')
 +
country_translated = _('country')
 +
</pre>
 +
 
 +
You can use one of these variables as a group, and the tool will recognise them, and use as values for the corresponding place attributes.
 +
 
 +
=== Lat/Lon lookup parsing ===
 +
For the regex of lat/lon lookup, you need to indicate which data must be replaced with existing place attributes for the search, as well as indicate which regex groups must be extracted.
 +
 
 +
#"GeoNames country file, city search" is parsed with: <code>r'\t'+CITY_transl +r'\t[^\t]*\t[^\t]*\t' +latgr +  r'[\d+-][^\t]*)\t' +  longr + r'[\d+-][^\t]*)\tP'</code>
 +
#"GeoNames country file, city localized variants search" is parsed with: <code>r'[\t,]'+CITY_transl+r'[,\t][^\t\d]*\t?' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)\tP'</code>
 +
#"GeoNames country file, title begin, general search" is parsed with: <code>r'\t'+TITLEBEGIN_transl +r'\t[^\t]*\t[^\t]*\t' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)\t[PSTV]'</code>
 +
#"GeoNames USA state file, city search" is parsed with: <code>r'\t'+CITY_transl+r'\tPopulated Place\t[^\t]*\t[^\t]*\t' + countygr + r'[^\t]*)' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)'</code>
 +
#"GNS Geonet country file, city search" is parsed with: <code>r'\t'+latgr+r'[\d+-][^\t]*)\t'+longr+r'[\d+-][^\t]*)' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\tP\t[^\t]*\t[^\t]*' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*' r'\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t'+CITY_transl+r'\t[^\t]*\t[^\t\n]+$'</code>
 +
#"GNS Geonet country file, title begin search" is parsed with: <code>r'\t'+latgr+r'[\d+-][^\t]*)\t'+longr+r'[\d+-][^\t]*)'+ r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[PLSTV]\t[^\t]*\t[^\t]*'+ r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t'+TITLEBEGIN_transl+r'\t[^\t]*\t[^\t\n]+$'</code>
 +
#<i>Read of mediawiki CSV dump. This reads the files on [http://tools.wikimedia.de/~kolossos/wp-world/pub_CSV_test3.csv.gz](link gone) (for more information, see http://meta.wikimedia.org/wiki/WikiProjects_Geographical_coordinates) (''Contribution by nomeata'')</i>
 +
 
 +
For extraction of data you can use the same groupnames as in title parsing, so eg latgr in above should read: <code>r'(?P<'+lat_translated +r'>'</code> .
 +
 
 +
The syntax for the values that need to be used for searching in the file, eg CITY_transl, is given by : _('CITY'). You can use as substitution values:
 +
_('CITY'), _('TITLE'), _('TITLEBEGIN'), _('STATE'), _('PARISH').
 +
 
 +
The tool will read in the given regex, replace the substitution strings by the values in the place object, do the search, and extract the regex groups given from the result.
 +
 
 +
Resource: [https://en.wikipedia.org/wiki/GEOnet_Names_Server GEOnet Names Server]
 +
 
 +
== Troubleshooting ==
 +
=== Non UTF-8 latitude/longitude file ===
 +
The place completion tool expects the input files for location lookup to be in unicode (utf-8). On the occasion this is not the case, you will get the error:
 +
 
 +
<pre>File "/home/benny/programms/gramps/gramps2/src/plugins/PlaceCompletion.py", line 851, in load_latlon_file
 +
    self.latlonfile_datastr = infile.read()
 +
  File "/usr/lib/python2.4/codecs.py", line 481, in read
 +
    return self.reader.read(size)
 +
  File "/usr/lib/python2.4/codecs.py", line 293, in read
 +
    newchars, decodedbytes = self.decode(data, self.errors)
 +
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1610092-1610094: invalid data</pre>
 +
 
 +
Note that the Place Completion tool catches these errors and shows you an information box. After this, the tool will attempt to read the file with utf-8 (unicode), ignoring errors. This might give good results, but will of course fail to produce results on non-unicode encoded files.
 +
 
 +
In the above example it is clear the problem is in two bytes, so you can correct this manually: open the file with eg <code>KHexEdit Binary Editor</code>, go to the specified position (offset 1610092), and change the two bytes with a space.
 +
 
 +
In the case the file is completely non-unicode, you will have to convert it to unicode with a tool, before using it in the placecompletion tool.
 +
 
 +
=== Memory Error ===
 +
The tool might fail with the error:
 +
 
 +
<nowiki>self.latlonfile_datastr = infile.read()
 +
  File "/usr/lib/python2.4/codecs.py", line 481, in read
 +
    return self.reader.read(size)
 +
  File "/usr/lib/python2.4/codecs.py", line 293, in read
 +
    newchars, decodedbytes = self.decode(data, self.errors)
 +
MemoryError</nowiki>
 +
 
 +
The tool has to load the datafile for latitude/longitude searching into memory. For large files like USA.txt, this might be impossible if you have limited memory. You can try to close as many programs running at together with Gramps, and try the tool again.
 +
 
 +
==See also==
 +
 
 +
*[[GeoCodes]]
 +
 
 +
=== Design specification ===
 +
*See [[Place completion tool specification]]
 +
 
 +
Place/Location is a newer concept in Gramps. Many older databases only have a Place title field which is a descriptive text containing city, state, country. To distribute these values into the correct attribute fields, see Parse Title Details below
  
 
== Download ==
 
== Download ==
You can download the developers version. You find it at [http://cage.ugent.be/~bm/varia/placetool0_9.tar.gz placetool0_9.tar.gz].  Put the .glade and .py file both in the plugins directory of GRAMPS 2.2.x.
 
  
'''DO NOT BETA TEST WITH YOUR RESEARCH DATA. EXPORT DATA FIRST TO HAVE A BACKUP''
+
If you use Gramps 4.2 , then use the [[4.2_Addons|automatic installation]].
 +
 
 +
If you use Gramps 4.1 , then use the [[4.1_Addons|automatic installation]].
 +
 
 +
If you use Gramps 3.4 , then use the [[3.4_Addons|automatic installation]].
 +
 
 +
If you use Gramps 3.3 , then use the [[3.3_Addons|automatic installation]].
 +
 
 +
If you use Gramps 3.2 , then use the [[3.2_Addons|automatic installation]].
 +
 
 +
=== Manual install ===
 +
 
 +
For the following older versions of Gramps you will need to do a Manual install of the files.
 +
 
 +
Extract the three files that are in the download. Put the .glade and .py files in the plugins directory. For linux:
 +
 
 +
* local install: place in <code>~/.gramps/plugins</code>
 +
 
 +
If you still use Gramps 3.1.x, then you will need version 1.2 of the Place Completion Tool. You can find it at [http://cage.ugent.be/~bm/varia/placecompletion_1_2.tar.gz placecompletion_1_2.tar.gz].
 +
 
 +
If you still use Gramps 3.0.x, then you will need version 1.1 of the Place Completion Tool. You find it at [http://cage.ugent.be/~bm/varia/placecompletion_1_1.tar.gz placecompletion_1_1.tar.gz].
 +
 
 +
If you still use Gramps 2.2.5+, then you will need version 1.0 of the Place Completion Tool. You find it at [http://cage.ugent.be/~bm/varia/placecompletion_1_0.tar.gz placecompletion_1_0.tar.gz].
 +
 
 +
[[Category:Plugins]]
 +
[[Category:Developers/General]]
 +
[[Category:Tools]]

Revision as of 07:52, 5 January 2017

Gramps-notes.png

Please use carefully on data that is backed up, and help make it better by reporting any comments or problems to the author, or issues to the bug tracker
Unless otherwise stated on this page, you can download this addon by following these instructions.
Please note that some Addons have prerequisites that need to be installed before they can be used.
This Addon/Plugin system is controlled by the Plugin Manager.


Place completion tool - Example Results

A tool to bring the places in your Gramps database in accordance with the Gramps requirements: batch add country, county; look-up latitude-longitude; set description (title); ...

This tool helps you fill in the place attributes like county, country, ..., by allowing you to select the places you work on, and do changes on all these places with one button click.

The general aims are:

  • Place/Location is a newer concept in Gramps. Many older databases only have a Place title field which is a descriptive text containing city, state, country. This should be parsed to insert the values in the correct attribute fields.
  • Latitude and longitude are important to show data on a map. However, doing a look-up of this data on the internet is slow and time consuming. The tool allows to search in the free resources on the net.
  • Setting of an attribute of a set of places in one go.
  • Conversion of latitude and longitude to a fixed data format. On import one might obtain latitude and longitude in several different formats. A conversion tool to store them all in the same format is useful.
  • Construction of a uniform title/description field, from the data in the place object.

Follow the installation details in the Download section for your version of Gramp.

Usage Instructions

The place completion tool provides a lot of functionality. These usage instruction should help you to understand how it works.

Download resources

Gramps-notes.png

This article's content is incomplete or a placeholder stub.
Please update or expand this section.


The place completion tool can look up for you latitude/longitude, add county information (USA), ... . For some of this functionality, you must download data files of the countries you are interested in. Right now you have three options:

  1. Download geonames country files. You can do this here freely. Geonames parses fastest, so is the advised format to use.
  2. Download geonames USA state files. You can do this here freely. This is advised for USA searches, as the data in the USA country file contains many doubles, which can be avoided by searching state per state. State info also contains county information.
  3. Download GNS Geonet country files (not available for usa). You can do this here freely with ftp.

Watch out, some of these downloads are VERY large, especially USA data. Only download what you need! If the download is a compressed zip file, you will need to extract the data file before you can use it.

Gramps-notes.png
Note:

The geonames data of popular places is in English, so e.g. municipalities in Italy will be found, but Roma not, as this is Rome in English. To find data with these you need to search in the localised variants of the name (see below)

Gnome-important.png
Warning:

DO NOT BETA TEST WITH YOUR RESEARCH DATA. EXPORT DATA FIRST TO HAVE A BACKUP.

Starting the tool

You will find the plugin under Tools > Utilities > PlaceCompletion...

The dialog explained

The Place Completion Tool

The Dialog consists of 4 parts:

Part 1. Selection of places

First you need to choose with which places you want to work. You can use several methods to define your places:

  1. Use a place filter. You can use two preset filters: All places, which returns all places, and No Latitude/Longitude given, which returns all places of which the latitude or the longitude is not set. You can also created a custom place filter in the place view, test it with the filter sidebar, and then use it in this tool. All custom filters you made will be available
  2. To prevent the need to make a filter for every city, ... in your data, you can set country,state,county,city or parish of the places you want to search on. This works just like in the filter sideview in the places view.
  3. Use a latitude, longitude rectangle. Eg, suppose you have the latitude and longitude of all places in the UK, and now want to add in the state attribute Wales, for all places in Wales. You can look on a map, note down the centre of Wales in latitude and longitude, as well as roughly the width and height of this rectangle. This will allow you to obtain all places in Wales (and some in England), allowing to much faster set the state information.

Part 2. Completion of places

  1. The first possibility is to look up in a datafile the latitude and longitude of your places. For this you must have downloaded the necessary resources, see section above. You can select with a file dialog the file you want to search, and set how this data must be parsed. The following parsing options are available:
    1. GeoNames country file, city search: use the city attribute to look for lat/lon in a GeoNames country file. This is the fastest search.
    2. GeoNames country file, city localized variants search: use the city attribute to look for lat/lon in a GeoNames country file using the localised (non-English) known names in the GeoNames file. Eg, Roma will be found with this option (as Roma is the Italian local variant of the English name Rome)
    3. GeoNames country file, title begin, general search: Use the start of the title field to search in a GeoNames file. With start it is meant everything before a comma:, . This allows to find landmarks, squares, ... . Eg, if the title of your place is: Piazza Navona, Rome, using this search will find you the latitude and longitude of this famous square in Rome.
    4. GeoNames USA state file, city search: Looking for places in the USA file is almost worthless: it takes a long time and every name exists several times. Hence, it is worthwhile to use state by state. If a USA state file is selected for doing a search, you must select this option. The city attribute is used for the search.
    5. GNS Geonet country file, city search: use the city attribute to search in a GNS file (slower than GeoNames search!).
    6. GNS Geonet country file, title begin search: use the start of the title of a place to search in a GNS file. With start everything appearing before the first comma is meant.
  2. A second option is to parse some existing data in your places.
    1. You can parse the title attribute to extract information from it. Eg a title like Albany, NY can be used to set the city attribute to Albany and the state attribute to NY.
    2. You can set the title of all the selected places to a uniform way. This is interesting if due to imports you have different styles for the title field, which can be annoying in reports. At the moment there are two options:
      1. Set title field to City[, State]: This means the title of your places will contain the city, and if the state field is present, the state will be appended with a comma.
      2. Set title field to Titlestart[, City][, State]: This means the present start of your title will be kept. If this start is not the city, then the city will be appended. If state is present, also state will be appended. An example: suppose your title is Piazza Navona, Italy, the city is Rome and the State is Lazio. Using this option to set the title would change the title attribute into Piazza Navona, Rome, Lazio.
    3. Convert latitude and longitude to a uniform way. Again due to import, copy/paste, you might have latitude and longitude entered in different formats. This is annoying on reports. This options allows you to set for all selected places the lat/lon to one form. The options are:
      1. All in degree notation: use the classical degree notation with degree, minutes and seconds.
      2. All in decimal notation: use the decimal system to denote lat/lon.
      3. Correct -50° in 50°S: a much seen error is to use - for the classical degree notation, which is wrong, and which Gramps will not be able to interpret. With this option this error is looked for and corrected.

Part 3. Overview of the results

After having entered all data in Part 1 and 2, you click find for Gramps to search all changes that will occur. This part of the dialog shows all changes that will occur.

Place completion tool results.png

All selected places are shown. If changes will be done all changes are listed as subentries of the place. Every change will be a subentry.

If the change will overwrite an existing entry, the subentry is shown in orange.

Gnome-important.png
Warning

TO AVOID PROBLEMS, GO OVER ALL CHANGES QUICKLY, AND CHECK ALL ENTRIES IN ORANGE!

The following actions are possible in the result screen:

  1. press delete to delete the entry, making sure that this change will not occur. You can delete the entry to delete all changes, or select one subentry, to only delete that specific chagne
  2. double-click on an entry to open the place dialog. If you double-click on the entry, all changes will be preentered. If you double-click on a subentry, only this specific change will be preentered in the place dialog.
  3. press tab to open in a browser window google maps. Pressing tab on a subentry showing a new lat/lon entry will open google maps on this new lat/lon position. Pressing tab on the top place entry will give open google maps with the old lat/lon position, or if that is not known the title/city field is used for the search.

Part 4. Actions

After you have checked the changes in Part 3, you can apply them by clicking the Apply button.

Selecting Help will bring you to this page, clicking Close will close the Place completion tool window and clicking {{man button|Google Maps} when an entry is selected in the results field has the same effect as pressing tab on an entry (see above).

Example

Open the example file from the examples where latitude and longitude are empty: example.gramps.

We will now show how the places in this file can be completed. The best thing to do is to create a new Family Tree, give it a name, and import the example.gramps file. This file has 852 places, which would mean a lot of manual edits if you do not use this tool!

Now, open the place view. You will see all places are of the form:

Aberdeen, WA

This value is the Place Name attribute (the title or description of the place).

Step 1: City and State data

Our first step will be to split this field into a City value (here Aberdeen), and a State value (here WA).

We open the Place completion tool:

Parse the Place Name Field

Here we have selected All Places, and we Change the title into as City [,State]. Click on Find, quickly scan the data if all looks ok, and then click on Apply. You are notified that 443 place records were modified. This is one less that the number of places. Indeed, one place does have a different type of title: Puerto Rico has no state information.


Step 2: Look-up latitude and longitude

We have downloaded the GeoNames datafiles for the USA states, and will now use that to complete the latitude and longitude of the data. At the same time, this will fill up the county field.

Look up lat/lon for Alaska

In the screenshot, you see we have selected All Places with State=AK. In the second part of the window we give that we want to search in the AK_DECI.txt file downloaded from GeoNames, using the parsing method: GeoNames USA state file, city search.
Note that if you want to change AK into Alaska, this would be possible. Just set state=Alaska in the set attributes section of the window.

Do this now for all the states. Always check for doubles. Eg, for state AL, going over the changes, we encounter:


Double in lat/lon, city Enterprise exists in two counties


We see that the first time 'Enterprise' if found, it is in county Coffee in lat/lon:31.31/-85.85. The second hit is for county Chilton with lat/lon:32.73/-86.62.

You can now use the Google Maps button (or press TAB key) while the lat/lon subentry is selected to see where this city is in both cases. From this it will be clear for example that one is a hamlet, not really a city, while the first is a real city. So now, select the second lat/lon entry, and delete it by pressing the DEL key. Do the same for the second county entry.

In case google maps did not allow you to determine which is the correct city, you can double click on the city to open the Place Dialog (Warning: this will preenter the data of the Place Completion tool. So hit cancel here if you want to exit without these changes done). In this dialog the references tab allows you to navigate to all events coupled to this place. This will give you extra information you might use to decide which of the two found places is the correct place.

Step 3: Problem entries

While updating all places in step 2, you will have noticed some errors in the state information: Some places have a dubious state: eg OH-AL

You can obtain these states by choosing All Places en setting the state search box to -. Clicking Find will give you all these problem places. You can use google maps or the place dialog to sort them out. You can also use the USA country GeoNames file to search these places in the entire USA.

Gramps-notes.png
Note:

You will need sufficient memory for this, or you will obtain a MemoryError (see below)!

Step 4: Lat/Lon not found

After the above, still some 45 places have no latitude/longitude found. You can now select these places by setting the Place filter to 'No Latitude/Longitude', which will find you all places with no coordinates.

It will be clear that many of those can be quickly corrected: abbreviations, eg the city field contains St.George, which should be Saint George; double names, eg Waterloo-Cedar Falls, IA means Waterloo near Cedar Falls, changing the city to Waterloo and redoing the search using Google Maps will allow to quickly find which coordinates for Waterloo are needed.

Advanced usage

This is for advanced users who know how to use regular expressions.

The parsing fields have entry fields allowing you to give your own parsing. Parsing uses regular expressions. You can use this to parse your title, and to parse a lat/lon file in your own way. For reference, here an overview of the parsing codes used for the predefined parses:

Parse title details

In 'Parsing and Conversion of existing title or position', 'Parse title' and 'Change title into' provide some pre-defined options for extracting location values from a Place Title. Otherwise regex parsing is needed.

Write your regex in 'Parse title:'.Click on 'Find', which shows the proposed changes. Then click 'Apply'.

Example 1

For France, some practical rules could be useful for seizing place. It needs :

  • the city name + INSEE code (at option). This code is unique and can identify with certainty a common (with the county, district, township and municipality). It can identify with a common insurance even if it has changed its name. This code is used in Archives. Using postal code is not advisable ...
  • a subdivision: identifies a parish or a place called within a municipality
  • the state (at option) or county but is already on INSEE code
  • the country (at option). Ideally it should still take the country. It is understandable that this is tedious. Maybe do not enter the country if the genealogy is mostly of one country and seize enter the country for events outside the country's main. Everyone will appreciate.
e.g. Avignon,84000,Vaucluse,Provence-Cote-d'Azur,FRANCE,

where some fields may be missing:

e.g. Woerth,,,Alsace,FRANCE,

the regex:

\s*(?P<city>[^,]+)[,]\s*(?P<zip>\d*)[,](?P<county>[^,]*)[,](?P<state>[^,]*)[,](?P<country>[^,]*)[,]*$

treating the comma character as an end-of-string delimiter, will distribute Avignon to City, 84000 to ZIP, Vaucluse to County, Province-Cote-d'Azur to State and France to Country

and in the case of missing fields, as long as the first is not empty, will distribute Woerth to City, Alsace to State and FRANCE to Country.

It allows initial whitespace and an optional comma after the Country

Example 2

In many 16th and 17th century English IGI records the situation is more complicated.

The Place Title takes the form of 3 strings (Town, County, Country) or 4 strings (Parish, District, County, Country) for example:

(a) Chester le Street, Durham, England - 3-string

(b) Of Middleton-in-Teesdale, Durham, England - 3-string

(c) Bishoply,Stanhope, Durham, England - 4-string

(d) Of St. Margaret's, Stanhope, Durham, England - 4-string


For the 3-string record the following regex, treating the comma character as an end-of-string delimiter, will distribute the 3 strings correctly to City, County, Country locations, leaving 4-string records untouched.

Regex A:

\s*(Of[,]*\s*)*(?P<city>[^,]+?)[,]\s*(?P<county>[^,]+?)[,]\s*((?P<country>[^,]+?)){1,1}$


For 4-string records the following regex will distribute the 4 strings correctly to Parish, City, County, Country locations, leaving 3-string records untouched.

Regex B:

\s*(Of[,]*\s*)*(?P<parish>[^,]+?)[,]\s*(?P<city>[^,]+?)[,]\s*(?P<county>[^,]+?)[,]\s*((?P<country>[^,]+?)){1,1}$


"Bogus" 4-string records: a not uncommon error in these old records is for 'Of' to be followed by a comma, e.g.

(e) Of, Houghton-le-Spring, Durham, England

Regex A will parse 3-string records correctly and Regex B 4-string records. However Regex B will attempt to treat a 3-string record with 'Of,' as if it were a 4-string record. This would give the 3-string record a non-existent Parish called "Of" !

To avoid this, when using Regex B click on 'Find' to display the records proposed for change and delete all the 3-string 'Of,' records before clicking 'Apply'.

Gramps-notes.png
take care when pasting a Regex (for example, after testing in a regex editor) into the Parse title details field. Pasting adds redundant spaces before or after the Regex that will prevent it working properly in the Place Completion Tool

The predefined regex expressions are as follows, where for brevity we use some variables defined lower.

Gramps-notes.png
Regex Help

For those new to Python and Regex please review the HOWTO here:
http://docs.python.org/dev/howto/regex.html

  1. "City [,|.] State" is parsed by : r'\s*(?P<'+city_translated +r'>.+?)\s*[.,]\s*(?P<'+state_translated +r'>.+?)\s*$'
  2. "City [,|.] Country" is parsed by : r'\s*(?P<'+city_translated +r'>.+?)\s*[.,]\s*(?P<'+country_translated +r'>.+?)\s*$'
  3. "City (Country)" is parsed by : r'\s*(?P<'+city_translated +r'>.*?)\s*\(\s*(?P<'+country_translated +r'>[^\)]+)\s*\)\s*$'
  4. "City" is parsed by : r'\s*(?P<'+city_translated +r'>.*?)\s*$'

Here the variables used are:

lat_translated = _('lat')
lon_translated = _('lon')
city_translated = _('city')
county_translated = _('county')
state_translated = _('state')
country_translated = _('country')

You can use one of these variables as a group, and the tool will recognise them, and use as values for the corresponding place attributes.

Lat/Lon lookup parsing

For the regex of lat/lon lookup, you need to indicate which data must be replaced with existing place attributes for the search, as well as indicate which regex groups must be extracted.

  1. "GeoNames country file, city search" is parsed with: r'\t'+CITY_transl +r'\t[^\t]*\t[^\t]*\t' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)\tP'
  2. "GeoNames country file, city localized variants search" is parsed with: r'[\t,]'+CITY_transl+r'[,\t][^\t\d]*\t?' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)\tP'
  3. "GeoNames country file, title begin, general search" is parsed with: r'\t'+TITLEBEGIN_transl +r'\t[^\t]*\t[^\t]*\t' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)\t[PSTV]'
  4. "GeoNames USA state file, city search" is parsed with: r'\t'+CITY_transl+r'\tPopulated Place\t[^\t]*\t[^\t]*\t' + countygr + r'[^\t]*)' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t' +latgr + r'[\d+-][^\t]*)\t' + longr + r'[\d+-][^\t]*)'
  5. "GNS Geonet country file, city search" is parsed with: r'\t'+latgr+r'[\d+-][^\t]*)\t'+longr+r'[\d+-][^\t]*)' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\tP\t[^\t]*\t[^\t]*' + r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*' r'\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t'+CITY_transl+r'\t[^\t]*\t[^\t\n]+$'
  6. "GNS Geonet country file, title begin search" is parsed with: r'\t'+latgr+r'[\d+-][^\t]*)\t'+longr+r'[\d+-][^\t]*)'+ r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[PLSTV]\t[^\t]*\t[^\t]*'+ r'\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t[^\t]*\t[^\t]*\t[^\t]*' + r'\t'+TITLEBEGIN_transl+r'\t[^\t]*\t[^\t\n]+$'
  7. Read of mediawiki CSV dump. This reads the files on [1](link gone) (for more information, see http://meta.wikimedia.org/wiki/WikiProjects_Geographical_coordinates) (Contribution by nomeata)

For extraction of data you can use the same groupnames as in title parsing, so eg latgr in above should read: r'(?P<'+lat_translated +r'>' .

The syntax for the values that need to be used for searching in the file, eg CITY_transl, is given by : _('CITY'). You can use as substitution values: _('CITY'), _('TITLE'), _('TITLEBEGIN'), _('STATE'), _('PARISH').

The tool will read in the given regex, replace the substitution strings by the values in the place object, do the search, and extract the regex groups given from the result.

Resource: GEOnet Names Server

Troubleshooting

Non UTF-8 latitude/longitude file

The place completion tool expects the input files for location lookup to be in unicode (utf-8). On the occasion this is not the case, you will get the error:

File "/home/benny/programms/gramps/gramps2/src/plugins/PlaceCompletion.py", line 851, in load_latlon_file
    self.latlonfile_datastr = infile.read()
  File "/usr/lib/python2.4/codecs.py", line 481, in read
    return self.reader.read(size)
  File "/usr/lib/python2.4/codecs.py", line 293, in read
    newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1610092-1610094: invalid data

Note that the Place Completion tool catches these errors and shows you an information box. After this, the tool will attempt to read the file with utf-8 (unicode), ignoring errors. This might give good results, but will of course fail to produce results on non-unicode encoded files.

In the above example it is clear the problem is in two bytes, so you can correct this manually: open the file with eg KHexEdit Binary Editor, go to the specified position (offset 1610092), and change the two bytes with a space.

In the case the file is completely non-unicode, you will have to convert it to unicode with a tool, before using it in the placecompletion tool.

Memory Error

The tool might fail with the error:

self.latlonfile_datastr = infile.read()
  File "/usr/lib/python2.4/codecs.py", line 481, in read
    return self.reader.read(size)
  File "/usr/lib/python2.4/codecs.py", line 293, in read
    newchars, decodedbytes = self.decode(data, self.errors)
 MemoryError

The tool has to load the datafile for latitude/longitude searching into memory. For large files like USA.txt, this might be impossible if you have limited memory. You can try to close as many programs running at together with Gramps, and try the tool again.

See also

Design specification

Place/Location is a newer concept in Gramps. Many older databases only have a Place title field which is a descriptive text containing city, state, country. To distribute these values into the correct attribute fields, see Parse Title Details below

Download

If you use Gramps 4.2 , then use the automatic installation.

If you use Gramps 4.1 , then use the automatic installation.

If you use Gramps 3.4 , then use the automatic installation.

If you use Gramps 3.3 , then use the automatic installation.

If you use Gramps 3.2 , then use the automatic installation.

Manual install

For the following older versions of Gramps you will need to do a Manual install of the files.

Extract the three files that are in the download. Put the .glade and .py files in the plugins directory. For linux:

  • local install: place in ~/.gramps/plugins

If you still use Gramps 3.1.x, then you will need version 1.2 of the Place Completion Tool. You can find it at placecompletion_1_2.tar.gz.

If you still use Gramps 3.0.x, then you will need version 1.1 of the Place Completion Tool. You find it at placecompletion_1_1.tar.gz.

If you still use Gramps 2.2.5+, then you will need version 1.0 of the Place Completion Tool. You find it at placecompletion_1_0.tar.gz.