NoteCleanupTool

From Gramps
Jump to: navigation, search
Gramps-notes.png

Please use carefully on data that is backed up, and help make it better by reporting any comments or problems to the author, or issues to the bug tracker
Unless otherwise stated on this page, you can download this plugin by following these instructions
Please note that some Addons have prerequisites that need to be installed before they can be used.
This Addon/Plugin system is controlled by the Plugin Manager


Note Cleanup Tool - example window

What is it For?

The Note Cleanup Tool searches your database for notes that contain "HTML tags" and cleans it up. "HTML tags" have meaning to web browsers and other applications, but look like garbage to most people.

The cleanup consists of converting the "HTML tags" to what the Gramps folks call "Styled text" which displays nicely within Gramps and in the various reports that Gramps can generate.

The tool also searches for "links" to web sites and sets them to "Styled Text Links", so that they work properly in reports such as the Gramps Narrative Web Site.

Examples of notes that needs cleaning

 <i>Selected U.S. Naturalization Records</i>. 
 Washington D.C.: National Archives and Records Administration. 
 <p><a href="/search/dbextra.aspx?dbid=1629">
 View Full Source Citations</a>.</p>

 United States of America, Bureau of the Census. <i>Fifteenth Census of the United States, 1930</i>. 
 Washington, D.C.: National Archives and Records Administration, 1930. T626, 2,667 rolls.

The first note contains some text with italics as well as a link to a web site. The cleaned note is shown in the lower right pane of the image on the cleanup tool above. The second dirty note is much simpler, it only contains a bit of italics text; the tool will remove the <i> and </i> and convert the text in between to italics.

Where do the "dirty" notes come from?

The author of this plugin has a lot of data that was imported from http://ancestry.com via GEDCOM. It seems that the ancestry.com software doesn't always do a great job of converting their web pages to notes for GEDCOM export. The popular Family Tree Maker software, in conjunction with ancestry.com also seems to have a lot of these "dirty" notes in their GEDCOM exports. There are probably other ways that dirty notes get into our data, but this tool can help clean it up.

Usage

  • Once this plugin has been installed
  • Select Menu Tools -> Utilities -> Note Cleanup...
Gnome-important.png
Warning

Proceeding with this tool may make unexpected changes to your data. While it is possible to Undo the changes made by this tool, it may be easier to recover from a backup.

  • The dialog that pops up when you start the tool provides a warning and a short review of the tool usage. Clicking Close will dismiss the dialog but does nothing else.
  • At this point the tool window with three main panes and some buttons should be visible.
  • If you want to perform a quick test to see what this tool does, you can press the Generate Test Notes button. This is best performed on a test database, since it actually adds some test notes to the active database.
  • Otherwise you can perform a scan of your data by pressing the Search button. This does not make any changes to your data, only performs the search and fills the panes if any dirty notes are found.
    • The left pane will contain lists of the results of the search. If nothing appears here, then none of your data is dirty, or any dirty notes already have been edited and styled. There may be up to three lists in the left pane. A list will only be present if there are notes found in that category.
      • The Cleaned Notes list contains notes that have been edited in some way, and may have styles applied. This includes converting any web links to a styled link.
      • The Links Only list contains notes that have not been edited (no textual changes at all) but have had links converted to styled.
      • The Issues list contains notes that may have been edited in some way, but contain HTML tags that the tool does not recognize.
    • The top right pane will show the original unedited text of a note that is selected in the left pane. This cannot be changed.
    • The lower right pane will show the proposed changes to the note. If the user wants to make additional changes, they can be made directly in the pane. The editing toolbar directly above the lower left pane can be used to make or correct style changes as well. This toolbar is an abbreviated version of the normal Gramps Note editor toolbar and each icon works the same. Any changes made here are preserved within the tool, but will not affect your data until the Save All button is pressed (see next below).
  • When you have reviewed all the changes made, either by the tool, or manually, you can save your work using the Save All button. Your data is not changed until you press this button, but when you do, all the changes are committed to your database.
  • If you do not want any changes to be made to your data, or you are done with the tool, you can simply press the Close button.
  • If you want a record of notes that have been found in the search, you can use the Export button to save a listing to a text file.

Some Notes

  • If any Issues are found, the unrecognized "HTML tag" will be marked up with a yellow highlight. The user may want to manually edit each of these in some way. When done, the highlight can be removed by selecting the highlighted region and pressing the "Clear Markup" toolbar button (last button on the right).
  • Pressing the Search button at any time will scan your data again, discarding any proposed automatic or manual changes and producing new lists.
  • Double Clicking on a note in a left pane list will bring up the standard Gramps Note Editor. Any changes in that editor followed by pressing the OK button will update your database immediately. However, if you later press the Save All button in the still open Note Cleanup Tool, that data will be overwritten by the Note Cleanup Tool.

Issues