Difference between revisions of "Recover corrupted family tree"

From Gramps
Jump to: navigation, search
m (Grammar and Typo)
(26 intermediate revisions by 9 users not shown)
Line 1: Line 1:
{{languages}}
+
{{grampsmanualcopyright}}
  
 +
{{languages|Recover_corrupted_family_tree}}
  
An attempt to explain GRDB corruption, how to recover from it, and how to avoid it in the future.
+
Explanation of '''family tree''' and '''GRDB corruption''', how to recover from it, and how to avoid it in the future.
[[Category:How do I...]]
+
 
 +
== Family Tree corruption ==
 +
=== What causes this corruption? ===
 +
Not really known. Database corruption with family trees is however far less likely than with the previous format of storing your family tree Gramps version 2.2.x uses
 +
 
 +
=== How do you know about it? ===
 +
 
 +
Gramps might give you on startup that recovery is needed via a dialog box:
 +
 
 +
Gramps has detected a problem in the underlying Berkeley database.
 +
This can be repaired by from the Family Tree Manager.
 +
Select the database and click on the Repair button
 +
 
 +
But it might happen no {{man button|Repair}} button is present, or you obtain the error (visible in terminal)
 +
 
 +
(-30975, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: Invalid argument').
 +
 
 +
=== What to do now? ===
 +
 
 +
It is advisable not to click the repair button right away. It should work, but GRAMPS might believe an error is present while this is in reality not true. Repairing your tree then will lead to loss of the last typed changes.
 +
 
 +
Instead, take a backup of the family tree that is given problems. In a terminal do:
 +
 
 +
gramps -l
 +
 
 +
This will give you a list with all family trees and the directory where they are stored, normally somewhere in the directory  ~/.gramps/grampsdb. Copy the directory of the tree with problems so as to have a backup:
 +
 +
cp -a <target directory> <backup directory>
 +
 
 +
If the recover button was present on the Gramps family tree, click it. All should work again. If you notice you lost information, or the repair button does not work, then do the following.
 +
If recovery worked, but you do not like the result, backup this data and place your backup taken above back in its original position. You now have again the bad family tree to work on. Next, obtain the bsddb recovery tools, see your distributions package search page. The program is called db4.8_recover, where 4.8 might be an older or newer version number.
 +
 
 +
Run this tool as follows:
 +
 
 +
cd /home/<user>/.gramps/grampsdb/<target directory>
 +
db4.8_recover -c
 +
 
 +
That should do the trick, and allow GRAMPS to load the family tree. If not, then start a ticket on the gramps bug tracker.
 +
 
 +
=== I have backup gpkg files ===
 +
If you have a backup, you can try to recover the backup gpkg files. Do the following steps:
 +
The procedure to recover your data from gbkp files is:
 +
# Copy the gbkp files to a new directory in your database directory, eg directory ''a1111''
 +
# Copy name.txt, open it in the new directory and set the content to a unique name.
 +
# Create a file with name '''need_recover'''. Mind the underscore and the lack of an extension. The content of that file is unimportant.
 +
# Start Gramps, click on the family tree with the name you adjusted in step 2. There should be a red stop sign with that filename. Click on the Recover button. The red stop sign should disappear and you should be able to load that family tree.
 +
 
 +
=== Implement more security ===
 +
Your genealogy data contains a lot of work and man hours. So '''work out a backup scheme'''
 +
 
 +
If you work on Gramps regularly: backup the directory holding the family tree databases. These are very large files however.
 +
 
 +
If you know you work on GRAMPS sporadically only, or have no space to backup your trees regularly, then do [[How_to_make_a_backup|backup]] in XML format (the .gramps format). Do not forget to disable privacy filters...
 +
The XML format will open up just fine over 5 years on another computer with another OS. This will probably '''not''' be the case for the databases a family tree is stored in. XML is machine- and human-readable. It is completely self-sufficient. It is also small. The following are good practices of [[How_to_make_a_backup|backups]] :
 +
 
 +
  1. Export to XML from time to time, especially after large edits.
 +
  2. Export to XML before making big changes, such as importing new data into an existing database from e.g. GEDCOM, merging records, running tools that may heavily modify the data, etc.
 +
  3. Export to XML before upgrading GRAMPS to a newer version. Apparently, export to XML with old version before you install the new one!
 +
  4. Export to XML before upgrading your OS.
 +
 
 +
Also, use XML format for any data migration. Moving to another machine, sending data to grandma, copying to another user on the same machine -- all of these cases should use XML, as there is no binary specific data.
 +
 
 +
Note that XML does not contain your media files. The gpkg output format contains XML and your media files, with the disadvantage of this being very large. If you already have a backup scheme for your media files, there is no need to also backup gpkg files.
 +
 
 +
=== ACI not ACID, upgrade, downgrade ===
 +
Gramps protects your data using an ACI database. This means the last commit can be lost on an error, but not more than that. You should before an upgrade make sure Gramps closed your family tree correctly however.
 +
 
 +
There should be no error in opening a family tree with a newer version. See the long research in {{bug|3975}}, which does indicate version 4.7.25 of Bsddb contains a bug that can give a strange error message.
 +
 
 +
Trying to open a family tree after a downgrade is not supported. You will obtain an error that the database is created with a newer version.
  
==Why this corruption?==
+
== Version 2.2.x: GRDB corruption ==
By far, the leading cause of grdb corruption is moving the grdb file from its original location. Whether you move the file to another directory, rename it, copy into another file, transfer to another machine, or another user account -- all of those will "corrupt" the file.
+
===What causes this corruption?===
 +
The leading cause of grdb corruption is moving the grdb file from its original location. Whether you move the file to another directory, rename it, copy into another file, transfer to another machine, or another user account -- all of those will "corrupt" the file.
  
What happens is that the grdb file needs its database environment -- a directory with log files, lock files, temp files, etc. The current stable gramps releases store the environment for each file, under a tree in a <code>~/.gramps/env</code> directory. If your grdb file is <code>/home/user/genealogy/MyData.grdb</code> then its environment is in the <code>/home/user/.gramps/env/home/user/genealogy/MyData.grdb</code> directory.
+
What happens is that the grdb file needs its database environment -- a directory with log files, lock files, temp files, etc. The 2.2.x gramps releases uses grdb files and stores the environment for each file, under a tree in a <code>~/.gramps/env</code> directory. If your grdb file is <code>/home/user/genealogy/MyData.grdb</code> then its environment is in the <code>/home/user/.gramps/env/home/user/genealogy/MyData.grdb</code> directory.
  
 
So moving, copying, or renaming the file will copy the file's bytes, but not its environment. This is why the moved file appears corrupted.
 
So moving, copying, or renaming the file will copy the file's bytes, but not its environment. This is why the moved file appears corrupted.
  
==What do I do now?==
+
Another cause can be an upgrade or downgrade of your operating system to a bsddb database backend that does not support fully the previous form of the database (eg, changed hash versions). This will also seem like a corruption in GRAMPS, but actually means the bsddb tools must be used to convert to data to a new version.
 +
 
 +
Not being able to open a /tmp/... file in GRAMPS 3.0.x on opening grdb files indicates database corruption. This is because the grdb file you want to open is copied to the /tmp dir, and then opened. All failure results in the '/tmp/tmpxxxxx could not be opened'
 +
 
 +
===What do I do now?===
 
The answer depends on whether or not you have the environment for that database. If you just copied one file into another then the environment may still work. If you modified the original database since then, the original environment has changed and there's no good environment for the new file. If you removed your <code>.gramps</code> directory (why oh why?) then all environments are lost. So act depending on the situation, as explained below.
 
The answer depends on whether or not you have the environment for that database. If you just copied one file into another then the environment may still work. If you modified the original database since then, the original environment has changed and there's no good environment for the new file. If you removed your <code>.gramps</code> directory (why oh why?) then all environments are lost. So act depending on the situation, as explained below.
  
===The environment still exists===
+
====The environment still exists====
 
If you have environment directory for that file, copy it under the above gudelines.
 
If you have environment directory for that file, copy it under the above gudelines.
 
;Example: You copied <code>/home/user/genealogy/MyData.grdb</code> to <code>/home/user/genealogy/backup/BackupData.grdb</code> and the new file is not working.
 
;Example: You copied <code>/home/user/genealogy/MyData.grdb</code> to <code>/home/user/genealogy/backup/BackupData.grdb</code> and the new file is not working.
 
;Solution: Copy <code>/home/user/.gramps/env/home/user/genealogy/MyData.grdb</code> directory into <code>/home/user/.gramps/env/home/user/genealogy/backup/BackupData.grdb</code> and this should fix the problem.
 
;Solution: Copy <code>/home/user/.gramps/env/home/user/genealogy/MyData.grdb</code> directory into <code>/home/user/.gramps/env/home/user/genealogy/backup/BackupData.grdb</code> and this should fix the problem.
  
===The environment is lost===
+
====The environment is lost====
If you don't have the original environment for that file, you may try dumping and loading your data using Berkeley DB tools. Depending on your system, they may be called <code>db_dump</code> and <code>db_load</code>, <code>db41_dump</code> and <code>db41_load</code>, <code>db4.4_dump</code> and <code>db4.4_load</code>, or some such. Whatever they are called, there should be be a dump tool and a load tool, and they should be version 4 or later.
+
If you don't have the original environment for that file, you may try dumping and loading your data using Berkeley DB tools. Depending on your system, they may be called <code>db_dump</code> and <code>db_load</code>, <code>db41_dump</code> and <code>db41_load</code>, <code>db4.4_dump</code> and <code>db4.4_load</code>, ... In Ubuntu you find them in the package <code>db4.8-util</code>. You might need more recent versions depending on the version your distribution uses in its python package. So for eg Ubuntu Hardy created files, you will need <code>db4.8-util</code>. Whatever they are called, there should be a dump tool and a load tool, and they should be version 4 or later. For Fedora 17 this is 'db4-utils-4.8.30-10.fc17'.
  
 
Basically, you just dump the grdb into a text file, then create a new grdb from that text file:
 
Basically, you just dump the grdb into a text file, then create a new grdb from that text file:
     $ db4.4_dump BackupData.grdb > somefile.txt
+
     $ db4.8dump BackupData.grdb > somefile.txt
     $ db4.4_load newfile.grdb < somefile.txt
+
     $ db4.8_load newfile.grdb < somefile.txt
and then cross your heart and hope that <code>newfile.grdb</code> will open in gramps.
+
and then cross your heart and hope that <code>newfile.grdb</code> will open in Gramps.
 +
If you obtain the error:
 +
 
 +
db4.4_dump: eidtrans: unsupported hash version: 9
 +
 
 +
this is an indication you need a more recent version. So use db4.8 tools:
 +
    $ db4.8_dump BackupData.grdb > somefile.txt
 +
    $ db4.8_load newfile.grdb < somefile.txt
 +
 
 +
Note: If you downgrade your distribution, it might be needed to do dump with 4.6 tools, and load with 4.4 or 4.5 tools.
  
==How to prevent corruption?==
+
===How to prevent corruption?===
 
While moving the file is the leading cause of corruption, apparently there are other less frequent causes that we don't fully know. So preventing corruption is not always possible.
 
While moving the file is the leading cause of corruption, apparently there are other less frequent causes that we don't fully know. So preventing corruption is not always possible.
  
What is possible though is to backup the data regularly. The backups should be in XML format (the <code>.gramps</code> format). XML is machine- and human-readable. It is completely self-sufficient. It is also smalll. Following are the good practices of backups:
+
What is possible though is to [[How_to_make_a_backup|backup]] the data regularly. The [[How_to_make_a_backup|backups]] should be in XML format (the <code>.gramps</code> format). XML is machine- and human-readable. It is completely self-sufficient. It is also small. The following are good practices of backups:
 
# Export to XML from time to time, especially after large edits.
 
# Export to XML from time to time, especially after large edits.
# Export to XML before making big changes, such as importing new data into an existing database from e.g. GEDCOM; merging records; running tools that may heavily modify the data etc.
+
# Export to XML before making big changes, such as importing new data into an existing database from e.g. GEDCOM, merging records, running tools that may heavily modify the data, etc.
# Export to XML before upgrading gramps to a newer version. Apparently, export to XML with old version before you install the new one!
+
# Export to XML before upgrading GRAMPS to a newer version. Apparently, export to XML with old version before you install the new one!
 
# Export to XML before upgrading your OS.
 
# Export to XML before upgrading your OS.
  
 
Also, use XML format for any data migration. Moving to another machine, sending data to grandma, copying to another user on the same machine -- all of these cases should use XML.
 
Also, use XML format for any data migration. Moving to another machine, sending data to grandma, copying to another user on the same machine -- all of these cases should use XML.
  
==Can you guys not solve this ? ==
+
{{languages|Recover_corrupted_family_tree}}
Yes we can! In the next version (GRAMPS 3.0/4.0) this part will be completely reworked, see [[Database Formats#The Future - GRAMPS_3.0|here]].
+
 
 +
[[Category:How do I...]]

Revision as of 21:51, 25 November 2012

Gnome-important.png Special copyright notice: All edits to this page need to be under two different copyright licenses:

These licenses allow the Gramps project to maximally use this wiki manual as free content in future Gramps versions. If you do not agree with this dual license, then do not edit this page. You may only link to other pages within the wiki which fall only under the GFDL license via external links (using the syntax: [https://www.gramps-project.org/...]), not via internal links.
Also, only use the known Typographical conventions


Explanation of family tree and GRDB corruption, how to recover from it, and how to avoid it in the future.

Family Tree corruption

What causes this corruption?

Not really known. Database corruption with family trees is however far less likely than with the previous format of storing your family tree Gramps version 2.2.x uses

How do you know about it?

Gramps might give you on startup that recovery is needed via a dialog box:

Gramps has detected a problem in the underlying Berkeley database.
This can be repaired by from the Family Tree Manager.
Select the database and click on the Repair button

But it might happen no Repair button is present, or you obtain the error (visible in terminal)

(-30975, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: Invalid argument').

What to do now?

It is advisable not to click the repair button right away. It should work, but GRAMPS might believe an error is present while this is in reality not true. Repairing your tree then will lead to loss of the last typed changes.

Instead, take a backup of the family tree that is given problems. In a terminal do:

gramps -l 

This will give you a list with all family trees and the directory where they are stored, normally somewhere in the directory ~/.gramps/grampsdb. Copy the directory of the tree with problems so as to have a backup:

cp -a <target directory> <backup directory>

If the recover button was present on the Gramps family tree, click it. All should work again. If you notice you lost information, or the repair button does not work, then do the following. If recovery worked, but you do not like the result, backup this data and place your backup taken above back in its original position. You now have again the bad family tree to work on. Next, obtain the bsddb recovery tools, see your distributions package search page. The program is called db4.8_recover, where 4.8 might be an older or newer version number.

Run this tool as follows:

cd /home/<user>/.gramps/grampsdb/<target directory>
db4.8_recover -c

That should do the trick, and allow GRAMPS to load the family tree. If not, then start a ticket on the gramps bug tracker.

I have backup gpkg files

If you have a backup, you can try to recover the backup gpkg files. Do the following steps: The procedure to recover your data from gbkp files is:

  1. Copy the gbkp files to a new directory in your database directory, eg directory a1111
  2. Copy name.txt, open it in the new directory and set the content to a unique name.
  3. Create a file with name need_recover. Mind the underscore and the lack of an extension. The content of that file is unimportant.
  4. Start Gramps, click on the family tree with the name you adjusted in step 2. There should be a red stop sign with that filename. Click on the Recover button. The red stop sign should disappear and you should be able to load that family tree.

Implement more security

Your genealogy data contains a lot of work and man hours. So work out a backup scheme

If you work on Gramps regularly: backup the directory holding the family tree databases. These are very large files however.

If you know you work on GRAMPS sporadically only, or have no space to backup your trees regularly, then do backup in XML format (the .gramps format). Do not forget to disable privacy filters... The XML format will open up just fine over 5 years on another computer with another OS. This will probably not be the case for the databases a family tree is stored in. XML is machine- and human-readable. It is completely self-sufficient. It is also small. The following are good practices of backups :

  1. Export to XML from time to time, especially after large edits.
  2. Export to XML before making big changes, such as importing new data into an existing database from e.g. GEDCOM, merging records, running tools that may heavily modify the data, etc.
  3. Export to XML before upgrading GRAMPS to a newer version. Apparently, export to XML with old version before you install the new one!
  4. Export to XML before upgrading your OS. 

Also, use XML format for any data migration. Moving to another machine, sending data to grandma, copying to another user on the same machine -- all of these cases should use XML, as there is no binary specific data.

Note that XML does not contain your media files. The gpkg output format contains XML and your media files, with the disadvantage of this being very large. If you already have a backup scheme for your media files, there is no need to also backup gpkg files.

ACI not ACID, upgrade, downgrade

Gramps protects your data using an ACI database. This means the last commit can be lost on an error, but not more than that. You should before an upgrade make sure Gramps closed your family tree correctly however.

There should be no error in opening a family tree with a newer version. See the long research in 3975, which does indicate version 4.7.25 of Bsddb contains a bug that can give a strange error message.

Trying to open a family tree after a downgrade is not supported. You will obtain an error that the database is created with a newer version.

Version 2.2.x: GRDB corruption

What causes this corruption?

The leading cause of grdb corruption is moving the grdb file from its original location. Whether you move the file to another directory, rename it, copy into another file, transfer to another machine, or another user account -- all of those will "corrupt" the file.

What happens is that the grdb file needs its database environment -- a directory with log files, lock files, temp files, etc. The 2.2.x gramps releases uses grdb files and stores the environment for each file, under a tree in a ~/.gramps/env directory. If your grdb file is /home/user/genealogy/MyData.grdb then its environment is in the /home/user/.gramps/env/home/user/genealogy/MyData.grdb directory.

So moving, copying, or renaming the file will copy the file's bytes, but not its environment. This is why the moved file appears corrupted.

Another cause can be an upgrade or downgrade of your operating system to a bsddb database backend that does not support fully the previous form of the database (eg, changed hash versions). This will also seem like a corruption in GRAMPS, but actually means the bsddb tools must be used to convert to data to a new version.

Not being able to open a /tmp/... file in GRAMPS 3.0.x on opening grdb files indicates database corruption. This is because the grdb file you want to open is copied to the /tmp dir, and then opened. All failure results in the '/tmp/tmpxxxxx could not be opened'

What do I do now?

The answer depends on whether or not you have the environment for that database. If you just copied one file into another then the environment may still work. If you modified the original database since then, the original environment has changed and there's no good environment for the new file. If you removed your .gramps directory (why oh why?) then all environments are lost. So act depending on the situation, as explained below.

The environment still exists

If you have environment directory for that file, copy it under the above gudelines.

Example
You copied /home/user/genealogy/MyData.grdb to /home/user/genealogy/backup/BackupData.grdb and the new file is not working.
Solution
Copy /home/user/.gramps/env/home/user/genealogy/MyData.grdb directory into /home/user/.gramps/env/home/user/genealogy/backup/BackupData.grdb and this should fix the problem.

The environment is lost

If you don't have the original environment for that file, you may try dumping and loading your data using Berkeley DB tools. Depending on your system, they may be called db_dump and db_load, db41_dump and db41_load, db4.4_dump and db4.4_load, ... In Ubuntu you find them in the package db4.8-util. You might need more recent versions depending on the version your distribution uses in its python package. So for eg Ubuntu Hardy created files, you will need db4.8-util. Whatever they are called, there should be a dump tool and a load tool, and they should be version 4 or later. For Fedora 17 this is 'db4-utils-4.8.30-10.fc17'.

Basically, you just dump the grdb into a text file, then create a new grdb from that text file:

   $ db4.8dump BackupData.grdb > somefile.txt
   $ db4.8_load newfile.grdb < somefile.txt

and then cross your heart and hope that newfile.grdb will open in Gramps. If you obtain the error:

db4.4_dump: eidtrans: unsupported hash version: 9

this is an indication you need a more recent version. So use db4.8 tools:

   $ db4.8_dump BackupData.grdb > somefile.txt
   $ db4.8_load newfile.grdb < somefile.txt

Note: If you downgrade your distribution, it might be needed to do dump with 4.6 tools, and load with 4.4 or 4.5 tools.

How to prevent corruption?

While moving the file is the leading cause of corruption, apparently there are other less frequent causes that we don't fully know. So preventing corruption is not always possible.

What is possible though is to backup the data regularly. The backups should be in XML format (the .gramps format). XML is machine- and human-readable. It is completely self-sufficient. It is also small. The following are good practices of backups:

  1. Export to XML from time to time, especially after large edits.
  2. Export to XML before making big changes, such as importing new data into an existing database from e.g. GEDCOM, merging records, running tools that may heavily modify the data, etc.
  3. Export to XML before upgrading GRAMPS to a newer version. Apparently, export to XML with old version before you install the new one!
  4. Export to XML before upgrading your OS.

Also, use XML format for any data migration. Moving to another machine, sending data to grandma, copying to another user on the same machine -- all of these cases should use XML.