Comparson of performance on large datasets between different GRAMPS versions
It is important that GRAMPS performs well on datasets in the 10k to 30k range. A good benchmark is to test GRAMPS on a 100k range dataset, and keep track of performance with every new version.
Furthermore, this page can serve as proof to users that the present version of GRAMPS is not slow. From version 2.2.5 onwards, special attention will be given to performance, so that it does not deteriorate due to changes.
If you want to work with a large database, read Tips for large databases.
Comparison should be with equal hardware, and on the same datasets to be fair. Optimal representation may be chosen, so for GRAMPS, tests are done in the native database format, called GRAMPS GRDB format.
Should somebody want to publish results of commercial software under windows, this is allowed, but should be fair: same hardware and dataset, so test on a dual-boot machine, and use the internal format of the program.
A table with datasets is given. Pay attention to the copyright
The second table is a table with hardware configuration. Add your machine to this list if you do some tests and want to add them to this article.
The third table gives the test results, which are subjective. Please, don't use other software while doing the tests.
The Test Results
- My computer hangs on open, eating memory? These are LARGE datasets, so do NOT open them directly. For GRAMPS open them as follows: create a new grdb file. In the empty file go to file menu-import and import the dataset
- What is tar.bz? This is a compression format. You must uncompress the file before importing it
- Can you provide the GEDCOM? No. Offering GEDCOM has the danger of attracting to much traffic to this site. If you need GEDCOM, you should install GRAMPS, import the dataset, and then choose "Export to GEDCOM".
- What is in these files? See summary at the bottom of this page.
|d01||Doug's test GEDCOM||-||100993||32MB||Private|
|d02||testdb80000||11.2 MB||82688||70MB||Testing only, no sharing, no publication *** NOTE: THIS FILE IS MISSING. IF ANYONE HAS A COPY, PLEASE CONTACT [email protected] ***|
|d03||testdb120000||18.5MB||124032||105MB||Testing only, no sharing, no publication|
|H01||Pentium 4||2.66 GHz||512 MB||Linux||?|
|H02||?||1.7 GHz||512 MB||Linux||?|
|H03||AMD Athlon64 X2||2x2.1 GHz||1 GB||Kubuntu 6.06||?|
|H04||Intel Centrino Duo||2x1.66 GHz||2 GB||Ubuntu 9.04||User:Duncan|
|H05||Intel Centrino Duo||2x1.66 GHz||2 GB||Ubuntu 8.10||User:Duncan|
|T01||Time to import GEDCOM/GRAMPS in empty native file format (GRDB)|
|T02||Size native file format (GRDB)|
|T03||Time to open native file format (GRDB) for clean/nonclean start on people view (*)|
|T04||Time to open edit person dialog|
|T05||Time to delete/undelete person|
|T06||Open event view clean/after T03 (*)|
|T07||Sort on date in event view|
|T08||Overal editing responsiveness|
(*) clean start means computer restart (so also python methods/modules must be loaded and started). Nonclean means you have opened gramps with .grdb file before, and open it again. Parts will be still in memory and access will be faster, as well as python being in memory.
General remark: tests are done with in GRAMPS preferences: transactions enabled, unless indicated otherwise with notrans. This gives a performance boost. For safety: only change this setting on an empty database -- you are warned!
|H03||2.2.4 notrans||d01||2h||542.6MB (v11)|
|H03||d03||2.2.6||T03 = /17s||T04 = 1s||T05 = 20s/18s||T06 = ?/9s||T07 = 21s||T08 = Excellent|
|H03||d03||2.2.4||T03 = 2m37s/4m3s||T04 = 3s||T05 = 43s/23s||T06 = 1m23s/12s||T07 = 20s||T08 = very bad|
|H03||d02||2.2.6||T03 = ?/24s||T04 = 1s||T05 = 17s/13s||T06 = ?/11s||T07 = 17s||T08 = Excellent|
|H03||d01||2.2.4||T03 = 2m22s/2m||T04 = 3s||T05 = 33s||T06 = 1m9s/10s||T07 = 18s||T08 = very bad|
|H02||d01||2.2.5||T03 = 12s|
|H02||d01||2.2.4||T03 = 4m17s|
|H05||d03||2.2.10||T03 = 1m15s/16s||T04 = 1s||T05 = 16s/13s||T06 = 11s/1s||T07 = 26s||T08 = good after loading each view once|
|?||db||version||T03 = ?/?||T04 = ?||T05 = ?/?||T06 = ?||T07 = ?||T08 = description|
For every test dataset, create a summary with Report: Summary of the database
Summary of database test d01:
Number of individuals: 100993 Males: 53046 Females: 47947 Individuals with incomplete names: 324 Individuals missing birth dates: 42726 Disconnected individuals: 19 Number of families: 36554 Unique surnames: 15308
Summary of database test d02:
Number of individuals: 82688 Males: 44736 Females: 37952 Individuals with incomplete names: 17120 Individuals missing birth dates: 31528 Disconnected individuals: 880 Number of families: 32256 Unique surnames: 13957
Summary of database test d03:
Number of individuals: 124032 Males: 67104 Females: 56928 Individuals with incomplete names: 25680 Individuals missing birth dates: 47292 Disconnected individuals: 1320 Number of families: 48384 Unique surnames: 20695
Possible Future Optimizations
One can fine tune some things to obtain better results. An overview.