Talk:Addon:DNASegmentMapGramplet

From Gramps
Jump to: navigation, search
Gramps-notes.png

Please use carefully on data that is backed up, and help make it better by reporting any comments or problems to the author, or issues to the bug tracker
Unless otherwise stated on this page, you can download this addon by following these instructions.
Please note that some Addons have prerequisites that need to be installed before they can be used.
This Addon/Plugin system is controlled by the Plugin Manager.
An Addons Offline Manual is available for review.

Undocked DNA Segment Map Gramplet with the HighContrast theme

The DNA Segment Map Gramplet shows a graph.

Usage

The purpose of this gramplet is to view DNA segment data for the active user and a set of associated users. Once the user has done an autosomal DNA test and uploaded their data to one of the vendors (GEDmatch or FamilyTreeDNA, for example), the vendor can calculate the shared DNA segments with others in their system. These list of people with their shared segments are input to this gramplet for visualization.

Each person with shared segments will be a separate Association. The Notes in the Association should contain the shared segment info as calculated by the vendor.

There are two views for this gramplet. The default view is all of the chromosomes with all of the segments, painted in the order of the Association. The second view is a detailed view of a single chromosome, with each associated person having a separate row. The user clicks on the chromosome label (y-axis) to switch to the detailed view and clicks on the background to return to the default view.

You can install the DNA Segment Map Gramplet on the bottombar of one of the people or relationship category list views.

Create a DNA Association

To specify shared DNA segments between 2 people,

  • Create an Association for one person (Person A) with another Person (Person B) of the type DNA.
  • Add a Citation to the Association to reference the vendor which provided the segment info. This is not used by the gramplet, but is useful since different vendors will produce slightly different segment info. And some Associated Persons might have data from multiple vendors. In these cases, having a second Association is useful.
  • Create a Note in the Association or attached to a Citation in the Association with the shared DNA segment data.
    • The format of the Note is a comma separated list or a tab separated list in the order: Chromosome Number, Start Segment, End Segment, shared length in centiMorgans (cMs), SNP (optional), M or P or U (unknown) to override the Maternal Paternal chromosome as determined by the closest genetic connection in the tree (optional). .
    • For Example: 3,56950055,64247327,10.9,1404 Which means; Chromosome Number: 3, Start Segment: 56950055, End Segment: 64247327, shared length in cMs: 10.9, matching SNPs: 1404
    • Valid entries for each are:
Chromosome Number
number between 1-22 or X
Start Segment
The starting number for the segment location.
End Segment
The ending number for the segment location.
Shared length in cMs
The Genetic Distance (otherwise known as the number of centiMorgans) in the segment.
SNP
optional field of the matching SNPs (Single Nucleotide Polymorphism) in the segment.
M/P flag
optional field to override the Maternal or Paternal or Unknown chromosome. Valid entries are M or P or U. Any other data is ignored


Getting the chromosome data

Sites like GEDmatch make this shared chromosome data available. Direct copy from the GEDmatch results page (with header and tab separators) will work. There can be additional Associations between Person A and Person C (et cetera) as known.

DNApainter provides a description of how to get the chromosome data from many of the common sites.

Add the kit number or other unique info as an Attribute to the Associated Person. For example, create an Attribute 'GEDmatch kit' to contain the GEDmatch kit number. Or 'DNAkit' for the FTDNA kit number. These are not used by the gramplet but good for reference.

The sites that have DNA segment data and formats are covered below.

FamilyTreeDNA

FamilyTreeDNA provides a download CSV file. Sample format starting at field 2 is set for copy into this gramplet.

Match Name,Chromosome,Start Location,End Location,Centimorgans,Matching SNPs
MATCH NAME, 1,191870504,201977600,9.505878,2391
GEDmatch

GEDmatch provides a cut-paste option for data. The format is set for copy into this gramplet. The gramplet will accept either comma or period as the thousands separator.

Chr B37 Start Pos'n B37 End Pos'n Centimorgans (cM) SNPs Segment threshold Bunch limit SNP Density Ratio
6 67,249,077 91,039,808 15.6 978 191 114 0.1
MyHeritage

MyHeritage provides a download CSV. The fields need to be adjusted (remove the 1st, 2nd, 6th and 7th fields).

<U+FEFF>Name,Match Name,Chromosome,Start Location,End Location,Start RSID,End RSID,Centimorgans,SNPs
User Name,Match Name,2,172775482,208500311,rs116868713,rs9288384,29.8,16512
Geneanet

Geneanet provides a download CSV that needs to be adjusted. Fields 4 (Number of SNPs) and 5 (Length in centimorgan) need switched and field 6 removed. Semicolon delimeter needs replaced with comma.

Chromosome;Start of segment;Length of segment;Number of SNPs;Length in centimorgan (cM);Type of segment
9;14037831;73101159;6804;38.64;half-identical

Legend

Legend with rollover tooltip
  • For each Chromosome: the top portion is the Paternal side and the bottom portion is the Maternal side.
  • The chromosome segment side (Paternal or Maternal) is determined from the Most Recent Common Ancestor. If there is no common ancestor, both sides are used.
  • The color code for each associated person in the DNA segment map is consistent but not user-specified. The first Association will always be the same color.
  • To change the location of the Legend, edit the config file parameters:
legend-single-chromosome-y-offset=0
legend-swatch-offset-y=0

Navigation

  • The Legend on the right side lists each associated person who has a mapped segment. Hovering over the legend items will show a tooltip for possible action. Primary button click will change the active person. Secondary button click will open the Person Editor for the associated person.
  • Hovering over the Y-axis chromosome labels will show a tooltip for possible action. Primary button click will switch view to single chromosome of the label clicked. To return to full view, click on the background in the single chromosome view.
  • Hovering over a segment provides detail on the segments at that location. If there are multiple segments overlapping, all will have details.

Configuration

The config file for this gramplet has the following options. Remove the comment (double semi-colon) and edit as needed. Changes are not reflected until the next time the gramps is started.

[map]
;;chromosome-build=37
;;chromosome-x-scale=1.4
;;chromosome-y-scale=1
;;include-citation-notes=0
;;legend-char-height=12
;;legend-single-chromosome-y-offset=0
;;legend-swatch-offset-y=0
;;maternal-background=(0.996, 0.8, 0.941, 1.0)
;;paternal-background=(0.722, 0.808, 0.902, 1.0)
;;show-centromere=1
;;show_associate_id=0
  • Chromosome Build: choose the specific build for the chromosomes. Options are 36, 37, 38.
  • Chromosome X Scale: multiplier for width of drawing area. Increasing will shrink the size of the chromosome bars.
  • Chromosome Y Scale: multiplier for height of drawing area. Increasing will lower the bottom area of the drawing area.
  • Include Citation Notes: Whether to include reading Citation Notes for DNA data. Default is to only read Association Notes.
  • Legend Char Height: height of lines in legend.
  • Legend Single Chromosome Y offset: Adjust the height of the legend for the single chromosome view. This may be needed if more than 12 people share the chromosome.
  • Legend Swatch Offset: Offset for the color swatch in the legend. Should be 5 for Windows systems. 0 works for Linux and Mac
  • Maternal Background Color: RGB values for the background of the Maternal Chromosome
  • Paternal Background Color: RGB values for the background of the Paternal Chromosome
  • Show Centromere: If 1, do not paint the background area for the centromere. This area is dependent on the Build.
  • Show Associate ID: To remove the ID on the legend and tooltip, set to 0. Otherwise leave as 1 to print the Associate ID.

Single Chromosome View

To view just a single chromosome, click on the chromosome number. This view has each association as a different row, helping to see overlapping areas. To return to the standard view, click on the background area.

DNA Example data

To reproduce the illustration with the Example.gramps dataset, create two records in the Associations tab in the Person Editor for Luther Robinson(I0656).

The first record is DNA type, adding an association with Robert F. Garner (I1123).
The Note under this association contains the following text. The last two fields with the names are ignored.

Chromosome,Start Location,End Location,Centimorgans,Matching SNPs,Name,Match Name
3,56950055,64247327,10.9,375,Luther Robinson, Robert F. Garner
11,25878681,35508918,9.9,396
12,129481599,133491098,12.4,304
15,35444614,64710827,33.3,1212
1,48053426,68837810,24.6,3413
1,72956037,87857969,13.4,2035
3,69656569,74563488,9.0,974
6,6179882,15400114,18.5,1994  


The second record is also DNA type, adding an association with Maude Garner (I0651).
The Note under this association contains the following text:

1,30578594,38686908,11.2,334
1,236520701,249210707,29.7,685
3,14446545,24339734,12.3,458
3,128688499,140766208,11.4,447
4,76585823,114118650,33.7,1317
4,163973796,190915650,49.1,1422
6,4737179,9181572,10.3,279
6,39128976,49586285,15.8,510
6,150564916,156389148,10.2,415
7,18915133,37547290,25.8,1038
7,93557588,116296896,20.2,821
7,141636563,156148608,30.9,787
8,2808265,6919748,10.7,436
8,12568161,42652859,38.7,1556
8,49039681,71454529,20.3,742
8,71990280,99554231,21.9,917
9,78958599,122204804,55.7,2014
10,5608202,10769007,10.4,333
10,19365648,38434090,19.6,775
11,26722523,37020611,11.8,447
12,66412457,94422155,24.4,1035
13,19234747,23899627,8.7,270
13,74422984,91183468,14.2,506
14,23902753,33048583,15.5,392
14,88816167,106020366,37.8,947
15,23727655,27246462,8.2,229
16,22836249,32137965,12.6,413
16,46644903,54620503,11.7,360
17,13905,6613192,18.9,419
17,25567080,44187492,19.7,694
17,44790203,72115774,39.7,1223
18,18714991,47726830,27.9,1071
18,69454453,77894844,23.5,481
19,1993444,11174625,28.1,567
19,54545531,59087479,12.7,335
20,9879166,26225145,24.3,788
20,30221104,43975451,13.5,489
21,14670124,18743733,10.8,201
Tango-Dialog-information.png
DNA match row data must be in a specific CSV order

The data may include a header but the Gramplet does not require one. Headers are to make the rows more readable by humans. The rows do not need to be sequentially ordered by the chromosome identifier (1 to 22 and X).


Addon-DNA-GEDmatch.png

Sample GEDmatch output (see screenshot) that can be cut and paste into the Note. The fields are the same - Chromosome, Start, End, cM, and SNPs. This can be cut/paste from the GEDmatch output directly into the Note for the Association. The header line will be ignored.

Example

Addon-DNA-Note-Example.png

Create an Association of type DNA as described in the Association page to Person A. Add a Note with the DNA shared segment data. Set the Note private if you do not want the data printed in reports.

Addon-DNA-Association-Example.png

Save the Association.

Addon-DNA-Associations-Example.png

Add more associations as known. Each would be associated to a different person and have a different Note. Since the Associations are drawn in order, it is generally better to have them in order of closest relative to furthest relative to avoid obscuring a distant relative (smaller segment) by a close relative (larger segment). Use the up-arrow and down-arrow to change the order of the Association.

Addon-DNA-SegmentMap2.png

Add the DNA gramplet to the Person view. Select the DNA tab. The segment map will be color coded by associated person. For each Chromosome the top portion is the P (Paternal) side and the M (Maternal) side is the bottom portion. If the chromosome segment side (Paternal or Maternal) is unknown, the segment will cover both the top and bottom portions of the chromosome and be 50% transparent.

Addon-DNA-SegmentMap-with-Tooltip2.png

Hovering the cursor over a known segment will pop up the name of the associated person and the length (in cMs) of the shared segment.

Addon-DNA-SegmentMap-Single.png

Clicking on the Y-axis label for the 6th chromosome, a detailed view is shown. Click on the background to return to the complete view.

Reference Info

Untested Areas

There are areas in the DNA test that are generally not tested as they are not reliable indicators of a match. If you want to visualize these areas, create a dummy person and add an association with the following DNA segments.

13,1,19020094,0,0
14,1,19067948,0,0
15,1,20004965,0,0
21,1,9922017,0,0
22,1,16055121,0,0

Centromere Areas

Machines have difficulties to read the area around the centromere. There are less SNPs to read, therefore there is higher probability of false positive matches. For example, if you have a match exactly around the centromere, then it is most probably a false positive match. If you want to visualize these areas, edit the config file as described above and set show-centromere to 1. The area of the centromere id dependent on the Build (36, 37, or 38). For example, the list for Build 37 (the most common) is

1	121500000	128900000
2	90500000	96800000
3	87900000	93900000
4	48200000	52700000
5	46100000	50700000
6	58700000	63300000
7	58000000	61700000
8	43100000	48100000
9	47300000	50700000
10	38000000	42300000
11	51600000	55700000
12	33300000	38200000
13	16300000	19500000
14	16100000	19100000
15	15800000	20700000
16	34600000	38600000
17	22200000	25800000
18	15400000	19000000
19	24400000	28600000
20	25600000	29400000
21	10900000	14300000
22	12200000	17900000
X	58100000	63000000

To hide the centromere for all chromosomes, edit the config file to change this option:

show-centromere=0

To Do

Issues

  • If the Chromosome Number is not in the range (1, 2, ..., 22, X) it is ignored.
  • If there are multiple paths to a common ancestor, the closest found is used.
  • To create a segment map for Person A, you need to add associations to Person A. There is no reciprocal relationship for Person B - that is, there is no segment map for Person B, only for Person A. You can execute the Addon:SyncAssociation addon to create any missing reciprocal relationships.
  • Color code for each associated person in the map is consistent but not user-specified. The first Association will always be the same color.
  • If there are overlapping segments within a maternal/paternal view of a chromosome, only the front (last drawn) will be pickable. The tooltip will still provide the details of the hidden segments. Changing the order of Associations (using the up-arrow and down-arrow) to have the closer relatives before further relatives will fix this. To see overlapping segments, select a single chromosome.

See also