Skip Navigation


Bioinformatics Advance Access originally published online on February 17, 2006
Bioinformatics 2006 22(8):1013-1014; doi:10.1093/bioinformatics/btl058
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/8/1013    most recent
btl058v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhao, J. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhao, J. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Pedigree-drawing with R and graphviz

Jing Hua Zhao

MRC Epidemiology Unit, Strangeways Research Laboratory Cambridge CB1 8RN, UK


    ABSTRACT
 TOP
 ABSTRACT
 REFERENCES
 

Summary: Two functions for pedigree-drawing available in R (http://www.r-project.org): plot.pedigree in kinship and pedtodot in gap are described. The latter requires graphviz (http://www.graphviz.org). They can produce many pedigree diagrams quickly into a single file, serving as alternatives to programs that only offer interactive use.

Availability: Packages kinship and gap are available from http://cran.r-project.org.

Contact: jinghua.zhao{at}mrc-epid.cam.ac.uk

Graphical display of pedigree data is of interest in family studies and there are many computer programs avaiable [see Dudbridge et al. (2004) and http://linkage.rockefeller.edu]. For example, there is a flexible yet sophisticated package under the GNU General Public License (http://www.gnu.org/copyleft/gpl.html) called Madeline (http://eyegene.ophthy.med.umich.edu). In view of the rising interest in R (Zhao and Tan, 2006), two R functions for drawing pedigree diagrams are introduced. The first is called plot.pedigree and in package kinship originally in S-Plus by Terry Therneau and Beth Atkinson, while the second is called pedtodot and in package gap motivated by a gawk script by David Duffy.

Before a full description of them is given, it is necessary to know the way which pedigree information is organized and represented. It would also be helpful to have some understanding of the algorithmic aspects (Tores and Barillot, 2001).

For example, information for pedigree numbered 10081 from Genetic Analysis Workshop 14 (http://www.gaworkshop.org) is shown as follows.


Id Fid Mid Sex aff GABRB1 D4S1645

1 2 3 2 2 7/7 7/10
2 0 0 1 1 —/— —/—
3 0 0 2 2 7/9 3/10
4 2 3 2 2 7/9 3/7
5 2 3 2 1 7/7 7/10
6 2 3 1 1 7/7 7/10
7 2 3 2 1 7/7 7/10
8 0 0 1 1 —/— —/—
9 8 4 1 1 7/9 3/10
10 0 0 2 1 —/— —/—
11 2 10 2 1 7/7 7/7
12 2 10 2 2 6/7 7/7
13 0 0 1 1 —/— —/—
14 13 11 1 1 7/8 7/8
15 0 0 1 1 —/— —/—
16 15 12 2 1 6/6 7/7

Where the first three columns represent individual and parent IDs, changed to integers for clarity. Pedigree ID allows multiple pedigrees to be maintained in a single database. Individual's gender (e.g. 1 = male, 2 = female) is included as auxiliary information. Here the variable aff indicates whether an individual is alcoholic (1 = no, 2 = yes). Parents for individuals whose parents are not in the pedigree are set to be zero. The last two columns are the genotypes for marker GABRB1 and D4S1645.

In a typical pedigree diagram males and females are shown in squares and circles, respectively. Spouses can form marriage nodes from which nodes for children are derived. It is also customary to draw pedigree diagrams top down, so that children at a given generation could have children of their own in the next generation.

This implies that a conceptually simple algorithm for pedigree drawing would involve sorting members of a pedigree by generation and align members of the same generation horizontally and those at different generations vertically. In other words, the family is drawn as a graph with members as nodes and ordered by their generation numbers. The algorithm could be more involved if there are marriage loops in the family, i.e. overlapping generations, or if the pedigree is too large to fit in a single page. Therefore pedigree information maintained in a database is such that each record of which corresponds to a node in the pedigree graph.

Now suppose the example pedigree above is kept in an ASCII text file called 10081.pre, we use pre <- read.table ("10081.pre", header=T) to read it into object pre. We can call plot.pedigree as follows.

library(kinship)
attache(pre)
ped < – pedigree(id,fid,mid,sex,aff)
par(xpd = T)
plot.pedigree(ped)
To attach genotype, we specify strid<-id; strid<- paste("\n",id,GABRB1,D4S1645,sep="\n"), par(xpd=T); plot(ped,id= strid). This gives Figure 1. As function pedigree produced an R object with class ‘pedigree’, one can simply use plot instead of plot.pedigree or option mar to control the size of the diagram and R devices such as postscript to keep the diagram as an outside file, e.g. postscript("10081.ps"); plot(ped); dev.off(). The output file thus specified can hold many pedigree diagrams, generated by looping over pedigrees. The diagram is a still image so that nodes in the pedigree graph cannot be pulled and dragged. These might not be trivial to implement but can be found in graphviz. It interprets the dot language, which is in ASCII format so one can translate pedigree information to dot language and feed into graphviz programs dot, dotty and neato. The R function is called pedtodot and can be used as follows: library(gap); pre<-cbind(pid=10081,pre); pedtodot(pre). This generates 10081.dot and can be viewed by dotty or converted to postscript file via "dot -Tps 10081.dot -o 10081.ps". When dir="forward" option in pedtodot is specified, neato can be used to give a more liberal graph. Its unusual look makes it appealing for exposing the peeling algorithm in the likelihood calculation of pedigrees. A single file containing all pedigree diagrams can be generated via the sink command when multiple pedigrees are drawn to screens using sink=F option in pedtodot. Unlike plot.pedigree, pedtodot requires pedigree ID to be specified. The diagrams from pedtodot can be edited with dotty and printed out in multiple pages when a pedigree diagram is too big to fit in a single one.


Figure 1
View larger version (12K):
[in this window]
[in a new window]
 
Fig. 1 Pedigree 10081 by kinship.

 
Package kinship was developed for linear mixed and mixed-effects Cox models of family data (Zhao, 2005) and package gap was for population and family-based genetic analyses in general. These can be obtained within R through library(help=kinship) as with ?plot.pedigree or library(help=gap) with ?pedtodot, while help.start() presents these information in a web browser. Furthermore, path.diagram in the R package sem can generate dot file to be used by graphviz, as with the Bioconductor (http://www.bioconductor.org) package Rgraphviz. It is therefore desirable that graphviz will eventually be part of the R system.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Martin Bishop

Received on December 13, 2005; revised on February 13, 2006; accepted on February 13, 2006

    REFERENCES
 TOP
 ABSTRACT
 REFERENCES
 

    Dudbridge, F., et al. (2004) Pelican: pedigree editor for linkage computer analysis. Bioinformatics, 20, 2327–2328[Abstract/Free Full Text].

    Tores, F. and Barillot, E. (2001) The art of pedigree drawing: algorithmic aspects. Bioinformatics, 17, 174–179[Abstract/Free Full Text].

    Zhao, J.H. (2005) Mixed-effects Cox models of alcohol dependence in extended pedigrees. BMC Genetics, 6, Suppl 1, S127.

    Zhao, J.H. and Tan, Q. (2006) Integrated analysis of genetic data with R. Hum. Genomics, 2, 258–265[Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
C. Fuchsberger, M. Falchi, L. Forer, and P. P. Pramstaller
PedVizApi: a Java API for the interactive, visual analysis of extended pedigrees
Bioinformatics, January 15, 2008; 24(2): 279 - 281.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. H. Trager, R. Khanna, A. Marrs, L. Siden, K. E.H. Branham, A. Swaroop, and J. E. Richards
Madeline 2.0 PDE: a new program for local and web-based pedigree drawing
Bioinformatics, July 15, 2007; 23(14): 1854 - 1856.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/8/1013    most recent
btl058v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhao, J. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhao, J. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?