Skip Navigation


Bioinformatics Advance Access originally published online on May 5, 2007
Bioinformatics 2007 23(13):1705-1707; doi:10.1093/bioinformatics/btm132
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/13/1705    most recent
btm132v2
btm132v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Webb-Robertson, B.-J. M.
Right arrow Articles by Havre, S. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Webb-Robertson, B.-J. M.
Right arrow Articles by Havre, S. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

PQuad—a visual analysis platform for proteomic data exploration of microbial organisms

Bobbie-Jo M. Webb-Robertson 1,*, Elena S. Peterson 1, Mudita Singhal 1, Kyle R. Klicker 1, Christopher S. Oehmen 1, Joshua N. Adkins 2 and Susan L. Havre 1

1Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA and 2Biological Sciences, Pacific Northwest National Laboratory, Richland, WA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: The visual Platform for Proteomics Peptide and Protein data exploration (PQuad) is a multi-resolution environment that visually integrates genomic and proteomic data for prokaryotic systems, overlays categorical annotation and compares differential expression experiments. PQuad requires Java 1.5 and has been tested to run across different operating systems.

Availability: http://ncrr.pnl.gov/software

Contact: bobbie-jo.webb-robertson{at}pnl.gov


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Technological advances have been fueling a revolution in biology, enabling analyses of entire systems at a global scale (e.g. whole cells, tumors, or environmental communities). The application of high-throughput (HTP) experimental methodologies to global profiling of proteins is providing an essential component to the challenge of understanding biology at a systems level. Given that approaches, such as mass spectrometry (MS), can generate over 400 000 spectra per day, the size and inherent noise in the resulting data sets makes data mining challenging, especially in the traditional spreadsheet type of view.

PQuad is a software platform that enables visual exploration of large and complex proteomic datasets of microbial organisms (Havre et al., 2005) in a genomic context. Linked multi-resolution visualizations offer views of the data from the entire chromosome or plasmid down to the individual nucleotide and amino acid sequences.


    2 VISUAL CAPABILITIES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
PQuad offers three key levels of resolution, (1) Genome View, (2) ORF View, and (3) Sequence View. Figure 1 displays these views on a data set for Salmonella typhimurium distributed with the software and other resources, available at http://www.proteomicsresource.org. The Genome View on the far left displays the complete DNA sequence of the chromosome. The DNA sequence is depicted as a single continuous gray line that wraps to fill the display area with the defined ORFs highlighted in yellow with peptide identifications mapped onto the ORFs in blue. The Sequence View on the far right gives residue specific information for a selected ORF and the possible six-frame translation. The center figure is the ORF View. The ORF View depicts the double-stranded DNA as two black lines and proteins are represented as bars in respect to the six-frame translation. This view is typically the most interesting to biological users as observing expression in relation to neighboring proteins in microbial organisms provides biologically relevant information, such as operon position. All three views are linked so that what is selected in one view is automatically propagated to the others.


Figure 1
View larger version (64K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. PQuad offers three multi-resolution views to browse proteomic data sets. The Genome View on the far left displays an entire source of genome information, such as a chromosome, where the ORFs are colored yellow and peptides are colored blue. The Sequence View on the far right displays the sequence information for a single ORF. The ORF View in the center displays the DNA sequence as double-stranded black lines with the associated proteins displayed in their six-frame translation. The ORF view gives an example of a comparative proteomics data set where the peptides and proteins are colored based on expression, virulent (cyan/blue) and non-virulent (white/green). A detailed legend, bottom right defines the coloring scheme for the user.

 
Many sources of supplemental information that facilitate biological interpretation come in categorical form, such as sample condition, protein function or microarray expression (up or down regulated). PQuad offers the capability to map colors based on category definitions into the Genome and ORF Views. In the ORF View, for example, this would allow users to quickly identify if genes that are up-regulated in a microarray experiment are also expressed in the corresponding proteomic experiment. One of the most useful applications for categorical data integration is comparative proteomics to evaluate peptide/protein identification information from two or more different experimental conditions. Color is used to differentiate both peptide and protein expression across two experimental conditions.


    3 DESIGN AND IMPLEMENTATION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
PQuad is built in JavaTM and has been tested to run on several operating systems, including Windows, Mac OS X, Linux and Unix. Included is a data set creation module and standard warning and error messages associated with loading data. For example, when a data set is loaded into PQuad a message immediately relays how many peptides matched defined ORFs. Additionally, to enable follow-up analysis peptides can be selected for exporting to a file.

3.1 Data requirements
The software requires three very simple input files: (1) the chromosome or DNA sequence, (2) the ORF location information, and (3) the peptide identification information (sequence and ORF membership). The categorical data is user-defined by appending this information to the relevant file. Peptide conditions are limited to two conditions added to the peptide file. The ORF categories are a column in the ORF location file and are bounded to 12, a typical number of colors that can be differentiated by a human.

On a standard desktop PC (2 GB, 3.2 GZ) PQuad can easily load a large microbial genome, such as E.coli, at ~5 Mb, ~4500 genes and 100 000 peptides in ~30 s. Although PQuad has the computational capability to load larger chromosomes or multiple chromosomes concatenated, splice joints would not be apparent and thus the visualization would be less straight forward to interpret.

3.2 Legends and menus
Each view offers pull-down menus that allow the user to tailor the view to their needs. The pull-down menus offer two key capabilities, change in resolution, contrast and coloring schemes. The user can modify the Genome and ORF views to show different resolutions, such as one base-pair per pixel or 10 base-pairs per pixel. Menus also offer the capability to change the color scheme for the comparative studies by simply selecting show peptides in default coloring or by condition. The contrast between colors is easily modifiable by a slider. Tabs give pedigree information associated with the loaded genome, such as size and resolution. Additionally, a legend is provided for the Genome and ORF Views that define the color schemes, bottom right of Figure 1.


    4 USE CASE SCENARIO
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
We obtained experimental observations for S.typhimurium growing under standard laboratory (rich media) and virulence inducing (acidic, magnesium-depleted minimal media) (Adkins et al., 2006). Figure 1 illustrates a view of this data set allowing the user to quickly detect proteins expressed in virulent, non-virulent, or both conditions. The peptides associated with the standard condition are colored in light blue and the peptides associated with virulence in white. Peptides that are expressed in both conditions are colored red. The underlying proteins are then colored for easy identification of proteins expressed in only one condition or another, blue for standard and green for virulence conditions, circled in white.

Using the documented S.typhimurium genome we identified a set of proteins present only in the virulent condition, which mapped back to pathogenicity island 2 (PI2) which has been previously liked to virulence in S.typhimurium (Unsworth and Holden, 2000). Increased presence of peptides from the virulence-mimicking preparation is readily evident from the PQuad visualization, especially three proteins linked to type III secretion—a key process in S.typhimurium ability to survive in hostile environments.


    5 CONCLUSIONS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
PQuad is a new visual analysis tool for proteomics that facilitates analysis of complex mixtures of proteins in multiple conditions for prokaryotic systems. Additionally, PQuad offers basic data integration capabilities by mapping categorical information onto the peptide and protein expression data. Development as an object-oriented application allows new visualizations to be added relatively easily.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
We would like to thank the laboratory of Richard Smith at the Pacific Northwest National Laboratory (PNNL) who provided the dataset herein generated through interagency agreement Y1-AI-4894-01 from the National Institute of Allergy and Infectious Diseases (NIH/DHHS). This work was supported through Laboratory Directed Research and Development at PNNL. PNNL is a multi-program national laboratory operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76L01830.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Thomas Lengauer

Received on November 16, 2006; revised on March 29, 2007; accepted on March 30, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 VISUAL CAPABILITIES
 3 DESIGN AND IMPLEMENTATION
 4 USE CASE SCENARIO
 5 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Adkins JN, et al. Analysis of the Salmonella typhimurium proteome through environmental response toward infectious conditions. Mol. Cell Proteomics, ( (2006) ) 5, : 1450–1461.[Abstract/Free Full Text].

    Havre SL, et al. Enabling proteomics discovery through visual analysis. IEEE Eng. Med. Biol. Mag, ( (2005) ) 24, : 50–57.[CrossRef][ISI][Medline].

    Unsworth K, Holden D. Identification and analysis of bacterial virulence genes in vivo. Philos. Trans. R. Soc. Lond. B. Biol. Sci, ( (2000) ) 355, : 613–622.[CrossRef][ISI][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/13/1705    most recent
btm132v2
btm132v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Webb-Robertson, B.-J. M.
Right arrow Articles by Havre, S. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Webb-Robertson, B.-J. M.
Right arrow Articles by Havre, S. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?