<?xml version="1.0" encoding="ISO-8859-1"?>

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
 xmlns:prism="http://purl.org/rss/1.0/modules/prism/"
 xmlns:admin="http://webns.net/mvcb/"
>

<channel rdf:about="http://bioinformatics.oxfordjournals.org">
<title>Bioinformatics - Advance Access</title>
<link>http://bioinformatics.oxfordjournals.org</link>
<description>Bioinformatics - RSS feed of articles</description>
<prism:eIssn>1460-2059</prism:eIssn>
<prism:publicationName>Bioinformatics</prism:publicationName>
<prism:issn>1367-4803</prism:issn>
<items>
 <rdf:Seq>
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp415v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp406v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp403v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp402v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp414v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp413v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp412v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp411v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp410v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp407v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp404v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp391v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp399v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp398v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp393v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp397v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp395v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp394v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp392v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp390v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp385v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp384v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp382v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp388v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp386v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp383v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp380v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp381v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp379v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp378v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp377v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp376v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp375v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp373v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp372v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp364v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp374v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp371v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp370v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp368v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp366v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp362v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp361v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp360v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp369v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp367v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp365v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp363v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp357v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp335v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp359v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp356v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp355v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp341v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp358v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp354v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp353v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp352v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp349v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp348v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp343v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp342v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp340v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp350v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp347v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp346v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp345v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp344v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp338v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp336v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp339v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp334v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp333v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp332v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp331v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp329v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp330v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp325v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp294v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp318v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp316v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp313v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp311v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp299v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp306v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp307v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp303v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp301v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp268v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp300v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp291v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp289v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp287v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp276v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp279v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp250v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp278v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp266v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp256v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp144v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp143v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp064v2?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp065v1?rss=1" />
  <rdf:li rdf:resource="http://bioinformatics.oxfordjournals.org/cgi/content/short/btm094v2?rss=1" />
 </rdf:Seq>
</items>
</channel>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp415v1?rss=1">
<title><![CDATA[Statistical lower bounds on protein copy number from fluorescence expression images]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp415v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Fluorescence imaging has become commonplace for quantitatively measuring mRNA or protein expression in cells and tissues. However, such expression data is usually relative&mdash;absolute concentrations or molecular copy numbers are typically not known. While this is satisfactory for many applications, for certain kinds of quantitative network modeling and analysis of expression noise, absolute measures of expression are necessary.</p>
<p><b>Results:</b> We propose two methods for estimating molecular copy numbers from single uncalibrated expression images of tissues. These methods rely on expression variability between cells, due either to steady state fluctuations or unequal distribution of molecules during cell division, to make their estimates. We apply these methods to 152 protein fluorescence expression images of <I>Drosophila melanogaster</I> embryos during early development, generating copy number estimates for 14 genes in the segmentation network. We also analyze the effects of noise on our estimators and compare with empirical findings. Finally, we confirm an observation of Bar-Even et al., made in the much different setting of <I>Saccharomyces cerevisiae</I>, that steady state expression variance tends to scale with mean expression.</p>
<p><b>Availability:</b> The data is all drawn from FlyEx (explained within), and is available at <inter-ref locator="http://flyex.ams.sunysb.edu/FlyEx/.MATLAB" locator-type="url">http://flyex.ams.sunysb.edu/FlyEx/.MATLAB</inter-ref> codes for all algorithms described in this paper are available at <inter-ref locator="http://www.cs.mcgill.ca/~perkins/CopyNumber.html" locator-type="url">http://www.cs.mcgill.ca/~perkins/CopyNumber.html</inter-ref>.</p>
<p><b>Contact: </b><inter-ref locator="tperkins@ohri.ca" locator-type="email">tperkins@ohri.ca</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Zamparo, L., Perkins, T.]]></dc:creator>
<dc:date>2009-07-02</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp415</dc:identifier>
<dc:title><![CDATA[Statistical lower bounds on protein copy number from fluorescence expression images]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-02</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp406v1?rss=1">
<title><![CDATA[Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp406v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.</p>
<p><b>Results:</b> In this paper we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two pheno-types. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T<sup>2</sup>, N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to comple-menting null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.</p>
]]></description>
<dc:creator><![CDATA[Glazko, G. V., Emmert-Streib, F.]]></dc:creator>
<dc:date>2009-07-02</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp406</dc:identifier>
<dc:title><![CDATA[Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-02</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp403v1?rss=1">
<title><![CDATA[SNP-o-matic]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp403v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> High throughput sequencing technologies generate large amounts of short reads. Mapping these to a reference sequence consumes large amounts of processing time and memory, and read mapping errors can lead to noisy or incorrect alignments. SNP-o-matic is a fast, memory-efficient, and stringent read mapping tool offering a variety of analytical output functions, with an emphasis on geno-typing.</p>
<p><b>Availability:</b> <inter-ref locator="http://snpomatic.sourceforge.net" locator-type="url">http://snpomatic.sourceforge.net</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="mm6@sanger.ac.uk" locator-type="email">mm6@sanger.ac.uk</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Manske, H. M., Kwiatkowski, D. P.]]></dc:creator>
<dc:date>2009-07-02</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp403</dc:identifier>
<dc:title><![CDATA[SNP-o-matic]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-02</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp402v1?rss=1">
<title><![CDATA[TTA Lynx: a web-based service for analysis of actinomycete genes containing rare TTA codon]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp402v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> TTA Lynx is a web-based service for finding, assessing and comparing coding sequences that contain TTA codons. This codon is most notable for being a regulatory switch that governs different aspects of the physiology of several GC-rich, Gram-positive bacteria belonging to genus <I>Streptomyces</I>, prolific producers of clinically important natural products. The ever-increasing pace of genome sequencing is creating a huge volume of data that could be utilized to improve our understanding of rare codons in actinomycete biology (and other biological systems.) The service described here is designed to facilitate analysis of TTA-containing genes and to assess the importance of TTA-mediated regulation in an organism of interest. This service and its database of organisms with well-known or hypothetical TTA-based regulation provides an opportunity for the identification of such regulation on a genome-wide scale.</p>
<p><b>Availability:</b> <inter-ref locator="http://ttalynx.bio.lnu.edu.ua" locator-type="url">http://ttalynx.bio.lnu.edu.ua</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="ttas@franko.lviv.ua" locator-type="email">ttas@franko.lviv.ua</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Zaburannyy, N., Ostash, B., Fedorenko, V.]]></dc:creator>
<dc:date>2009-07-02</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp402</dc:identifier>
<dc:title><![CDATA[TTA Lynx: a web-based service for analysis of actinomycete genes containing rare TTA codon]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-02</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp414v1?rss=1">
<title><![CDATA[Affy Exon Tissues: Exon Levels in Normal Tissues in Human, Mouse, and Rat]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp414v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> Most genes in human, mouse, and rat produce more than one transcript isoform. The Affymetrix Exon Array is a tool for studying the many processes that regulate RNA production, with separate probesets measuring RNA levels at known and putative exons. For insights on how exons levels vary between normal tissues, we constructed the Affy Exon Tissues track from tissue data published by Affymetrix.  This track reports exon probeset intensities as log ratios relative to median values across the dataset and renders them as colored heat maps, to yield quick visual identification of exons with intensities that vary between normal tissues.</p>
<p><b>Availability:</b> Affy Exon Tissues track is freely available under the UCSC Genome Browser (<inter-ref locator="http://genome.ucsc.edu/" locator-type="url">http://genome.ucsc.edu/</inter-ref>) for human (hg18), mouse (mm8 and mm9), and rat (rn4).</p>
<p><b>Contact: </b><inter-ref locator="cline@biology.ucsc.edu" locator-type="email">cline@biology.ucsc.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Pohl, A. A., Sugnet, C. W., Clark, T. A., Smith, K., Fujita, P. A., Cline, M. S.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp414</dc:identifier>
<dc:title><![CDATA[Affy Exon Tissues: Exon Levels in Normal Tissues in Human, Mouse, and Rat]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp413v1?rss=1">
<title><![CDATA[Genome-wide maps of mono- and di-nucleosomes of Aspergillus fumigatus]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp413v1?rss=1</link>
<description><![CDATA[
<p>We identified 6 499 428 mono- and 7 545 410 di-nucleosome positions of the fungus <I>Aspergillus fumigatus</I>, which was detected at high resolution based on the DNA sequence data obtained from both mono- and dinucleosomal DNA fragments. We show that the distribution of lengths of the mononucleosomal DNA fragments has two peaks at 134 nt and 149 nt, whereas the distribution of di-nucleosomal DNA fragment lengths has a single peak at 285 nt. Although the gene bodies of the active and inactive genes and the inactive gene promoters had the two peaks of the mono-nucleosomal DNA fragment lengths, the active gene promoter lost the longer peak at 149-nt. Our findings strongly suggest that the nucleosomes protecting longer DNA fragments against MNase at the promoters, thereby inhibiting high gene expression.</p>
<p><b>Contact:</b> <inter-ref locator="hnishida@iu.a.u-tokyo.ac.jp" locator-type="email">hnishida@iu.a.u-tokyo.ac.jp</inter-ref></p>
<p><b>Supplementary information: </b>Supplementary data are available at <I>Bioinformatics online</I>.</p>
]]></description>
<dc:creator><![CDATA[Nishida, H., Motoyama, T., Yamamoto, S., Aburatani, H., Osada, H.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp413</dc:identifier>
<dc:title><![CDATA[Genome-wide maps of mono- and di-nucleosomes of Aspergillus fumigatus]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>DISCOVERY NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp412v1?rss=1">
<title><![CDATA[HI: Haplotype Improver using paired-end short reads]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp412v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> We present a program to improve haplotype reconstruction by incorporating information from paired-end reads, and demonstrate its utility on simulated data. We find that given a fixed coverage, longer reads (implying fewer of them) are preferable.</p>
<p><b>Availability:</b> The executable and user manual can be freely downloaded from <inter-ref locator="ftp://ftp.sanger.ac.uk/pub/zn1/HI" locator-type="url">ftp://ftp.sanger.ac.uk/pub/zn1/HI</inter-ref>.</p>
<p><b>Contact: </b><inter-ref locator="ql2@sanger.ac.uk" locator-type="email">ql2@sanger.ac.uk</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Long, Q., MacArthur, D., Ning, Z., Tyler-Smith, C.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp412</dc:identifier>
<dc:title><![CDATA[HI: Haplotype Improver using paired-end short reads]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp411v1?rss=1">
<title><![CDATA[Determining noisy attractors of delayed stochastic Gene Regulatory Networks from multiple data sources]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp411v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Gene regulatory networks (GRN) are stochastic, thus, do not have attractors, but can remain in confined regions of the state space, the &lsquo;noisy attractors&rsquo;, which define the cell type and phenotype.</p>
<p><b>Results:</b> We propose a gamma-bernoulli mixture model clustering algorithm (BMM), tailored for quantizing states from gamma and bernoulli distributed data, to determine the noisy attractors of stochastic GRN. BMM uses multiple data sources, naturally selects the number of states and can be extended to other parametric distributions according to the number and type of data sources available. We apply it to protein and RNA levels, and promoter occupancy state of a toggle switch and show that it can be bistable, tristable, or monostable depending on its internal noise level. We show that these results are in agreement with the patterns of differentiation of model cells whose pathway choice is driven by the switch. We further apply BMM to a model of the MeKS module of <I>Bacillus subtilis</I>, and the results match experimental data, demonstrating the usability of BMM.</p>
<p><b>Availability:</b> Implementation software is available upon request.</p>
<p><b>Contact: </b>andre.sanchesribeiro@tut.fi and xiaofeng.dai@tut.fi</p>
]]></description>
<dc:creator><![CDATA[Dai, X., Yli-Harja, O., Ribeiro, A. S.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp411</dc:identifier>
<dc:title><![CDATA[Determining noisy attractors of delayed stochastic Gene Regulatory Networks from multiple data sources]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp410v1?rss=1">
<title><![CDATA[k-link EST Clustering: evaluating error introduced by chimeric sequences under different degrees of linkage]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp410v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> The clustering of expressed sequence tags (ESTs) is a crucial step in many sequence analysis studies that require a high level of redundancy. Chimeric sequences, while uncommon, can make achieving the optimal EST clustering a challenge. Single-linkage algorithms are particularly vulnerable to the effects of chimeras. To avoid chimerafacilitated erroneous merges, researchers using single-linkage algorithms are forced to use stringent sequence-similarity thresholds. Such thresholds reduce the sensitivity of the clustering algorithm.</p>
<p><b>Results:</b> We introduce the concept of <I>k</I>-link clustering for EST data. We evaluate how clustering error rates vary over a range of linkage thresholds. Using <I>k</I>-link, we show that type II error decreases in response to increasing the number of shared ESTs (ie. links) required. We observe a base level of type II error likely caused by the presence of unmasked low-complexity or repetitive sequence.We find that Type I error increases gradually with increased linkage. To minimise the type I error introduced by increased linkage requirements, we propose an extension to <I>k</I>-link which modifies the required number of links with respect to the size of clusters being compared.</p>
<p><b>Availability:</b> The implementation of k-link is available under the terms of the GPL from <inter-ref locator="http://www.bioinformatics.csiro.au/products.shtml" locator-type="url">http://www.bioinformatics.csiro.au/products.shtml</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="lauren.bragg@csiro.au" locator-type="email">lauren.bragg@csiro.au</inter-ref></p>
<p><b>Supplementary Information: </b>Supplementary data are available at Bioinformatics online.</p>
]]></description>
<dc:creator><![CDATA[Bragg, L. M., Stone, G.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp410</dc:identifier>
<dc:title><![CDATA[k-link EST Clustering: evaluating error introduced by chimeric sequences under different degrees of linkage]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp407v1?rss=1">
<title><![CDATA[Robustness Considerations in Selecting Efficient Two-Color Microarray Designs]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp407v1?rss=1</link>
<description><![CDATA[
<p>The main goal of microarray experiments is to select a small subset of genes that are differentially expressed among competing <I>mRNA</I> samples. For a given set of such <I>mRNA</I> samples, it is possible to consider a number of two&ndash;color <I>cDNA</I> microarray designs with a fixed number of arrays. Appropriate criteria can be used to select an efficient design from such a set of alternative experimental designs. In practice, however, microarray expression data often contain missing observations and the most efficient design (with complete observations) for a specific setup may not be efficient in the presence of missing observations. In this paper we propose two criteria to address the robustness of microarray designs against missing observations. We demonstrate the simultaneous use of efficiency and robustness criteria to select good microarray designs for both one&ndash;factor and multi&ndash;factor experiments.</p>
<p><b>Contact: </b><inter-ref locator="mlatif@isrt.ac.bd" locator-type="email">mlatif@isrt.ac.bd</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Latif, A. H. M. M., Bretz, F., Brunner, E.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp407</dc:identifier>
<dc:title><![CDATA[Robustness Considerations in Selecting Efficient Two-Color Microarray Designs]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp404v1?rss=1">
<title><![CDATA[High throughput minor histocompatibility antigen prediction]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp404v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Minor histocompatibility antigens (mHags) are a diverse collection of MHC-bound peptides that have immunological implications in the context of allogeneic transplantation because of their differential presence in donor and host, and thus play a critical role in the induction of the detrimental graft-versus-host disease (GvHD) or in the development of the beneficial graft-versus-leukemia (GvL) effect. Therefore, the search for mHags has implications not only for preventing GvHD, but also for therapeutic applications involving leukemia-specific T cells. We have created a web-based system, named PeptideCheck, which aims to augment the experimental discovery of mHags using bioinformatic means. Analyzing peptide elution data to search for mHags and predicting mHags from poly-morphism and protein databases are core features.</p>
<p><b>Results:</b> Comparison with known mHag data reveals that some but not all of the previously known mHags can be reproduced. By applying a system of filtering and ranking, we were able to produce an ordered list of potential mHag candidates in which HA-1, HA-3, and HA-8 occur in the best 0.25 per cent. By combining SNP, protein, tissue expression, and genotypic frequency data, together with antigen presentation prediction algorithms, we propose a list of the best peptide candidates which could potentially induce the graft versus leukemia effect without causing graft versus host disease.</p>
<p><b>Availability:</b> <inter-ref locator="http://www.peptidecheck.org" locator-type="url">http://www.peptidecheck.org</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="Blasczyk.Rainer@mh-hannover.de" locator-type="email">Blasczyk.Rainer@mh-hannover.de</inter-ref></p>
]]></description>
<dc:creator><![CDATA[DeLuca, D. S., Eiz-Vesper, B., Ladas, N., Khattab, B. A.-M., Blasczyk, R.]]></dc:creator>
<dc:date>2009-07-01</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp404</dc:identifier>
<dc:title><![CDATA[High throughput minor histocompatibility antigen prediction]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-07-01</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp391v1?rss=1">
<title><![CDATA[Biophysical annotation and representation of CellML models]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp391v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> CellML is an implementation-independent model description language for specifying and exchanging biological processes. The focus of CellML is the representation of mathematical formulations of biological processes. The language captures the mathematical and model building constructs well but does not lend itself to capturing the biology these models represent.</p>
<p><b>Results:</b> This paper describes the development of an ontological framework for annotating CellML models with biophysical concepts. We demonstrate that, by using these ontological mappings, in com-bination with a set of graph reduction rules, it is possible to repre-sent the underlying biological process described in a CellML model.</p>
]]></description>
<dc:creator><![CDATA[Wimalaratne, S. M., Halstead, M. D. B., Lloyd, C. M., Crampin, E. J., Nielsen, P. F.]]></dc:creator>
<dc:date>2009-06-29</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp391</dc:identifier>
<dc:title><![CDATA[Biophysical annotation and representation of CellML models]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-29</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp399v1?rss=1">
<title><![CDATA[HAPLOWSER: a whole-genome haplotype browser for personal genome and metagenome]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp399v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> Haplotype assembly is becoming a very important tool in genome sequencing of human and other organisms. Although haplotypes were previously inferred from genome assemblies, there has never been a comparative haplotype browser that depicts a global picture of whole-genome alignments among haplotypes of different organisms. We introduce a whole-genome HAPLotype brOWSER (HAPLOWSER), providing evolutionary perspectives from multiple aligned haplotypes and functional annotations. Haplowser enables the comparison of haplotypes from metagenomes, and associates conserved regions or the bases at the conserved regions with func-tional annotations and custom tracks. The associations are quanti-fied for further analysis and presented as pie charts. Functional annotations and custom tracks that are projected onto haplotypes are saved as multiple files in FASTA format. Haplowser provides a user-friendly interface, and can display alignments of haplotypes with functional annotations at any resolution.</p>
<p><b>Availability:</b> Haplowser, written in Java, supports multiple platforms including Windows and Linux. Haplowser is publicly available at <inter-ref locator="http://embio.yonsei.ac.kr/haplowser" locator-type="url">http://embio.yonsei.ac.kr/haplowser</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="sanghyun@cs.yonsei.ac.kr" locator-type="email">sanghyun@cs.yonsei.ac.kr</inter-ref>; <inter-ref locator="lilei@usc.edu" locator-type="email">lilei@usc.edu</inter-ref></p>
<p><b>Supplemental information: </b>Supplemental data are available at <inter-ref locator="http://embio.yonsei.ac.kr/haplowser" locator-type="url">http://embio.yonsei.ac.kr/haplowser</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Kim, J. H., Kim, W.-C., Waterman, M. S., Park, S., Li, L. M.]]></dc:creator>
<dc:date>2009-06-27</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp399</dc:identifier>
<dc:title><![CDATA[HAPLOWSER: a whole-genome haplotype browser for personal genome and metagenome]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-27</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp398v1?rss=1">
<title><![CDATA[ITM Probe: analyzing information flow in protein networks]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp398v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> Founded upon diffusion with damping, <I>ITM Probe</I> is an application for modelling information flow in protein interaction networks without prior restriction to the sub-network of interest. Given a context consisting of desired origins and destinations of information, <I>ITM Probe</I> returns the set of most relevant proteins with weights and a graphical representation of the corresponding sub-network. With a click, the user may send the resulting protein list for enrichment analysis to facilitate hypothesis formation or confirmation.</p>
<p><b>Availability:</b> ITM Probe web service and documentation can be foundat <inter-ref locator="www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probe" locator-type="url">www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probe</inter-ref></p>
<p><b>Contact: </b><inter-ref locator="yyu@ncbi.nlm.nih.gov" locator-type="email">yyu@ncbi.nlm.nih.gov</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Stojmirovic, A., Yu, Y.-K.]]></dc:creator>
<dc:date>2009-06-27</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp398</dc:identifier>
<dc:title><![CDATA[ITM Probe: analyzing information flow in protein networks]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-27</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp393v1?rss=1">
<title><![CDATA[IMG ER: A System for Microbial Genome Annotation Expert Review and Curation]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp393v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct.</p>
<p><b>Results:</b> We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.</p>
<p><b>Contact: </b><inter-ref locator="vmmarkowitz@lbl.gov" locator-type="email">vmmarkowitz@lbl.gov</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Markowitz, V. M., Mavromatis, K., Ivanova, N. N., Chen, I-M. A., Chu, K.]]></dc:creator>
<dc:date>2009-06-27</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp393</dc:identifier>
<dc:title><![CDATA[IMG ER: A System for Microbial Genome Annotation Expert Review and Curation]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-27</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp397v1?rss=1">
<title><![CDATA[The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp397v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Rapidly expanding repositories of highly informative genomic data have generated increasing interest in methods for protein function prediction and inference of biological networks. The successful application of supervised machine learning to these tasks requires a gold standard for protein function: a trusted set of correct examples, which can be used to assess performance through cross-validation or other statistical approaches. Since gene annotation is incomplete for even the best-studied model organisms, the biological reliability of such evaluations may be called into question.</p>
<p><b>Results:</b> We address this concern by constructing and analyzing an experimentally-based gold standard through comprehensive validation of protein function predictions for mitochondrion biogenesis in <I>Saccharomyces cerevisiae</I>. Specifically, we determine that A) current machine learning approaches are able to generalize and predict novel biology from an incomplete gold standard and B) incomplete functional annotations adversely affect the evaluation of machine learning performance. While computational approaches performed better than predicted in the face of incomplete data, relative comparison of competing approaches - even those employing the same training data - is problematic with a sparse gold standard. Incomplete knowledge causes individual methods' performances to be differentially underestimated, resulting in misleading performance evaluations. We provide a benchmark gold standard for yeast mitochondria to complement current databases and an analysis of our experimental results in the hopes of mitigating these effects in future comparative evaluations.</p>
<p><b>Availability:</b> The mitochondrial benchmark gold standard, as well as experimental results and additional data, is available at <inter-ref locator="http://function.princeton.edu/mitochondria" locator-type="url">http://function.princeton.edu/mitochondria</inter-ref>.</p>
<p><b>Contact: </b><inter-ref locator="ogt@cs.princeton.edu" locator-type="email">ogt@cs.princeton.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Huttenhower, C., Hibbs, M. A., Myers, C. L., Caudy, A. A., Hess, D. C., Troyanskaya, O. G.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp397</dc:identifier>
<dc:title><![CDATA[The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp395v1?rss=1">
<title><![CDATA[Transcriptional landscape estimation from tiling array data using a model of signal shift and drift]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp395v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> High-density oligonucleotide tiling array technology holds the promise of a better description of the complexity and the dynamics of transcriptional landscapes. In organisms such as bacteria and yeasts, transcription can be measured on a genome-wide scale with a resolution higher than 25 bp. The statistical models currently used to handle these data remain however very simple, the most popular being the piecewise constant Gaussian model with a fixed number of breakpoints.</p>
<p><b>Results:</b> This paper describes a new methodology based on a hidden Markov model that embeds the segmentation of a continuous-valued signal in a probabilistic setting. For a computationally affordable cost, this framework (i) alleviates the difficulty of choosing a fixed number of breakpoints, and (ii) permits retrieving more information than a unique segmentation by giving access to the whole probability distribution of the transcription profile. Importantly, the model is also enriched and accounts for subtle effects such as signal "drift" and covariates. Relevance of this framework is demonstrated on a <I>Bacillus subtilis</I> data-set.</p>
<p><b>Availability:</b> A software is distributed under the GPL.</p>
<p><b>Contact:</b> <inter-ref locator="pierre.nicolas@jouy.inra.fr" locator-type="email">pierre.nicolas@jouy.inra.fr</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Nicolas, P., Leduc, A., Robin, S., Rasmussen, S., Jarmer, H., Bessieres, P.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp395</dc:identifier>
<dc:title><![CDATA[Transcriptional landscape estimation from tiling array data using a model of signal shift and drift]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp394v1?rss=1">
<title><![CDATA[Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp394v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> There is a strong demand in the genomic community to develop effective algorithms to reliably identify genomic variants. Indel detection using next-gen data is difficult and identification of long structural variations is extremely challenging.</p>
<p><b>Results:</b> We present Pindel, a pattern growth approach, to detect breakpoints of large deletions and medium sized insertions from paired-end short reads. We use both simulated reads and real data to demonstrate the efficiency of the computer program and accuracy of the results.</p>
<p><b>Availability:</b> The binary code and a short user manual can be freely downloaded from <inter-ref locator="http://www.ebi.ac.uk/~kye/pindel/" locator-type="url">http://www.ebi.ac.uk/~kye/pindel/</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Ye, K., Schulz, M. H., Long, Q., Apweiler, R., Ning, Z.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp394</dc:identifier>
<dc:title><![CDATA[Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp392v1?rss=1">
<title><![CDATA[libAnnotationSBML: a library for exploiting SBML annotations]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp392v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> The Systems Biology Markup Language (SBML) is an established community XML format for the markup of biochemical models (Hucka <I>et al.</I>, 2003). With the introduction of SBML level 2 version 3, specific model entities, such as species or reactions, can now be annotated using ontological terms. These annotations, which are encoded using the resource description framework (RDF), provide the facility to specify definite terms to individual components, allowing software to unambiguously identify such components and thus link the models to existing data resources (Kell &amp; Mendes, 2008).</p>
<p>libSBML (Bornstein <I>et al.</I>, 2008) is an application programming interface library for the manipulation of SBML files. While libSBML provides the facilities for reading and writing such annotations from and to models, it is beyond the scope of libSBML to provide interpretation of these terms. The libAnnotationSBML library introduced here acts as a layer on top of libSBML linking SBML annotations to the web services that describe these ontological terms. Two applications that use this library are described: SbmlSynonymExtractor finds name synonyms of SBML model entities, and SbmlReactionBalancer checks SBML files to determine whether specifed reactions are elementally balanced.</p>
<p><b>Availability:</b> <inter-ref locator="http://mcisb.sourceforge.net/" locator-type="url">http://mcisb.sourceforge.net/</inter-ref></p>
<p><b>Contact:</b> <inter-ref locator="neil.swainston@manchester.ac.uk" locator-type="email">neil.swainston@manchester.ac.uk</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Swainston, N., Mendes, P.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp392</dc:identifier>
<dc:title><![CDATA[libAnnotationSBML: a library for exploiting SBML annotations]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp390v1?rss=1">
<title><![CDATA[A Genetic Programming Approach for Burkholderia Pseudomallei Diagnostic Pattern Discovery]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp390v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Finding diagnostic patterns for fighting diseases like <I>Burkholderia pseudomallei</I> using biomarkers involves two key issues. First, exhausting all subsets of testable biomarkers (antigens in this context) to find a best one is computationally infeasible. Therefore, a proper optimization approach like evolutionary computation should be investigated. Second, a properly selected function of the antigens as the diagnostic pattern which is commonly unknown is a key to the diagnostic accuracy and the diagnostic effectiveness in clinical use.</p>
<p><b>Results:</b> A conversion function is proposed to convert serum tests of antigens on patients to binary values based on which Boolean functions as the diagnostic patterns are developed. A genetic programming approach is designed for optimising the diagnostic patterns in terms of their accuracy and effectiveness. During optimisation, it is aimed to maximise the coverage (the rate of positive response to antigens) in the infected patients and minimize the coverage in the non-infected patients while maintaining the fewest number of testable antigens used in the Boolean functions as possible. The final cover-age in the infected patients is 96.55% using 17 of 215 (7.4%) antigens with zero coverage in the non-infected patients. Among these 17 antigens, BPSL2697 is the most frequently selected one for the diagnosis of <I>Burkholderia Pseudomallei</I>. The approach has been evaluated using both the cross-validation and the Jack-knife simulation methods with the prediction accuracy as 93% and 92%, respectively. A novel approach is also proposed in this study to evaluate a model with binary data using ROC analysis.</p>
]]></description>
<dc:creator><![CDATA[Yang, Z. R., Lertmemongkolchai, G., Tan, G., Felgner, P. L., Titball, R.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp390</dc:identifier>
<dc:title><![CDATA[A Genetic Programming Approach for Burkholderia Pseudomallei Diagnostic Pattern Discovery]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp385v1?rss=1">
<title><![CDATA[Error control variability in pathway-based microarray analysis.]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp385v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> The decision to commit some or many false positives in practice rests with the investigator. Unfortunately, not all error control procedures perform the same. Our problem is to choose an error control procedure to determine a p-value threshold for identifying differentially expressed pathways in high-throughput gene expression studies. Pathway analysis involves fewer tests than differential gene expression analysis, on the order of a few hundred. We discuss and compare methods for error control for pathway analysis with gene expression data.</p>
<p><b>Results:</b> In consideration of the variability in tests results, we find that the widely used Benjamini and Hochberg's (BH) false discovery rate (FDR) analysis is less robust than alternative procedures. BH's error control requires a large number of hypothesis tests, a reasonable assumption for differential gene expression analysis, though not the case with pathway-based analysis. Therefore, we advocate through a series of simulations and applications to real gene expression data, that researchers control the number of false positives rather than the FDR.</p>
<p><b>Availability:</b> Our R package, EPath.omg is available at <inter-ref locator="http://sphhp.buffalo.edu/biostat/research/software" locator-type="url">http://sphhp.buffalo.edu/biostat/research/software</inter-ref>.</p>
<p><b>Contact:</b> <inter-ref locator="dlgold@buffalo.edu" locator-type="email">dlgold@buffalo.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Gold, D. L., Miecznikowski, J. C., Liu, S.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp385</dc:identifier>
<dc:title><![CDATA[Error control variability in pathway-based microarray analysis.]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp384v1?rss=1">
<title><![CDATA[Comparative Study on ChIP-seq Data: Normalization and Binding Pattern Characterization]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp384v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Antibody-based Chromatin Immunoprecipitation assay followed by high-throughput sequencing technology (ChIP-seq) is a relatively new method to study the binding patterns of specific protein molecules over the entire genome. ChIP-seq technology allows scientist to get more comprehensive results in shorter time. Here we present a nonlinear normalization algorithm and a mixture modeling method for comparing ChIP-seq data from multiple samples and characterizing genes based on their RNA polymerase II (Pol II) binding patterns.</p>
<p><b>Results:</b> We apply a two-step nonlinear normalization method based on locally weighted regression (LOESS) approach to compare ChIPseq data across multiple samples and model the difference using an Exponential-Normal<sup>K</sup>mixture model. Fitted model is used to identify genes associated with differential binding sites based on local false discovery rate (<I>fdr</I>). These genes are then standardized and hierarchically clustered to characterize their Pol II binding patterns. As a case study, we apply the analysis procedure comparing normal breast cancer (MCF7) to tamoxifen-resistant (OHT) cell line. We find enriched regions that are associated with cancer (p-value &lt; 0.0001). Our findings also imply that there may be a dysregulation of cell cycle and gene expression control pathways in the tamoxifen resistant cells. These results show that the nonlinear normalization method can be used to analyze ChIP-seq data.</p>
<p><b>Availability:</b> Data is available at <ty><inter-ref locator="http://www.bmi.osu.edu/~khuang/Data/ChIP/RNAPII/" locator-type="url">http://www.bmi.osu.edu/~khuang/Data/ChIP/RNAPII/</inter-ref></ty></p>
<p><b>Contact:</b> <inter-ref locator="cenny.taslim@osumc.edu" locator-type="email">cenny.taslim@osumc.edu</inter-ref>; <inter-ref locator="khuang@bmi.osu.edu" locator-type="email">khuang@bmi.osu.edu</inter-ref></p>
<p><b>Supplementary info:</b> Supplementary figures and tables are available at <I>Bioinformatics</I> online.</p>
]]></description>
<dc:creator><![CDATA[Taslim, C., Wu, J., Yan, P., Singer, G., Parvin, J., Huang, T., Lin, S., Huang, K.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp384</dc:identifier>
<dc:title><![CDATA[Comparative Study on ChIP-seq Data: Normalization and Binding Pattern Characterization]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp382v1?rss=1">
<title><![CDATA[A pipeline for the quantitative analysis of CG dinucleotide methylation using mass spectrometry]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp382v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> DNA cytosine methylation is an important epigenetic regulator, critical for mammalian development and the control of gene expression. Numerous techniques using either restriction enzyme or affinity-based approaches have been developed to interrogate cytosine methylation status genome-wide, however these assays must be validated by a more quantitative approach, such as MALDI-TOF mass spectrometry of bisulphite-converted DNA (commercialized as Sequenom.s EpiTYPER assay using the MassArray system). Here we present an R package ("MassArray") that assists in assay design and uses the standard Sequenom output file as the input to a pipeline of analyses not available as part of the commercial software. The tools in this package include bisulphite conversion efficiency calculation, sequence polymorphism flagging and visualization tools that combine multiple experimental replicates and create tracks for genome browser viewing.</p>
]]></description>
<dc:creator><![CDATA[Thompson, R. F., Suzuki, M., Lau, K. W., Greally, J. M.]]></dc:creator>
<dc:date>2009-06-26</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp382</dc:identifier>
<dc:title><![CDATA[A pipeline for the quantitative analysis of CG dinucleotide methylation using mass spectrometry]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-26</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp388v1?rss=1">
<title><![CDATA[Prestige Centrality-Based Functional Outlier Detection in Gene Expression Analysis]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp388v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Traditional gene expression analysis techniques capture an average gene expression state across sample replicates. However, the average signal across replicates will not capture activated gene networks in different states across replicates. For example, if a particular gene expression network is activated within a subset or all sample replicates, yet the activation state across the sample replicates differs by the specific genes activated in each replicate, the activation of this network will be washed out by averaging across replicates. This situation is likely to occur in single cell gene expression experiments or in noisy experimental settings where a small subpopulation of cells contributes to the gene expression sig-nature of interest.</p>
<p><b>Results and Implementation:</b> In this light, we developed a novel network based approach which considers gene expression within each replicate across its entire gene expression profile, and identifies outliers across replicates. The power of this method is demonstrated by its ability to enrich for distant metastasis related genes derived from noisy expression data of CD44+CD24-/low tumor initiating cells.</p>
]]></description>
<dc:creator><![CDATA[Torkamani, A., Schork, N. J.]]></dc:creator>
<dc:date>2009-06-23</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp388</dc:identifier>
<dc:title><![CDATA[Prestige Centrality-Based Functional Outlier Detection in Gene Expression Analysis]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-23</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp386v1?rss=1">
<title><![CDATA[SOLpro: accurate sequence-based prediction of protein solubility]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp386v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Protein insolubility is a major obstacle for many experimental studies. A sequence-based prediction method able to accurately predict the propensity of a protein to be soluble on overexpression could be used, for instance, to prioritize targets in large-scale proteomics projects and to identify mutations likely to increase the solubility of insoluble proteins.</p>
<p><b>Results:</b> Here we first curate a large, non-redundant, and balanced training set of more than 17,000 proteins. Next, we extract and study twenty three groups of features computed directly or predicted (e.g. secondary structure) from the primary sequence. The data and the features are used to train a two-stage SVM architecture. The resulting predictor, SOLpro, is compared directly to existing methods and shows significant improvement according to standard evaluation metrics, with an overall accuracy of over 74% estimated using multiple runs of ten-fold cross-validation.</p>
<p><b>Availability:</b> SOLpro is integrated in the SCRATCH suite of predictors and is available for download as a stand-alone application and as a web server at: <inter-ref locator="http://scratch.proteomics.ics.uci.edu" locator-type="url">http://scratch.proteomics.ics.uci.edu</inter-ref>.</p>
<p><b>Contact:</b> <inter-ref locator="pfbaldi@ics.uci.edu" locator-type="email">pfbaldi@ics.uci.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Magnan, C. N., Randall, A., Baldi, P.]]></dc:creator>
<dc:date>2009-06-23</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp386</dc:identifier>
<dc:title><![CDATA[SOLpro: accurate sequence-based prediction of protein solubility]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-23</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp383v1?rss=1">
<title><![CDATA[Swift: Primary Data Analysis for the Illumina Solexa Sequencing Platform]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp383v1?rss=1</link>
<description><![CDATA[
<p>Motivation: Primary data analysis methods are of critical importance in second generation DNA sequencing. Improved methods have the potential to increase yield and reduce the error rates. [Openly documented analysis tools enable the user to understand the primary data, this is important for the optimisation and validity of their scientific work.]</p>
<p>Results: In this paper we describe Swift, a new tool for performing primary data analysis on the Illumina Solexa Sequencing Platform. Swift is the first tool, outside of the vendors own software, which completes the full analysis process, from raw images through to base-calls. As such it provides an alternative to, and independent validation of, the vendor supplied tool. Our results show that Swift is able to increase yield by 13.8%, at comparable error rate.</p>
<p>Availability and Implementation: Swift is implemented in C++ and supported under Linux. It is supplied under an open source license (LGPL3), allowing researchers to build upon the platform. Swift is available from <inter-ref locator="http://swiftng.sourceforge.net" locator-type="url">http://swiftng.sourceforge.net</inter-ref>.</p>
<p>Contact: <inter-ref locator="new@sgenomics.org" locator-type="email">new@sgenomics.org</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Whiteford, N., Skelly, T., Curtis, C., Ritchie, M. E., Lohr, A., Zaranek, A. W., Abnizova, I., Brown, C.]]></dc:creator>
<dc:date>2009-06-23</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp383</dc:identifier>
<dc:title><![CDATA[Swift: Primary Data Analysis for the Illumina Solexa Sequencing Platform]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-23</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp380v1?rss=1">
<title><![CDATA[A survey of across-target bioactivity results of small molecules in PubChem]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp380v1?rss=1</link>
<description><![CDATA[
<p>This work provides an analysis of across-target bioactivity results in the screening data deposited in PubChem. Two alternative approaches for grouping related targets are used to examine a compound's across-target bioactivity. This analysis identifies compounds that are selectively active against groups of protein targets that are identical or similar in sequence.  This analysis also identifies compounds that are bioactive across unrelated targets. Statistical distributions of compounds' across-target selectivity   provide a survey to evaluate target specificity of compounds by deriving and analyzing bioactivity profile across a wide range of biological targets for tested small molecules in PubChem. This work enables one to select target specific inhibitors, identify promiscuous compounds, and better understand the biological mechanisms of target-small molecule interactions.</p>
]]></description>
<dc:creator><![CDATA[Han, L., Wang, Y., Bryant, S. H.]]></dc:creator>
<dc:date>2009-06-23</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp380</dc:identifier>
<dc:title><![CDATA[A survey of across-target bioactivity results of small molecules in PubChem]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-23</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp381v1?rss=1">
<title><![CDATA[FancyGene: dynamic visualization of gene structures and protein domain architectures on genomic loci.]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp381v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> : FancyGene is a fast and user-friendly web-based tool for producing images of one or more genes directly on the corresponding genomic locus. Starting from a variety of input formats, FancyGene rebuilds the basic components of a gene (UTRs, intron, exons). Once the initial representation is obtained, the user can superimpose additional features - such as protein domains and/or a variety of biological markers -in specific positions. FancyGene is extremely flexible allowing the user to change the resulting image dynamically, modifying colors and shapes and adding and/or removing objects. The output images are generated either in PNG or PDF formats and can be used for scientific presentations as well as for publications. The PDF format preserves editing capabilities, allowing picture modification using any vector graphics editor.</p>
<p><b>Availability:</b> <inter-ref locator="http://bio.ifom-ieo-campus.it/fancygene" locator-type="url">http://bio.ifom-ieo-campus.it/fancygene</inter-ref>,</p>
<p><b>Contact:</b><inter-ref locator="francesca.ciccarelli@ifom-ieo-campus.it" locator-type="email">francesca.ciccarelli@ifom-ieo-campus.it</inter-ref>.</p>
<p><b>Supplementary Information:</b> Details, examples and tutorials can be found at <inter-ref locator="http://bio.ifom-ieo-campus.it/fancygene/tutorial.html" locator-type="url">http://bio.ifom-ieo-campus.it/fancygene/tutorial.html</inter-ref> and <inter-ref locator="http://bio.ifom-ieo-campus.it/fancygene/help.html" locator-type="url">http://bio.ifom-ieo-campus.it/fancygene/help.html</inter-ref> </p>
]]></description>
<dc:creator><![CDATA[Rambaldi, D., Ciccarelli, F. D.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp381</dc:identifier>
<dc:title><![CDATA[FancyGene: dynamic visualization of gene structures and protein domain architectures on genomic loci.]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp379v1?rss=1">
<title><![CDATA[SHREC: A short-read error correction method]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp379v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Second-generation sequencing technologies produce a massive amount of short reads in a single experiment. However, sequencing errors can cause major problems when using this approach for de novo sequencing applications. Moreover, existing error correction methods have been designed and optimized for shotgun sequencing. Therefore, there is an urgent need for the design of fast and accurate computational methods and tools for error correction of large amounts of short read data.</p>
<p><b>Results:</b> We present SHREC, a new algorithm for correcting errors in short-read data that uses a generalized suffix trie on the read data as the underlying data structure. Our results show that the method can identify erroneous reads with sensitivity and specificity of over 99% and 96% for simulated data with error rates of up to 3% as well as for real data. Furthermore, it achieves an error correction accuracy of over 80% for simulated data and over 88% for real data. These results are clearly superior to previously published approaches. SHREC is available as an efficient open-source Java implementa-tion that allows processing of ten million of short reads on a stan-dard workstation.</p>
<p><b>Availability:</b> SHREC source code in JAVA is freely available at <inter-ref locator="http://www.informatik.uni-kiel.de/~jasc/Shrec/" locator-type="url">http://www.informatik.uni-kiel.de/~jasc/Shrec/</inter-ref> </p>
<p><b>Contact: </b> <inter-ref locator="jasc@informatik.uni-kiel.de" locator-type="email">jasc@informatik.uni-kiel.de</inter-ref> </p>
]]></description>
<dc:creator><![CDATA[Schroder, J., Schroder, H., Puglisi, S. J., Sinja, R., Schmidt, B.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp379</dc:identifier>
<dc:title><![CDATA[SHREC: A short-read error correction method]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp378v1?rss=1">
<title><![CDATA[ISOLATE: A computational strategy for identifying the primary origin of cancers using high throughput sequencing]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp378v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well studied cancers.</p>
<p><b>Results:</b> We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions <I>de novo</I>, without having seen any training expression profiles of cancers with identified origin. Compared to previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity, and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin.</p>
<p><b>Availability:</b> ISOLATE is available for download at: <inter-ref locator="http://morrislab.med.utoronto.ca/software" locator-type="url">http://morrislab.med.utoronto.ca/software</inter-ref></p>
<p><b>Contact:</b> {gerald.quon, quaid.morris}@utoronto.ca</p>
]]></description>
<dc:creator><![CDATA[Quon, G., Morris, Q.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp378</dc:identifier>
<dc:title><![CDATA[ISOLATE: A computational strategy for identifying the primary origin of cancers using high throughput sequencing]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp377v1?rss=1">
<title><![CDATA[Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp377v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Most microbial species can not be cultured in the lab. Metagenomic sequencing may still yield a complete genome if the sequenced community is enriched and the sequencing coverage is high. However, the complexity in a natural population may cause the enrichment culture to contain multiple related strains. This diversity can confound existing strict assembly programs and lead to a fragmented assembly, which is unnecessary if we have a related reference genome available that can function as a scaffold. <b>Results:</b> Here, we map short metagenomic sequencing reads from a population of strains to a related reference genome, and compose a genome that captures the consensus of the population's sequences. We show that by iteration of the mapping and assembly procedure, the coverage increases while the similarity with the reference genome decreases. This indicates that the assembly becomes less dependent on the reference genome and approaches the consensus genome of the multi-strain population. <b>Contact:</b> <inter-ref locator="dutilh@cmbi.ru.nl" locator-type="email">dutilh@cmbi.ru.nl</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Dutilh, B. E., Huynen, M. A., Strous, M.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp377</dc:identifier>
<dc:title><![CDATA[Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp376v1?rss=1">
<title><![CDATA[Reconstruct Modular Phenotype-Specific Gene Networks by Knowledge-Driven Matrix Factorization]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp376v1?rss=1</link>
<description><![CDATA[
<p>Motivation: Reconstructing gene networks from microarray data has provided mechanistic information on cellular processes. A popular structure learning method, Bayesian network inference, has been used to determine network topology despite its shortcomings, i.e., the high computational cost when analyzing a large number of genes and the inefficiency in exploiting prior knowledge, such as the coregulation information of the genes. To address these limitations, we are introducing an alternative method, knowledge-driven matrix factorization (KMF) framework, to reconstruct phenotype-specific modular gene networks.</p>
<p>Results: Considering the reconstruction of gene network as a matrix factorization problem, we first use the gene expression data to estimate a correlation matrix, and then factorize the correlation matrix to recover the gene modules and the interactions between them. Prior knowledge from Gene Ontology is integrated into the matrix factorization. We applied this KMF algorithm to hepatocellular carcinoma (HepG2) cells treated with free fatty acids (FFAs). By comparing the module networks for the different conditions, we identified the specific modules that are involved in conferring the cytotoxic phenotype induced by palmitate. Further analysis of the gene modules of the different conditions suggested individual genes that play important roles in palmitate-induced cytotoxicity. In summary, KMF can efficiently integrate gene expression data with prior knowledge, thereby providing a powerful method of reconstructing phenotype-specific gene networks and valuable insights into the mechanisms that govern the phenotype.</p>
<p>Contact: <inter-ref locator="krischan@msu.edu" locator-type="email">krischan@msu.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Yang, X., Zhou, Y., Jin, R., Chan, C.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp376</dc:identifier>
<dc:title><![CDATA[Reconstruct Modular Phenotype-Specific Gene Networks by Knowledge-Driven Matrix Factorization]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp375v1?rss=1">
<title><![CDATA[Reconstructing Signaling Pathways from RNAi Data using Probabilistic Boolean Threshold Networks]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp375v1?rss=1</link>
<description><![CDATA[
<p>Motivation: The reconstruction of signaling pathways from gene knockdown data is a novel research field enabled by developments in RNAi screening technology. However, while RNA interference is a powerful technique to identify genes related to a phenotype of interest, their placement in the corresponding pathways remains a challenging problem. Difficulties are aggravated if not all pathway components can be observed after each knockdown, but readouts are only available for a small subset. We are then facing the problem of reconstructing a network from incomplete data.</p>
<p>Results: We infer pathway topologies from gene knockdown data using Bayesian networks with probabilistic Boolean threshold functions. To deal with the problem of under-determined network parameters, we employ a Bayesian learning approach, in which we can integrate arbitrary prior information on the network under consideration. Missing observations are integrated out. We compute the exact likelihood function for smaller networks, and use an approximation to evaluate the likelihood for larger networks. The posterior distribution is evaluated using mode hopping Markov chain Monte Carlo, distributions over topologies and parameters can then be used to design additional experiments. We evaluate our approach on a small artificial dataset, and present inference results on RNAi data from the Jak/Stat pathway in a human hepatoma cell line.</p>
<p>Availability: Software is available on request.</p>
<p>Contact: <inter-ref locator="lars.kaderali@bioquant.uni-heidelberg.de" locator-type="email">lars.kaderali@bioquant.uni-heidelberg.de</inter-ref></p>
<p>Supplementary Information: Available at Bioinformatics online.</p>
]]></description>
<dc:creator><![CDATA[Kaderali, L., Dazert, E., Zeuge, U., Frese, M., Bartenschlager, R.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp375</dc:identifier>
<dc:title><![CDATA[Reconstructing Signaling Pathways from RNAi Data using Probabilistic Boolean Threshold Networks]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp373v1?rss=1">
<title><![CDATA[VarScan: Variant detection in massively parallel sequencing of individual and pooled samples]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp373v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> Massively parallel sequencing technologies hold incredible promise for the study of DNA sequence variation, particularly the identification of variants affecting human disease. The unprece-dented throughput and relatively short read lengths of Roche/454, Illumina/Solexa, and other platforms have spurred development of a new generation of sequence alignment algorithms. Yet detection of sequence variants based on short read alignments remains chal-lenging, and most currently available tools are limited to a single platform or aligner type. We present VarScan, an open source tool for variant detection that is compatible with several short read align-ers. We demonstrate VarScan's ability to detect SNPs and indels with high sensitivity and specificity, in both Roche/454 sequencing of individuals and deep Illumina/Solexa sequencing of pooled samples.</p>
<p><b>Availability and Implementation:</b> Source code and documentation freely available at <inter-ref locator="http://genome.wustl.edu/tools/cancer-genomics" locator-type="url">http://genome.wustl.edu/tools/cancer-genomics</inter-ref>, implemented as a Perl package and supported on Linux/UNIX, MS Windows, and Mac OSX.</p>
<p><b>Contact:</b> <inter-ref locator="dkoboldt@genome.wustl.edu" locator-type="email">dkoboldt@genome.wustl.edu</inter-ref></p>
<p><b>Supplementary Information:</b> Supplementary data are available at <I>Bioinformatics</I> online.</p>
]]></description>
<dc:creator><![CDATA[Koboldt, D. C., Chen, K., Wylie, T., Larson, D. E., McLellan, M. D., Mardis, E. R., Weinstock, G. M., Wilson, R. K., Ding, L.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp373</dc:identifier>
<dc:title><![CDATA[VarScan: Variant detection in massively parallel sequencing of individual and pooled samples]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp372v1?rss=1">
<title><![CDATA[GR-Aligner: an algorithm for aligning pairwise genomic sequences containing rearrangement events]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp372v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Homologous genomic sequences between species usually contain different rearrangement events. Whether some specific patterns existed in the breakpoint regions that caused such events to occur is still unclear. To resolve this question, it is necessary to determine the location of breakpoints at the nucleotide level. The availability of sequences near breakpoints would further facilitate the related studies. We thus need a tool that can identify breakpoints and align the neighboring sequences. Although local alignment tools can detect rearrangement events, they only report a set of discontinuous alignments, where the detailed alignments in the breakpoint regions are usually missing. Global alignment tools are even less appropriate for these tasks since most of them are designed to align the conserved regions between sequences in a consistent order, i.e., they do not consider rearrangement events.</p>
<p><b>Results:</b> We propose an effective and efficient pairwise sequence alignment algorithm, called GR-Aligner, which can find breakpoints of rearrangement events by integrating the forward and reverse alignments of the breakpoint regions flanked by homologously rearranged sequences. In addition, GR-Aligner also provides an option to view the alignments of sequences extended to the breakpoints. These outputs provide materials for studying possible evolutionary mechanisms and biological functionalities of the rearrangement.</p>
<p><b>Availability:</b> <inter-ref locator="http://biocomp.iis.sinica.edu.tw/new/GR_Aligner.htm" locator-type="url">http://biocomp.iis.sinica.edu.tw/new/GR_Aligner.htm</inter-ref></p>
<p><b>Contact:</b> <inter-ref locator="arthur@iis.sinica.edu.tw" locator-type="email">arthur@iis.sinica.edu.tw</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Chu, T.-C., Liu, T., Lee, D.T., Lee, G. C., Shih, A. C.-C.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp372</dc:identifier>
<dc:title><![CDATA[GR-Aligner: an algorithm for aligning pairwise genomic sequences containing rearrangement events]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp364v1?rss=1">
<title><![CDATA[Copy number variation has little impact on bead-array-based measures of DNA methylation]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp364v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Integration of various genome-scale measures of molecular alterations is of great interest to researchers aiming to better define disease processes or identify novel targets with clinical utility.  Particularly important in cancer are measures of gene copy number DNA methylation.  However, copy number variation may bias the measurement of DNA methylation.  To investigate possible bias, we analyzed integrated data obtained from 19 head and neck squamous cell carcinoma (HNSCC) tumors and 23 mesothelioma tumors.</p>
<p><b>Results:</b> Statistical analysis of observational data produced results consistent with those anticipated from theoretical mathematical properties.  Average beta value reported by Illumina GoldenGate (a bead-array platform) was significantly smaller than a similar measure constructed from the ratio of average dye intensities.  Among CpGs that had only small variations in measured methylation across tumors (filtering out clearly biological methylation signatures), there were no systematic copy number effects on methylation for 3 and 4+ copies; however, 1 copy led to small systematic negative effects, and 0 copies led to substantial significant negative effects.</p>
<p><b>Conclusions:</b> Since mathematical considerations suggest little bias in methylation assayed using bead-arrays, the consistency of observational data with anticipated properties suggests little bias.However, further analysis of systematic copy number effects across CpGs suggest that though there may be little bias when there are copy number gains, small biases may result when 1 allele is lost, and substantial biases when both alleles are lost. These results suggest that further integration of these measures can be useful for characterizing the biological relationships between these somatic events.</p>
]]></description>
<dc:creator><![CDATA[Houseman, E. A., Christensen, B. C., Karagas, M. R., Wrensch, M. R., Nelson, H. H., Wiemels, J. L., Zheng, S., Wiencke, J. K., Kelsey, K. T., Marsit, C. J.]]></dc:creator>
<dc:date>2009-06-19</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp364</dc:identifier>
<dc:title><![CDATA[Copy number variation has little impact on bead-array-based measures of DNA methylation]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-19</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp374v1?rss=1">
<title><![CDATA[A Fast Hybrid Short Read Fragment Assembly Algorithm]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp374v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> The shorter and vastly more numerous reads produced by second-generation sequencing technologies require new tools that can assemble massive numbers of reads in reasonable time. Existing short-read assembly tools can be classified into two categories: <I>greedy extension-based</I> and <I>graph-based</I>. While the graph-based approaches are generally superior in terms of assembly quality, the computer resources required for building and storing a huge graph are very high. In this paper, we present Taipan, an assembly algo-rithm which can be viewed as a hybrid of these two approaches. Taipan uses greedy extensions for contig construction but at each step realizes enough of the corresponding read graph to make bet-ter decisions as to how assembly should continue. We show that this approach can achieve an assembly quality at least as good as the graph-based approaches used in the popular Edena and Velvet assembly tools using a moderate amount of computing resources.</p>
<p><b>Availability and Implementation:</b> Source code in C running on Linux is freely available at <inter-ref locator="http://taipan.sourceforge.net" locator-type="url">http://taipan.sourceforge.net</inter-ref> </p>
<p><b>Contact:</b> <inter-ref locator="asbschmidt@ntu.edu.sg" locator-type="email">asbschmidt@ntu.edu.sg</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Schmidt, B., Sinha, R., Beresford-Smith, B., Puglisi, S. J.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp374</dc:identifier>
<dc:title><![CDATA[A Fast Hybrid Short Read Fragment Assembly Algorithm]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp371v1?rss=1">
<title><![CDATA[A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp371v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation</b>: High-resolution copy-number (CN) analysis has in recent years gained much attention, not only for the purpose of identifying CN aberrations associated with a certain phenotype but also for identifying CN polymorphisms. In order for such studies to be successful and cost effective, the statistical methods have to be optimized. We propose a single-array preprocessing method for estimating full-resolution total CNs. It is applicable to all Affymetrix genotyping arrays, including the recent ones that also contain nonpolymorphic probes. A reference signal is only needed at the last step when calculating relative CNs.</p>
<p><b>Results</b>: As with our method for earlier generations of arrays, this one controls for allelic crosstalk, probe affinities and PCR fragmentlength effects. Additionally, it also corrects for probe-sequence effects and co-hybridization of fragments digested by multiple enzymes that takes place on the latest chips. We compare our method with Affymetrix' CN5 method and the dChip method by assessing how well they differentiate between various CN states at the full resolution and various amounts of smoothing. Although CRMA v2 is a singlearray method, we observe that it performs as well as or better than alternative methods that use data from all arrays for their preprocessing. This shows that it is possible to do online analysis in large-scale projects where additional arrays are introduced over time.</p>
<p><b>Availability</b>: A bounded-memory implementation that can process any number of arrays is available in the open-source R package <I>aroma.affymetrix</I>.</p>
<p><b>Contact</b>: <inter-ref locator="hb@stat.berkeley.edu" locator-type="email">hb@stat.berkeley.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Bengtsson, H., Wirapati, P., Speed, T. P.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp371</dc:identifier>
<dc:title><![CDATA[A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp370v1?rss=1">
<title><![CDATA[Reliable Prediction of Protein Thermostability Change upon Double Mutation from Amino Acid Sequence]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp370v1?rss=1</link>
<description><![CDATA[
<p><b>Summary: </b> The accurate prediction of protein stability change upon mutation is one of the important issues for protein design. In this work, we have focused on the stability change of double mutations and systematically analyzed the wild-type and mutant residues, patterns in amino acid sequence and locations of mutants. Based on the sequence information of wild-type, mutant and three neighboring residues, we have presented a weighted decision table method (WET) for predicting the stability changes of 180 double mutants obtained from thermal (G) denaturation. Using 10-fold cross-validation test, our method showed a correlation of 0.75 between experimental and predicted values of stability changes, and an accuracy of 82.2% for discriminating the stabilizing and destabilizing mutants.</p>
<p><b>Availability:</b> <inter-ref locator="http://bioinformatics.myweb.hinet.net/wetstab.htm" locator-type="url">http://bioinformatics.myweb.hinet.net/wetstab.htm</inter-ref>.</p>
<p><b>Contact:</b> <inter-ref locator="michael-gromiha@aist.go.jp" locator-type="email">michael-gromiha@aist.go.jp</inter-ref></p>
<p><b>Supplementary information:</b> Supplementary data are available at <I>Bioinformatics</I> online.</p>
]]></description>
<dc:creator><![CDATA[Huang, L.-T., Michael Gromiha, M.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp370</dc:identifier>
<dc:title><![CDATA[Reliable Prediction of Protein Thermostability Change upon Double Mutation from Amino Acid Sequence]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp368v1?rss=1">
<title><![CDATA[PhyloBayes 3. A Bayesian software package for phylogenetic reconstruction and molecular dating]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp368v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> A variety of probabilistic models describing the evolution of DNA or protein sequences have been proposed for phylogenetic reconstruction or for molecular dating. However, there still lacks a common implementation allowing one to freely combine these independent features, so as to test their ability to jointly improve phylogenetic and dating accuracy.</p>
<p><b>Results:</b> We propose a software package, PhyloBayes 3, that can be used for conducting Bayesian phylogenetic reconstruction and molecular dating analyses, using a large variety of amino-acid replacement and nucleotide substitution models, including empirical mixtures or non-parametric models, as well as alternative clock relaxation processes.</p>
<p><b>Availability:</b> PhyloBayes is freely available from our website <inter-ref locator="http://www.phylobayes.org" locator-type="url">http://www.phylobayes.org</inter-ref>. It works under Linux, Mac OsX and Windows operating systems.</p>
<p><b>Contact:</b> <inter-ref locator="nicolas.lartillot@umontreal.ca" locator-type="email">nicolas.lartillot@umontreal.ca</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Lartillot, N., Lepage, T., Blanquart, S.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp368</dc:identifier>
<dc:title><![CDATA[PhyloBayes 3. A Bayesian software package for phylogenetic reconstruction and molecular dating]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp366v1?rss=1">
<title><![CDATA[Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp366v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> The bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics nowadays for identifying proteins from a sequence database. <I>De novo</I> sequencing software is also available for sequencing novel peptides with relatively short sequence lengths. However, automated sequencing of novel proteins from MS/MS remains a challenging problem.</p>
<p><b>Results:</b> Very often, although the target protein is novel, it has a homologous protein included in a known database. When this happens, we propose a novel algorithm and automated software tool, named Champs, for sequencing the complete protein from MS/MS data of a few enzymatic digestions of the purified protein. Validation with two standard proteins showed that our automated method yields greater than 99% sequence coverage and 100% sequence accuracy on these two proteins. Our method is useful to sequence novel proteins or "re-sequence" a protein that has mutations comparing with the database protein sequence.</p>
<p><b>Availability:</b> The software is freely available at <inter-ref locator="http://monod.uwaterloo.ca/champs/" locator-type="url">http://monod.uwaterloo.ca/champs/</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Liu, X., Han, Y., Yuen, D., Ma, B.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp366</dc:identifier>
<dc:title><![CDATA[Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp362v1?rss=1">
<title><![CDATA[A Statistical Framework for Protein Quantitation in Bottom-Up MS-based Proteomics]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp362v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low-quality or incorrectly identi-fied peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level.</p>
<p><b>Results:</b> We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC-MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives.</p>
<p><b>Availability:</b> The software has been made available in the open-source proteomics platform DAnTE (Polpitiya et al. (2008)) (<inter-ref locator="http://omics.pnl.gov/software/" locator-type="url">http://omics.pnl.gov/software/</inter-ref>).</p>
<p><b>Contact:</b> <inter-ref locator="adabney@stat.tamu.edu" locator-type="email">adabney@stat.tamu.edu</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Karpievitch, Y., Stanley, J., Taverner, T., Huang, J., Adkins, J. N., Ansong, C., Heffron, F., Metz, T. O., Qian, W.-J., Yoon, H., Smith, R. D., Dabney, A. R.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp362</dc:identifier>
<dc:title><![CDATA[A Statistical Framework for Protein Quantitation in Bottom-Up MS-based Proteomics]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp361v1?rss=1">
<title><![CDATA[Modeling Multi-Cellular Behavior in Epidermal Tissue Homeostasis via Finite State Machines in Multi-Agent Systems]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp361v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> : For the efficient application of multi-agent systems to spatial and functional modeling of tissues flexible and intuitive modeling tools are needed, which allow the graphical specification of cellular behavior in a tissue context without presuming specialized program-ming skills. </p>
<p><b>Results:</b> : We developed a graphical modeling system for multi-agent based simulation of tissue homeostasis. An editor allows the intuitive and hierarchically structured specification of cellular behavior. The models are then automatically compiled into highly efficient source code and dynamically linked to an interactive graphical simulation environment. The system allows the quantitative analysis of the morphological and functional tissue properties emerging from the cell behavioral model. We demonstrate the relevance of the approach using a recently published model of epidermal homeostasis as well as a series of cell cycle models. </p>
<p><b>Availability:</b> The complete software is available in binary executables for MS-Windows and Linux at tiga.uni-hd.de</p>
<p><b>Contact:</b> <inter-ref locator="niels.grabe@bioquant.uni-heidelberg.de" locator-type="email">niels.grabe@bioquant.uni-heidelberg.de</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Sutterlin, T., Huber, S., Dickhaus, H., Grabe, N.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp361</dc:identifier>
<dc:title><![CDATA[Modeling Multi-Cellular Behavior in Epidermal Tissue Homeostasis via Finite State Machines in Multi-Agent Systems]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp360v1?rss=1">
<title><![CDATA[An investigation into the population abundance distribution of mRNAs, proteins, and metabolites in biological systems]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp360v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Distribution analysis is one of the most basic forms of statistical analysis. Thanks to improved analytical methods, accurate and extensive quantitative measurements can now be made of the mRNA, protein, and metabolites species from biological systems. Here we report a large-scale analysis of the population abundance distributions of the transcriptomes, proteomes, and metabolomes from varied biological systems.</p>
<p><b>Results:</b> We compared the observed empirical distributions with a number of distributions: power law, lognormal, loglogistic, loggamma, right Pareto-lognormal, and double Pareto-lognormal. The best-fit for mRNA, protein, and metabolite population abundance distributions was found to be the double Pareto-lognormal. This distribution behaves like a lognormal distribution around the centre, and like a power law distribution in the tails. To better understand the cause of this observed distribution we explored a simple stochastic model based on geometric Brownian motion. The distribution indicates that multiplicative effects are causally dominant in biological systems. We speculate that these effects arise from chemical reactions: the central-limit theorem then explains the central lognormal, and a number of possible mechanisms could explain the long tails - positivefeedback effects, network topology, etc. Many of the components in the central lognormal parts of the empirical distributions are unidentified and/or have unknown function. This indicates that much more biology awaits discovery.</p>
<p><b>Contact:</b> <inter-ref locator="rdk@aber.ac.uk" locator-type="email">rdk@aber.ac.uk</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Lu, C., King, R. D.]]></dc:creator>
<dc:date>2009-06-17</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp360</dc:identifier>
<dc:title><![CDATA[An investigation into the population abundance distribution of mRNAs, proteins, and metabolites in biological systems]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-17</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp369v1?rss=1">
<title><![CDATA[A CitationRank Algorithm Inheriting Google Technology Designed to Highlight Genes Responsible for Serious Adverse Drug Reaction]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp369v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Serious adverse drug reaction (SADR) is an urgent, world-wide problem. In the absence of any well organized gene-oriented SADR information pool, a database should be constructed. Since the importance of a gene to a particular SADR cannot simply be defined in terms of how frequently the two are cited together in the literature, an algorithm should be devised to sort genes according to their relevance to the SADR topics.</p>
<p><b>Results:</b> The SADR-Gengle database, which is made up of gene-SADR relationships extracted from Pubmed, has been constructed, covering six major SADRs, namely cholestasis, deafness, muscle toxicity, QT prolongation, Stevens-Johnson syndrome and torsades de points. The CitationRank algorithm, which inherits the principle of the Google PageRank algorithm that a gene should be highly ranked when biologically related to other highly ranked genes, is devised. The algorithm performs robustly in recovering SADR related genes in the presence of extraneous noise, and the use of the algorithm has been extended to sorting genes in our database. Users can browse genes in a Google-type system where genes are ordered according to their descending relevance to the SADR topic selected by the user. The database also provides users with visualized gene-gene knowledge chain networks, helping them to systematize their gene-oriented knowledge chain whilst navigating these networks.</p>
<p><b>Availability:</b> The SADR-Gengle is freely available at <inter-ref locator="http://Gengle.Bio-X.cn/SADR/" locator-type="url">http://Gengle.Bio-X.cn/SADR/</inter-ref>.</p>
<p><b>Contact:</b> Lin He helinhelin@gmail.com or Lun Yang Lun.Yang@gmail.com</p>
]]></description>
<dc:creator><![CDATA[Yang, L., Xu, L., He, L.]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp369</dc:identifier>
<dc:title><![CDATA[A CitationRank Algorithm Inheriting Google Technology Designed to Highlight Genes Responsible for Serious Adverse Drug Reaction]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp367v1?rss=1">
<title><![CDATA[De novo Transcriptome Assembly with ABySS]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp367v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Whole transcriptome shotgun sequencing data from non-normalized samples offer unique opportunities to study the metabolic states of organisms. One can deduce gene expression levels using sequence coverage as a surrogate, identify coding changes or discover novel isoforms or transcripts. Especially for discovery of novel events, <I>de novo</I> assembly of transcriptomes is desirable.</p>
<p><b>Results:</b> Transcriptome from tumor tissue of a patient with follicular lymphoma was sequenced with 36 base-pair (bp) single- and paired-end reads on the Illumina Genome Analyzer II platform. We assembled approximately 194 million reads using ABySS into 66,921 contigs 100bp or longer, with a maximum contig length of 10,951bp, representing over 30 million base pairs of unique transcriptome sequence, or roughly 1% of the genome.</p>
<p><b>Availability and Implementation:</b> Source code and binaries of ABySS are freely available for download at <inter-ref locator="http: // www.bcgsc.ca / platform / bioinfo / software / abyss" locator-type="url">http: // www.bcgsc.ca / platform / bioinfo / software / abyss</inter-ref>. Assembler tool is implemented in C++. The parallel version uses Open MPI. Explorer tool is implemented in Java using the Java universal network/graph framework.</p>
<p><b>Contact:</b> Software help: abyss@bcgsc.ca, authors {ibirol, sjackman, cydneyn, jqian, sjones}@bcgsc.ca</p>
]]></description>
<dc:creator><![CDATA[Birol, I., Jackman, S. D, Nielsen, C., Qian, J. Q, Varhol, R., Stazyk, G., Morin, R. D, Zhao, Y., Hirst, M., Schein, J. E, Horsman, D. E, Connors, J. M, Gascoyne, R. D, Marra, M. A, Jones, S. J.]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp367</dc:identifier>
<dc:title><![CDATA[De novo Transcriptome Assembly with ABySS]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp365v1?rss=1">
<title><![CDATA[Inferring Progression Models for CGH data]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp365v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> One of the mutational processes that has been monitored genome-wide is the occurrence of regional DNA <I>Copy Number Alterations (CNAs)</I>, which may lead to deletion or over-expression of tumor suppressors or oncogenes, respectively. Understanding the relationship between CNAs and different cancer types is a fundamental problem in cancer studies.</p>
<p><b>Results:</b> This paper develops an efficient method that can accurately model the progression of the cancer markers and reconstruct evolutionary relationship between multiple types of cancers using Comparative Genomic Hybridization (CGH) data. Such modeling can lead to better understanding of the commonalities and differences between multiple cancer types and potential therapies. We have developed an automatic method to infer a graph model for the markers of multiple cancers from a large population of CGH data. Our method identifies highly related markers across different cancer types. It then builds a directed acyclic graph that shows the evolutionary history of these markers based on how common each marker is in different cancer types. We demonstrated the use of this model in determining the importance of markers in cancer evolution. We have also developed a new method to measure the evolutionary distance between different cancers based on their markers. This method employs the graph model we developed for the individual markers to measure the distance between pairs of cancers. We used this measure to create an evolutionary tree for multiple cancers.</p>
<p>Our experiments on Progenetix database show that our markers are largely consistent to the reported hot-spot imbalances and most frequent imbalances. The results show that our distance measure can accurately reconstruct the evolutionary relationship between multiple cancer types.</p>
<p><b>Availability:</b> All the code developed in this paper are available at <ty><inter-ref locator="http: //bioinformatics.cise.ufl.edu/phylogeny.html" locator-type="url">http: //bioinformatics.cise.ufl.edu/phylogeny.html</inter-ref></ty>. subtypes of the same cancer.</p>
<p><b>Contact:</b> nirmalya@cise.ufl.edu</p>
]]></description>
<dc:creator><![CDATA[Liu, J., Bandyopadhyay, N., Ranka, S., Baudis, M, Kahveci, T.]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp365</dc:identifier>
<dc:title><![CDATA[Inferring Progression Models for CGH data]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp363v1?rss=1">
<title><![CDATA[Commentson theAnalysisof Unbalanced Microarray Data]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp363v1?rss=1</link>
<description><![CDATA[
<p><b>Motivation:</b> Permutation testing is very popular for analyzing microarray data to identify differentially expressed genes; estimating false discovery rates is a very popular way to address the inherent multiple testing problem. However, combining these approachesmay be problematicwhen sample sizes are unequal.</p>
<p><b>Results:</b> With unbalanced data, permutation tests may not be suitable because they do not test the hypothesis of interest. In addition, permutation tests can be biased. Using biased p-values to estimate the false discovery rate can produce unacceptable bias in those estimates. Results also show that the approach of pooling permutation null distributions across genes can produce invalid p-values, since even non-differentially-expressedgenes can have different permutation null distributions.We encourage researchers to use statistics that have been shown to reliably discriminate differentially-expressed genes, but caution that associated p-values may be either invalid, or a less effective metric for discriminating differentially-expressed genes.</p>
]]></description>
<dc:creator><![CDATA[Kerr, K. F.]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp363</dc:identifier>
<dc:title><![CDATA[Commentson theAnalysisof Unbalanced Microarray Data]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp357v1?rss=1">
<title><![CDATA[PROMISE: A Tool to Identify Genomic Features with a Specific Biologically Interesting Pattern of Associations with Multiple Endpoint Variables]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp357v1?rss=1</link>
<description><![CDATA[
<p>Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses, or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Documented R routines to implement PROMISE are freely available from <inter-ref locator="www.stjuderesearch.org/depts/biostats" locator-type="url">www.stjuderesearch.org/depts/biostats</inter-ref> and will soon be available as a Bioconductor package from <inter-ref locator="www.bioconductor.org" locator-type="url">www.bioconductor.org</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Pounds, S., Cheng, C., Cao, X., Crews, K. R., Plunkett, W., Gandhi, V., Rubnitz, J., Ribeiro, R. C., Downing, J. R., Lamba, J.]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp357</dc:identifier>
<dc:title><![CDATA[PROMISE: A Tool to Identify Genomic Features with a Specific Biologically Interesting Pattern of Associations with Multiple Endpoint Variables]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp335v1?rss=1">
<title><![CDATA[InterMap3D: predicting and visualizing co-evolving protein residues]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp335v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> InterMap3D predicts co-evolving protein residues and plots them on the 3D protein structure. Starting with a single protein sequence, InterMap3D automatically finds a set of homologous sequences, generates an alignment and fetches the most similar 3D structure from the Protein Data Bank (PDB). It can also accept a user-generated alignment. Based on the alignment, co-evolving residues are then predicted using three different methods: RCW Mutual Information, Mutual Information/Entropy and Dependency. Finally, InterMap3D generates high-quality images of the protein with the predicted co-evolving residues highlighted.</p>
<p><b>Availability:</b> <inter-ref locator="http://www.cbs.dtu.dk/services/InterMap3D/" locator-type="url">http://www.cbs.dtu.dk/services/InterMap3D/</inter-ref></p>
<p><b>Contact:</b> <inter-ref locator="gorm@cbs.dtu.dk" locator-type="email">gorm@cbs.dtu.dk</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Gouveia-Oliveira, R., Roque, F. S, Wernersson, R., Sicheritz-Ponten, T., Sackett, P. W, Molgaard, A., Pedersen, A. G]]></dc:creator>
<dc:date>2009-06-15</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp335</dc:identifier>
<dc:title><![CDATA[InterMap3D: predicting and visualizing co-evolving protein residues]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-15</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp359v1?rss=1">
<title><![CDATA[A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp359v1?rss=1</link>
<description><![CDATA[
<p>Motivation: Identification of core promoters is a key clue in understanding gene regulations. However, due to the diverse nature of promoter sequences, the accuracy of existing prediction approaches for non-CpG island (simply CGI)-related promoters is not as high as that for CGI-related promoters. This consequently leads to a low genome-wide promoter prediction accuracy.</p>
<p>Results: In this paper, we first systematically analyze the similarities and differences between the two types of promoters (CGI-related and non-CGI-related) from a novel structural perspective, and then devise a unified framework, called PNNP (Pattern-based Nearest Neighbor search for Promoter), to predict both CGI-related and non-CGIrelated promoters based on their structural features. Our comparative analysis on the structural characteristics of promoters reveals two interesting facts: 1) the structural values of CGI-related and non-CGI-related promoters are quite different, but they exhibit nearly similar structural patterns; 2) the structural patterns of promoters are obviously different from that of non-promoter sequences though the sequences have almost similar structural values. Extensive experiments demonstrate that the proposed PNNP approach is effective in capturing the structural patterns of promoters, and can significantly improve genome-wide performance of promoters prediction, especially non-CGI-related promoters prediction.</p>
<p>Availability: The implementation of the program PNNP is available at <inter-ref locator="http://admis.tongji.edu.cn/Projects/pnnp.aspx" locator-type="url">http://admis.tongji.edu.cn/Projects/pnnp.aspx</inter-ref>.</p>
<p>Contact: <inter-ref locator="jhguan@tongji.edu.cn" locator-type="email">jhguan@tongji.edu.cn</inter-ref>; <inter-ref locator="sgzhou@fudan.edu.cn" locator-type="email">sgzhou@fudan.edu.cn</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Gan, Y., Guan, J., Zhou, S.]]></dc:creator>
<dc:date>2009-06-10</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp359</dc:identifier>
<dc:title><![CDATA[A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-10</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp356v1?rss=1">
<title><![CDATA[Reordering contigs of draft genomes using the Mauve Aligner]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp356v1?rss=1</link>
<description><![CDATA[
<p><b>Summary:</b> Mauve Contig Mover provides a new method for proposing the relative order of contigs that make up a draft genome based on comparison to a complete or draft reference genome. A novel application of the Mauve aligner and viewer provides an automated reordering algorithm coupled with a powerful drill-down display al-lowing detailed exploration of results.</p>
<p><b>Availability:</b> The software is available for download at <inter-ref locator="http://gel.ahabs.wisc.edu/mauve" locator-type="url">http://gel.ahabs.wisc.edu/mauve</inter-ref>.</p>
<p><b>Contact:</b> <inter-ref locator="rissman@wisc.edu" locator-type="email">rissman@wisc.edu</inter-ref>.</p>
<p><b>Supplemental information:</b> Supplemental data are available from Bioinformatics online and <inter-ref locator="http://gel.ahabs.wisc.edu" locator-type="url">http://gel.ahabs.wisc.edu</inter-ref>.</p>
]]></description>
<dc:creator><![CDATA[Rissman, A. I, Mau, B., Biehl, B. S., Darling, A. E., Glasner, J. D., Perna, N. T.]]></dc:creator>
<dc:date>2009-06-10</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp356</dc:identifier>
<dc:title><![CDATA[Reordering contigs of draft genomes using the Mauve Aligner]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-10</prism:publicationDate>
<prism:section>APPLICATIONS NOTE</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp355v1?rss=1">
<title><![CDATA[Executing Multicellular Differentiation: Quantitative Predictive Modelling of C. elegans Vulval Development]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp355v1?rss=1</link>
<description><![CDATA[
<p>Motivation: Understanding the processes involved in multi-cellular pattern formation is a central problem of developmental biology, hopefully leading to many new insights, e.g., in the treatment of various diseases. Defining suitable computational techniques for development modelling, able to perform in silico simulation experiments, is an open and challenging problem.</p>
<p>Results: Previously, we proposed a coarse-grained, quantitative approach based on the basic Petri net formalism, to mimic the behaviour of the biological processes during multicellular differentiation. Here we apply our modelling approach to the well-studied process of C. elegans vulval development. We show that our model correctly reproduces a large set of in vivo experiments with statistical accuracy. It also generates gene expression time series in accordance with recent biological evidence. Finally, we modelled the role of microRNA mir-61 during vulval development and predict its contribution in stabilising cell pattern formation.</p>
<p>Contact: <inter-ref locator="feenstra@few.vu.nl" locator-type="email">feenstra@few.vu.nl</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Bonzanni, N., Krepska, E., Feenstra, K. A., Fokkink, W., Kielmann, T., Bal, H., Heringa, J.]]></dc:creator>
<dc:date>2009-06-10</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp355</dc:identifier>
<dc:title><![CDATA[Executing Multicellular Differentiation: Quantitative Predictive Modelling of C. elegans Vulval Development]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-10</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp341v1?rss=1">
<title><![CDATA[Visual and Statistical Comparison of Metagenomes]]></title>
<link>http://bioinformatics.oxfordjournals.org/cgi/content/short/btp341v1?rss=1</link>
<description><![CDATA[
<p><b>Background:</b> Metagenomics is the study of the genomic content of an environmental sample of microbes. Advances in the throughput and cost-efficiency of sequencing technology is fueling a rapid increase in the number and size of metagenomic datasets being generated. Bioinformatics is faced with the problem of how to handle and analyze these datasets in an efficient and useful way. One goal of these metagenomic studies is to get a basic understanding of the microbial world both surrounding us and within us. One major challenge is how to compare multiple datasets. Furthermore, there is a need for bioinformatics tools that can process many large datasets and are easy to use.</p>
<p><b>Results:</b> This paper describes two new and helpful techniques for comparing multiple metagenomic datasets. The first is a visualization technique for multiple datasets and the second is a new statistical method for highlighting the differences in a pairwise comparison. We have developed implementations of both methods that are suitable for very large datasets and provide these in Version 3 of our stand-alone metagenome analysis tool MEGAN.</p>
<p><b>Conclusion:</b> These new methods are suitable for the visual comparison of many large metagenomes and the statistical comparison of two metagenomes at a time. Nevertheless, more work needs to be done to support the comparative analysis of multiple metagenome datasets.</p>
<p><b>Availability:</b> Version 3 of MEGAN, which implements all ideaspresented in this paper, can be obtained from our website at:<inter-ref locator="www-ab.informatik.uni-tuebingen.de/software/megan" locator-type="url"><ty>www-ab.informatik.uni-tuebingen.de/software/megan</ty></inter-ref>.</p>
<p><b>Contact:</b> <inter-ref locator="mitra@informatik.uni-tuebingen.de" locator-type="email">mitra@informatik.uni-tuebingen.de</inter-ref></p>
]]></description>
<dc:creator><![CDATA[Mitra, S., Klar, B., Huson, D. H.]]></dc:creator>
<dc:date>2009-06-10</dc:date>
<dc:identifier>info:doi/10.1093/bioinformatics/btp341</dc:identifier>
<dc:title><![CDATA[Visual and Statistical Comparison of Metagenomes]]></dc:title>
<dc:publisher>Oxford University Press</dc:publisher>
<prism:publicationDate>2009-06-10</prism:publicationDate>
<prism:section>ORIGINAL PAPER</prism:section>
</item>

<item rdf:about="http://bioinformatics.oxfordjournals.org/cgi/content/short/btp358v1?rss=1">
<title><![CDATA[Structural and practical identifiability analysis of partially obser