Hopper: software for automating data tracking and flow in DNA sequencing
Department of Molecular Biotechnology, Box 357730, University of Washington Seattle, WA 98195, USA
1To whom reprint requests should be addressed
MOTIVATION: Genome-scale DNA sequencing is a multistep process in which large numbers of small template clones are propagated, purified, sequenced and analyzed on acrylamide gels. A significant challenge to these projects is the scale at which the data handling must be done. Hence, large-scale sequencing facilities will benefit from tracking template DNA information (purification methods, reaction and electrophoresis conditions) in a systematic fashion. A lack of software tools that support automated sample entry, and automatic data storage, retrieval and analysis are a major hindrance to recording and using laboratory workflow information to monitor the overall quality of data production.
RESULTS: The UNIX file system has been used to prototype automation of the flow of data from the ABI sequencer to a data repository. Data are automatically processed by a central Perl program, Hopper, which runs a series of programs that analyze data quality (read length estimate, fraction of indeterminate bases, and number of contaminating and repetitive sequences), assemble shotgun sequence data, and generates simple reports describing the results.
AVAILABILITY: This software is freely available over the Internet on the WWW (http://www.genome.washington.edu/docs/hopper).
CONTACT: E-mail: T.M.Smith, soundbat{at}u.washington.edu
Received on August 26, 1996; accepted on November 20, 1996
This article has been cited by other articles:
![]() |
D. Meldrum Automation for Genomics, Part One: Preparation for Sequencing Genome Res., August 1, 2000; 10(8): 1081 - 1092. [Abstract] [Full Text] |
||||
![]() |
M. C. Wendl, S. Dear, D. Hodgson, and L. Hillier Automated Sequence Preprocessing in a Large-Scale Sequencing Environment Genome Res., September 1, 1998; 8(9): 975 - 984. [Abstract] [Full Text] |
||||
![]() |
N. N. Dedhia and W. R. McCombie Kaleidaseq: A Web-Based Tool to Monitor Data Flow in a High Throughput Sequencing Facility Genome Res., March 1, 1998; 8(3): 313 - 318. [Abstract] [Full Text] |
||||
