A picture is worth a thousand base pairs

Anna Nowogrodzki in Nature:

When Adam Siepel was building algorithms for evolutionary genomics as part of his PhD, he wasn’t thinking about visualization. But, as a graduate student in the laboratory of computational biologist David Haussler, at the University of California, Santa Cruz (UCSC), he happened to sit next to the software engineers who were building and maintaining a tool called the UCSC Genome Browser. These engineers helped Siepel to make his algorithms publicly available as a track, or data overlay, that anyone could explore. Genome browsers are graphical tools that display the genome sequence, usually as a horizontal line. Other sequence-associated data are aligned and stacked above and below that line in ‘tracks’, for instance to illustrate the relationship between gene expression, DNA modification and protein-binding sites.

Siepel’s track identifies sequences that have been retained over evolutionary time; when a user applies it while viewing the alignment of genomic data from two or more species, the track highlights regions that are evolutionarily conserved. Allowing others to use the algorithm to highlight regions of interest in their own data was “probably the single most important thing I did during my PhD”, says Siepel, who is now a computational biologist at Cold Spring Harbor Laboratory in New York. Other researchers have used it, for instance, to find mutations associated with diseases and to pinpoint functionally important regions of noncoding RNA molecules.

More here.