An industry/academic collaboration led by COV-IRT co-founders Krista Ternus and Todd Treangen has released SeqScreen, an open source workflow for detecting and characterizing DNA sequences of interest.
Uncovering the taxonomic origin and functional profile of gene sequences is of high interest to biosecurity professionals and fundamental researchers, but it is a challenging and nuanced task to navigate. SeqScreen software was developed to leverage multiple open source tools to predict the most probable source organism, the diversity of possible taxonomic lineages, functional gene annotations, and custom biological processes of interest for a given query sequence. SeqScreen is useful for annotating full-length gene sequences in assembled genomes, as well as gaining the most information possible from short gene fragments in metagenomes, metatranscriptomes, or individual sequences (>50 bp).
SeqScreen executes the following consecutive workflows:
- Initialization: Preprocessing and input validation
- SeqMapper: Rapid alignment of the input queries against a custom database
- Taxonomic Identification: Sensitive taxonomic classification of query sequence
- Functional Annotation: Biological process and molecular function GO term predictions
- Final Report Generation with Biological Process of Interest (BPoI) Assignments: Summarize findings in easy to parse report
The SeqScreen workflow is available on GitLab.