Back To Schedule
Wednesday, October 27 • 3:00pm - 3:15pm
SeqScreen: Accurate and Sensitive Functional Screening of Pathogenic Sequences via Ensemble Learning - Auditorium

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Modern benchtop DNA synthesis techniques and increased concern of emerging pathogens have elevated the importance of screening oligonucleotides for pathogens of concern. However, accurate and sensitive characterization of oligonucleotides is an open challenge for many of the current techniques and ontology-based tools.  To address this gap, we have developed a novel software tool, SeqScreen, that can accurately and sensitively characterize short DNA sequences using a set of curated Functions of Sequences of Concern (FunSoCs), novel functional labels specific to microbial pathogenesis which describe the pathogenic potential of individual proteins.  SeqScreen uses ensemble machine learning models encompassing multi-stage Neural Networks and Support Vector Classifiers which can label query sequences with FunSoCs via an imbalanced multi-class and multi-label classification task with high accuracy. In summary, SeqScreen represents a first step towards a novel paradigm of functionally informed pathogen characterization from genomic and metagenomic datasets. SeqScreen is open-source and freely available for download at: www.gitlab.com/treangenlab/seqscreen

Authors: Advait Balaji, Bryce Kille, Anthony Kappell, Gene Godbold, Madeline Diep, R. A Leo Elworth, Zhiqin Qian, Dreycey Albin, Daniel Nasko, Nidhi Shah, Mihai Pop, Santiago Segarra, Krista Ternus, and Todd Treangen


Todd Treangen

Rice University

Wednesday October 27, 2021 3:00pm - 3:15pm CDT

Attendees (1)