Search Prime Grants

R44GM145195

Project Grant

Overview

Grant Description
MetaboQuest: A Suite of Tools for Metabolite Annotation

Project Summary

Metabolomics aims at high throughput detection, quantification, and identification of metabolites in biological samples. The use of liquid chromatography coupled with mass spectrometry (LC-MS) has risen in prominence in the field of metabolomics due to its ability to analyze a sizable number of metabolites with a limited amount of biological material. However, in a typical untargeted metabolomics analysis of human samples by LC-MS, about 70% of the detected peaks represent unknown analytes mainly because existing mass spectral libraries cover only a small fraction of known compounds, but also due to uncertainty in peak picking, alignment of peaks, and recognizing isotopic peaks and adduct forms.

These challenges have kept at bay the pace of development of data analytics pipelines for metabolomics and its integration with other omics studies. The goal of this Phase II SBIR proposal is to make metabolomics studies on a par with other omics studies such as genomics, transcriptomics, and proteomics, for which well-established pipelines are available. By doing so, we will accelerate the role of metabolomics in systems biology approaches for various applications including biomarker and drug discovery.

To achieve this goal, we propose to develop a cloud-based platform that allows customers to build pipelines for analysis of LC-MS-based untargeted metabolomics data, starting from peak detection to metabolite annotation. This will be accomplished by implementing a suite of innovative tools that can be assembled into customized pipelines and by enhancing metabolite annotation accuracy through integration of information derived from multiple resources including compound databases, pathways, biochemical networks, and mass spectral libraries.

Aim 1 of this proposal will focus on developing a suite of tools to enable:

(1) Peak detection, alignment, and quality assessment;
(2) Adduct and isotopic peak recognition;
(3) Mass-based search against multiple compound databases;
(4) Expert-based evaluation of putative IDs;
(5) Isotopic pattern analysis;
(6) Network-based evaluation of putative IDs;
(7) Spectral matching of MS/MS data against experimental and in-silico fragmentation patterns;
(8) Deep learning-based prediction of compound fingerprints; and
(9) Integrative assessment of putative metabolite IDs via a probabilistic model.

Aim 2 will assemble the tools developed in Aim 1 into a cloud-based platform, MetaboQuest, which provides users with interactive visualization of peaks, isotopic patterns, networks, and mass spectra. Furthermore, Aim 2 will focus on integrating into MetaboQuest a pipeline builder that allows users to create pipelines by linking modules and run them remotely through a modular interactive web interface.

Aim 3 will perform a comprehensive evaluation of MetaboQuest in terms of metabolite annotation accuracy, number of annotated metabolites, and computational efficiency compared to other existing tools. Accuracy in metabolite annotation will be evaluated via experimental methods in which MS/MS data from unknown analytes and reference compounds are compared, and by using LC-MS/MS data from multiple metabolomics studies that consist of ground-truth information.

Successful implementation and validation of MetaboQuest will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby eliminating the need for manual verification of putative metabolite IDs and enhancing the contribution of metabolomics studies, specifically in disease biomarker and drug discovery.
Awardee
Place of Performance
District Of Columbia United States
Geographic Scope
State-Wide
Analysis Notes
Amendment Since initial award the total obligations have increased 100% from $997,931 to $1,995,862.
Omicscraft was awarded Project Grant R44GM145195 worth $1,995,862 from the National Institute of General Medical Sciences in February 2022 with work to be completed primarily in District Of Columbia United States. The grant has a duration of 2 years and was awarded through assistance program 93.859 Biomedical Research and Research Training. The Project Grant was awarded through grant opportunity PHS 2020-2 Omnibus Solicitation of the NIH, CDC and FDA for Small Business Innovation Research Grant Applications (Parent SBIR [R43/R44] Clinical Trial Not Allowed).

SBIR Details

Research Type
SBIR Phase II
Title
MetaboQuest: A Suite of Tools for Metabolite Annotation
Abstract
MetaboQuest: A Suite of Tools for Metabolite Annotation PROJECT SUMMARY Metabolomics aims at high throughput detection, quantification, and identification of metabolites in biological samples. The use of liquid chromatography coupled with mass spectrometry (LC-MS) has risen in prominence in the field of metabolomics due to its ability to analyze a sizable number of metabolites with a limited amount of biological material. However, in a typical untargeted metabolomics analysis of human samples by LC-MS, about 70% of the detected peaks represent unknown analytes mainly because existing mass spectral libraries cover only a small fraction of known compounds, but also due to uncertainty in peak picking, alignment of peaks, and recognizing isotopic peaks and adduct forms. These challenges have kept at bay the pace of development of data analytics pipelines for metabolomics and its integration with other omics studies. The goal of this Phase II SBIR proposal is to make metabolomics studies on a par with other omics studies such as genomics, transcriptomics, and proteomics, for which well-established pipelines are available. By doing so, we will accelerate the role of metabolomics in systems biology approaches for various applications including biomarker and drug discovery. To achieve this goal, we propose to develop a cloud-based platform that allows customers to build pipelines for analysis of LC-MS-based untargeted metabolomics data, starting from peak detection to metabolite annotation. This will be accomplished by implementing a suite of innovative tools that can be assembled into customized pipelines and by enhancing metabolite annotation accuracy through integration of information derived from multiple resources including compound databases, pathways, biochemical networks, and mass spectral libraries. Aim 1 of this proposal will focus on developing a suite of tools to enable: (1) peak detection, alignment, and quality assessment; (2) adduct and isotopic peak recognition; (3) mass-based search against multiple compound databases; (4) expert-based evaluation of putative IDs; (5) isotopic pattern analysis; (6) network-based evaluation of putative IDs; (7) spectral matching of MS/MS data against experimental and in- silico fragmentation patterns; (8) deep learning-based prediction of compound fingerprints; and (9) integrative assessment of putative metabolite IDs via a probabilistic model. Aim 2 will assemble the tools developed in Aim 1 into a cloud-based platform, MetaboQuest, which provides users with interactive visualization of peaks, isotopic patterns, networks, and mass spectra. Furthermore, Aim 2 will focus on integrating into MetaboQuest a pipeline builder that allows users to create pipelines by linking modules and run them remotely through a modular interactive web interface. Aim 3 will perform a comprehensive evaluation of MetaboQuest in terms of metabolite annotation accuracy, number of annotated metabolites, and computational efficiency compared to other existing tools. Accuracy in metabolite annotation will be evaluated via experimental methods in which MS/MS data from unknown analytes and reference compounds are compared, and by using LC-MS/MS data from multiple metabolomics studies that consist of ground-truth information. Successful implementation and validation of MetaboQuest will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby eliminating the need for manual verification of putative metabolite IDs and enhancing the contribution of metabolomics studies, specifically in disease biomarker and drug discovery.
Topic Code
400
Solicitation Number
PA20-260

Status
(Complete)

Last Modified 2/7/23

Period of Performance
2/11/22
Start Date
1/31/24
End Date
100% Complete

Funding Split
$2.0M
Federal Obligation
$0.0
Non-Federal Obligation
$2.0M
Total Obligated
100.0% Federal Funding
0.0% Non-Federal Funding

Activity Timeline

Interactive chart of timeline of amendments to R44GM145195

Transaction History

Modifications to R44GM145195

Additional Detail

Award ID FAIN
R44GM145195
SAI Number
R44GM145195-2091014262
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Small Business
Awarding Office
75NS00 NIH NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
Funding Office
75NS00 NIH NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
Awardee UEI
YA6XJ6MLG1J8
Awardee CAGE
7KMJ4
Performance District
98

Budget Funding

Federal Account Budget Subfunction Object Class Total Percentage
National Institute of General Medical Sciences, National Institutes of Health, Health and Human Services (075-0851) Health research and training Grants, subsidies, and contributions (41.0) $1,995,862 100%
Modified: 2/7/23