R44GM145195
Project Grant
Overview
Grant Description
MetaboQuest: A Suite of Tools for Metabolite Annotation
Project Summary
Metabolomics aims at high throughput detection, quantification, and identification of metabolites in biological samples. The use of liquid chromatography coupled with mass spectrometry (LC-MS) has risen in prominence in the field of metabolomics due to its ability to analyze a sizable number of metabolites with a limited amount of biological material. However, in a typical untargeted metabolomics analysis of human samples by LC-MS, about 70% of the detected peaks represent unknown analytes mainly because existing mass spectral libraries cover only a small fraction of known compounds, but also due to uncertainty in peak picking, alignment of peaks, and recognizing isotopic peaks and adduct forms.
These challenges have kept at bay the pace of development of data analytics pipelines for metabolomics and its integration with other omics studies. The goal of this Phase II SBIR proposal is to make metabolomics studies on a par with other omics studies such as genomics, transcriptomics, and proteomics, for which well-established pipelines are available. By doing so, we will accelerate the role of metabolomics in systems biology approaches for various applications including biomarker and drug discovery.
To achieve this goal, we propose to develop a cloud-based platform that allows customers to build pipelines for analysis of LC-MS-based untargeted metabolomics data, starting from peak detection to metabolite annotation. This will be accomplished by implementing a suite of innovative tools that can be assembled into customized pipelines and by enhancing metabolite annotation accuracy through integration of information derived from multiple resources including compound databases, pathways, biochemical networks, and mass spectral libraries.
Aim 1 of this proposal will focus on developing a suite of tools to enable:
(1) Peak detection, alignment, and quality assessment;
(2) Adduct and isotopic peak recognition;
(3) Mass-based search against multiple compound databases;
(4) Expert-based evaluation of putative IDs;
(5) Isotopic pattern analysis;
(6) Network-based evaluation of putative IDs;
(7) Spectral matching of MS/MS data against experimental and in-silico fragmentation patterns;
(8) Deep learning-based prediction of compound fingerprints; and
(9) Integrative assessment of putative metabolite IDs via a probabilistic model.
Aim 2 will assemble the tools developed in Aim 1 into a cloud-based platform, MetaboQuest, which provides users with interactive visualization of peaks, isotopic patterns, networks, and mass spectra. Furthermore, Aim 2 will focus on integrating into MetaboQuest a pipeline builder that allows users to create pipelines by linking modules and run them remotely through a modular interactive web interface.
Aim 3 will perform a comprehensive evaluation of MetaboQuest in terms of metabolite annotation accuracy, number of annotated metabolites, and computational efficiency compared to other existing tools. Accuracy in metabolite annotation will be evaluated via experimental methods in which MS/MS data from unknown analytes and reference compounds are compared, and by using LC-MS/MS data from multiple metabolomics studies that consist of ground-truth information.
Successful implementation and validation of MetaboQuest will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby eliminating the need for manual verification of putative metabolite IDs and enhancing the contribution of metabolomics studies, specifically in disease biomarker and drug discovery.
Project Summary
Metabolomics aims at high throughput detection, quantification, and identification of metabolites in biological samples. The use of liquid chromatography coupled with mass spectrometry (LC-MS) has risen in prominence in the field of metabolomics due to its ability to analyze a sizable number of metabolites with a limited amount of biological material. However, in a typical untargeted metabolomics analysis of human samples by LC-MS, about 70% of the detected peaks represent unknown analytes mainly because existing mass spectral libraries cover only a small fraction of known compounds, but also due to uncertainty in peak picking, alignment of peaks, and recognizing isotopic peaks and adduct forms.
These challenges have kept at bay the pace of development of data analytics pipelines for metabolomics and its integration with other omics studies. The goal of this Phase II SBIR proposal is to make metabolomics studies on a par with other omics studies such as genomics, transcriptomics, and proteomics, for which well-established pipelines are available. By doing so, we will accelerate the role of metabolomics in systems biology approaches for various applications including biomarker and drug discovery.
To achieve this goal, we propose to develop a cloud-based platform that allows customers to build pipelines for analysis of LC-MS-based untargeted metabolomics data, starting from peak detection to metabolite annotation. This will be accomplished by implementing a suite of innovative tools that can be assembled into customized pipelines and by enhancing metabolite annotation accuracy through integration of information derived from multiple resources including compound databases, pathways, biochemical networks, and mass spectral libraries.
Aim 1 of this proposal will focus on developing a suite of tools to enable:
(1) Peak detection, alignment, and quality assessment;
(2) Adduct and isotopic peak recognition;
(3) Mass-based search against multiple compound databases;
(4) Expert-based evaluation of putative IDs;
(5) Isotopic pattern analysis;
(6) Network-based evaluation of putative IDs;
(7) Spectral matching of MS/MS data against experimental and in-silico fragmentation patterns;
(8) Deep learning-based prediction of compound fingerprints; and
(9) Integrative assessment of putative metabolite IDs via a probabilistic model.
Aim 2 will assemble the tools developed in Aim 1 into a cloud-based platform, MetaboQuest, which provides users with interactive visualization of peaks, isotopic patterns, networks, and mass spectra. Furthermore, Aim 2 will focus on integrating into MetaboQuest a pipeline builder that allows users to create pipelines by linking modules and run them remotely through a modular interactive web interface.
Aim 3 will perform a comprehensive evaluation of MetaboQuest in terms of metabolite annotation accuracy, number of annotated metabolites, and computational efficiency compared to other existing tools. Accuracy in metabolite annotation will be evaluated via experimental methods in which MS/MS data from unknown analytes and reference compounds are compared, and by using LC-MS/MS data from multiple metabolomics studies that consist of ground-truth information.
Successful implementation and validation of MetaboQuest will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby eliminating the need for manual verification of putative metabolite IDs and enhancing the contribution of metabolomics studies, specifically in disease biomarker and drug discovery.
Awardee
Grant Program (CFDA)
Awarding / Funding Agency
Place of Performance
District Of Columbia
United States
Geographic Scope
State-Wide
Related Opportunity
Analysis Notes
Amendment Since initial award the total obligations have increased 100% from $997,931 to $1,995,862.
Omicscraft was awarded
Project Grant R44GM145195
worth $1,995,862
from the National Institute of General Medical Sciences in February 2022 with work to be completed primarily in District Of Columbia United States.
The grant
has a duration of 2 years and
was awarded through assistance program 93.859 Biomedical Research and Research Training.
The Project Grant was awarded through grant opportunity PHS 2020-2 Omnibus Solicitation of the NIH, CDC and FDA for Small Business Innovation Research Grant Applications (Parent SBIR [R43/R44] Clinical Trial Not Allowed).
SBIR Details
Research Type
SBIR Phase II
Title
MetaboQuest: A Suite of Tools for Metabolite Annotation
Abstract
MetaboQuest: A Suite of Tools for Metabolite Annotation PROJECT SUMMARY Metabolomics aims at high throughput detection, quantification, and identification of metabolites in biological samples. The use of liquid chromatography coupled with mass spectrometry (LC-MS) has risen in prominence in the field of metabolomics due to its ability to analyze a sizable number of metabolites with a limited amount of biological material. However, in a typical untargeted metabolomics analysis of human samples by LC-MS, about 70% of the detected peaks represent unknown analytes mainly because existing mass spectral libraries cover only a small fraction of known compounds, but also due to uncertainty in peak picking, alignment of peaks, and recognizing isotopic peaks and adduct forms. These challenges have kept at bay the pace of development of data analytics pipelines for metabolomics and its integration with other omics studies. The goal of this Phase II SBIR proposal is to make metabolomics studies on a par with other omics studies such as genomics, transcriptomics, and proteomics, for which well-established pipelines are available. By doing so, we will accelerate the role of metabolomics in systems biology approaches for various applications including biomarker and drug discovery. To achieve this goal, we propose to develop a cloud-based platform that allows customers to build pipelines for analysis of LC-MS-based untargeted metabolomics data, starting from peak detection to metabolite annotation. This will be accomplished by implementing a suite of innovative tools that can be assembled into customized pipelines and by enhancing metabolite annotation accuracy through integration of information derived from multiple resources including compound databases, pathways, biochemical networks, and mass spectral libraries. Aim 1 of this proposal will focus on developing a suite of tools to enable: (1) peak detection, alignment, and quality assessment; (2) adduct and isotopic peak recognition; (3) mass-based search against multiple compound databases; (4) expert-based evaluation of putative IDs; (5) isotopic pattern analysis; (6) network-based evaluation of putative IDs; (7) spectral matching of MS/MS data against experimental and in- silico fragmentation patterns; (8) deep learning-based prediction of compound fingerprints; and (9) integrative assessment of putative metabolite IDs via a probabilistic model. Aim 2 will assemble the tools developed in Aim 1 into a cloud-based platform, MetaboQuest, which provides users with interactive visualization of peaks, isotopic patterns, networks, and mass spectra. Furthermore, Aim 2 will focus on integrating into MetaboQuest a pipeline builder that allows users to create pipelines by linking modules and run them remotely through a modular interactive web interface. Aim 3 will perform a comprehensive evaluation of MetaboQuest in terms of metabolite annotation accuracy, number of annotated metabolites, and computational efficiency compared to other existing tools. Accuracy in metabolite annotation will be evaluated via experimental methods in which MS/MS data from unknown analytes and reference compounds are compared, and by using LC-MS/MS data from multiple metabolomics studies that consist of ground-truth information. Successful implementation and validation of MetaboQuest will contribute to addressing the major bottleneck in metabolomics - metabolite identification, thereby eliminating the need for manual verification of putative metabolite IDs and enhancing the contribution of metabolomics studies, specifically in disease biomarker and drug discovery.
Topic Code
400
Solicitation Number
PA20-260
Status
(Complete)
Last Modified 2/7/23
Period of Performance
2/11/22
Start Date
1/31/24
End Date
Funding Split
$2.0M
Federal Obligation
$0.0
Non-Federal Obligation
$2.0M
Total Obligated
Activity Timeline
Transaction History
Modifications to R44GM145195
Additional Detail
Award ID FAIN
R44GM145195
SAI Number
R44GM145195-2091014262
Award ID URI
SAI UNAVAILABLE
Awardee Classifications
Small Business
Awarding Office
75NS00 NIH NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
Funding Office
75NS00 NIH NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
Awardee UEI
YA6XJ6MLG1J8
Awardee CAGE
7KMJ4
Performance District
98
Budget Funding
Federal Account | Budget Subfunction | Object Class | Total | Percentage |
---|---|---|---|---|
National Institute of General Medical Sciences, National Institutes of Health, Health and Human Services (075-0851) | Health research and training | Grants, subsidies, and contributions (41.0) | $1,995,862 | 100% |
Modified: 2/7/23