ARPA-H Biomedical Data Fabric Toolbox

The Big Question

What if new data integration tools made it possible to extract more value out of data? 

The Problem

Each time a health research study is conducted, data is collected and analyzed to find ways to improve health. All those datasets, while powerful individually, could be so much more useful when pooled together with research from across diseases and diverse populations. However, using different platforms to store different datasets, limitations in accessing that data, and the challenge of sharing data while preserving privacy, all make building a common and large data pool – one where datasets can be reasonably compared – more difficult.  Access limitations, siloed data platforms, and a lack of privacy-preserving access methods stymy researchers in trying to analyze critical biomedical data. These barriers make it difficult to leverage data from thousands of labs, hospitals, and centers, because each entity tends to organize and manage data using incompatible biomedical dialects.

The Current State

Today, many data science efforts seek to leverage established technologies and operationalize data infrastructure platforms to make data findable, accessible, interoperable, and reusable (“FAIR”). However, these technologies often fail to improve the quality, standardization, and timeliness of data availability for data collected across thousands of labs and hospitals. 

Current software for experimental research falls short of consistently capturing the fidelity of data provenance, calibration information, and protocol specifications needed to reliably test for experimental reproducibility across different labs. Established technologies are limited in their ability to integrate data from multiple sources and to support intuitive multi-source exploration or data analysis by a range of human users, including through artificial intelligence/machine learning (AI/ML). 

The Challenge

The ARPA-H Biomedical Data Fabric (BDF) Toolbox seeks to make it easier to connect biomedical research data from thousands of sources and overcome barriers caused by incompatible data dialects. The BDF Toolbox effort will seek to advance capabilities in five areas: (1) lowering barriers to high-fidelity, timely data collection in computer-readable forms, (2) preparation for multi-source data analysis at scale, (3) advanced and intuitive data exploration, (4) improving stakeholder access while maintaining privacy and security measures, and (5) generalizability of biomedical data fabric tools across disease types. Together novel data fabric capabilities will lower the barriers associated with data collection, reduce the time needed to integrate new data sources, and improve data usability by community members across disciplines and biomedical literacy levels.

The Solution

The ARPA-H BDF Toolbox Combined Module Announcement called for innovative proposals for research and development (R&D) in data integration and usability technologies. Proposed R&D will investigate innovative software approaches that enable revolutionary advances in the collection and usability of biomedical datasets that originate from thousands of different research labs, clinical care centers, and other sources of data to accelerate technical innovation across the health ecosystem.

ARPA-H has partnered with several institutes and centers at the National Institutes of Health to tackle this problem, with funding available through ARPA-H and the National Cancer Institute in partnership with Frederick National Lab (FNL). Proposers with cancer research data expertise and an interest in developing an integrated capability in partnership with FNL are encouraged to apply through the FNL solicitation which can be found on their Business Opportunities page (Business Opportunities | Frederick National Laboratory). Additional project-related questions specific to the ARPA-H BDF Toolbox should be directed to

Module Announcement

ARPA-H BDF Toolbox Module Announcement

Proposal due date: Closed

Virtual Proposers’ Day

Proposers’ Day: September 27, 2023, 2:00 - 4:00 PM EDT

Proposers' Day video

Frequently Asked Questions

ARPA-H anticipates that teaming will be necessary to achieve the goals of the Biomedical Data Fabric Toolbox. Prospective performers are encouraged to form teams with varied technical expertise to submit a research proposal. To facilitate this process, we have created a teaming page where prospective performers can share their profiles and learn more about other interested parties.

