NDFI4DS – Infrastructure for research data in data science

Jan. 01, 2021 to Dec. 31, 2026

Problem / Task

A paradigm shift has taken place over the last few years: Computational methods increasingly rely on data-driven and often deep learning-based approaches. As a result, data science has established itself as a discipline driven by advances in the field of computer science.

Transparency, reproducibility and fairness have become key challenges for data science and artificial intelligence due to the complexity of data science methods – which are often based on a combination of code, models and data. NFDI4DS promotes FAIR and open research data infrastructures that support all involved digital artefacts such as code, models, data or publications through an integrated approach.

Solution / Implementation

NFDI4DataScience (NFDI4DS) is an interdisciplinary consortium of the National Research Data Infrastructure (NFDI). The overarching objective of NFDI4DS is to develop, construct and establish a national research data infrastructure for the data science and artificial intelligence research community. The resulting data management solutions will also be of benefit to a wider community – even beyond the NFDI.

The core idea of NFDI4DataScience is to improve the transparency, reproducibility and fairness of projects in the fields of data science and artificial intelligence by providing access to all digital artefacts, linking them together and offering innovative tools and services. Based on the re-use of these digital artefacts, innovative research will be made possible.

In the initial phase, NFDI4DS will focus on four application areas: Language Technology, Life Sciences, Information Sciences and Social Sciences.

Sonja Schimmler, research group leader at Fraunhofer FOKUS and visiting professor at TU Berlin, is the consortium's spokesperson. Fraunhofer FOKUS is particularly responsible for implementing a metaportal that serves as an entry point for the scientists.

NFDI: National Research Data Infrastructure

NFDI4DataScience is part of the National Research Data Infrastructure (NFDI). There, valuable data sets from science and research are systematically catalogued, networked and rendered usable in a sustainable and qualitative manner for the entire German science system. Up until now, these have mostly been available on a decentralised, project-related or temporary basis. The NFDI is intended to create a permanent digital knowledge repository as an indispensable prerequisite for new research questions, findings and innovations.

NFDI consortia are associations of different institutions within a research field and work together on an interdisciplinary basis to realise the objective. The non-profit organisation National Research Data Infrastructure (NFDI) e.V., based in Karlsruhe, was founded to coordinate the activities involved in establishing a National Research Data Infrastructure. Together, the association and NFDI consortia will shape the future of research data management in Germany.