Expected Outcome: Proposal results are expected to contribute to the following expected outcomes:
• provide new secure and energy-efficient data management tools improving the usability and discoverability of data in different contexts, covering data provenance, synthetic data generation, data quality management (such as data cleaning, validation, enrichment, co-creation, identification of bias and correlations), improving data interoperability, metadata management (automated ways of labelling and describing data, data linkage), and ensuring data security, privacy and integrity, especially in the context of data spaces.
Scope: The actions under this topic are expected to provide practical, robust and scalable tools to improve the interoperability, quality, and integrity of data and metadata, in the context of other topics of the heading “Data sharing in the common European data space”. The data management tools and systems should support a holistic approach of the data life cycle and comply with accountability, fairness and confidentiality as well as the FAIR principles (Findable, Accessible, Interoperable, Reusable) for data and metadata management. Building on results of relevant past and current initiatives, data management tools, systems and processes are expected to enable, support and/or automate the creation and maintenance of common ontologies, vocabularies and data models and/or structured, standardised and automated authoring, co-creation, curation, annotation and labelling of data, in view of different later uses (especially AI) made of the data. The actions are expected to create links with relevant initiatives collecting/using heterogeneous/linguistic data, including AI initiatives (such as AI4EU, European Language Grid, or the projects from the H2020 topic ICT-48), and liaise with standardization bodies, where appropriate.
Actions are expected to deal with gaps and needs identified in real-world data space management and real-world data heterogeneity challenges (encoding formats, multiple languages, collection mechanisms, access methods, etc.), supporting, where necessary, hybrid/adaptive approaches and models, leading to robust, reliable and automated annotation of unstructured data sources. The tools should contribute to minimization of the energy footprint, be adaptable to different user needs and support and encourage new business models and (where appropriate) citizen involvement and social innovation. The tools should be demonstrated by diverse use cases. Provision of open source tools is encouraged to contribute to outreach and impact.
In this topic the integration of the gender dimension (sex and gender analysis) in research and innovation content is not a mandatory requirement.
This topic implements the co-programmed European Partnership on Artificial Intelligence, Data and Robotics.