Proyecto OPENDATA4ICTS



| Referencia: | RED2022-134332-I |
| Área: | Ciencias Físicas |
| Subárea: | Física de partículas y nuclear |
| Titulo: | Datos en Abierto para Instalaciones Científico Técnicas Singulares Basadas en Aceleradores |
| Tipo: | Redes ICTS |
The optimal use of research infrastructures requires that the wider scientific community has adequate access to the data produced. This requires that data are produced and stored in a way that makes them easy to access. For that purpose, the FAIR principles (Findability, Accessibility, Interoperability, Reusability) have been stated and developed.
The accelerator-based ICTS, CNA and CMAM, which form the distributed network IABA, produce an important set of data in each accelerator experiment carried out. Typically, these experiments generate a set of counts in different detectors of a detector array, which should be complemented with data from the accelerator diagnostics. Such wealth of data is often lost after each analysis, in such a way that users only refer to a count rate, where some background subtraction and fitting are performed. In many other cases, data are stored in specific formats that make any reuse initiative impractical. Proper storage of the raw data obtained from accelerator-based experiments, along with the relevant metadata, would be highly beneficial. It would allow systematic analysis of data from different experiments, detector arrays, and facilities. This, in turn, would improve data analysis procedures by leveraging historical data from previous experiments.
To address these challenges, the project has been structured into several objectives, as outlined below:
Objective 1: Define a Common Data Management Plan
The document "Data Management Policy IABA-ICTS" jointly developed for CNA and CMAM, has been drafted and reviewed by researchers from CNA, CMAM, and external users. It outlines the guidelines for data collection, storage, and sharing within the IABA-ICTS network. The document can be accessed at the following link:
Preliminary IABA-ICTS Data Policy
Objective 2: Define a Standard Format Containing the Relevant Metadata
This objective required collaboration with researchers to define all the necessary parameters to contextualize experiments and interpret the measurement files of the selected techniques for the pilot implementation. Templates were used in which researchers specified the parameter name, an example value (used to determine the data type: integer, string, float, etc.), measurement units, whether the parameter is mandatory or optional, the instrument associated with the parameter, and a clear definition.
As part of this process, a distinction was made between two categories of metadata: general metadata and technical metadata.
- General metadata refers to all the information that improves the discoverability, accessibility, and reuse of experimental datasets. This includes elements such as the proposal title, authors, keywords, experiment dates, and other descriptive fields. These metadata are collected through a centralized proposal management platform, which every researcher must complete to request beamtime at the facilities of the ICTS-IABA node, comprised of the Centro Nacional de Aceleradores (CNA) and the Centro de Micro-Análisis de Materiales (CMAM). The proposal submission portal can be accessed at: https://beamtime.cmam.uam.es/
- Technical metadata, in contrast, are essential for the correct interpretation and reusability of measurement data. These include parameters such as beam characteristics, detector configurations, and experimental geometry. They are critical for understanding the experimental conditions under which the data were acquired.
Furthermore, both CNA and CMAM are actively contributing to the NAPMIX project (Nuclear, Astro, and Particle Metadata Integration for eXperiments), which seeks to develop a common metadata schema for a wide range of experimental techniques used across international research infrastructures. NAPMIX also aims to create the digital tools needed to implement and manage these standards effectively. The work conducted within the OPENDATA4ICTS project is directly supporting this effort, providing a foundation of structured metadata and practical implementation insights.
More information about the NAPMIX project is available at:
https://oscars-project.eu/projects/napmix-nuclear-astro-and-particle-metadata-integration-experiments
The metadata description for each of the pilot techniques can be accessed at the following links:
Objective 3: Develop Specific Software
Once the necessary parameters were defined, the next objective was to develop software tools that enable the creation of standardized metadata files for ion beam analysis experiments. The developed tools guide users through the completion of required metadata fields and automatically generate structured output files suitable for dataset publication and long-term reuse.
Legacy standalone applications (current available)
As an initial implementation, two standalone desktop applications were developed in Java and distributed as Windows executable installers. Those tools are currently functional and available for use to generate metadata files for IBA and TOF-ERDA techniques.
Both applications generate metadata files in JSON format, selected due to its human- and machine-readability and its straightforward conversion to other structured formats when needed.
The applications are distributed as Windows executable installers (.exe) and can be downloaded from the following links:
Since these executables are not digitally signed, Windows may display security warnings during download or installations (e.g., Microsoft Defender SmartScreen). These warnings are expected for unsigned applications distributed outside official software stores. The installers are provided for testing and operational use within the project context. You can safely trust and install the application.
Main functionality
The workflow of the standalone applications is as follows:
- Each tab features a green "Save" button to store the entered data before generating the .json file. Even if fields are filled in, pressing the button is necessary to save them in the generated file. If not pressed, the values will be lost when switching tabs.
- If mandatory fields are left blank, an error message appears, highlighting the missing fields in red. The red border will not disappear until the "Save" button is pressed again.
Additionally, three utility buttons are available:
- Save JSON: Generates the .json file with all saved data, including date and time. All forms must be completed and saved before generating the file; otherwise, a message will indicate the missing fields, highlighting them in red.
- Load JSON: Loads data from an existing .json file of the same structure.
- Clean JSON: Clears all saved or loaded data.
The application accepts both dot (.) and comma (,) as decimal separators, although numerical values are exported using a dot (.). Thousand separators must not be used to ensure correct parsing. All numerical fields associated with units are exported including both value and unit.
IABA-ICTS Metadata Portal (under development)
While the standalone Java applications provide functional solutions for specific techniques, maintaining separate tools for each experiment type limits scalability and increases long-term maintenance effort. To address this limitation, a unified platform is currently being developed. The IABA-ICTS-Metadata Portal is a software platform designed to facilitate the creation, management, and reuse of experimental metadata produced at the Centro Nacional de Aceleradores (CNA) and Centro de Micro-análisis de Materiales (CMAM). The platform provides a user-friendly interface for researchers can enter general and technical metadata through guided forms, without requiring prior knowledge of data management or metadata standards. It is structured into two main components: an administration interface, used to configure experimental resources and manage access, and a user-oriented interface, focused on the visualization, validation, and export of metadata.
Techniques currently supported in the portal
At the time of writing, the portal includes metadata generators for the following techniques and experimental configurations:
- CMAM: TOF-ERDA
- CNA (standard beamline): ERDA, NRA, PIGE, PIXE and RBS.
- CNA (microbeam chamber): IBIC, PIXE and TRIBIC.
The portal is implemented as a Django + React web application. A key improvement introduced by this platform is its integration with the internal MySQL database of the IABA-ICTS node’s proposal management system. This connection allows the portal to automatically retrieve the required general metadata associated with the experiment proposal submitted to the CNA and CMAM. For security reasons, database access is restricted to authorized CNA and CMAM public IP addresses.
The source code of the platform is hosted in the following GitHub repository: IABA-ICTS Metadata Portal Repository.
Note: This repository is private. Only users who have been granted access will be able to view its contents. If access is required, please contact Esta dirección de correo electrónico está siendo protegida contra los robots de spam. Necesita tener JavaScript habilitado para poder verlo.. Access must be explicity authorized; attempting to open the link without permission will result in a GitHub 404 page.
Examples and documentation templates
Examples of the structure of the measurement files can be found at the following links:
All datasets will be accompanied by a README file, which describes both the measurement files and all the metadata file parameters that can be found within it. These files can be viewed at the following links:
Objective 4: Determine the Optimal Storage Policy
Once the datasets, necessary metadata, and tools for generating these files were defined, the next step was to determine the most suitable repository for storing them. Two repositories have been evaluated:
- idUS (https://idus.us.es/), the institutional repository of the University of Seville, enables researchers to publish their datasets with DOI assignment and optional embargo periods.
- Madroño (https://edatos.consorciomadrono.es/), developed by a consortium of Madrid-based universities, provides similar functionalities for controlled access and persistent identification.
- Zenodo (https://zenodo.org/), maintained by CERN, also allows users to deposit datasets with free DOI generation and support for embargoed access.
All repositories align with the developed data policy. Additionally, Zenodo’s API (https://developers.zenodo.org/) is being explored to develop an automated tool for uploading datasets to the repository. This tool will extract the necessary data for filling out the upload form using the pre-generated metadata files.




