headerbild

Featured

The Pierre Auger Observatory Open Data

The challenge of making cosmic-ray data open and FAIR

Data from the Pierre Auger Observatory come from a variety of instruments and take many forms, starting from either raw experimental or simulated data through reconstructed data and higher level data generated by analysis workflows, all the way up to data presented in scientific publications. The data result from an enormous and long-term human and financial investments by the international community.

The Collaboration is committed to the public release and provides accompanying software tools to offer to a broader community, including professional and citizen scientists, a unique opportunity to explore and analyse the data at various levels of complexity. This is inspired by the FAIR (Findable, Accessible, Interoperable, and Reusable) principles.

This paper describes the Collaboration’s approach to data open access and the implementation of this complex and continuing effort based on providing the user with:

  • support and facilitation: a detailed explanation of detection techniques, data reconstruction and selection
  • a portable and flexible file format: the use of JSON (JavaScript Object Notation) and CSV (comma-separated values)
  • analysis code and tutorials: Jupyter Notebooks in Python for easy data manipulation.

Following the Auger Collaboration Open Data Policy, the Open Data Portal https://opendata.auger.org/ contains the public release of 10% of the Pierre Auger Observatory cosmic-ray data published in scientific papers and at international conferences. The cosmic-ray dataset consists of more than 80 000 showers, measured with the surface detector (SD), and of more than 3 000 hybrid events, i.e. recorded simultaneously with the surface detector and the fluorescence detector (FD). 100% of the atmospheric and space-weather data are also available. Data are released under the CC BY-SA 4.0 International License at the Zenodo DOI: https://doi.org/10.5281/zenodo.4487612

Brief overviews of the Pierre Auger Observatory and an online event display to explore the released cosmic-ray events and example analysis codes are provided. An outreach section, built in the same spirit as the research part, with the same data but in a simplified format, dedicated to the general public is also available.

A user-friendly interface is available for selecting and browsing each public event by specifying an event ID or a range of reconstructed variables, such as the energy or the zenith angle. The browser contains an immersive 3D animation from the arrival direction of the cosmic rays to the detection of the created extensive air shower with the instruments of the Observatory, see figure 1:

 

Figure 1: The highest energy multi-eye hybrid event in the UHECR catalog, PAO100815 (id 102266222400): the reconstructed zenith angle is 54° , the energy 82 EeV. It triggered 22 stations of the surface detector and four fluorescence detectors.

The UHECR catalog
The events published in the catalog of the 100 highest-energy cosmic-ray events, with reconstructed energy between 76 EeV and 166 EeV, collected during Phase I of the Observatory’s data taking (between 2004 and 2021) used in the study of the arrival directions of events above 32 EeV, along with the nine highest-energy hybrid events used for their calibration, are available for inspection and download in the UHECR catalog section of the Portal.

For these events, all details of the reconstruction are available in the event summary card, such as the Coordinated Universal Time (UTC), the energy, the zenith and azimuth angles, the declination and right ascension, the multiplicity of triggered stations, and, for hybrid events, the quantities measured with the fluorescence detector, such as energy and depth of shower maximum. Additional features can be viewed: not only the footprint at the ground can be displayed, but also that on the shower plane, and besides the lateral distribution of the shower particles, the user can also see the time delays of the signals with respect to a plane shower front. The associated JSON files contain the calibrated traces for each photomultiplier tube in the triggered stations. Displays for the highest-energy hybrid event in the UHECR catalog are shown in figure 2:

 

Figure 2: The highest energy multi-eye hybrid event in the UHECR catalog, PAO100815 (id 102266222400): footprint with respect to the shower plane (top left panel); lateral distribution of the signals as a function of the distance from the shower axis (top right panel); time delays of the signals with respect to a fit with a plane shower front (bottom left panel) and reconstructed energy deposited in the atmosphere (bottom right panel).

 

The catalog demonstrates the quality of the data that lie behind measurements of the energy spectrum, the distribution of arrival directions, and the mass of the highest-energy cosmic rays that have been reported by the Pierre Auger Collaboration in recent publications. The full publication of the top 100 events is in line with the Collaboration’s commitment to sharing its data and results with the scientific community and to promote the exchange of knowledge between experiments.

The Open Data can be analyzed using the provided Python Jupyter Notebooks (www.jupyter.org). Tutorial examples are provided in the Portal introducing the Python programming language and its use with the Open Data. More advanced analysis codes are simplified re-implementation of parts of analyses published by the Collaboration. All the Notebooks can be run online on a web browser via the Kaggle (www.kaggle.org) platform or downloaded together with the datasets. The graphical output of exemplary notebooks as energy calibration, spectrum, mass composition, and arrival directions is shown in figure 3:

2024 10 update large angular scale studies3

Figure 3: graphical output of the analysis notebooks: energy calibration and spectrum (top), mass composition and arrival directions (bottom).

 

Impact and Perspectives
The Auger Open Data have been used in diverse scientific publications in refereed journals or on arXiv. They have also been exploited in world-wide outreach events, dedicated to high-school and higher-level students, focused on learning physics and enjoying programming and data analysis, such as the International Cosmic Day and the IPPOG International Masterclasses program.

Since the portal first publication in 2021, the total visits number more than 60000 from all around the world, while downloads of cosmic-ray data samples number more than 4000.

In June 2023, the Pierre Auger Collaboration Board approved the increase of the fraction of released cosmic-ray data to 30%. The Collaboration members are convinced that this will further boost the interest in using the Observatory data. The Observatory has completed its first phase of data taking. It has recently been upgraded with additional detectors, such as surface detector scintillators, underground muon detectors, radio antennas, and with upgraded electronics added to each surface detector station. Future data will be easily integrated into this framework to produce Phase II open data, to the release of which the Auger Collaboration will undoubtedly maintain its dedication and commitment.

 

Related papers:

The Pierre Auger Observatory Open Data
The Pierre Auger Collaboration, Eur. Phys. J. C 85 (2025) 70
[https://arxiv.org/abs/2309.16294] [doi: 10.1140/epjc/s10052-024-13560-5]

A Catalog of the Highest-Energy Cosmic Rays recorded during Phase I of Operation of the Pierre Auger Observatory
The Pierre Auger Collaboration, Astrophys. J., Suppl. Ser. 264 (2023) 50
[arxiv.org/abs/2211.16020] [doi: 10.3847/1538-4365/aca537]

 

Observatorio Pierre Auger
Av. San Martín Norte 304
Malargüe, Mendoza, Argentina
https://visitantes.auger.org.ar/

These contents are released under the  CC BY-SA 4.0 International License, unless explicitly stated differently.

© 2024 Pierre Auger Observatory

sm fb  sm fb  sm twitter  sm flickr  youtube

Legal Notice
Privacy Policy

communications(∂)auger.org
webmaster(∂)auger.org