Automating Cross-modal Processing for Accessible Expert-in-the-loop Scientific Workflows

Introduction

Analysis of data is the basis for progress in science (Batista et al., 2021). Frontier scientific research operates in an environment defined by unprecedented volumes of data and an acceleration in the number of measurement types (Abdalla et al., 2022, National Academies of Sciences et al., 2023, ESA, 2023). In the European single market, missions and experiments in astronomy and particle physics performed by large-scale scientific projects are resulting in a deluge of data releases (Cuillandre et al. 2024, Dubath et al. 2016, The ATLAS Collaboration, 2023, The CMS Collaboration, 2019, Alexandrov et al., 2024). Large-scale scientific projects challenge organisations and academic institutions to identify data formats that shorten the time to scientific insight and open access to researchers with a range of cognitive and perceptual abilities. In addition implementing systems to identify suitable formats and generate outputs comes with its own technical and economic challenges.
In this research, we assess the opportunity presented by the emergence of AI systems designed for multi-step workflows to deliver data determined by the requirements of scientists and researchers in astronomy and particle physics. This forms the basis for identifying system designs to serve scientific communities. Two principles inform the design of systems able to automate efficient processing of data and generate outputs that enable scientific discovery:

The first principle is innovation in the preprocessing and generation of data in the form of different modalities (Angelidakis et al., 2024, Team et al., 2023). Presenting information in multiple modalities serves as a means to broaden the scope of analytical options available to researchers (Pérez-Montero et al., 2022b).
The second principle is to deliver performance capabilities that can run multi-step workflows. These systems are enabled by the combination of Large Language Models (LLMs) and Large Multimodal Models (LMMs) - and development frameworks (Wu et al., 2023, Liu et al., 2023).

AI in the form of Large Language Models (LLMs) and Large Multimodal Models (LMMs) are displaying early abilities to perform complex tasks (Huang et al., 2024, Carrasco et al., 2025). The release of these models is coinciding with the development frameworks to perform workflows consisting of several steps or subtasks (Wu et al., 2023, Liu et al., 2023). In this research, we assess the opportunity presented by the emergence of AI systems designed for multi-step workflows delivering data determined by the requirements of scientists and researchers. This forms the basis for identifying systems designs to serve scientific communities in astronomy and particle physics. Research questions defining the scope of this research follow.

Research Questions

Which formats improve scientific workflows across the community of investigators (with specific consideration of persons with disabilities as investigators)?
Which systems are best adapted to generating accessible and multimodal formats for data outputs?
How can framework design be specified to improve identification of formats that accelerate discovery and insight?
What are the most efficient methods for using machine reasoning for format identification and generation?
Which framework designs are most suited to enable efficient discovery, interpretation, and communication of insights from scientific data?

Quicklinks

Hauptnavigation

Automating Cross-modal Processing for Accessible Expert-in-the-loop Scientific Workflows

Introduction

Research Questions