"The modern development approach of SIDESTREAM made me curious. I was convinced by the confident communication with the many stakeholders in the research center and the flexible handling of varying requirements during the course of the project. SIDESTREAM's software meets our scientific standards and makes our everyday research work much easier."
Digitalization is also making its way into research. Our customer, ZEA-3, is a facility of Forschungszentrum Jülich (Research Center Jülich) with a focus on analysis services in the field of compositional analysis.
Through the use of various methods, procedures and technologies, ZEA-3 analyses physical samples from all over the world. The results of the laboratory processes serve as the basis for scientific studies and thus form an important part of research. The data is also essential for external customers to work on new innovations and products.
The challenge: Complex processes and lots of data
These scientific laboratory processes are elaborate and complex. In order for samples to be analyzed, they must first be prepared for the procedures used. In addition, the manufacturers of the individual laboratory devices have specific formatting of data sets. This brings with it a high number of results and formats.a
For uniform structuring, the data are therefore transferred to Excel tables. This presentation of the data is all the more important because different research areas need access to the results.
Therefore, an error-free assignment of the specific results is crucial. This is associated with a clear structuring, as often several hundred samples are examined in parallel. These individual procedures and results build on each other and form a coherent laboratory process.
The requirement: step by step to a fully autonomous laboratory
The complex processing and preparation of the measurement results is time-consuming and error-prone. Therefore, ZEA-3 turned to SIDESTREAM to develop an automated solution. The goal was to develop a comprehensive software that would enable the entire examination process to be carried out fully autonomously in the long term. It is of particular relevance that no errors occur, since the measurement results comply with scientific standards and the institute's customers trust in the accuracy of the data. Accordingly, the efficiency and user-friendliness of the software must not be achieved at the expense of data governance.
Complex processes, simple solution
Unser Ansatz war es dabei den Gesamtprozess zu analysieren und wichtige grundlegende Prozessschritte zu identifizieren. Dazu war eine enge Kommunikation mit den Wissenschaftlern und eine hohe Testabdeckung des Endprodukts von großer Bedeutung.
Der Fokus unserer Arbeit lag dabei auf der Datenqualität und den FAIR-Prinzipien . Dabei steht das Akronym FAIR für Findable (Auffindbar), Accessible (Zugänglich), Interoperable (Interoperabel) und Reusable (Wiederverwendbar).
- die einzelnen Formate der Analyse-Technologien und Verfahren fehlerfrei auszugeben,
- Messwerte korrekt zu berechnen,
- und wissenschaftliche Standards einzuhalten.
We succeeded in preparing the various formats and results clearly in a web application. The basis for this was formed by the various result files of the individual processes. Using these, we reverse-engineered and derived input, output and the process function. Based on our findings, we programmed a first version for an analysis procedure. An important aspect of this project was the test runs with the scientists. In addition to automated data processing, the application presents the results and formats in a clear and structured way so that the institute's staff can use the data efficiently. In the future, the entire analysis process is to be fully automated so that the scientists can concentrate on interpreting the data.
Technology Deep Dive
In addition to the large number of steps and data sets, repetitive loops also make the entire analysis process very complex. This is because the process steps do not run in a strictly straight line. Certain measurement results of the laboratory instruments can lead to repetitions of the respective analysis. These verification loops must be covered in the software to guarantee the accuracy of the results. For this purpose, we have modeled the entire process as a Finite State Machine (FSM). This allows absolute control over the process. This is because FSMs cannot be bypassed. The process will not continue until the correct conditions have been fully met. Especially in data-centric processes, this approach is essential for success.
This FSM-based solution is a perfect example of a human-in-the-loop process. The scientists only perform a few expert steps and play the final results back into the system. The system consists of several Docker microservices. These were deployed locally at the research center. The data is managed in a PostgreSQL database. The majority of the code forms a Python backend, while the user application is a modern VueJS web application. All the code here is checked by a high test coverage. At least 90 percent must be achieved at every point in the code. Otherwise, the automatic test and merge process does not allow the introduction of new features ("merge in master"). This eliminates potential errors at an early stage and enables an application that meets scientific standards.