Big Sleep Data: LUMI Supercomputer Trains Neural Networks for Sleep Research

Two graduate students at Aarhus University in Denmark have contributed to international research by developing tools for training neural networks on LUMI.

Words: Marie Charllotte Søbye, DeiC

Andreas Larsen Engholm and Jesper Strøm can’t help smiling when it’s mentioned during the interview that, as students, they’ve already made a significant contribution to research. Nevertheless, it’s true because the back-end tools these two young men have developed to train selected sleep scoring models on the LUMI supercomputer will be freely available on GitLab in a user-friendly form. These tools will make it easier for future researchers to load more sleep data. More data leads to better sleep scoring models and more accurate interpretations of sleep data because neural networks become increasingly skilled at automatically reading sleep stages correctly.

Machine Learning with Big Data in Sleep Research

In the field of sleep scoring a ‘gold standard’ exists, where a sleep expert, using a manual, determines the sleep stage of the individual sleeper every 30 seconds throughout a night’s recording. This is a task that calls heavily for automation. Can we create an analysis model that replicates what a sleep expert would have answered? The task was to train neural networks to perform sleep scoring based on 20,000 PSG (Polysomnography) recordings to see the impact of working with such a large dataset. It was all about how well they could train the neural networks.

The Work Begins: Normalising 21 Datasets Takes Time

A significant part of the work in this project involved programming the back end to be able to load 20,000 nights’ worth of data (which was the sum of the 21 datasets) in a sensible way.

“The major work of normalising all the data for our models was actually what took the longest time, and that pre-processing pipeline is now accessible to other researchers and students, making it much easier to load dataset number 22. We emphasised finding a sustainable, scalable solution that could be used by others in the future,” Andreas Engholm says.

Without LUMI, we probably would have abandoned the project

The project utilised a total of 3500 GPU hours. If a single GPU had done the work, it would have taken 145 days, longer than the entire 4-month thesis period.

“In reality, we probably would have abandoned the project if we didn’t have access to LUMI. We would have had to move data back and forth because there wasn’t enough space, making it very inconvenient,” Jesper Strøm explains.

Fact Box

When: April to June 2023
Allocation: 5000 Terabyte hours, 3500 GPU hours on LUMI via DeiC’s “Sandbox”
Solution: Software designed to run on parallel GPU nodes and temporary storage of up to 50 TB of data
Student: Andreas Larsen Engholm, M.Sc. Computer Engineering, AU
Student: Jesper Strøm, M.Sc. Computer Engineering, AU
Advisor: Kaare Mikkelsen, Assistant Professor, Biomedical Technology, Department of Electrical and Computer Engineering, AU

Resources

LUMI supercomputer: https://www.lumi-supercomputer.eu
Apply for resources on LUMI: https://lumi-supercomputer.eu/
HPC/LUMI Sandbox:https://www.deic.dk/en/Supercomputing/Instructions-and-Guides/Access-to-HPC-Sandbox
SLURM Learning:https://www.deic.dk/en/news/2022-11-21/virtual-slurm-learning-environment-ready
Cotainr for LUMI: https://www.deic.dk/en/news/2023-9-20/cotainr-tool-should-make-it
GitLab tools developed for pre-processing of sleep data on LUMI: https://gitlab.au.dk/tech_ear-eeg/common-sleep-data-pipeline

This article is featured on CONNECT 44, the latest issue of the GÉANT CONNECT Magazine!

Read or download the full magazine here

TagsCONNECT CONNECT44 DeiC HPC LUMI

Big Sleep Data: LUMI Supercomputer Trains Neural Networks for Sleep Research

Two graduate students at Aarhus University in Denmark have contributed to international research by developing tools for training neural networks on LUMI.

Machine Learning with Big Data in Sleep Research

The Work Begins: Normalising 21 Datasets Takes Time

Without LUMI, we probably would have abandoned the project

Fact Box

Resources

About the author

Silvia Fiore

Two graduate students at Aarhus University in Denmark have contributed to international research by developing tools for training neural networks on LUMI.

Machine Learning with Big Data in Sleep Research

The Work Begins: Normalising 21 Datasets Takes Time

Without LUMI, we probably would have abandoned the project

Fact Box

Resources

You may also like

Aligning Strategies: Shaping the Future of NREN Service Portfolios – SIG-MSP meets in Paris

GÉANT CEO visits GARR Headquarters

AfricaConnect3 Impact Report now live!

Regular Passes available for TNC25

Expanding EU-LAC collaboration: The SPIDER project launches a call for ideas to leverage BELLA connectivity

FileSender Online Infoshare: update on release 3.0 and security approach

About the author

Silvia Fiore

Share

Copy short link