Protons are tiny yet they carry a lot of heft. They inhabit the center of every atom in the universe and play a critical role in one of the strongest forces in Nature. 

And yet, protons have a down-to-earth side, too.  

Like most particles, protons have spin that act like tiny magnets. Flipping a proton’s spin or polarity may sound like science fiction, but it is the basis of technological breakthroughs that have become essential to our daily lives, such as magnetic resonance imaging (MRI), the invaluable medical diagnostics tool.

Despite such advancements, the proton’s inner workings remain a mystery.  

“Basically everything around you exists because of protons – and yet we still don’t understand everything about them. One huge puzzle that physicists want to solve is the proton’s spin,” said Ben Nachman, a physicist who leads the Machine Learning Group in the Physics Division at the Department of Energy’s Lawrence Berkeley National Laboratory (Berkeley Lab). 

Understanding how and why protons spin could lead to technological advancements we can’t even imagine today, and help us understand the strong force, a fundamental property that gives all protons and therefore atoms mass.     

But it’s not such an easy problem to solve. For one, you can’t exactly pick up a proton and place it in a petri dish: Protons are unfathomably small – Their radius is a hair shy of one quadrillionth of a meter, and visible light passes right through them. What’s more, you can’t even observe their insides with the world’s most powerful electron microscopes.

Recent work by Nachman and his team could bring us closer to solving this perplexing proton puzzle.   

Two people wearing face masks in discussion while looking at a laptop. In the background is a chalkboard with calculations.

As a member of the H1 Collaboration – an international group that now includes 150 scientists from 50 institutes and 15 countries, and is based at the DESY national research center in Germany – Nachman has been developing new machine learning algorithms to accelerate the analysis of data collected decades ago by HERA, the world’s most powerful electron-proton collider that ran at DESY from 1992 to 2007. 

HERA – a ring 4 miles in circumference – worked like a giant microscope that accelerated both electrons and protons to nearly the speed of light. The particles were collided head-on, which could scatter a proton into its constituent parts: quarks and gluons. 

Scientists at HERA took measurements of the particle debris cascading from these electron-proton collisions, what physicists call “deep inelastic scattering,” through sophisticated cameras called particle detectors, one of which was the H1 detector.

Unfolding secrets of the strong force

The H1 stopped collecting data in 2007, the year HERA was decommissioned. Today, the H1 Collaboration is still analyzing the data and publishing results in scientific journals.

It can take a year or more when using conventional computational techniques to measure quantities related to proton structure and the strong force, such as how many particles are produced when a proton collides with an electron.  

The HERA electron-proton collider accelerated both electrons and protons to nearly the speed of light. The particles were collided head-on, which could scatter a proton into its constituent parts: quarks (shown as green and purple balls in the illustration above) and gluons (illustrated as black coils).

The HERA electron-proton collider accelerated both electrons and protons to nearly the speed of light. The particles were collided head-on, which could scatter a proton into its constituent parts: quarks (shown as green and purple balls in the illustration above) and gluons (illustrated as black coils). (Credit: DESY)

And if a researcher wants to examine a different quantity, such as how fast particles are flying in the wake of a quark-gluon jet stream, they would have to start the long computational process all over again, and wait yet another year. 

A new machine learning tool called OmniFold – which Nachman co-developed – can simultaneously measure many quantities at once, thereby reducing the amount of time to run an analysis from years to minutes.

OmniFold does this by using neural networks at once to combine computer simulations with data. (A neural network is a machine learning tool that processes complex data that is impossible for scientists to do manually.)

Nachman and his team applied OmniFold to H1 experimental data for the first time in a June issue of the journal Physical Review Letters and more recently at the 2022 Deep Inelastic Scattering (DIS) Conference.

To develop OmniFold and test its robustness against H1 data, Nachman and Vinicius Mikuni, a postdoctoral researcher in the Data and Analytics Services (DAS) group at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC) and a NERSC Exascale Science Applications Program for Learning fellow, needed a supercomputer with a lot of powerful GPUs (graphics processing units), Nachman said.

Coincidentally, Perlmutter, a new supercomputer designed to support simulation, data analytics, and artificial intelligence experiments requiring multiple GPUs at a time, had just opened up in the summer of 2021 for an “early science phase,” allowing scientists to test the system on real data. (The Perlmutter supercomputer is named for the Berkeley Lab cosmologist and Nobel laureate Saul Perlmutter.)

“Because the Perlmutter supercomputer allowed us to use 128 GPUs simultaneously, we were able to run all the steps of the analysis, from data processing to the derivation of the results, in less than a week instead of months. This improvement allows us to quickly optimize the neural networks we trained and to achieve a more precise result for the observables we measured,” said Mikuni, who is also a member of the H1 Collaboration.

A central task in these measurements is accounting for detector distortions. The H1 detector, like a watchful guard standing sentry at the entrance of a sold-out concert arena, monitors particles as they fly through it. One source of measurement errors happens when particles fly around the detector rather than through it, for example – sort of like a ticketless concert goer jumping over an unmonitored fence rather than entering through the ticketed security gate.

Correcting for all distortions simultaneously had not been possible due to limited computational methods available at the time. “Our understanding of subatomic physics and data analysis techniques have advanced significantly since 2007, and so today, scientists can use new insights to analyze the H1 data,” Nachman said.

Scientists today have a renewed interest in HERA’s particle experiments, as they hope to use the data – and more precise computer simulations informed by tools like OmniFold – to aid in the analysis of results from future electron-proton experiments, such as at the Department of Energy’s next-generation Electron-Ion Collider (EIC). The EIC – to be built at Brookhaven National Laboratory in partnership with Thomas Jefferson National Accelerator Facility – will be a powerful and versatile new machine capable of colliding high-energy beams of polarized electrons with a wide range of ions (or charged atoms) across many energies, including polarized protons and some polarized ions.

“It’s exciting to think that our method could one day help scientists answer questions that still remain about the strong force,” Nachman said. “Even though this work might not lead to practical applications in the near term, understanding the building blocks of nature is why we’re here – to seek the ultimate truth. These are steps to understanding at the most basic level what everything is made of. That is what drives me. If we don’t do the research now, we will never know what exciting new technological advances we’ll get to benefit future societies.”

In addition to Nachman and Mikuni, H1 team members at Berkeley Lab include Fernando Torales Acosta and Yao Xu of the Physics Division, and Peter Jacobs of the Nuclear Sciences Division.

NERSC is a DOE Office of Science user facility located at Berkeley Lab.

The work was supported by the NERSC Exascale Science Applications Program, the Berkeley Lab Laboratory Directed Research and Development program, and the DOE Office of Science.


Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 16 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit

Photo – CERN's ATLAS detector undergoes upgrades in preparation for its next round of particle physics experiments, which is scheduled to begin in 2022. Photo - A new wheel-shaped muon detector is part of an ATLAS detector upgrade at CERN. This wheel-shaped detectors measures more than 30 feet in diameter. A new study applies “unfolding,” or error-correction techniques from the field of particle physics and applies them to problems with noise in quantum computing. (Credit: Julien Marius Ordan/CERN) Image - The 2020 LHC Olympics, a machine learning challenge, challenged teams to develop codes that could find a hidden signal in particle-collision data. This image shows particle-collision data captured by the ATLAS detector at CERN's Large Hadron Collider. (Credit: CERN)