Persson machine learning

Materials Project Director Kristin Persson (Credit: Roy Kaltschmidt/Berkeley Lab)

Kristin Persson is at the helm of a materials revolution.

Since 2011, she has led the Materials Project, an open-access online database that virtually delivers the largest collection of materials properties to scientists from every corner of the globe who are searching for the next big thing in batteries, solar cells, and computer chips.

Harnessing the power of supercomputers at Berkeley Lab’s National Energy Research Scientific Computing Center (NERSC), and customized machine-learning algorithms based on state-of-the art quantum mechanical theory, Persson developed the Materials Project with open-access service, accuracy, speed, and user-friendliness in mind.

Scientists seeking to design a better battery electrode, for example, only need to log into their free Materials Project user account. A few keystrokes here, a mouse click there, and users enter the online database’s vast, virtual catalog of most known inorganic materials and thousands more that may exist. The Materials Project narrows the 124,000 inorganic compounds, and some 35,000 molecules, down to the best candidate – without the Materials Project, that search would take months to do.

“The Materials Project is unique in its ability to calculate a multitude of properties using high-quality first-principles calculations for materials research. With our data we can serve everyone – industry, academia, the whole world – without having to compete for profit in the private sector,” said Persson, a computational materials scientist who holds titles of senior faculty scientist in the Energy Storage & Distributed Resources Division in Berkeley Lab’s Energy Technologies Area and professor of materials science and engineering at UC Berkeley.

“And as somebody who passionately cares about the environment, I just want to come up with the next clean-energy solution as fast as possible,” she said.

In the Q&A below, Persson shares what inspired her to launch the Materials Project, her thoughts on the future of materials research and machine learning, and how she found her own way into a STEM (science, technology, engineering, and math) career.

Q: What inspired you to launch the Materials Project database?

materials project machine learning

Scientists seeking to design a better battery electrode only need to log into their free Materials Project user account to take advantage of the online database’s vast, virtual catalog of most known inorganic materials and thousands more that may exist. (Credit: Materials Project/Berkeley Lab)

Persson: When I was a postdoc at MIT, I was working on what’s known as density functional theory, a technique for modeling the electronic structure of materials in their ground state, or the material’s lowest energy state. At the time, DFT was still fairly new and the group I was in had just started to explore how the technique could be used in high-throughput computing, a technique that automatically runs the same analytical process simultaneously on multiple computer systems.

Word had gotten around about our work. And in 2004, a U.S.-based battery manufacturer asked us if we could use our high-throughput computing technique – which uses multiple computers to automatically run the same process over thousands of compounds – to search for a better material for its battery’s electrode chemistry.

In addition to funding the project, our industry partner gave us free time on their supercomputer. Having access to that much computing power really opened up a new world for me. I was comfortable with using computational DFT techniques to understand how individual materials work, but the idea of turning it around and using it on a supercomputer as an automated screening vehicle was game-changing. Suddenly you can screen hundreds of materials per day for a specific property, learn about chemistry and structural trends, and become smarter about where to look. Without a supercomputer, screening those same materials would take a team of researchers months to complete.

The data from that project laid the foundation for the Materials Project. And when I was hired by Berkeley Lab in 2008, I brought that vision with me. During my second year here, I got funding from the Laboratory Directed Research and Development program to develop the nascent Materials Project’s capabilities and make it open access so it could serve a diverse community of materials scientists – like battery researchers, photovoltaics researchers, and researchers who specialize in data storage materials. In 2011, we launched the Project to the public and we have since continuously improved it with more materials, better search capabilities, and even more importantly, more diverse coverage of properties and analyses algorithms. Recently, and thanks to our broad and comprehensive datasets, we are adding state-of-the-art machine-learning algorithms to help researchers understand and identify functional materials.

materials project 2

Simulation of terbium fluoride (Credit: Materials Project/Berkeley Lab)

Today, the Materials Project is the largest materials data provider in the world, serving data more than a million times a day to more than 120,000 users all over the world, and it’s been cited by thousands of papers.

Nobody has ever had this kind of data at their fingertips before. It’s a complete paradigm change in that sense. It’s exciting to know that researchers all over the world are publishing papers that used data from the Materials Project.

Many of them are energy-related researchers, spanning batteries, catalysis, photovoltaics, thermoelectrics, et cetera, but I’ve been pleasantly surprised to see it used in other fields, like alloy design, scintillators, high-pressure and magnetic materials, and even astrophysics. It is extremely rewarding when people call you up and say, “Hey, a paper published in this journal said they used the Materials Project to understand the formation of concrete in space!”

The Materials Project wouldn’t have been able to generate all that data without the support of the Basic Energy Sciences program within the Department of Energy’s Office of Science and Berkeley Lab’s supercomputers at NERSC. Similarly, many of the crucial, early software and architecture choices were made together with experts in the Computational Research Division. The interdisciplinary nature of the Project – combining domain knowledge, high-performance computing, and modern data infrastructure and dissemination, is really perfectly suited for a national lab, where you can build collaborative, long-term teams with permanent staff.

Q: How can the Materials Project help to accelerate technological advancements for clean energy? 

Persson: The loop of materials design, synthesis, and characterization is traditionally intensely time-consuming. We hope that data-driven approaches fueled by computations can accelerate each aspect of that loop, enabling new materials for powerful rechargeable batteries for electric cars, or semiconductors that could make artificial photosynthesis a reality. With the Materials Project, clean-energy researchers can virtually test hundreds to thousands of components and then focus on the most promising candidates, use simulations and associated machine learning to accelerate the identification of new materials, and use computational insights and guidelines for optimal synthesis conditions.

As our data grows, we are building machine-learning tools and curated datasets into the database, which saves researchers time and money so they can focus on their important work to help the world. And because we cast it in a way that any materials scientist can understand, such as phase diagrams, bandgaps, and electronic conductivity, I can see the Materials Project becoming a cornerstone in all materials scientists’ portfolio because they don’t have to become a computational expert to use this data – however, as with all data, they do need to understand its limitations and level of accuracy.

Image - Clip art showing rechargeable batteries in the shape of a car.

Credit: JanWillemKunnen/iStock

Q: What’s your dream machine-learning materials app?

Persson: Harnessing both experimental and computed data with on-the-fly machine learning for rapid iterations and insights. With machine learning, the fuel is the data. And researchers from both industry and academia agree that if we want to take advantage of what machine learning has to offer, we still need high-quality, diverse, curated data.

As someone whose role is to provide that data, I’m very interested in what robotics can do for the experimental side of materials science. Robotically automated materials synthesis could help us gather high-quality, robust data by making sure that an experiment is done exactly the same way every time it’s performed. And that’s very hard to do with humans, because people are different and will perform the same task in slightly different ways.

I am often asked if robots will replace scientists. Robots, just like the supercomputers at NERSC, are extremely powerful tools to produce data faster and more robustly. However, robots will not replace humans. They will just broaden our experience; enable us to make better, informed decisions; and help us focus on what we do best – use our amazing and creative human brain to solve the scientific and engineering problems of the day.

Q: What’s next for the Materials Project?

Persson: I’d like to do more industry outreach and make the Materials Project an integrated part of both the academic as well as the industrial science process. When I was a graduate student, density functional theory was a fairly young technique, so if you’re a manager at a semiconducting company and you haven’t hired anybody who completed their Ph.D. in the last 15 years, you probably don’t even know that materials databases like the Materials Project even exist.

In 2009, computational materials scientist Kristin Persson used funding from Berkeley Lab’s Laboratory Directed Research and Development program to develop the nascent Materials Project’s capabilities and make it open access to serve a diverse community of materials scientists. (Credit: Roy Kaltschmidt/Berkeley Lab)

I’d also like to collaborate with our partners across the national laboratory system. I see the Materials Project growing into a data institute, harnessing both computed as well as standardized experimental datasets, where we not only provide large sets of machine-learning data to other labs and industry researchers but we also work directly with them so they know how to use all of the machine-learning features and simulations that the Materials Project has to offer.

Q: When you were a child, did you dream of becoming a scientist?

Persson: No, not really. Actually, when I was very young, I wanted to be an opera singer. I loved singing – I still do, and when I was little, opera seemed like the perfect environment for that. Then I considered becoming an archaeologist. I was drawn to archaeology because I love history and enjoy discovering how people lived – I was always fascinated by the idea of unearthing stories of people from ancient eras: what they thought, what they believed in, and how they lived day to day.

Q: Were you always good at math?

Persson: It depends on how far back you are asking. Between the ages of 7 and 11, I had pretty mediocre grades across the board.

I remember a particular, standardized math test, at the age of 10, that I didn’t do well on. Feeling very disappointed and honestly nervous about my future, I started doing an hour of math a day by myself, without a tutor. I did basic math – I learned by redoing all sorts of problems wherever I could find them in textbooks just to make sure I understood what was going on.

It wasn’t easy because no one was directing me. Instead, it was my own growing ambition and determination that drove me. By the time I was 12, I was at the head of my class in every single subject.

Q: What led you to computational materials science?

Persson: When I was in college I initially wanted to study medicine, but I ended up studying engineering physics, which is very broad and fast-paced. And it was during that time when I fell in love with quantum mechanics. I thought it was the most beautiful thing ever – physics suddenly made sense together with the math, and it was gorgeous.

When I completed my master’s degree – my thesis was on neutrino oscillations, which is essentially theoretical particle physics – I was awarded a doctoral fellowship that would allow me to go wherever I wanted to go.

After interviewing four different professors in four very different fields, I ended up choosing the computational materials group in the Theoretical Physics Department at the Royal Institute of Technology in Stockholm, Sweden, because I liked their methodology. They used simulations together with theoretical frameworks to figure out how materials work on the fundamental level of electrons and atoms.

And that’s why I tell my graduate students, “Don’t expect that by the age of 25 you will know exactly what you want to do in life. There are so many interesting topics when you dig deeper.” And for me, it was important that I was happy with the methodology, the every-day tasks, and getting along with the people you work with.

###

The Materials Project is supported by the DOE Office of Science.

NERSC is a DOE Office of Science User Facility located at Berkeley Lab.

Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 13 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.