A More Reliable Algorithm

Petrik receives CAREER award to improve reinforcement learning

Monday, June 27, 2022

In the ever-expanding world of machine learning, there have been positive breakthroughs in reinforcement learning, a computer-based method focused on rewarding desired behaviors and punishing undesired ones.

However, challenges remain when it comes to applying reinforcement learning to important real-world scenarios in which catastrophic failure is unacceptable.

In order to address those issues, Marek Petrik, an assistant professor of computer science, was awarded a prestigious Faculty Early Career Development Program, or CAREER, award from the National Science Foundation. Petrik will utilize the five-year, $575,866 award to conduct theoretical and algorithmic research with his UNH students and collaborators from academic institutions and industrial research labs.

“I am very excited that I will be able to work on this project because I am passionate about this research,” says Petrik, who has worked on reinforcement learning for many years. “This project takes a new, and promising in my opinion, direction.”

Reinforcement algorithms are designed to perceive and interpret their environments, take actions and learn through trial and error. Petrik says that in some domains, like board games, video games and, more recently, chip layout design, reinforcement learning has notched impressive achievements. However, it has been hard to translate them to many physical domains, like health care or agriculture. One of the main reasons is that the existing algorithms are fragile and can fail catastrophically without a warning.

“This work will lead to algorithms that can avoid or at least detect that the learned policy, or actions, are likely to fail,” says Petrik. “This project develops new reinforcement learning algorithms that achieve reliability by carefully balancing the expected quality of recommended decisions with their risk of failure.”

Petrik says improvements in sensors, data collection and computational power have driven the desire to harness data to improve decision-making in various domains, such as precision agriculture or medicine. He expects the new, reliable algorithms will help bring data-driven decision-making to these new domains and plans to integrate the research with educational activities to provide graduate and undergraduate students with training opportunities and new study materials, including a textbook.