
CMU Team Rises to Amazon Nova AI Challenge
Team Pr1smCode, composed of students from the Language Technologies Institute, Software and Societal Systems Department, and Electrical and Computer Engineering Department, represents CMU in the global competition
For the second year in a row, a team from Carnegie Mellon University will compete in the Amazon Nova AI Challenge, a global academic competition in which 10 university teams compete in a tournament-style format that tests their abilities in AI-based software development.
The CMU team, Pr1smCode, includes students from multiple programs within the Language Technologies Institute (LTI), as well as students from the School of Computer Science’s Software and Societal Systems Department (S3D) and the College of Engineering’s Electrical and Computer Engineering Department (ECE).
The LTI students represent two different research groups, led respectively by the team’s two advisers, Kavčić-Moura Professor Carolyn Rosé and Associate Professor Lei Li.
“It’s been a great opportunity for cross-departmental collaboration,” Rosé explained. “CMU is the kind of place that really encourages boundary-spanning collaboration. We can quickly mobilize groups of interestingly different expertise because that’s the kind of culture we have.”
Shubham Gandhi, an LTI Ph.D. student who was a member of last year’s team and is a lead for Pr1smCode, echoed Rosé’s sentiment.
“We have folks with different realms of expertise,” he said. “This year’s challenge has a good amount of engineering workload. We have a nice mix of Ph.D. and master’s students, and the master’s students help a lot with their knowledge of Amazon Web Services (AWS) and with the engineering workflow that we’ve had to go through.”
Li added that CMU’s culture and curriculum make it the ideal place to foster the qualities that make good competitors in such a challenge.
“On the research side, CMU and LTI both invest heavily in large language models and agents, and of course we have CyLab and we invest heavily in security research,” Li said. “In terms of education, we offer a few foundational courses that really prepare our students for this, including Advanced NLP, Large Language Models and a new course on agents.”
This year’s Nova AI Challenge focuses on trusted software agents, emphasizing multistep, agentic application development to create AI agents that can independently plan and execute complex processes to complete software tasks end-to-end, while also ensuring safety and security.
As in last year’s competition, the 10 teams are split into two groups: the Model Developer teams, who work to build the agents that will carry out the project tasks; and the Red teams, who build “cybersecurity agents” that test the software development agents via techniques that probe for weaknesses, vulnerabilities and potential exploits. Last year’s LTI team was on the Model Developer side, but Pr1smCode assumes the Red team role this year.
“I hadn’t worked on the cybersecurity agent side before, and it helps open up a really nice perspective on agents,” Gandhi said. “You get to be the devil’s advocate. You learn to anticipate what the other side might do.”
Rosé agreed, adding that teams on both sides devote time to working from the other side’s perspective.
“There’s a lot of value in exploring the tension between those two things,” she said, noting that this year's emphasis on the interaction between software agent and user fosters the "both sides" approach.
Nachiket Kotalwar, another LTI Ph.D. student and team lead, explained that testing real-world user interactions is part of the Red team's work.
“As a cybersecurity agent team, we’re also working on a benign user simulator,” he said. “But how a user would behave depends a lot on how the coding agent behaves. You can’t make a user simulator in isolation.”
The safety and security aspects of the challenge also dovetail with a point of emphasis found throughout the LTI: responsible and ethical AI development.
“Safety is something that emerges through the interaction between the AI systems and the users,” Rosé said. “There are multiple stakeholders who are responsible, and we can’t just say ‘I’m creating the safe thing and then I’m going to be safe.’ We need a whole ecosystem that’s safe. This competition gives us the opportunity to really think about it in those terms, and at CMU we have a lot of different flavors of work on AI safety that runs the whole gamut.”
For Li, the competition is a chance to show the real-world stakes of ensuring safety in AI systems and to germinate research that advances the techniques that protect that safety.
“Often in AI research, we just test functionality: whether an agent can perform certain tasks,” he said. “But an equally important question is ‘Can we trust them?’ A piece of AI-generated software might leak a password. It might develop a website with vulnerabilities to DDOS attack or SQL injection, so a malicious user can pull out internal data or bypass authentication. A competition like this, in a simulated environment, allows us to safely examine and mitigate these types of vulnerabilities. It really benefits the research.”
After the competition, each team publishes research on their work.
Pr1smCode’s student team leads said they're grateful for the chance to represent CMU on a global stage, but the competition’s greatest benefit may be the opportunity for collaboration.
“One thing I really appreciate about this challenge is getting to work with people from so many different areas of expertise,” Kotalwar said. “In my own Ph.D. work, I might have focused narrowly on one benchmark or one problem. Here, the tasks are inherently collaborative, and you’re competing against other universities in real time. Everyone keeps improving, and that pushes you to improve, too.”
Gandhi echoed that appreciation, pointing to the collaborative atmosphere the competition fosters as something that stands in contrast to the increasingly solo nature of industry research.
“Nowadays a lot of research labs are moving toward a more lone-wolf style,” he said. “This competition brings people together to work on problems they wouldn’t have naturally tackled on their own. You sit in a room with people from so many different backgrounds, and what felt like one huge problem becomes hundreds of smaller ones that everyone can contribute to.”
The team also expressed gratitude for the resources Amazon makes available to participants, most notably substantial monthly AWS compute credits that allow experiments at a scale simply not possible on an academic budget.
“It gives us the ability to do experiments that go well beyond what we would normally be able to do, and I think that kind of investment in academic research is something worth acknowledging and appreciating,” Rosé said.
The Nova AI Challenge began in February 2026 and continues through December 2026. The finalists will compete remotely in a closed-door evaluation in November, and winners will be announced and teams will present their research findings at the Amazon Nova AI Challenge Summit in December.
Learn more about the competition on the Nova AI Challenge website.
