The use of an artificial intelligence (AI) foundation model like ChatGPT may be free for the end user, but it doesn’t come without an economic or environmental cost. One run of an AI model can cost up to $500,000 on the cloud and produce emissions that equate to the burning of more than 11,400 pounds of coal.

This demand for higher computational resources can make AI challenging for individuals and small businesses, particularly in low- to middle-income countries, to access.

Professor of Computer Engineering Jun Wang plans to address the efficiency and scalability of foundation models with a $600,000 grant from the U.S. National Science Foundation.

“Foundation models are transforming industries from healthcare to education to creative works,” Wang says. “But the problem is, they are extremely expensive to run. You need a cluster of very powerful graphics processing units (GPUs) and a lot of electricity. That creates [challenges] for small businesses, nonprofits or universities.”

Over the next three years, Wang and his team will tackle different obstacles that reduce the efficiency and speed of AI. First, they will identify any informational bottlenecks that cause running AI models to move around data that they don’t need, which wastes time and energy. They’ll also investigate the scheduling inefficiencies of AI protocols. AI models perform speculative execution, which means they think ahead to save time, but they don’t organize tasks well or always prioritize correctly.

Finally, Wang and his team will also strive to speed up start times and communication with the GPUs that help AI models run. With this three-pronged approach to their research, Wang says his team can improve the effectiveness of AI.

“To reduce the cost and make AI run efficiently and effectively, think of a running AI model like a factory,” Wang says. “You have different teams, machines, workers and logistics, and all of them have to be coordinated. So in AI, those teams are the software algorithm and the hardware components like the GPUs and the operating systems. It just means coordinating across all of them and designing the algorithm and the system together so they work in harmony from the start.”

Through the support of NSF, Wang and his team will work with the University of Illinois Champaign to build a large GPU cluster and access an AI supercomputer to conduct their experiments.

Wang says they’re also working with industry giants like Microsoft, Google, Oracle and Amazon to integrate their proposed AI solutions into their products.

“There’s a lot of exciting things to do in terms of how to make good use of AI, but it’s still expensive for everybody to use and not everybody can afford AI’s computational resources,” Wang says. “By processing the AI models to be faster and more efficient, we reduce the cost and the energy and ultimately allow more people to benefit from the use of AI. And it’s not just about performance and program execution time, it is actually about access.”

If you’re a graduate student interested in working on Wang’s team, contact him at jun.wang@ucf.edu.