Artificial intelligence is widely used today to combine and interpret different types of information, including text, images, audio and video. Among the challenges is the need to decide which algorithm is best suited for a particular task.
Physicists from Emory University have proposed a more clear and systematic approach, writes xrust. In a paper published in The Journal of Machine Learning Research, they describe a new mathematical model that codifies artificial intelligence techniques and guides the development of algorithms to solve specific problems.
“We found that many of today's most successful artificial intelligence techniques boil down to one simple idea—compressing different types of data just enough to retain only the bits that actually predict what you want,” says Ilya Nemenman, a professor of physics at Emory University and senior author of the study. “This gives us a sort of ‘periodic table’ of AI techniques. Different methods fall into different cells depending on what information the method’s loss function retains or discards.”
A loss function is a mathematical formula that measures how much an AI model's predictions deviate from the correct answer. During the training process, the system is constantly adjusted to reduce this error. The lower the loss function, the better the model performs.
“Hundreds of different loss functions have been developed for multimodal AI systems, and some may be better than others depending on the context,” says Nemenman. “We wondered if there was an easier way than starting from scratch every time you run into a problem in multimodal AI.”
Variational multidimensional information structure of the bottleneck
class=»notranslate»>__GTAG5__ To answer this question, the team developed a general mathematical framework for constructing task-specific loss functions. Their method focuses on determining which information should be retained and which can be discarded. They call it a variational multidimensional information structure.
“Our technique is essentially like a regulator,” says co-author Michael Martini, who worked on the project as a postdoctoral fellow at Emory University and a postdoctoral fellow in Nemenman's group. «You can 'tune the governor' to determine what information should be stored to solve a particular problem.»
“Our approach is general and principled,” adds Eslam Abdelalim, the first author of the paper. Abdelalim began the work as a physics graduate student at Emory University, but graduated in May and moved to Georgia Tech as a postdoctoral fellow.
“Our goal is to help people create artificial intelligence models that are tailored to the problem they solve,” he says, “and to enable them to understand how and why each part of the model works.”
Using this platform, AI developers can propose new algorithms, predict which ones are most likely to succeed, estimate the amount of training data needed, and anticipate possible points of failure.
“Equally important,” says Nemenman, “this could allow us to develop new artificial intelligence techniques that are more accurate, efficient and trustworthy.”
Physical approach to machine learning
Researchers approached AI development differently than many others in the machine learning community.
“The machine learning community is focused on achieving accuracy in a system without necessarily understanding why that system works,” Abdelalim explains. “However, as physicists, we want to understand how and why things work. So we focused on finding fundamental, unifying principles that would allow us to link different artificial intelligence methods together.”
Abdelalim and Martini began by working through equations by hand, trying to find the core idea behind the complexity of modern artificial intelligence methods.
“We spent a lot of time sitting in my office and writing things on the board,” Martini says. “Sometimes I would write on a piece of paper and Eslam would look over my shoulder.”
This work lasted for several years. They developed the mathematical foundations, discussed them with Nemenman, tested the ideas on computers, and were often forced to go back to the beginning unless they tried approaches that turned out to be ineffective.
“It was a long process of trial and error, having to keep coming back to the board,” Martini says.
A moment of insight and a surprise in the form of a smart watch
Their breakthrough came when they identified a single principle that describes the balance between data compression and recovery. This idea reflected the trade-off at the heart of many artificial intelligence techniques.
“We tested our model on two test datasets and showed that it automatically identified common, important features between them,” says Martini. “It made me happy.”
After intense negotiations leading to this discovery, Abdelalim checked his Samsung Galaxy smartwatch as he left campus. The device uses artificial intelligence to monitor health indicators such as heart rate. However, it misinterpreted his delight that day.
“My watch showed that I had been riding my bike for three hours,” says Abdelalim. “That’s how they interpreted my level of arousal.” I thought, 'Wow, this is really something! Apparently science can work like that.'»
Testing the framework and looking to the future
To evaluate their approach, the team applied the developed framework to dozens of existing artificial intelligence methods.
“We have performed computer demonstrations that show that our general framework performs well on benchmark problems on benchmark datasets,” says Nemenman. “We can derive loss functions more easily, which can solve problems that matter to us with less training data.”
Since this structure helps eliminate unnecessary functions, it can also reduce the computational costs of artificial intelligence systems.
“By helping you choose the best approach to artificial intelligence, this platform allows you to avoid coding unimportant functions,” says Nemenman. “The less data required to operate a system, the less computing power is required to run it, making it less harmful to the environment. It could also open the door to cutting-edge experiments to solve problems that we currently cannot solve due to the lack of existing data.”
The researchers hope that others can use this technique to develop algorithms tailored to specific scientific problems.
They also continue to expand their work. One of their areas of interest is biology, including attempts to identify patterns associated with cognitive function.
“I want to understand how your brain simultaneously compresses and processes multiple sources of information,” says Abdelalim. “Can we develop a method that allows us to see similarities between a machine learning model and the human brain?” This may help us better understand both systems.»
Xrust Physicists have created a “periodic table” of artificial intelligence
- Если Вам понравилась статья, рекомендуем почитать
- Greg Kroah-Hartman tests AI tool "Clanker" to find bugs in the Linux kernel
- Anthropic Announces Breakthrough: AI Mythos Changes Cybersecurity







