A new way to build neural networks could make AI more understandable

5 months ago admin

A tweak to the way artificial neurons work in neural networks could make AIs easier to decipher. Artificial neurons—the fundamental building blocks of deep neural networks—have survived almost unchanged for decades. While these networks give modern artificial intelligence its power, they are also inscrutable. Existing artificial neurons, used in large language models like GPT4, work…

The simplification, studied in detail by a group led by researchers at MIT, could make it easier to understand why neural networks produce certain outputs, help verify their decisions, and even probe for bias. Preliminary evidence also suggests that as KANs are made bigger, their accuracy increases faster than networks built of traditional neurons.

“It’s interesting work,” says Andrew Wilson, who studies the foundations of machine learning at New York University. “It’s nice that people are trying to fundamentally rethink the design of these [networks].”

The basic elements of KANs were actually proposed in the 1990s, and researchers kept building simple versions of such networks. But the MIT-led team has taken the idea further, showing how to build and train bigger KANs, performing empirical tests on them, and analyzing some KANs to demonstrate how their problem-solving ability could be interpreted by humans. “We revitalized this idea,” said team member Ziming Liu, a PhD student in Max Tegmark’s lab at MIT. “And, hopefully, with the interpretability… we [may] no longer [have to] think neural networks are black boxes.”

While it’s still early days, the team’s work on KANs is attracting attention. GitHub pages have sprung up that show how to use KANs for myriad applications, such as image recognition and solving fluid dynamics problems.

Finding the formula

The current advance came when Liu and colleagues at MIT, Caltech, and other institutes were trying to understand the inner workings of standard artificial neural networks.

Today, almost all types of AI, including those used to build large language models and image recognition systems, include sub-networks known as a multilayer perceptron (MLP). In an MLP, artificial neurons are arranged in dense, interconnected “layers.” Each neuron has within it something called an “activation function”—a mathematical operation that takes in a bunch of inputs and transforms them in some pre-specified manner into an output.

A new way to build neural networks could make AI more understandable

Finding the formula

Training robots in the AI-powered industrial metaverse

The Download: the future of nuclear power, and fact checking Mark Zuckerberg

What’s next for nuclear power

You may have missed

Hewlett Packard Enterprise secures $1 billion AI server deal for Elon Musk’s X, Bloomberg News reports

Perplexity AI looks to expand in India, seeks new talent for strategic growth

Genetic risk factor linked to increased susceptibility to SARS-CoV-2 infection

CBS taps veteran exec to help tackle 'perceived bias' at struggling network

LA asset management firms overseeing more than $4T scramble to recover after wildfires

Categories

Useful Links

Finding the formula

More Stories

You may have missed

Categories

Useful Links