Transformers Mimic Brain Functionality and Outperform 42 Models

Follow our official account to discover the beauty of CV technology

This article is reprinted from Quantum Bit.

Pine from Aofeisi Quantum Bit | Official Account QbitAI

Many AI application models today cannot avoid mentioning one model structure:

Transformer.

It abandons traditional CNN and RNN structures, consisting entirely of the Attention mechanism.

Transformers not only empower various AI application models with the ability to write and compose poetry but also shine in multimodal tasks.

Especially after the emergence of ViT (Vision Transformer), the model barriers between CV and NLP have been broken, allowing a single Transformer model to handle multimodal tasks.

(Who wouldn’t marvel at its power after reading this?)

Although Transformers were initially designed for language tasks, they also have great potential in mimicking the brain.

Recently, a science writer wrote a blog about how Transformers model the brain.

Transformers Mimic Brain Functionality and Outperform 42 Models

Let’s take a look at what he said.

Transformers: Doing What the Brain Does

First, let’s outline its evolution process.

The Transformer mechanism first appeared 5 years ago, and its powerful performance is largely attributed to its Self-attention mechanism.

As for how Transformers mimic the brain, continue reading.

In 2020, Austrian computer scientist Sepp Hochreiter’s research team reorganized the Hopfield Neural Network (a memory retrieval model, HNN) using Transformers.

In fact, the Hopfield Neural Network was proposed 40 years ago, and the reason the research team chose to reorganize this model after decades is as follows:

First, this network follows a universal rule: neurons that are active simultaneously establish strong connections with each other.

Second, the Hopfield Neural Network has certain similarities with the Self-attention mechanism of Transformers during the memory retrieval process.

Thus, the research team reorganized HNN to establish better connections between neurons for storing and retrieving more memories.

The reorganization process, simply put, involves integrating the attention mechanism of Transformers into HNN, transforming the originally discontinuous HNN into a continuous state.

Transformers Mimic Brain Functionality and Outperform 42 Models

Image source: Wikipedia

The reorganized Hopfield network can be integrated into deep learning architectures, allowing for the storage and access of original input data, intermediate results, and more.

Therefore, both Hopfield himself and Dmitry Krotov from MIT’s Watson AI Lab stated:

The Hopfield Neural Network based on Transformers is biologically plausible.

Although this somewhat resembles the brain’s working principles, it is not entirely accurate in some aspects.

As a result, computational neuroscientists Whittington and Behrens adjusted Hochreiter’s method, making some corrections to the reorganized Hopfield network to further enhance the model’s performance on neuroscience tasks (replicating neural firing patterns in the brain).

Transformers Mimic Brain Functionality and Outperform 42 Models

Tim Behrens (left) James Whittington (right) Image source: Quanta Magazine

In simple terms, during the encoding-decoding process, the model no longer encodes memories as linear sequences but encodes them as coordinates in high-dimensional space.

Specifically, a TEM (Tolman-Eichenbaum Machine) was introduced into the model. TEM is an associative memory system built to mimic the spatial navigation function of the hippocampus.

It can summarize spatial and non-spatial structural knowledge, predict the neural performance observed in spatial and associative memory tasks, and explain the remapping phenomenon in the hippocampus and entorhinal cortex.

Combining the TEM with Transformers forms the TEM-transformer (TEM-t).

Then, the TEM-t model is trained in multiple different spatial environments, as shown in the structure below.

Transformers Mimic Brain Functionality and Outperform 42 Models

In TEM-t, it still possesses the Self-attention mechanism of Transformers. This allows the model’s learning outcomes to transfer to new environments for predicting new spatial structures.

Research also shows that compared to TEM, TEM-t is more efficient in performing neuroscience tasks, and it can handle more problems with fewer learning samples.

Transformers Mimic Brain Functionality and Outperform 42 Models

Transformers are delving deeper into mimicking brain patterns, which in other words, the development of Transformer models continuously promotes our understanding of how brain functions operate.

Moreover, in some aspects, Transformers can also enhance our understanding of other brain functions.

Transformers Help Us Understand the Brain

For instance, last year, computational neuroscientist Martin Schrimpf analyzed 43 different neural network models to observe their predictive capabilities regarding human neural activity measurements: functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) reports.

Among them, the Transformer model could predict nearly all changes found in imaging.

Looking back, perhaps we can foresee the operations of corresponding brain functions from the Transformer model.

Additionally, recently computer scientists Yujin Tang and David Ha designed a model that can consciously send large amounts of data in a random, unordered manner through the Transformer model, simulating how the human body transmits sensory observations to the brain.

This Transformer, like the human brain, can successfully process unordered information streams.

Although the Transformer model is continuously improving, it is still just a small step toward an accurate brain model, and reaching the finish line will require deeper research.

If you want to learn more about how Transformers mimic the human brain, you can click the link below~

Reference links:[1]https://www.quantamagazine.org/how-ai-transformers-mimic-parts-of-the-brain-20220912/[2]https://www.pnas.org/doi/10.1073/pnas.2105646118[3]https://openreview.net/forum?id=B8DVo9B1YE0

Transformers Mimic Brain Functionality and Outperform 42 Models

END

Join the “Transformer” discussion group👇 Note:TFM

Transformers Mimic Brain Functionality and Outperform 42 Models

Leave a Comment