Can NLP Work Like the Human Brain? Insights from CMU and MIT

Analyst Network of Machine Heart

Analyst: Wu Jiying

Editor:Joni Zhong

As an important research topic in the fields of computer science and artificial intelligence, Natural Language Processing (NLP) has been extensively studied and discussed across various domains. With the deepening of research, some scholars have begun to explore whether there are connections between natural language processing in machines and in the brain, extending to the intersection of neuroscience and NLP pre-training methods like BERT. This article selects three papers from the CMU Wehbe research group (2 papers) and MIT Professor Roger P. Levy’s group (1 paper) to provide a detailed analysis and discussion on this topic.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Since Google AI proposed BERT (Bidirectional Encoder Representations from Transformers), BERT has achieved excellent results in the field of Natural Language Processing (NLP), becoming one of the most significant advancements in recent NLP developments. BERT is a bidirectional encoder representation of Transformers, completing pre-training by jointly mediating the context of all layers in the model. Additionally, it can be fine-tuned with an extra output layer, allowing it to quickly adapt to various NLP tasks such as language inference and question-answering systems while maintaining its original architecture.
Unlike many studies that enhance language model performance using BERT, CMU’s Wehbe research group has recently focused on a very interesting question: Understanding the relationship between natural language processing in machines and in the brain, which can also be viewed as an interdisciplinary study of language models (NLP) and neuroscience. When discussing various AI models applied to natural language processing tasks, there is a common expectation that they can achieve human-level performance in text comprehension tasks. Therefore, can models operate in a manner similar to the representations found in the human brain? The Wehbe research group is particularly focused on using brain activity records to interpret the representations of an AI model—BERT—and attempting to find heuristic methods to improve them, even altering the weights learned by the network so that they can function like the brain.
Regardingthe understanding of the relationship between natural language processing in machines and in the brain, the Wehbe group has published two papers in NeurIPS 2019, which are:
  • A method that utilizes brain activity records from subjects reading natural text to compare and explain the representations generated by different neural networks (“Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)”). This paper aligns four NLP models: ELMO, BERT, USE, T-XL, with two methods of brain activity recording: functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG), demonstrating what information corresponding to brain recordings is contained in the representations extracted by different NLP models, such as contextual information, part-of-speech information, etc.

  • Encoding target information from prediction tasks into model parameters to improve BERT’s capability in predicting neural activity related to language processing in the brain (“Inducing brain-relevant bias in natural language processing models”). While recording brain activity using neuroimaging devices (fMRI, MEG), subjects are presented with language stimuli (e.g., reading a chapter of a book word by word or listening to a story), using the representations extracted from NLP models corresponding to the presented text to simulate the recorded brain activity. By fine-tuning BERT to find representations that generalize well to both human brains and the types of recordings, improvements to BERT are achieved.

In addition to focusing on the work of the Wehbe research group, we also analyze other research outcomes in this field, such as the article by MIT’s Professor Roger P. Levy’s group, “Linking artificial and human neural representations of language,” which has a different focus from the Wehbe group. Its primary research goal is to explain “what information obtained from sentence comprehension behavior is strongly expressed in the human brain?” That is, where is the most meaningful intersection between neuroscience and BERT? They use BERT and fine-tune it across different Natural Language Understanding (NLU) tasks, aiming to enhance brain decoding performance and find this key information..
1. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
Paper link: https://arxiv.org/pdf/1905.11833.pdf
First, let’s take a look at how to interpret the relationship between natural language processing in machines and in the brain. Deep neural networks applied to NLP tasks have achieved remarkable results, and these deep neural network models seem to capture certain characteristics of human language. So, what are the characteristics they capture or the features extracted by the models? Earlier, some researchers conducted work on sequential models like LSTMs and RNNs to assess how neural networks propagate information, explore what information word embeddings represent, and investigate representations in network layers through NLP tasks that detect specific linguistic information. Similar research on non-sequential models like transformers is relatively scarce.
In this paper, the authors propose a new method for interpreting neural networks, using the only processing system we can understand language with—the brain—to explain neural networks. In fact, according to relevant neuroscience research, the brain can represent complex linguistic information when processing language. Therefore, this paper considers using brain activity records as features for these representations. By aligning neural network representations with brain activity, effective network layer representations for completing NLP tasks can be found.
Specifically, to align the performance of a certain layer in the neural network with brain activity, this paper proposes learning a model that can predict the fMRI or MEG activity of each brain region, as shown in Figure 1, which displays a view of the brain relative to the head. Using the methods in [4], prior knowledge is extracted, where regions in Group 1 (white) process information related to isolated words and word sequences, while Group 2 (red) only processes information related to word sequences. V represents the visual cortex.
By determining the model’s predictions of brain activity through classification tasks and significance tests: if a layer’s representation in the model can accurately predict the activity of a brain region, it can be concluded that this layer shares information with that brain region, allowing for inferences about the suitable representation of that layer based on prior knowledge of that brain region.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 1. Schematic diagram of the proposed method in this paper.
1. Experimental analysis of aligning neural networks and brain activity
To align neural networks and brain activity, the authors selected four neural network models: ELMO, BERT, USE, and T-XL, along with two methods of recording and displaying brain activity data: functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) for extensive experimental analysis.
fMRI is very sensitive to changes in oxygen content in the blood caused by neural activity, possessing high spatial resolution (2-3 mm) but low temporal resolution (multiple seconds). MEG is primarily used to measure changes in the magnetic field outside the skull caused by neural activity, characterized by low spatial resolution (multiple centimeters) but high temporal resolution (up to 1KHz).
The task of aligning neural networks and brain activity is described as follows: for the representations generated by the neural network model, x_l,k, use an encoding model to input the representation and complete the same reading task of k words as when generating x_l,k, predicting the brain activity at that time. Given the function f, f(x_l,k)=y, where y is the record of brain activity (fMRI or MEG). The authors define f as a linear function, adding ridge regression regularization to extract the relationship between x_l and y. The model is trained through four-fold cross-validation, and the regularization parameters are selected through nested cross-validation.
In the four-fold cross-validation setup, the authors evaluate predictions by using each encoding model in the classification task on reserved data. The classification task is to predict which group of words is being read based on the feature representations of two groups of words. This task is completed using 20 sets of consecutive words sampled from fMRI (considering the slowness of hemodynamic response) and 20 sets of random words sampled from MEG. The authors performed multiple classifications for each voxel in fMRI and each sensor/time point in MEG, obtaining final average classification accuracy assessments. The authors refer to the accuracy of matching the predictions of the encoding models to the correct brain recordings as “prediction accuracy.”
A voxel (volume pixel) is a volume element in three-dimensional space, representing the smallest unit of digital data in three-dimensional space (volume and pixel). In fMRI, the MRI pixel intensity is proportional to the signal intensity of the corresponding voxel.
This experiment also supports other viewpoints related to brain science and NLP. Since MEG signals are faster than the presentation speed of word representations, MEG recordings are better suited for studying the composition of word embeddings compared to the slower fMRI, which cannot correspond to individual words. The part-of-speech of a word and its ELMo embedding can predict shared brain activity approximately 200 milliseconds after the word appears. In fact, from electrophysiological studies, it is known that certain speech stimuli only elicit responses about 200 milliseconds after the appearance of a word in the frontal lobe.
Interpreting Contextual Long Representations
One question of concern in NLP is whether the model can integrate long contextual representations into its representations. The authors investigate whether the four NLP models considered can create integrated representations of text sequences by comparing the performance of encoding models trained with two types of representations: Task (1) corresponding to the latest single-word token shown to the subject (word-embedding); Task (2) representing the recent 10 words (10-word representation), thus being context-related. Figure 2 presents a qualitative comparison of the four models, where each of the eight subjects only includes the most significantly predictive voxels, controlling the false discovery rate at the 0.05 level.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 2. Performance comparison of the two types of network representations across models.
Figure 3 provides a quantitative analysis of the prediction differences for different models regarding regions of Group 1b and Group 2. We observe similarities in the performance of word embeddings across all models, which predict the brain activity in the left and right regions of Group 1b and to some extent in Group 1a. Among them, ELMo, BERT, and T-XL long contextual representations can predict subsets of regions in Group 1 and Group 2.Contextual long representations (with almost no blue voxels) can also predict most of the content predicted using word embeddings. Thus, the authors conclude that contextual long representations likely include information about long contextual text and recent word embedding information. Additionally, the experiments in Figure 2 also indicate that the contextual long representation of USE in the upper right can predict activity in a smaller subset of Group 2 regions. The low performance of USE may be caused by its coarse average operation in word formation: its contextual long representation only includes long-term information.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 3. Each network-based method can predict the quantity of Group 1b and Group 2 regions well.
Relationship between Neural Network Layer Depth and Context Length
Figure 4 shows the changes in ELMo, BERT, and T-XL across different layers. In all networks, the intermediate layers perform best for contexts longer than 15 words. Additionally, the deepest layers in all networks show a dramatic increase in performance for shorter contexts (fewer than 10 words). The only exception is T-XL, which is the only model that continues to improve performance as context length increases, consistent with previous research conclusions (T-XL is generally believed to better represent longer text information than transformers).

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 4. Performance comparison of encoding models in ELMo, BERT, and T-XL as the number of contexts provided to the network increases.
From the experiments in Figure 4, the authors find that the behavior of the first layer in BERT differs from that in the first layers of the other two neural network models. The authors present in Figure 5 the performance changes of the encoding models from the first layer to other layers in BERT, considering the performance of other layers based on the performance of the first layer in this experimental scenario. The variation pattern of BERT is consistent with that of T-XL in Figure 3. There is almost no performance change in the first layer of BERT as the context length varies. This indicates that the first layer combines token-level embedding information in a way that limits the retention of longer contextual information in that layer.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 5. Performance changes of the encoding models from the first layer to other layers in BERT.
Impact of Attention Mechanism on Layer Representations
The authors further analyze the impact of the attention mechanism in different layers of the model through experiments. In this paper, the authors use a unified attention mechanism on the representations of the previous layer instead of learning attention mechanisms. For BERT, the attention equation is:

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Replacing the pretrained matrices Wi.^Q, Wi.^K, and Wi.^V with unified attention Attn(Q,K,V), ensuring equal probability of values in the value matrix V, changing only one layer at a time while keeping all other parameters of the pretrained BERT unchanged. From the experiments in Figure 6, changes in the attention mechanism affect the performance of deeper layers (but not the output layer). However, surprisingly, improvements in shallow layer performance using the unified attention mechanism can only be observed for contexts of 25 words.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 6. Changes in the performance of the encoding models in the first layer of BERT using unified attention mechanisms.
2. Improving NLP from the Perspective of Brain Interpretation
Through the previous alignment experiments, it is known that applying unified attention mechanisms can improve the performance of the first half of the layers of the basic BERT model when predicting brain activity. Next, the authors test how changing NLP tasks affects BERT’s ability to predict language.
The task in the experiment is described as follows: First, a complete sentence is input to BERT, masking single-focus verbs (e.g., [CLS]the game that the guard [mask] bad.). Then, the pre-trained language model head is used to predict the masked position, and finally, the prediction accuracy is calculated by comparing the original correct verb (e.g., is) with the incorrectly predicted verb (e.g., are). Attention mechanisms are focused on the first to sixth layers of the basic BERT, changing one layer at a time, with all other parameters remaining the same as in the previous experiment, and evaluating across 13 tasks (specific tasks are shown in Table 1). Table 1 presents the results of modifying layers 1, 2, and 6. In 8 of the 13 tasks, the modified model significantly outperformed the pre-trained model (“base”), while in the remaining 5 tasks, 4 of them showed similar performance.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Table 1. Performance of modified models in subject-verb agreement across different sentence structures.
3. Discussion of the Article
This article proposes a method that utilizes brain activity records from subjects reading natural text to compare and explain the representations generated by different neural network models, including:
  • Using MEG data to demonstrate that ELMo (non-contextual) word embeddings contain information about word length and part of speech;

  • Using fMRI data to demonstrate that the representations obtained from different models (ELMo, USE, BERT, T-XL) contain information related to language processing encoded at different contextual lengths;

  • USE’s long contextual representations differ from those of other models, as its representation does not include any short-context textual information;

  • Transformer-based models (BERT and T-XL) capture context information most relevant to the brain in their intermediate layers;

  • T-XL combines recursive and transformer properties, unlike pure recursive models (e.g., ELMo) or transformers (e.g., BERT), maintaining performance even in long contextual situations.

The experimental results in this article also indicate that using unified attention mechanisms can improve brain prediction performance in shallow layers (1-6) compared to using learned attention mechanisms. Based on this experimental result, this article tests how the modified BERT representation affects its ability to predict language in grammatical NLP tasks. It can be seen that the improved BERT performs better in most tasks. The experiments in this article suggest a possibility: how to modify NLP models to better align with the brain’s records of processing language may enable NLP models to understand language better..
2. Inducing brain-relevant bias in natural language processing models
Paper link: https://arxiv.org/pdf/1911.03268.pdf
This paper primarily investigates how researchers present language stimuli (stimuli) to subjects while recording brain activity using neuroimaging devices (fMRI, MEG, or EEG), simulating recorded brain activity using representations extracted from NLP models corresponding to the presented text. If the NLP model can be explicitly trained to predict language-induced brain recordings, it can introduce brain-relevant language representations into the NLP model, thereby further improving the NLP model. This paper is based on BERT and fine-tunes across multiple subjects and brain activity recording patterns to find representations that generalize well to both human brains and recording types.
Before this paper, some researchers had conducted studies on the relationship between language-related brain activity and NLP models. The primary direction of research was to utilize NLP models to extract vector (embedding) representations of words, sentences, or texts, and then correlate these vectors (embeddings) with brain activity fMRI or MEG recordings. However, few researchers have attempted to use brain activity to modify the representations extracted by NLP models.
The research results presented in this paper show that fine-tuned BERT can enhance the ability to predict brain activity. Furthermore, representations learned from MEG and fMRI are more suitable for predicting fMRI than representations learned solely from fMRI, indicating that representations learned from MEG and fMRI can effectively capture information related to brain activity rather than merely the modality’s artifacts, making them more meaningful for studying neuroscience imagery.
1. BERT Model Framework Used in This Paper
In the experiments of this paper, the authors used MEG and fMRI data recorded while subjects read a chapter from Harry Potter and the Philosopher’s Stone. In both experiments, the content of this chapter was presented word by word, with each word appearing on the screen for 0.5 seconds, totaling 5176 words.
The authors used the BERT framework from Devlin, J. 2018[1], where each layer first applies a self-attention mechanism (which combines embeddings that are most similar to each other in several potential aspects) to transform its input embeddings. Then, these combined embeddings undergo further transformations to generate new features for the next layer. The authors utilized the pre-trained BERT version provided by Hugging Face in PyTorch. This model includes 12 layers and is trained on BooksCorpus and Wikipedia to predict masked words in the text and classify whether two word sequences are contiguous in the text.
In the BERT framework, each input sequence is appended with two special tokens, where [SEP] marks the end of the sequence, and [CLS] is trained as the sequence-level representation of the input for classification tasks. A simple linear layer is added to fine-tune BERT, mapping the output embeddings from the base architecture to the desired prediction task. By adding this linear layer, the model achieves end-to-end fine-tuning, meaning all parameters of the model change during fine-tuning. In the experiments of this paper, the authors also connected the word length and context-independent log probabilities of each word to this output layer, as shown in Figure 7.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 7. Framework for fine-tuning BERT using fMRI and/or MEG data.
First, the pre-trained BERT model is modified to better capture language information relevant to the brain. The experiment trains the model to predict fMRI and MEG data while subjects read the same chapter of a novel, recording each data point (at different times and from different subjects). fMRI records blood-oxygen-level-dependent (BOLD) responses, which represent the relative oxygen content of a specific brain region, a function of the activity level of neurons in that region. However, the BOLD response peaks 5 to 8 seconds after neuronal activation in a region. Due to this delay, the model predicting brain activity must access words before the fMRI images are captured. Therefore, 20 words (covering a 10-second time span) are used as input for the model, disregarding sentence boundaries.
Compared to fMRI recordings, MEG recordings have higher temporal resolution. For each word, data can be collected from 306 sensors at 20 time points. In the experiments using MEG data, the model predicts 6120 (306×20) values for each word. Moreover, the experiments only train and evaluate content word models (content words include adjectives, adverbs, auxiliary verbs, nouns, pronouns, proper nouns, or verbs). If the BERT tokenizer splits a word into multiple tokens, MEG data are added as the first token of that word. The MEG data are aligned with all content words from the fMRI examples (i.e., the content words from the previous 20 words before each fMRI image).
The authors recorded fMRI data for each subject four times, using the same chapter division as the fMRI tests to record MEG data four times. The cross-validation is performed using fMRI tests, meaning that for each fMRI run, the examples from the other three runs are used to train the model, and the fourth run is used to evaluate the model.
Finally, the fMRI and MEG data also require preprocessing. The first 20 and last 15 fMRI images from each run are removed to avoid warm-up and boundary effects. Additionally, the words associated with this excluded data are not used for MEG predictions. The fMRI data from the runs are linearly detrended and standardized so that the variance of each voxel is 1 and the mean is 0. MEG data are also detrended and standardized in the fMRI runs (i.e., in the cross-validation folds) so that each sensor component has a mean of 0 and variance of 1 across all content words during the run.
2. Analysis of Experimental Conditions
Model
To thoroughly validate the content of this study, the authors used multiple fine-tuned BERT models:
  • [Vanilla Model] The baseline BERT model, adding a linear layer to the pre-trained BERT model for each subject, training this linear layer to map the [CLS] embedding to the subject’s fMRI data. During training, the parameters of the pre-trained model are frozen to ensure that the embeddings do not change. Depending on the different comparative models in various experiments, the Vanilla model can train for 10, 20, or 30 phases.

  • [Participant-transfer Model] This model is used to study whether the relationship between text and brain activity learned through fine-tuning the BERT model is universally present across subjects. First, the model is fine-tuned based on the subject with the most predictable brain activity. In this fine-tuning process, only the linear layer is trained for 2 phases, followed by training the entire model for 18 phases, after which all parameters of the model are fixed. For other subjects, a linear layer is trained for the first subject under their current experimental conditions. These linear models were only trained for 10 phases, allowing for comparisons with the results of the Vanilla model trained for 10 phases.

  • [Fine-tuned Model] This model verifies whether a fine-tuned model can predict data for each subject. Based on the linear mapping of the Vanilla model, the model for each subject undergoes fine-tuning. Only the linear layers of these models are trained for 10 phases, followed by training the entire model for 20 phases.

  • [MEG-transfer Model] This model is used to investigate whether the relationship between text and brain activity can be learned through a model that accurately transfers MEG data to fMRI data. First, BERT is fine-tuned to predict data from all 8 MEG subjects (jointly). In the MEG training, only the linear output layer is trained for 10 phases, followed by 20 phases of full model training. Then, the MEG fine-tuned model is applied to predict data from each fMRI subject. This training also uses 10 phases, only training the linear output layer, followed by 20 phases of complete fine-tuning.

  • [Fully Joint Model] This model trains to predict data from all MEG subjects and fMRI subjects simultaneously. Only the linear output layer is trained for 10 phases, followed by training the complete model for 50 phases.

Experimental Results
Figure 8 presents experimental comparison results using different fine-tuned models, where each subplot shows the comparison between two models. The x-axis coordinates represent different voxel types, arranged in descending order of maximum accuracy for the two models in the 20 VS 20 test. The colored lines (one for each subject) show the differences in average accuracy between the two models, where the average is taken for all voxels to the left of each x-coordinate. In Figures 8(a)-(c), it is evident that for many voxels, the fine-tuned models are more accurate in predicting voxel activity compared to the standard Vanilla model. In Figure 8(c), the accuracy of the MEG-transfer model is roughly the same as that of the fine-tuned model that was fine-tuned solely on fMRI data.
From the experiments in Figure 8, we can draw the following conclusions: (a) Compared to the Vanilla model, using the fine-tuned language model can better predict brain activity; (b) For some subjects, incorporating MEG recordings can slightly improve prediction accuracy compared to cases without MEG recordings, while for other subjects, training based on MEG recordings may yield worse results or show no significant changes; (c) Compared to the Vanilla model, the Participant-transfer model can more accurately predict voxels, indicating that the Participant-transfer model indeed benefits from transfer learning; (d) When appropriate hyperparameters are selected, the Fully Joint model can perform as well as or better than the baseline Vanilla model.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 8. Comparison of experimental results using different models.
Next, the authors ran two models (MEG-transfer model and Fully-joint model) on the GLUE benchmark and compared the results with standard BERT, as shown in Table 2. These fine-tunings may assist in completing NLP tasks or may have no impact on the original model’s performance, but they do not degrade the original NLP task performance.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Table 2. GLUE benchmark experimental results.
To understand how the representations in BERT change when fine-tuned to predict brain activity, the authors finally study the universality of various features in the examples. By observing the percentage change in Euclidean distance between predictions and targets, the extent of accuracy change in predictions for each example after fine-tuning is calculated. This percentage change is based on a spatially selected set of voxels that may be related to language, as shown in Figure 9. From left to right, there is a lateral view of the inflated left and right hemispheres of the human brain, along with the medial view of the inflated left and right hemispheres.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 9. Voxels used to calculate accuracy change in feature distribution analysis between fine-tuned and Vanilla models.
The authors evaluate all available features in the experiment but only present the results for motion labels, emotion labels, and some part-of-speech labels, as other features are either too sparse to evaluate or show no changes in distribution, as seen in Figure 10. The experiment finds that among the samples where accuracy changed during fine-tuning, those containing verbs describing actions and imperative language are more prevalent.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 10. Universality of examples with the greatest and least changes in predictive accuracy in language areas, including motion (Motion), emotion (Emotion), and part of speech (Part of speech) related labels.
Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 11. Comparison of accuracy in voxel-level 20 and 20 classification tasks between fine-tuned and standard Vanilla models.
3. Discussion of the Article
Fine-tuning NLP models to predict brain activity is a new research direction for learning human language processing. The technology studied in this paper encodes target information from prediction tasks into model parameters to improve NLP models, making the improved NLP models applicable to prediction tasks of varying sizes and different spatiotemporal resolutions. Additionally, this technology can effectively leverage large-scale datasets (fMRI, MEG) to assist in learning human language processing. Of course, there is still a long way to go in research on this issue, as it is currently not possible to accurately grasp how to optimize models to effectively utilize various information sources related to language in the brain and how to effectively train brain activity data with low signal-to-noise ratios. Nevertheless, this research demonstrates the feasibility of learning the relationship between text and brain activity by fine-tuning language models. The authors believe this provides a new, interesting, and exciting direction for researchers interested in human language processing.
3. Linking artificial and human neural representations of language
Paper link: https://arxiv.org/pdf/1910.01244.pdf
Finally, let’s look at the research results presented by Professor Roger P. Levy’s group, which explores what information obtained from BERT is most effectively expressed in the human brain. This article broadens the research on neural decoding of neural networks and investigates optimized models suitable for a wide range of different tasks. In depth, it delves into the specific representational content of each model concerning its neural decoding performance.
The article evaluates the connections between human brain activity and neural network models applied to different Natural Language Understanding (NLU) tasks. The authors find that these models are designed for different NLU tasks, matching different human brain activation patterns. Furthermore, by analyzing the changes in the representational content of these models, the authors find that the granularity of the model’s grammatical representations can at least partially reflect their differences in brain decoding processes.
1. Introduction to the Method
Figure 12 provides an overview of the experimental structure of this paper, which attempts to match human neural imaging data with different candidate model representations of sentence inputs. A regression model that maps human brain activity to the representations generated by different natural language understanding models is learned using data recorded from human brain activity to complete sentences.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 12. Brain decoding method.
Consider a neural network classification model for a task T that maps input statements x to category output y. This neural network classifier is decomposed into a combination of two operations: a representation function r(x) and a mapping operator A:

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Where r(x) is linearly separable for different tasks T. In language neuroscience and other fields, it is currently unknown whether fMRI is linearly separable. One possibility is that the distinctions necessary to describe language comprehension behavior can be decoded from fMRI data in advance. If so, the performance of brain decoding can be utilized to measure the similarity between the psychological representations underlying human language comprehension and the psychological representations deployed in artificial neural network models.
Another possibility is that if the representations supporting language comprehension in the brain are not linearly decodable by fMRI, it can be demonstrated that the specific instantiation of sentence representation models does not degrade the performance of brain decoding. Thus, the brain decoding framework explored in this paper can achieve: (1) distinguishing NLU tasks as representations of human language comprehension, (2) understanding the potential limitations of fMRI imaging and linear decoding methods.
Model Construction
Similar to the experimental organization of the previous two papers, subjects read sentences while their brain activity is recorded using fMRI. For each subject and each sentence, the fMRI image consists of a vector of approximately 200,000 dimensions, describing the approximate neural activity within three-dimensional small regions of the brain known as voxels. These vectors are collected in a single matrix and compressed to 256 dimensions using PCA. BERT is used as the model to extract sentence representations: using a series of multi-attention operations to compute context-sensitive representations for each token in the input statements.
This model is pre-trained on two tasks: (1) a cloze language modeling task, where the model is given a complete sentence containing several masked words to predict these masked words. (2) A next-sentence prediction task, where the model predicts whether two sentences are adjacent in the original language model data. The BERT structure used in this paper is also from literature [1], with a series of fine-tuning operations.
  • Custom fine-tuning tasks: Each task is a modified version of the standard cloze language modeling task, highlighting a specific aspect of language representation through these modifications.

Table 3. NLU tasks for fine-tuning BERT

Can NLP Work Like the Human Brain? Insights from CMU and MIT

  • Disordered language modeling: Two language modeling tasks are designed to target fine-grained grammatical representations of the input by scrambling words from the samples used for language modeling. The first task, LM-scrambled, scrambles words in sentences; the second task, LM-scrambled-para, scrambles words in paragraphs. Through this input scrambling, the cloze task can effectively be transformed into a language processing task involving a set of words.

  • Part-of-speech language modeling: LM-pos targets the fine-grained semantic representation of input by requiring a model to predict only the part of speech of a masked word instead of the word itself.

  • Language modeling control: As a control, we continue to use text from the Books Corpus to retrain the original BERT modeling objectives.

This paper also selects GloVe as a baseline comparison algorithm for BERT. Unlike BERT, GloVe’s extracted word vectors are highly sensitive to the context of sentences.
Brain Decoding
The brain decoder maps the description of human brain activity to the response of model activations to sentences. Let B represent the subjects’ brain responses to 384 sentences in the evaluation set, G represent the mapping, C represent the output representation of the fine-tuned model, and β be the regularization parameter. The brain decoding function is a target function that minimizes the regularized regression loss between the two spaces as follows:

Can NLP Work Like the Human Brain? Insights from CMU and MIT

For each subject’s set of brain images and each target model representation, the above regression model is trained and evaluated using nested 8-fold cross-validation.
2. Analysis of Experimental Conditions
First, Figure 13 shows the performance of BERT and GloVe models tested. Relative to the BERT baseline, the fine-tuned models increase errors in completing brain decoding tasks under both evaluation metrics, and fine-tuning on the LM-scrambled-para custom task reduces errors in brain decoding. Fine-tuning the control language modeling task and the LM-pos custom task yields inconsistent results for the two evaluation metrics: MSE decreases, but AR shows no significant change.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 13. Brain decoding performance of fine-tuned BERT models and GloVe baselines.
Then, the authors perform similarity analysis on the representations, using coarse-grained model similarity analysis [3]. By measuring pairwise distances of the models, the degree of content alignment is determined. For each fine-tuning run l of each task j, the pairwise cosine similarity between the representation rows in C_jl is calculated to obtain vector D_jl. By calculating the Spearman correlation coefficient ρ(D_jl, D_j’l’), the similarity between representations derived from one run (j, l) and another run (j’, l’) can be measured. Figure 14 presents a heatmap of similarity values, where each cell represents the average of the results from different runs of the two corresponding models.
The authors conclude from this experiment that: (1) Language modeling fine-tuning runs (especially the two LM scrambled fine-tunings) are the only models that exhibit reliable high correlations with each other; (2) For the same task, representations produced by multiple runs of language modeling fine-tuning yield similar predictions of sentence-sentence distances, while other models show lower coherence between runs (see diagonal matrix); (3) Scrambled LM fine-tuning produces stable sentence representations across various runs and improves brain decoding task performance.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 14. Similarity of sentence encodings generated by each model (between -1 and 1; the higher the value, the more similar).
Finally, the article presents experiments on syntactic probing tasks that measure the extent to which word representations can reproduce sentence syntactic analyses. Figure 15 shows the results of this task for different fine-tuned models and the GloVe baseline. The results indicate that models optimized for LM-scrambled and LM-scrambled-para (which can improve brain decoding performance) gradually degrade in performance during fine-tuning. Of course, their performance still far exceeds that of the GloVe baseline. Figure 16 provides a representative example sentence containing syntactic analyses derived from the LM-scrambled (after 250 fine-tuning steps) and GloVe baseline representations, where solid lines indicate correct correlation predictions, while dashed lines indicate incorrect predictions.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 15. Evaluation of syntactic probing across fine-tuning time.

Can NLP Work Like the Human Brain? Insights from CMU and MIT

Figure 16. Representative example sentence parsing with undirected syntactic probing, generated from the LM-scrambled (blue) and GloVe (red) representations, where solid lines indicate correct correlation predictions and dashed lines indicate incorrect predictions.
3. Discussion of the Article
The authors believe that the sentence encoding tasks tested in the experiments did not significantly enhance brain decoding performance. Through further task design and representation analysis, the authors find that tasks generating syntactically light representations (removing some information related to syntax from the baseline BERT representations) can significantly improve brain decoding performance. The results of this paper suggest that the space of NLU models can effectively explain human neural expression of language, but also indicate the limited ability of fMRI human neuroimaging to recognize fine-grained grammatical information.
4. Summary of This Article
This article focuses on a fundamental research question: Understanding the relationship between natural language processing in machines and in the brain. Due to the spatial and temporal nature of brain neural imaging, NLP models have been applied to analyze and predict brain activity. On the other hand, in the field of machine learning theory, there are currently numerous neural network structure designs, parameter adjustments, or attempts to improve natural language processing through fine-tuning BERT, introducing contextual information, or training models using specialized databases. This foundational research question aims to: 1. Understand the principles inside the black box of NLP models, i.e., whether some representations can be extracted or utilized from NLP models that possess characteristics similar to those of human brain work and analysis records, thereby enabling NLP models to genuinely work like the human brain? 2. Conversely, can some parameters or even structures of NLP be modified to enhance their ability to predict brain activity (or even natural language predictions)?
The three papers mentioned earlier analyze this issue from different conclusions. The second and third papers share some similarities in perspective, but the final output activities differ. From the conclusions drawn in the three selected papers, it can be seen that progress has been made in matching NLP models with human brain recordings, effectively utilizing fMRI, MEG, and other techniques to assist in improving BERT’s predictive capabilities for brain activity, while also indicating that existing NLP and NLU models can partially explain human neural expression of language: the first and second papers provide relatively positive results, while the third paper presents a more negative conclusion. This indicates that research on this issue has a long way to go; the experiments in these papers are relatively simple and intuitive, and it is currently not possible to accurately grasp how to optimize models to effectively utilize various information sources related to language in the brain, as well as how to effectively train brain activity data with low signal-to-noise ratios.
The authors have not provided a fundamental theoretical analysis of the relationship between natural language processing in machines and in the brain, only demonstrating possible connections between the two through experimental means. However, this foundational research shows us the possibility of improving networks and systems under practical theoretical guidance, as relying solely on stacking computers will never be more effective than stacking human brains.
Analyst Introduction:Wu Jiying, PhD in Engineering, graduated from Beijing Jiaotong University, previously served as an assistant researcher and research assistant at The Chinese University of Hong Kong and The Hong Kong University of Science and Technology, currently engaged in research on new technologies in the field of e-government.Her main research directions are pattern recognition and computer vision, with a passion for scientific research, hoping to maintain learning and continuous progress.
References cited in this article:
[1] Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[2] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological), 57(1), 289–300.
[3] Nikolaus Kriegeskorte, Marieke Mur, and Peter A Ban dettini. 2008. Representational similarity analysis connecting the branches of systems neuroscience. Frontiers in systems neuroscience, 2:4.
[4] Lerner, Y., Honey, C. J., Silbert, L. J., and Hasson, U. (2011). Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. The Journal of Neuroscience, 31(8), 2906–2915.
About the Synced Global Analyst Network
The Synced Global Analyst Network is a global knowledge-sharing network for artificial intelligence initiated by Synced. Over the past four years, hundreds of AI professionals, students, and scholars from around the world have shared their research ideas, engineering experiences, and industry insights with the global AI community through online sharing, column interpretation, knowledge base construction, report publication, evaluation, and project consulting, all while gaining personal growth, experience accumulation, and career development.
Interested in joining the Synced Global Analyst Network? Click Read the original text to submit your application.

Leave a Comment