
Abstract: The transition from general large models to educational large models is an important trend in the deepening development of artificial intelligence large model technology. Based on an analysis of the current status, typical cases, and potential challenges of educational large models, this article posits that educational large models are artificial intelligence models suitable for educational scenarios, characterized by ultra-large scale parameters, integrating general knowledge and specialized knowledge training. They represent a synthesis of large model technology, knowledge base technology, and various intelligent educational technologies, capable of promoting a bidirectional construction of human learning and machine learning. Furthermore, it proposes an innovation architecture driven by application and co-construction and sharing, along with a “learner-centered” future application scenario. The aim is to establish open interfaces between artificial intelligence large models and various digital educational applications, continuously training and refining educational scenario models that can better address professional educational issues, forming a cluster of intelligent educational open models and knowledge bases that can be routinely used by a wide range of teachers and students. This approach aims to extract and distill deep educational knowledge while addressing the risks and challenges present in the application of artificial intelligence in education.
Keywords: Educational large models; Generative artificial intelligence; Intelligent education; Educational big data
Currently, artificial intelligence large model technologies represented by ChatGPT, Gemini, Wenxin Yiyan, and iFlytek Spark are developing rapidly and have attracted widespread attention globally. With powerful natural language processing capabilities, large models can complete complex tasks such as answering questions, content creation, and code generation, demonstrating tremendous potential to liberate social productivity and profoundly impacting human information acquisition, knowledge structure, and educational models. However, these general large models are not adept at solving specialized educational problems, making the transition from general large models to specialized large models in the educational field an inevitable trend in the deepening development of artificial intelligence large model technology. Educational large models are not merely fine-tuned and optimized versions of general large models; instead, they represent a systemic transformation aimed at reconstructing the future educational landscape, based on an open algorithm model architecture and centered around innovative educational application scenarios. Clarifying the conceptual connotations of educational large models and designing the system architecture based on the essence of technology to further create new educational application scenarios has become a pressing issue for the digital transformation and intelligent upgrading of education.
// I. Current Status of Educational Large Models
1. Definition and Distinction of Related Core Concepts
As an emerging research field, large models have generated many related concepts in academia, such as AIGC, generative artificial intelligence, and large models. Clarifying the connotations of these concepts is crucial for deepening the understanding of educational large models and constructing high-quality educational large models.
① AIGC (Artificial Intelligence Generated Content) can be directly translated as “artificial intelligence generated content,” which is proposed in contrast to professionally generated content (PGC) and user-generated content (UGC). AIGC is based on intelligent technologies such as supervised learning, reinforcement learning, pre-trained models, and natural language processing, automatically generating various forms of content, such as text, images, audio, video, and 3D interactive content through learning and training from existing data[1].
② Generative artificial intelligence is an artificial intelligence technology that automatically generates response content based on natural language dialogue prompts (Prompt)[2]. The technical implementation process of generative artificial intelligence typically consists of two steps: first, training or learning based on existing data (pre-training); second, when new instructions or commands are input, automatically generating new content based on the learned intent.
③ Large models refer to artificial intelligence models with tens of billions to hundreds of billions or more trainable parameters. They are the product of the joint development of deep learning, GPU hardware, and large-scale datasets. The powerful capabilities exhibited by large models are essentially the result of the “quantitative change leading to qualitative change” in artificial intelligence algorithms, a process vividly referred to as “intelligent emergence capability,” which is the ability to automatically learn and discover new, higher-level features and patterns from the original training data[3]. These abilities are prominently displayed in strong and general user intent understanding, continuous contextual dialogue ability, intelligent interaction correction ability, and new content generation ability.
In summary, AIGC, generative artificial intelligence, and large models are closely related concepts, all emphasizing a leap from passive design to active production in the new generation of artificial intelligence technology, representing the evolutionary trend of a new round of technological revolution. However, the three concepts emphasize different technical characteristics: AIGC emphasizes the diversity of generated content types, generative artificial intelligence emphasizes autonomous generation and creativity, and large models emphasize the parameter characteristics of algorithm models.
On this basis, this study posits that educational large models are artificial intelligence models suitable for educational scenarios, characterized by ultra-large scale parameters, integrating general knowledge and specialized knowledge training. They are a synthesis of large model technology, knowledge base technology, and various intelligent educational technologies, capable of promoting a bidirectional construction of human learning and machine learning. They not only encompass the educational knowledge necessary for the inheritance of human civilization but also distill educational experiences and methods that previously existed only in the minds of teachers, guiding learners to think deeply in human-machine dialogue interactions, providing guidance and support for learners’ autonomous exploration, and continuously updating and upgrading in the process to achieve higher professional levels.
2. Research Dynamics of Educational Large Models
Educational large models have immense transformative potential, not only provoking new understandings and reflections on education and teaching but also profoundly altering educational concepts, content, and models, potentially reshaping school education forms[4]. Some scholars believe that generative artificial intelligence can promote high-quality educational development through human-machine co-teaching, inclusive intelligence, and interactive evaluation[5], spawning “more open and inclusive, interdisciplinary fusion” educational teaching concepts, forming a diversified AI teaching system that is “human-centered and education-assisted”[6]; other scholars argue that educational large models will reconstruct the structure of school education, gradually shifting standardized evaluation methods towards personalized evaluation standards, forming a lifelong “credit bank”[7].
At the same time, educational large models may also bring potential risks such as intellectual property disputes, data usage biases, and algorithm abuse. The quality and safety of generated content still need improvement, as there are widespread issues of lack of coherence and logic, making it unsuitable for all educational scenarios[8]. Some scholars point out that educational large models may imbalance human-machine relationships, leading to technological dependence, data security, and other ethical risks[9][10].
In conclusion, the rapid development of artificial intelligence presents significant opportunities for education while also posing a series of unknown risks. How to enable educational large models to effectively support educational reform and innovation, developing warm intelligent education, has become a common challenge faced by global education.
3. Current Applications of Educational Large Models
Currently, countries around the world are placing great importance on exploring the applications of educational large models. The United States released “Artificial Intelligence and the Future of Teaching: Insights and Recommendations,” summarizing the opportunities and risks of artificial intelligence in teaching, learning, assessment, and research, and proposing seven action recommendations for the application of the next generation of artificial intelligence in teaching and learning[11]. The UK published “The Application of Generative Artificial Intelligence in Education,” suggesting that the education sector should fully utilize various new technologies to provide learners with high-quality education, equipping them with the capabilities to adapt to societal development[12]. Meanwhile, schools have also attempted corresponding practical explorations. In September 2023, Hong Kong, China, developed an artificial intelligence curriculum for junior high school students, requiring public schools to offer 10-14 hours of AI courses covering topics such as ChatGPT, artificial intelligence ethics, and the social impact of AI. In October 2023, Japan’s Ministry of Education announced that 53 primary and secondary schools would serve as pilot schools for generative artificial intelligence, aiming to enhance the efficiency of educational activities and school management through the use of new technologies[13]. Australia announced that starting in 2024, artificial intelligence, including ChatGPT, will be permitted in all schools[14]. These practical explorations highlight the significant role of the next generation of artificial intelligence technology in education and underscore the unstoppable trend of digital transformation and intelligent upgrading in education.
4. Analysis of Typical Cases of Educational Large Models
Globally, extensive and in-depth explorations and developments of educational large models are underway, with solutions already formed in areas such as speaking practice, mathematics learning, emotional analysis, and personalized recommendations. This study has summarized five typical application cases of educational large models (as shown in Table 1), analyzing their application scenarios, technological progress, and existing shortcomings.
Table 1 Typical Application Cases of Educational Large Models

From the perspective of application scenarios, Spark Voice Partner is mainly used for language learning, supporting real-time translation of multilingual text, speech, and images, and can correct grammatical errors and provide speaking practice. EmoGPT is used for psychological counseling, capable of recognizing and responding to user emotions, providing ongoing psychological support. MathGPT caters to global mathematics enthusiasts and research institutions, offering algorithms for problem-solving and explanations, supporting users in mathematical problem-solving and practice. Zhihai-Sanle is used for AI knowledge learning, providing search engines, computing engines, and local knowledge bases, supporting intelligent Q&A and test question generation. Khanmigo offers personalized learning plans for learners through a conversational AI chatbot, covering multiple disciplines such as mathematics and science.
From a technological progress perspective, educational large models exhibit significant advantages in model performance, application scenarios, and technical characteristics, covering most subject content and primarily focusing on autonomous learning scenarios, including knowledge Q&A, language learning, learning guidance, and teaching assistance. In terms of technical routes, the “general + fine-tuning” approach has proven effective, with many technical solutions based on general large models achieving effective responses to specific subject knowledge through instruction fine-tuning.
Regarding existing shortcomings, current educational large models face limitations in accuracy, diversity of teaching content, support for core educational scenarios, and inclusivity of learner diversity, with high error rates and a lack of empathetic understanding capabilities. They primarily focus on subject knowledge teaching and examination-oriented educational contexts, lacking in cross-disciplinary learning and the cultivation of learners’ comprehensive abilities and higher-order thinking. They mainly concentrate on supporting autonomous learning, with insufficient exploration of how to fully leverage large models in real classroom, peer collaboration, and blended teaching scenarios.
In summary, the application of large models in the field of education has made significant progress but still faces real issues that need to be addressed, requiring further enhancement of the quality and scale of training data, especially embedding advanced educational concepts, deep educational knowledge, and the genuine needs of core educational scenarios into technical design, and iterating through user feedback to form smarter and more flexible educational large models.
// II. Major Challenges Faced by Educational Large Models
The continuous updates and upgrades of artificial intelligence technology are driving the large-scale application of models. The deployment of vertical domain large models has already shown great success in scenarios such as intelligent customer service, digital assistants, and multimodal retrieval, and in the future, they will be deeply integrated into various fields and aspects of economic and social development, empowering intelligent upgrades across numerous industries and driving a leap in social productivity. However, achieving true automation and intelligence in education often faces higher demands compared to other fields, as most educational tasks are “non-programmable,” making automation more challenging[15], and there are a series of severe challenges in terms of capabilities, values, data, and algorithms.
In terms of capabilities, educational large models possess strong content generation and creative abilities, capable of directly providing answers to questions. Over-reliance on large models may lead to cognitive laziness among teachers and students, weakening their problem-solving abilities and further exacerbating the passivity, superficiality, and fragmentation of knowledge acquisition. Over time, this can result in a decline in human cognitive abilities. In reality, educational large models merely simulate human cognitive capabilities and do not possess true wisdom, nor do they have the ability to “solve the unknown” or “innovate knowledge.” It is crucial to guide teachers and students to develop the ability and literacy to effectively utilize educational large models. In terms of values, the value biases brought by educational large models may lead to the occurrence of “hallucinations,” generating erroneous or non-existent content. If the training data carries a particular value perspective, texts that align with that perspective will be repeatedly output, potentially being recognized as key texts and standard answers. Some large models based on Western-language corpora may covertly project “value and cultural standard answers” through content products such as texts, images, and videos, subtly penetrating the value systems of adolescents and exacerbating the “digital colonialism” of marginalized groups[16]. Therefore, it is essential to strengthen educational objectives and value guidance, focusing on “value alignment,” and establishing corresponding risk prevention and intervention mechanisms. In terms of data, educational large models require massive training data, further expanding risks related to data security and privacy protection, making student and teacher privacy issues an unprecedented challenge. We must strengthen the security of educational data while ensuring the quality of content generated by large models, encrypting and decrypting raw data to prevent privacy data leaks while establishing effective mechanisms for co-construction and sharing of educational data, expanding high-quality public training data resources to promote the healthy and sustainable development of educational large models. In terms of algorithms, educational large models based on deep learning are often seen as “black boxes,” making it difficult to explain their decision-making processes, which may lead to educational behaviors that are hard to understand and accept, potentially undermining the learner’s subject position. From a certain perspective, personalized learning algorithms seem to improve the accuracy of information delivery, but they may also trap learners in an “information cocoon,” seeing only information that aligns with their existing viewpoints, narrowing their horizons and further affecting their comprehensive development.
Currently, educational large models are at a critical stage of research and development application, requiring further enhancement of the quality and scale of training data, particularly embedding advanced educational concepts, deep educational knowledge, and the genuine needs of core educational scenarios into the underlying architecture of algorithm models, and iterating in conjunction with learner needs to realize the practical implementation of educational large models.
//III. Innovative Architecture of Educational Large Models
Currently, the research and development of educational large models mainly adopt two technical routes: one is to directly invoke general large models and enable them to possess certain professional capabilities through fine-tuning or prompt learning; the other is to utilize specialized data from the education field to train large models specifically designed to solve educational tasks. However, despite both technical routes making certain progress, the effectiveness still needs improvement. The issues arise from the lack of sufficient specialized data for training and inadequate deep knowledge in the education field, resulting in the current large models lacking intelligence and struggling to flexibly handle complex and variable educational tasks. The key to developing educational large models lies in integrating these two technical routes. This is not a simple addition but rather a method of co-construction and sharing driven by applications, continuously obtaining data from normalized educational applications through open data interfaces, achieving an organic combination of “large models” and “small models,” and “big data” and “small data” to meet the actual needs of teachers and students in daily teaching, breaking down “data silos”[17]; simultaneously, utilizing expert knowledge bases as a supplement to large models[18], consciously teaching educational knowledge and pedagogical methods to large models, and integrating various intelligent educational technologies to form specialized large models capable of flexibly handling various complex educational tasks.
1. Underlying Logic
The core competitiveness of educational large models lies not in technology or data, but in a deep understanding of education. The concept of “learner-centered” should be regarded as the underlying logic for developing educational large models, integrated into the entire process of algorithm model architecture design and prototype development. “Learner-centered” means configuring educational resources around learners’ needs, interests, and abilities, aiming for the proactive and creative development of learners, designing learning activities, and planning growth paths, thereby achieving large-scale personalized education. Guided by this concept, educational large models are no longer cold machines or tools but important assistants and collaborative entities that promote learning, optimize learning, and stimulate learning, helping learners transition from passive recipients of knowledge to active seekers, explorers, and collaborators. However, this “learner-centered” approach does not equate to “precise question answering” and should not fall into the trap of “efficient exam preparation.” It must adhere to the principle that “every learner is a person of comprehensive development and multidimensionality,” utilizing educational large models to understand learners’ growth states to provide personalized, adaptive, and warm learning support and teaching guidance services, promoting learners’ comprehensive and individualized development.
2. Open Innovative Architecture
Educational large models are based on general large models, continuously training educational scenario models by connecting various digital educational applications, thereby continuously enhancing their capabilities to solve professional educational tasks. The open innovative architecture of educational large models consists of three layers: the foundational capability layer (L0), the professional capability layer (L1), and the application service layer (L2), as illustrated in Figure 1.

Figure 1 Open Innovative Architecture of Educational Large Models
L0: Foundational Capability Layer. This layer includes large language models, video analysis models, subject-specific large models, and emotional computing models. Among them, large language models are responsible for processing text data; video analysis models handle video data, such as classroom recordings; subject-specific large models manage subject-specific tasks; emotional computing models address indicators related to physical and mental health, involving tasks like monitoring emotional states during learning processes and interpersonal emotional analysis. In the task completion process, multiple large models work collaboratively and support each other, with the task center integrating and processing the results output by different models.
L1: Professional Capability Layer. This layer consists of two parts: ① Educational Scenario Model Library. The educational scenario model library primarily includes models for learning behavior analysis, classroom interaction analysis, competency assessment, academic prediction, emotional computing, and decision support, with a portion of commonly used models pre-configured and continuously optimized and expanded during application. ② Expert Knowledge Base. The expert knowledge base comprises two types of knowledge: subject content knowledge and subject teaching knowledge. These two types of knowledge are integrated and stored and presented in the form of multi-dimensional dynamic knowledge graphs. As the teaching process evolves, both teachers and students become users of the knowledge graph while also serving as co-editors and creators, ultimately forming individual knowledge graphs for learners as well as shared knowledge graphs at the class, school, and regional levels.
L2: Application Service Layer. The most important innovative concept of educational large models is “application-driven,” meaning various digital educational applications are integrated into the large model, which empowers applications while continuously feeding application data back into the large model to enhance its educational professional capabilities. These applications encompass various educational scenarios related to teaching, learning, assessment, and management, forming a unified standard of high-quality training data through open data interfaces. Simultaneously, teacher and student users can issue task instructions through a unified portal, with the large model automatically invoking the corresponding functional modules based on the nature of the tasks, resulting in a learner-centered application model that allows seamless usage of the large model even without any knowledge of artificial intelligence.
3. Construction and Deployment Ideas
Educational large models are not a single, closed model but a process in which developers and users participate collaboratively and continuously improve. Professional teams develop the foundational architecture and core components of the model, while various users contribute to its optimization during application. A broad spectrum of teachers and students, as well as developers of various digital educational applications, are both users and contributors to educational large models, thereby forming a co-constructed and shared intelligent educational innovation ecosystem. This process will undergo two important phases: ① In the foundational construction phase, based on the dual-driven artificial intelligence technology route of “data + knowledge,” various educational artificial intelligence technologies are integrated to establish model systems, application systems, and data systems, forming a continuous training mechanism for models. From a technical implementation perspective, this phase includes seven steps: large-scale diverse educational data collection, data preprocessing, feature engineering, model design, model pre-training, fine-tuning and transfer learning, and model evaluation and optimization. ② In the application improvement phase, the educational large model continuously innovates algorithms and models, innovates data applications, and develops applications.
//IV. Application Prospects of Educational Large Models
Educational large models will promote the digital transformation and intelligent upgrading of education from three aspects: learning spaces, learning resources, and the roles of teachers, forming a new ecosystem of education characterized by human-machine collaboration and coexistence.
1. Interactive Generation of Learning Spaces
With the support of educational large models, learners obtain learning support and create learning outcomes through human-machine interaction, constructing personal and collective learning spaces, forming learning scenarios that integrate physical and online spaces, allowing all learners to access any information they need at any time and place[19]. On one hand, learners utilize tools such as knowledge graphs and digital textbooks to compile collections of learning outcomes, establish personal and team knowledge bases, and collaboratively write “digital learning cases,” forming a learning community based on the normalized co-construction and sharing of learning resources, while developing a user evaluation and labeling mechanism for learning resources, transitioning from knowledge consumption-based learning to knowledge-generative learning. On the other hand, learners’ learning experiences are distilled from learning behavior data by intelligent algorithms, summarized into new learning methods, continuously optimizing and refining educational teaching strategy models and knowledge bases, achieving bidirectional empowerment of human learning and machine learning.
2. On-Demand Supply of Learning Resources
Leveraging the learning analysis capabilities of educational large models, the gap between the demand and supply sides of educational resources can be narrowed, providing personalized learning resources for learners and addressing the issue of matching quality educational resource supply with learning demands. On one hand, a classification system for learning demands in the resource application process is established. Based on known resource classifications, learners label resource types, continuously expanding and enriching the resource classification framework to make resource labels more aligned with real learning needs. From a technical perspective, this process is essentially the “alignment” of the educational large model with human values, ensuring that the model adheres to human values, preferences, and ethical principles while providing learners with vast and appropriate learning resource support. On the other hand, a user-centered evaluation mechanism for educational resources is established, promoting the survival of the fittest among educational resources based on feedback from teachers and students, stimulating the enthusiasm of teachers and students to utilize resources, and discovering and cultivating a group of high-quality educational resource builders, advancing the transition from “educational-specific resources” to “educational big resources.”
3. Transformation and Upgrade of Teacher Roles
Educational large models will gradually replace repetitive and inefficient educational labor, enhancing the scientific and creative aspects of educational work, prompting teachers to transition from “experts in teaching” to “experts in learning”[20], providing personalized support for each learner through creative teaching design. On one hand, relying on intelligent knowledge bases and application sets, large-scale and precise knowledge transmission can be conducted, freeing teachers’ time, energy, and creativity, allowing them to focus on organizing and guiding learning activities, fostering deeper teacher-student dialogues. On the other hand, teachers will gradually become learning guides and teaching researchers, likely gaining insights into learners’ thought processes through human-machine collaborative learning data analysis and diagnostics, and designing targeted educational activities based on learners’ actual learning states, moving experiential teaching towards evidence-based professional practice. In the future, teachers will be both experienced and wise practitioners and researchers adept at utilizing big data analysis and intelligent teaching research tools, further exploring new educational laws and teaching methods under the conditions of artificial intelligence, ultimately forming a new knowledge system for intelligent education.
// V. Conclusion
Currently, there is still a gap between China’s educational large model technology and the international leading level, but the development momentum is good, having accumulated a certain technological strength, with a significant advantage in massive data. It can rely on large-scale, multimodal, long-cycle educational data in the education field to accurately capture and deeply understand the learning process, further clarifying the underlying mechanisms of teaching and learning, promoting rapid iteration of educational algorithms, and establishing more targeted, professional, and accurate large language models to achieve a leapfrog development. At the same time, educational large models will bring a series of new challenges and unknown risks, requiring the prompt clarification of their development principles and usage scope, strengthening ethical risk assessment and review, and formulating targeted guidelines for teachers and students to ensure that educational equity, inclusiveness, and sustainable development are highlighted throughout the lifecycle of educational large model development and application.
References
[1] Jiang Sha, Zhao Mingfeng, Zhang Gaoyi. An Analysis of the Application Progress of Generative Artificial Intelligence (AIGC) [J]. Mobile Communications, 2023, (12): 71-78.
[2] Miao F C, Holmes W. Guidance for Generative AI in Education and Research [OL].
[3] Wei J, Tay Y, Bommasani R, et al. Emergent Abilities of Large Language Models [OL].
[4] Li Yongzhi. Opening New Tracks for Educational Development with Digitalization [N]. People’s Daily, 2023-10-13 (9).
[5] Li Yanyan, Zheng Yafeng. Educational Applications of Generative Artificial Intelligence [J]. People’s Forum, 2023, (23): 69-72.
[6] Chen Xiaohong, Yang Ningyi, Zhou Yanju, et al. A Research Review on the Impact of AIGC Technology on Education and the Labor Market in the Digital Economy Era: Taking ChatGPT as an Example [J]. Systems Engineering Theory and Practice, 2024, (1): 1-13.
[7] Zhu Yongxin, Yang Fan. ChatGPT/Generative Artificial Intelligence and Educational Innovation: Opportunities, Challenges, and the Future [J]. Journal of East China Normal University (Education Science Edition), 2023, (7): 1-14.
[8] Lu Yu, Yu Jinglei, Chen Penghe, et al. Educational Applications and Prospects of Generative Artificial Intelligence: Taking the ChatGPT System as an Example [J]. China Distance Education, 2023, (4): 24-31.
[9] Yang Junfeng, Shen Zhongqi, Chen Ruining. An Exploration of the Educational Applications and Ethical Risks of Generative Artificial Intelligence [J]. Journal of Huzhou Normal University, 2024, (1): 1-8.
[10] Chen Qianqian, Zhang Lixin. Ethical Reflections on Educational Artificial Intelligence: Phenomenon Analysis and Vision Construction—Based on the Perspective of “Human-Machine Collaboration” [J]. Journal of Distance Education, 2023, (3): 104-112.
[11] US Department of Education. Artificial Intelligence and the Future of Teaching and Learning: Insights and Recommendations [OL].
[12] UK Government. Generative Artificial Intelligence (AI) in Education [OL].
[13] Jia Yun. Japan Establishes Pilot Schools for Generative Artificial Intelligence [J]. International Education Exchange, 2023, (6): 78-79.
[14] Australian Government Department of Education. Australian Framework for Generative Artificial Intelligence (AI) in Schools [OL].
[15] Autor D H, Levy F, Murnane R J. The Skill Content of Recent Technological Change: An Empirical Exploration [J]. The Quarterly Journal of Economics, 2023, (4): 1279-1333.
[16] Miao Fengchun. Principles of Generative Artificial Intelligence Technology and Its Applicability in Education [J]. Modern Educational Technology, 2023, (11): 5-18.
[17] Sang Xinmin, Xie Yangbin, Yu Zhong, et al. System Engineering Discussions on the Digital Transformation of Education [J]. Modern Educational Technology, 2023, (1): 5-16.
[18] Wei Bin. Analysis of the Fusion Path of Symbolism and Connectionism in Artificial Intelligence [J]. Research on Dialectical Materialism, 2022, (2): 23-29.
[19] Cao Peijie. Smart Education: Educational Reform in the Age of Artificial Intelligence [J]. Educational Research, 2018, (8): 121-128.
[20] Cao Peijie. The Three Realms of Educational Reform in the Age of Artificial Intelligence [J]. Educational Research, 2020, (2): 143-150.
Article cited from: Cao Peijie, Xie Yangbin, Wu Huizi, Yang Yuanyuan, Shen Yuan, Zuo Xiaomei, Huang Baozhong. Current Status, Innovative Architecture, and Application Prospects of Educational Large Models [J]. Modern Educational Technology, 2024, 34(02): 5-12.
