ProfAI – Teaching Co-agent for Pedagogical Mediation in Virtual Learning Environments

← Back to Portfolio

Abstract

Learning Management Systems (LMS), such as Moodle, systematize tasks but offer limited support for individualized student large-scale dynamics by instructors. This paper presents ProfAI, an intelligent agent designed to support pedagogical activities within Moodle. The system combines Retrieval-Augmented Generation (RAG) for static content with LMS integration for dynamic interactions. Execution takes place on local or institutional infrastructure, avoiding reliance on external online services and ensuring compliance with personal data protection regulations. ProfAI provides students with individualized feedback and acts as a teaching assistant within the pedagogical dynamic, contributing to secure and replicable models of intelligent interactive agents in e-learning.

Keywords: agent, Moodle, LLM, RAG, AI, e-learning

License & Copyright

Copyright: © 2026 Universidade Aberta
License: Creative Commons Attribution 4.0 International (CC BY 4.0)

This work is licensed under CC BY 4.0, which permits sharing, adaptation, and commercial use with proper attribution. This English translation and summarized version are provided for portfolio purposes with full attribution to the original publication.

Original Publication:
Aragão, G., & Morgado, L. (2025). ProfAI – Agente codocente para mediação pedagógica em ambientes virtuais de aprendizagem. Revista de Ciências da Computação, 20(2). https://journals.uab.pt/index.php/rcc/article/view/431

1. Introduction (Summarized)

Online teaching has become essential in higher education, with Learning Management Systems like Moodle serving as central platforms. However, pedagogical mediation still depends heavily on human intervention, with teachers facing significant cognitive and temporal burdens when providing individualized support to large numbers of students. Existing automated tools rely on rigid pre-programmed rules and require continuous manual configuration, limiting their effectiveness in proactive pedagogical accompaniment.

This article explores the technical feasibility of ProfAI, a co-teaching agent that acts proactively rather than reactively. The system combines RAG (Retrieval-Augmented Generation) architecture with dynamic Moodle integration, local execution, and containerization via Docker and Ollama, ensuring GDPR compliance while providing personalized support to students at scale.

Note: This is a summarized version for portfolio purposes. View the complete article published in Revista de Ciências da Computação, 2025.

2. Current Scenario (Summarized)

Recent years have seen significant growth in AI use in educational environments, particularly with intelligent agents and Large Language Models (LLMs). Traditional chatbots, which dominated e-learning for the past decade, relied heavily on rule-based systems and pre-configured dialogue flows, limiting their effectiveness in complex interactions.

The introduction of the Transformer architecture (Vaswani et al., 2017) and subsequent development of models like GPT revolutionized natural language processing. However, challenges persist: proper handling of personal data remains critical, with GDPR compliance being a major concern when educational systems transmit student data to external LLM services. Additionally, traditional bots require continuous user interaction, contrasting with educational needs for proactive, prolonged pedagogical support.

3. Problem, Concept and Proposal

3.1. Problem and Concept

The use of "classic" bots in e-learning tends to require extensive configuration and manual maintenance (e.g., conditional execution, definition of triggers and decision rules), which limits their generalization capacity and adaptation to new situations. Recent reviews on personalized feedback in digital environments show a predominance of rule-based solutions (~74%), often implemented as autonomous programs that consume LMS data and send weekly emails with "if-then" rules (e.g., OnTask, SARA), precisely the type of configuration that demands manual work from the teacher or technical team [Maier & Klotz, 2022]. As the previously cited study corroborates, the facts that the majority of chatbots (88%) follow pre-defined flows based on strict conditional logic rules, showing a vast quantity of chatbots that still follow this model [Kuhail et al., 2022].

Furthermore, the generation of personalized feedback is rarely automated contextually: systematic reviews demonstrate that the majority of feedback approaches still depend on manual adjustments and human interpretation, limiting the scalability of pedagogical practices [Maier & Klotz, 2022].

Thus, the manual nature of these processes constitutes one of the main obstacles to the personalization and sustainability of technology-mediated teaching, reinforcing the need for intelligent agents capable of acting as co-teachers in supporting the teacher. Additionally, systematic reviews of educational chatbots reveal significant limitations even in systems that use machine learning to optimize responses: the most recurrent limitation is training with insufficient or inadequate datasets, which results in inability to adequately respond to student questions, generating frustration and compromising the learning process [Kuhail et al., 2022]. It was also noted that interaction with a chatbot caused loss of interest in the activities in question, which with a human does not have this loss of interest, showing that this conversational style via chatbot can negatively impact learning [Kuhail et al., 2022].

These factors show that possible automation can enable intelligent mediation that acts proactively, without the need for an explicit conversation, generating a possible solution to the problems cited in the previous paragraph. It is also necessary a tool that has the discipline context as a whole, its content, the course schedule, the grading program, the student list, etc. Another important factor is the reduction of manual configuration done by a teacher; it is necessary for a system to be able to self-regulate as needed, that is, a system that can identify when and how to act according to the needs of the virtual environment (such as a question from a student that needs answering) and the temporal moment (such as a date to start evaluations).

With this comes the proposal for a co-teacher or co-professor, who has the necessary context to act and understands when and how to act in the virtual learning environment.

To achieve this self-regulating agent, it is necessary that it not have biases and can work with various topics. According to Shulman [1986], a good teacher needs to have some knowledge: Subject Matter Knowledge, Pedagogical Content Knowledge (PCK), Curricular Knowledge, and of utmost importance, the teacher must know how to transform knowledge into comprehensive teaching.

To achieve this "good" professor, a vast amount of data is needed, to which a human teacher was exposed over many years. Thanks to LLM technology, it is possible to access this data in an easily accessible way to natural language. Although there are no studies that allow concluding that LLMs have the same level of knowledge as a higher education teacher, there are studies that report good results from LLMs compared to doctorate professionals, although several are internal to companies like OpenAI [Learning to reason with LLMs, 2024].

Beyond the knowledge necessary to be a teacher, the proposed co-teacher also needs to act on its own initiative and not just react. Just as a teacher has physical capabilities to move around, such as correcting tests or publishing content in the virtual learning environment, the co-teacher also needs to have the capacity to act: to have agency. We can understand the co-teaching agent as having two main parts, the brain and the arms/legs. The brain is the LLM, which contains the necessary knowledge to "think" like a professor. The arms/legs are the tools with which the agent has access to act, such as access and permissions to the virtual learning environment to be able to publish a message in the forum. These tools must allow the agent to interact with the environment and with students: the agent must be able to understand what is happening in the disciplines, be able to get answers from students, be able to correct assignments or publish materials.

3.2. Proposal (ProfAI)

To address the challenges of manual orchestration and lack of personalization identified in the literature, ProfAI is presented as an intelligent agent conceived as a co-teaching agent capable of acting proactively in virtual learning environments. The system combines a RAG (Retrieval-Augmented Generation) architecture — a technique that allows the language model to obtain and use specific content, such as study materials provided by the teacher as primary sources of knowledge, instead of depending exclusively on its general training [Thoeni & Fryer, 2025] — with dynamic integration with the Moodle LMS, operating securely and replicably on local infrastructure.

The project was developed in Node.js (TypeScript) using frameworks for rapid prototyping and development, notably LangChain. The LLM used in the project was a local LLM to ensure compliance with GDPR. For this, the Ollama tool was used, which allows the use of LLM locally through an API in which the application can send and receive messages processed by the LLM. To access discipline content and the literature used, the RAG technique is used through a vector database, which allows natural language search of the necessary content to augment LLM knowledge. This recourse fills some gaps in LLMs, as models trained with globally available data from the past do not have specific knowledge of new information, from books, nor of organizational content of the discipline, such as evaluation dates.

In addition to this static data, the agent also has access to dynamic data, that is, data from the Moodle LMS (the virtual learning environment used). Access to this data is done via the Moodle Web Services API.

For there to also be recurrent interaction from the agent, a system called ProactiveEngine was created, in which the system has some configured triggers so that it can search for new information and stay updated with what is happening in the discipline. This ProactiveEngine will make scheduled calls to fetch Moodle data via API. This way the system is always updated with new information about what is happening in the discipline. This integration generates a challenge: the correct manipulation of data. If there is no proper data filtering, API requests become exaggerated (more data than necessary) and the system's normal form becomes overloaded. To avoid these problems, the data that Moodle can pass is chosen in a way to avoid excessive data consumption and a summary system is used. The LLM used in the project also assumes a function to summarize Moodle data and qualify it, such as identifying the intention of a post in a forum. This makes the data stored in the database summarized (not being a replica of the LMS) and also facilitates the LLM's task of replicating a professor's action.

To properly handle GDPR, the system runs in a local environment, thus data transmission occurs between the computer that runs the agent (which can be either a teacher's personal computer or a university computer) and the Moodle LMS directly via API, therefore, preserving the confidentiality of data and its transmission only to university machines or its employees, not to third-party systems.

For easy replication on multiple computers and to allow the system to be used by teachers to have their "co-teaching colleagues", the system was designed to run with Docker and Ollama help (to use publicly available LLMs).

4. Architecture and Development

4.1. System Architecture

Content omitted. See full article for details.

4.2. Technologies Used

The technologies used to make RAG function were based mainly on the LangChain framework and the Ollama tool.

The LangChain framework allows easy integration of tools. The tools are extra functions that the LLM can have access to. One of the tools integrated into the agent via LangChain is the search for data in the vector database. This occurs in a way that the LLM processes an input initially, verifying which tools it has access to and if there is any tool that can be used to improve its response, this tool is called, which in this case translates into a search. With the result of the search, the input to the LLM is accessed with this extra data.

For local LLM use, that is, on-premise, the Ollama tool was used, a tool that allows running LLM models on a local machine. LangChain has easy integration with Ollama, which allowed rapid prototyping of the solution by integrating these tools.

The vector database used was Qdrant. Its main function in this project is to store the static content of the discipline, such as dates already stipulated and manuals used during the curricular year. The vector database allows storage that facilitates natural language search, which is what the LLM together with its tools produces. This natural language search is done using a similarity algorithm, which returns a list of possible responses according to the search.

[Sequence diagram - showing the detailed flow of action generation and response]

These technologies were integrated using Docker, which allowed initializing containers with the necessary configurations so that the application could communicate with these services like the LLM and the vector and semi-structured databases (MongoDB). The programming language for the entire project to gain life was TypeScript with Node.js, enabling communication between all the project's components.

4.3. Practical Implementation

Content omitted. See full article for details.

4.4. Security and Privacy

Content omitted. See full article for details.

5. Results

Content omitted. See full article for details.

6. Discussion

Content omitted. See full article for details.

7. Conclusion and Future Work

Content omitted. See full article for details.

References

Holstein K., McLaren B. M., and Aleven V. (2019). Co-Designing a Real-Time Classroom Orchestration Tool to Support Teacher-AI Complementarity, Journal of Learning Analytics, vol. 6, no. 2, pp. 27-52, https://doi.org/10.18608/jla.2019.62.3.

Prieto, L. P., Rodríguez-Triana, M. J., Martínez-Maldonado, R., Dimitriadis, Y., & Gašević, D. (2019). Orchestrating learning analytics (OrLA): Supporting inter-stakeholder communication about adoption of learning analytics at the classroom level. Australasian Journal of Educational Technology, 35(4). https://doi.org/10.14742/ajet.4314

Maier U. and Klotz C. (2022). Personalized Feedback in Digital Learning Environments: Classification Framework and Literature Review, Computers and Education: Artificial Intelligence, vol. 3, article 100080, https://doi.org/10.1016/j.caeai.2022.100080.

Kuhail M.A., Alturki N., Alramlawi S., and Alhejori K. (2022). Interacting with educational chatbots: A systematic review, Education and Information Technologies, vol. 28, no. 1, pp. 973-1018, https://doi.org/10.1007/s10639-022-11177-3.

Labadze L., Grigolia M., and Machaidze L. (2023). Role of AI chatbots in education: systematic literature review, International Journal of Educational Technology in Higher Education, vol. 20, article 56, https://doi.org/10.1186/s41239-023-00426-1.

Adamopoulou E. and Moussiades L. (2020). Chatbots: History, technology, and applications, Machine Learning with Applications, vol. 2, article 100006, https://doi.org/10.1016/j.mlwa.2020.100006.

Winkler R. and Söllner M. (2018). Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis, in Academy of Management Annual Meeting (AOM).

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008, https://doi.org/10.48550/arXiv.1706.03762.

Russell, S., & Norvig, P. (2021). Artificial intelligence: A modern approach (4th ed.). Pearson. ISBN: 978-1292401133.

Thorat S.A. and Jadhav V.D. (2020). A Review on Implementation Issues of Rule-based Chatbot Systems, in Proceedings of the International Conference on Innovative Computing & Communications (ICICC), http://dx.doi.org/10.2139/ssrn.3567047.

Shulman L.S. (1986). Those Who Understand: Knowledge Growth in Teaching, Educational Researcher, vol. 15, no. 2, pp. 4-14, https://doi.org/10.3102/0013189X015002004.

OpenAI. (2024). Learning to reason with LLMs, https://openai.com/index/learning-to-reason-with-llms.

Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

Thoeni, A., & Fryer, L. K. (2025). AI Tutors in Higher Education: Comparing Expectations to Evidence, https://doi.org/10.31219/osf.io/24tg7_v1.

← Back to Portfolio