An introduction to automated Virtual Teaching Assistants and practical guidelines for building one

Virtual assistants have become increasingly popular over the past decade, and are now an integral part of our lives. From Apple’s Siri to Amazon’s Alexa and Google Assistant, the most popular digital assistants are generally voice-activated and can be employed for a variety of purposes, such as managing our calendars, starting playlists, and conducting online searches. Still, chat-based virtual assistants (also known as chatbots) – which can be rule-based, AI-automated, or a combination of the two – have proven useful as well in a wide range of fields, such as customer support, travel planning, and real estate.

It would be remiss to discuss virtual assistants without mentioning ChatGPT, a popular chat-based Large Language Model (LLM) from OpenAI, whose applicability in the educational domain was also discussed in a previous post, or the more recent GPT-4, which is already being used by Duolingo for providing feedback to learners and enable them to practice conversation skills. However, our focus in this post will be on custom-built Virtual Teaching Assistants (VTAs), which can be a valuable asset at all levels, from primary education to Massive Open Online Courses (MOOCs) and traditional university settings. 

VTAs are computer programs that can support both learners and educators across a broad range of settings. They achieve this by automating repetitive tasks, providing feedback on assignments, and facilitating communication among students. Arguably, the main advantage of VTAs is the fact that by automatically responding to students' questions and requests, they can reduce the workload of teachers, allowing them to focus on only the most critical requests that require human intervention, and can save students time by reducing the wait time for answers. 

In some cases, VTAs can also help improve student engagement and motivation. By providing immediate feedback, proactively messaging students about their tasks, and facilitating communication among students, they can reduce the feeling of loneliness that sometimes causes dropouts in MOOCs.


Previous experiments with VTAs

Arguably, the first extensive experiment conducted to study the effects of VTAs in an educational setting was performed by Goel et al. in 2016. This study led to the development and deployment of Jill Watson, a VTA that Georgia Tech employed in some of its online courses to assist teaching assistants in supporting students. As reported in the paper that introduced Jill Watson (JW), JW was able to autonomously respond “to student introductions, post weekly announcements, and answer routine, frequently asked questions”. The paper noted that when it was published, more than 750 students had interacted with different versions of Jill Watson, and the authors deemed the experiment a great success. It is worthwhile to take a step back and understand why Jill Watson was developed and how it benefited the students.

Georgia Tech University had recently launched an online course divided into two sections. The first section was accredited and available only to 45 students who received support from three teaching assistants. The second section was open and non-accredited, and more than forty thousand students enrolled, but they did not receive any support from teaching assistants. Due to the high number of students, the university couldn't afford to provide human TAs to support them all. As a result, the number of students who could be accepted into the accredited section was limited. To address this issue, Georgia Tech developed Jill Watson (JW), a virtual assistant that was deployed as a VTA in an online class about Knowledge-Based Artificial Intelligence. JW “autonomously responded to student introductions, posted weekly announcements, and answered routine, frequently asked questions. Thus, JW is a partially automated and interactive technology for providing online assistance for learning at scale”. Interestingly, the students were not informed that JW was a bot, and interacted with her as if she were a human TA. Although JW was unable to handle some of the students' requests, she could still manage most of the questions, reducing the work required by human TAs. They could then focus on the questions that required human-to-human interaction. Indeed, and this was one of the most interesting points made in the article, most of the students' requests were similar. This was evident in the forum, where all the students could see all the threads, and even more so in emails and private messages.

Was Jill Watson a successful experiment? Absolutely. Although she couldn't handle every request, she was able to respond to a significant number of them and allowed the same number of teaching assistants to handle a much larger number of students. The students were not informed that Jill Watson was a Virtual Teaching Assistant until the end of the course, and when they were told, their reaction in a final survey was overwhelmingly positive. Interestingly, most of the students didn't even realize that Jill Watson was a bot! An interesting TED talk given by Prof. Goel goes through the experiments, and it is definitely worth a look.

Similar experiments, generally at smaller scales, were also carried out at other universities, for both on-site and online courses, and the findings are generally similar across studies. Examples of this are Rexy, which was developed for a recommender systems course at Politecnico di Milano in 2018, Kwame, a VTA for an online coding course, and others.


Why existing Large Language Models are not always the best solution

In the introduction, we mentioned that ChatGPT and similar large language models (LLMs) could be used to support students in parallel with other virtual teaching assistants (VTAs). However, relying solely on LLMs might not be the best option, and we will now discuss some reasons for this. Firstly, although LLMs are highly capable and knowledgeable, they lack specific information about the course they would be deployed in. This can cause significant issues in an educational setting, such as confusion over the course schedule or which topics are covered. Therefore, a custom-built VTA may be a better choice for addressing course-related questions, while still utilizing third-party LLMs for general queries. It's essential to note that LLMs can provide incorrect answers, a phenomenon known as "hallucination" (as demonstrated in this great video about Bing Chat behaving badly). While LLMs are continuously improving, this risk is unlikely to disappear completely. In contrast, custom-built VTAs are simpler and less prone to hallucination, especially if rule-based.


Practical considerations when developing a V(T)A

When developing a custom-built VTA (or any virtual assistant), the first step consists in choosing the platform to use. Indeed, it is indeed possible to train end-to-end neural networks to implement the VTA, but doing so might increase the risk of the model hallucinating or answering with meaningless messages. There are several services for building virtual assistants, and we mention here only some of the most popular (for a more extensive list please have a look here and here):

  • One of the first and most popular platforms is IBM Watson Assistant, used for building Jill Watson and Rexy; until some years ago it would have been the go-to solution for most use cases, but nowadays there is more competition and more choice for users.
  • The three players dominating the cloud service provider market – Amazon Web Services (AWS), Microsoft Azure, and Google Cloud – also offer some services for building VAs: Amazon Lex, Google Cloud Dialogflow, Azure Bot Service, Microsoft Power Virtual Agents. The choice of using one of these mostly depends on whether the user is already a customer and already familiar with the cloud provider.
  • Other solutions are offered by smaller companies which still received very good reviews in the market:

It is difficult – if not impossible – to recommend only one over the others, as the best solution in terms of pricing and capabilities depends on the specific needs of each user. In any case, most providers offer some interesting free trials that can be used to understand if a specific service satisfies your needs.

After choosing the platform to manage the intelligence of the virtual agent, it is necessary to choose how it will be deployed. In many cases the developed VTAs can be “plugged into” other platforms depending on the user’s needs. Common solutions for this are Slack and Messenger which can be very neat as they are sometimes already used in educational settings. Some platforms also offer stand-alone solutions which do not rely on external services and might be a better solution if something like Slack is not already being used for the course.

After choosing the service to use for implementing the intelligent assistant, and possibly the service for deploying it, it is time to actually build the agent. For this part, there might be some differences between different providers, but from a high level perspective the steps are always the same:

  • Provide some initial rules and template responses of the assistant.
  • If needed provide specific data that is needed by the assistant (e.g., a table with the schedule of the lectures or the office hours of the lectures).
  • Some example interactions to train the model on the course; in practice, this consists in creating some simulated interactions to tell the assistant how it should behave in those scenarios, so that it can generalize to new interactions with users.

After these three steps, it is possible to start testing the VA, to make sure that its accuracy is acceptable for the specific purpose for which it was built. In practice, this consists in proceeding iteratively with new interactions, and for each one you should check how the agent responds, and add the interaction to the training data (possibly correcting the agent response in case it was wrong). This should continue as long as needed until an acceptable accuracy is reached. Interestingly, in this phase it might be possible to use ChatGPT or other LLMs to simulate students interacting with the VTA!



In conclusion, VTAs are a technology that is finding its importance in the way we approach education. With the help of natural language processing and machine learning, virtual teaching assistants can provide personalized and efficient support to both students and teachers. As we have seen in this blog post, there are already some excellent examples of virtual teaching assistants in action. These tools have proven to be useful in a variety of contexts, from K-12 education to higher education and online learning.

If you are interested in implementing virtual teaching assistants in your classroom or institution, there are several platforms and services that can help. Companies like IBM, Microsoft Azure, AWS, and Google Cloud offer platforms and tools to create virtual teaching assistants. Let us know if you have any experience with building – or using VTAs – and what are your experiences with them!