Why2000: A Dialogue-Based Explanation Tutoring System

This project developed Why2-Atlas and Why2-AutoTutor, two intelligent tutoring systems that helped students learn conceptual physics. Both tutors gave students a qualitative physics question, such as "Suppose a massive truck collides head-on with a lightweight car. Which vehicle suffers the greater impact force? Explain your answer." Students would type in a paragraph-long explanation, and the tutors would then carry out a dialogue in natural language intended to get students to realize what mistakes they had made in their explanations and to fix them. The Atlas system was developed by VanLehn's group at the University of Pittsburgh, building on earlier work supported by an NSF center, Circle. The AutoTutor system was devleoped by Graesser's group at the University of Memphis, building on earlier work supported by several sources. Roughly speaking, AutoTutor used mostly statistical natural language processing and simple dialogue management, whereas Atlas used mostly symbolic natural language processing and more complex dialogue management. For the Why2000 project, both systems were adapted to coach students on exactly the same problems.

The two tutoring systems were compared to human tutors and several control treatments. The results, which are reported in a 2007 Cognitive Science article, were quite surprising. On the one hand, Why2-Atlas and Why2-AutoTutor were just as effective as human tutoring! On the other hand, so was a simple step-based tutoring system that did not engage the student in dialogue, but merely had students read text that targeted the flaws in their explanations. This pattern of results held up across 7 experiments that varied the measures, subjects and content of the instruction. The same pattern of results has since been observed in several other studies of human tutoring and computer tutoring (VanLehn, 2011). Overall, these results suggest that a well-engineered step-based tutoring system can be just as effective as the gold standard of instruction, expert human tutoring.

In addition to these empirical results, the project produced a great deal of natural language processing software, tutoring software and experimental materials (cognitive task analyses, training materials, assessments, etc.), much of which is still in use. See the AutoTutor project and the Tutoring Scientific Explanations via Natural Language Dialogue project.



Kurt VanLehn was PI. The University of Pittsburgh group was led by Dr. Pamela Jordan. The group members included:

The University of Memphis group was led by Dr. Art Graesser . It was organized into four subgroups, each led by one or more faculty members:

Graduate students in computer science and psychology were assigned to each of the four groups.


Last update: April 5, 2013.