figshare
Browse
Automated Data-Driven Hint Generation for Learning Programming.pdf (5.61 MB)

Automated Data-Driven Hint Generation for Learning Programming

Download (5.61 MB)
thesis
posted on 2017-07-01, 00:00 authored by Kelly Rivers

Feedback is an essential component of the learning process, but in fields like computer science, which have rapidly increasing class sizes, it can be difficult to provide feedback to students at scale. Intelligent tutoring systems can provide personalized feedback to students automatically, but they can take large amounts of time and expert knowledge to build, especially when determining how to give students hints. Data-driven approaches can be used to provide personalized next-step hints automatically and at scale, by mining previous students’ solutions. I have created ITAP, the Intelligent Teaching Assistant for Programming, which automatically generates next-step hints for students in basic Python programming assignments. ITAP is composed of three stages: canonicalization, where a student's code is transformed to an abstracted representation; path construction, where the closest correct state is identified and a series of edits to that goal state are generated; and reification, where the edits are transformed back into the student's original context. With these techniques, ITAP can generate next-step hints for 100% of student submissions, and can even chain these hints together to generate a worked example. Initial analysis showed that hints could be used in practice problems in a real classroom environment, but also demonstrated that students' relationships with hints and help-seeking were complex and required deeper investigation. In my thesis work, I surveyed and interviewed students about their experience with helpseeking and using feedback, and found that students wanted more detail in hints than was initially provided. To determine how hints should be structured, I ran a usability study with programmers at varying levels of knowledge, where I found that more novice students needed much higher levels of content and detail in hints than was traditionally given. I also found that examples were commonly used in the learning process, and could serve an integral role in the feedback provision process. I then ran a randomized control trial experiment to determine the effect of next-step hints on learning and time-on-task in a practice session, and found that having hints available resulted in students spending 13.7% less time during practice while achieving the same learning results as the control group. Finally, I used the data collected during these experiments to measure ITAP’s performance over time, and found that generated hints improved as data was added to the system. My dissertation has contributed to the fields of computer science education, learning science, human-computer interaction, and data-driven tutoring. In computer science education, I have created ITAP, which can serve as a practice resource for future programming students during learning. In the learning sciences, I have replicated the expertise reversal effect by finding that more expert programmers want less detail in hints than novice programmers; this finding is important as it implies that programming teachers may provide novices with less assistance than they need. I have contributed to the literature on human-computer interaction by identifying multiple possible representations of hint messages, and analyzing how users react to and learn from these different formats during program debugging. Finally, I have contributed to the new field of data-driven tutoring by establishing that it is possible to always provide students with next-step hints, even without a starting dataset beyond the instructor’s solution, and by demonstrating that those hints can be improved automatically over time.

History

Date

2017-07-01

Degree Type

  • Dissertation

Department

  • Human-Computer Interaction Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Kenneth R. Koedinger

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC