WASM Python - Cat Herding 05

Pre(r)amble

This is part of a(n aspirationally) series of posts documenting some of the process of (building|cat herding an AI agent to build) an easily hosted Python teaching tool built with just front-end JS and a WASM port of MicroPython.

I have a (somewhat) up to date version of this tool running on the site. Things might occasionally be broken, things might not work the same way as they did before.

The learning interface is here
The authoring interface is here

There’s a changelog in the authoring interface to give you a bit of an idea of what’s new or different.

Warning

If you’ve used this before, you might need to do a hard refresh to clear out any cached JS or CSS. I should probably put some clearing of IDB storage in there too, just in case. Maybe next week.

What we had before

See last week’s post for some of the details and screenshots.

Student verification codes (which are going to need some tweaks to the author side with the new problem list feature)
Read-only file support in workspaces
Per-test workspace files

What’s happened since

Here are some screenshots of the latest version with some of the new things from this week.

UI enhancement

I’ve been trying to make things generally better looking and nicer to use, given that I typically rock a pretty utilitarian aesthetic by default. Some of the changes are (you can see some of them in the screenshots):

Shifting to non-emoji feedback indicators
Feedback messages are now optional - the indicator and the title should do most of the job
Run tests now visibly disabled when there are no tests to run
Added an admonition extension to markdown rendering so authors can provide highlights to different types of content

Problem lists

One of the things that bothered me about the initial design was the friction in getting multiple problems out to students. Each problem would have needed to have a new link with a URL parameter to load the next problem. Given that some problems might not take very long to solve, this would mean lots of bouncing back to a LMS page or email. However I still wanted to be able to have single problems loadable.

So now we support lists of problems (in a json file format) that can be passed to the app just like individual problems. If the config file passed to the app ends up being a config list rather than a single problem, the user is presented with a menu to navigate through the different problem files found in the list (with its own problem set title in the page header). This way, we can still respect the “client side only” nature of the app, but also make it possible to have a learning sequence.

I have a config list containing simple test scripts that you can load here

There is also a basic config list that loads by default.

Persistent success indicators

The problem with potentially not just seeing a single problem anymore is it’s less likely to take just one session to solve them, so how do students keep track of their progress? Enter persistent success indicators in a few places:

The page header now gets a success indicator if the student has passed the test suite for a single problem
If a problem is part of a list, then we also badge problems with passing solutions in the drop-down menu for navigating between problems (this one was a struggle for the agent, and it broke things a few time 🫠)
A ‘passing’ snapshot is created (or updated) each time a test suite is passed.

This gives students a few places where, if they pick up a problem set in another session, they can get a good visual indicator of whether or not they have passed a problem or not. The snapshot indicator also tells them when they passed the test suite last.

Feedback rule dependencies

I love feedback flexibility, and from personal experience it’s frustrating when the ability to give feedback is limited by the tools you have to provide it. One of the features I added to testing that made me pretty happy was conditional runs for tests and grouping.

This time I added some flexibility to when feedback rules trigger:

Each feedback item can be dependent on a list of other feedback items to trigger
Feedback dependency can be either “must match” or “must not match”
Combining this with default hidden behaviour means you can surface “just in time” feedback means you can build some pretty flexible feedback types

(A)Introspection

This week has been a bit of messing around trying to spec out problems in a way that makes it clear what I don’t want to get broken, doing a whole lot of regression testing (much of it manual because I need to get around to putting together some decent e2e tests, and because there are a lot of round trips into browser storage more isolated tests aren’t always a great indication of whether something has worked, or rather, not broken anything).

One of the (funny|frustrating) things about agent instruction is the utter lack of self-awareness. There have been a couple of sessions where I’ve been working on a feature and it doesn’t work out the way I intended, or, after using the initial version a few times, I change my mind about how it should work. The agent then turns and tries to implement legacy migration features for something that hasn’t even been committed yet, let alone pushed. The minor funny version of this is it’ll write some code, I’ll test it, authorise the next step and it’ll tell me all about all the changes that it sees I’ve made, despite it having just written the code.

Pre(r)amble#

What we had before#

What’s happened since#

UI enhancement#

Problem lists#

Persistent success indicators#

Feedback rule dependencies#

(A)Introspection#

Pre(r)amble

What we had before

What’s happened since

UI enhancement

Problem lists

Persistent success indicators

Feedback rule dependencies

(A)Introspection