Another 61511 Done

Four years ago, I made a push to change how we teach software engineering at the MSc level. I had ambitious plans about how to change the whole sequence, but I was going to start by taking over the first class in the sequence.

The first year was super tough as the person who has been teaching it took medical retirement (sadly). My early ideas just weren’t workable given me and the cohort.

I completely revamped it, esp the coursework and have edge closer and closer to something which has some good innovation. It needs another overhaul but this year went pretty smoothly (still some coursework marking to go).

I won’t said it’s best of breed because I don’t have a lot of comparisons. But it seems good and rather interesting. One class out of four isn’t enough to be transformative but it’s a start!

Plus I taught the whole day with a unicorn horn on my head. Good times!

Advertisements

Grading Postmortem

I just finished the followup of grading a programming/software engineering assignment with a mostly automated toolkit. The goal was to have “next class” turnaround. It wasn’t quite same day but it was definitely within 20 hours, for the bulk of people. Some of the problem was that Blackboard is terrible. Just terrible. It refused to upload marks and feedback for people who had had multiple submissions and then sent me into a hellscape to try to enter them manually. So there were some upload errors (2 people didn’t have any feedback and a 1 had the wrong feedback due to cut and paste fail). Out of 49 submissions, I had 17 people report a problem or request a regrade. Of those 5 resulted in a change of mark for a total of 19 marks added (the total possible for each was 10 per assignment so 490 total ; 170 were originally given thus 10% of “rightful” marks went missing and needed a manual update; one of these was due to a rouge extra submission after the deadline that was the wrong one to grade for 3 points, so 8.5 missing marks were due to grader bugs).

Now the people with wrong marks generally got “0” often when it was obvious that they shouldn’t have. This was because their program would either crash in a new way or return a really unexpected result. In the later case, since we try to parse the program out put, we’d through an expected exception for that odd output. In both scenarios, this unexpected scenario would crash the grader before it wrote any feedback. Missing feedback was inferred to be an “upload” problem so the students got 0 and an unhelpful error message.

These were stupidly hard bugs to track down! But they point to a couple of holes in our robustness and test isolation approach (we’re generally pretty good on that). In general, I’d like to review the 0s before uploading to confirm but the tight time frame was just too much. It was a tradeoff between the real anxiety, pain, and confusions some students would feel at getting an erroneous 0, and delaying feedback. It’d have been great if I could have turned around the corrections more quickly, but I have only so much time and energy. All students who filed an issue got a resolution by the subsequent Monday evening at the latest. So, two full days with correct feedback before the next assignment. Obviously, quicker is always better, but this isn’t unreasonable.

At least two people were misled by the feedback which basically said “You are missing this file” when it should have said “You are missing at least one of this file or that directory.” Oops! That was mostly work for me than anything else.

In the same day lab, the students did an over the shoulder code review of each other’s first assignment. I wish I had gathered stats on problems found. I told everyone who wanted to file an issue to send me an email aftertheir code review discovered no problems and they had some simple test cases passing. In many of those cases, there were very obvious problems that a simple sanity test would have revealed and oddities in the code which lept out (to me).

I feel this justifies my decision not to return granular feedback or explicit tests. The program is very small and they have an oracle to test against (they are reverse engineering a small unix utility). The points awarded are few and  2 come from basically not messing up the submission format. 1 comes from following the spec requirement to use tabs as an output separator.

But the goal of these assignments is to get people thinking about software engineering, not programming per se. They need to reflect on their testing and release process and try to improve them. I had several students ask for detailed feedback so they would lose fewer marks on the next assignment and that’s precisely what I don’t want to do. The learning I’m trying to invoke isn’t “getting this program to pass my tests” but “becoming better at software engineering esp testing and problem solving and spec reading and…”.

It’s difficult, of course, for students to care about the real goals instead of the proxy rewards. That’s just being a person! All I can do is try to set up the proxy rewards and the rest of my teaching so as to promote the real goal as much as possible.

Giving students low marks drives a lot of anxiety and upset on my part. I hate it. I hate it because of their obvious suffering. I hate it because it can provoke angry reactions against me. I hate it because I love seeing people succeed.

But it seems necessary to achieve real learning. At least, I don’t see other ways that are as broadly effective.

Some Good Reads

I really am trying to clean out my tabs and writing something in depth on each isn’t always cutting it. So here’s some quick hits (the link text generally isn’t the title:

  • Burying NoSQL for consistency failures. Essentially the argument is giving up consistency for availability (cf CAP theorem) is a bad move due to increased application complexity AND that many “NewSQL” systems aren’t consistent for a subtle implementation reason.
  • A beautiful performance study of grep tools by the author of ripgrep. Clear, fairly comprehensive, appropriately modest, it seems totally publishable to me. I learned a lot reading it and enjoyed doing so.
  • “Systems programming” != “low level programming” has a nice history of the term and concept. It’d be good to get an analysis of how the phrase “programming in the large” got in.
  • You should read all of Dan Luu, but you’d could do worse than starting with his “Hardware is Unforgiving”.
Four Down, hundreds to go.

Gitlab Has Kanban/Trello Style Boards

And they are linked to the issue tracker! Nice!

They aren’t as nice as Trello’s. The cards are very limited and don’t “flip over”. They don’t provide full access to the issue tracker, so adding comments, even adding full fledged issues, is hard to impossible from the board. However, I think for managing a workflow, it’s fine. A little clunky, but fine.

So now I can teach them in my software engineering class…which means I need to add them to my material…yay?

It’s panic time around here! Classes are….sooooo close!

CSS Misalignment

Well, here we are in day n of trying to add a simple logo to a showoff presentation. Showoff has a very neat feature set (esp for audience interaction) but is pretty garbage to dork with.  I mean, most HTML slideshow systems are, but showoff is screwing me pretty hard.

My current solution is to add the image to the content Markdown. That at least gets me somewhere even if I have to preprocess the Markdown. BUT, the image is aligned centre and I need it on the left. Now usually getting things to align centre is challenging for CSS with garbage like “margin-left:auto; margin-right:auto” being actual canonical moves (unless you are dealing with text or floating an image). Of course, I can’t use something like “align: left” but am off in some rathole of position, float, margin nonsense.

CSS has been around in some form since 1994. 1994. They’ve been deprecating HTML presentation stuff for quite some time now. But it’s just plain worse.

Now I’m sure if I spent enough time really learning all this shit, I would have some reasonable control. But…I don’t want to have to learn all this shit just to do some basic layout and I shouldn’t have too. LaTeX I sort of forgive just for it’s shear age, but 1994! With supposed active development since!

It makes me want to dork with transparent 1 pixel gifs.

Thinking About Bug Day

Grace Hopper discovered the “first” computer bug…a literal moth shorting out some relays. (I scare quote “first” because, like many “first”s, it’s complicated; I’m more than happy to credit Hopper though esp as it makes such a fun story and it involves an actual, biological, bug.) Last year was the 70th anniversary of her discovery and the folks at BugSnag put together a nice, if short, series of “worst bugs in history” with a focus on older ones with big property/life effects. It lists some classics which probably had an oversized effect in the literature (eg Ariane 5). Unit conversion issues figure prominently and still are poorly handled.

Bug day is Sept 9th which is inconvenient for my class. It’d be nice to have some sort of more wide spread…celebration? Event? Reflection?

Two Text UI Todo Managers

I’ve been studying Text UI (TUI) frameworks in Python for a while now. In my class, we use the built in argparse module to mange command line argument handling. I fantasise pushing up to REPLs and then widget based full screen console apps. Ideally, the TUI widget framework would have GUI and Web based backends but alas none do. Also, I have a couple of grade and exam management tools with argument handling that I’m looking to add a better front end. A TUI is appealing because it’s lightweight, portable, and easier to security audit.

In this investigation, I stumbled across a couple of TUI task and todo managers. There are, of course, dozens and based on editors, file formats, etc etc. These accidentally juxtaposed by one of them being in an open tab from Hacker News and the other is built on one of the frameworks I’m playing with. The first is a node.js app called Taskbook which has a kind of Trello mentality. The other is a Python/urwid toolkit based app called todotxt-machine and is designed around the todo.txt file format.

(I haven’t fully investigated but it certainly seems that todo.txt is sufficient to capture Taskbook’s data model.)

So, with minor differences in list layout, they have very similar functionality but very different interaction designs. Taskbook is entirely a shell app and all interaction is via arguments from the shell. This is great for integration with scripts and thus larger workflows but can get get when manually manipulating lots of items. It’s primary mode is batch.

Contrariwise, todotxt-machine is a full screen, interactive console app with a bit of menuing, mouse support, and so on. It’s virtues and vices are the reverse of Taskbook.

Implementationwise, though I don’t have actual stats or anything, it’s much easier to add the command line processing than the console UI. Adding the Taskbook argument handling should be a doddle (given feature parity).

In a similar summery, Taskbook’s JSON is more ready out of the box for processing whereas todo.txt is much easier for people to manipulate.