Demoing YAGNI with FizzBuzz

I use FizzBuzz as a warm up lab and as a key lecture. Tom Dalling has an excellent blog post where they iterate the implementation of FizzBuzz into a fully packaged, fully parameterized with sensible defaults, fully documented module. Which is horrific…completely larded with unnecessary complexity. And the punchline is “The Aristocrats”…er “You Ain’t Gonna Need It”.

I’m wondering if this would work as a lecture or even if it makes sense. Each step is supposed to be inevitable or at least sensible but there’s something made up about it. OTOH, it might go too far toward suspicion of abstraction.

It could but fun if the build was right. Perhaps it would be great if it also made clear when those moves made sense. Right now, it’s only negative. But a lot of those moves are good ones in some contexts.


The Viz Fallacy

Taxonomies are fun and usually treeshaped. Trees are “easy” to visualize. When people have a taxonomy they often like to visualize them.

But the results aren’t always so nice. Given that this is a taxonomies of fallacies, I found their misuse of a visualization highly amusing. It is also reminiscent of one of my thesis topics (non logical reductios).

mc et al called this the pathetic fallacy (of RDF).

The Web is (Holy Crap) Insecure

Blame Javascript. And browsers. And the W3C.

77% of 433,000 Sites Use Vulnerable JavaScript Libraries. I’ll bet that various random webpages I put up are vulnerable. Who updates random pages?! Not me!

Of course, my random, unvisited experimental pages are probably not a big problem, but still. Yeek.

Oh my god, SVG is a mess (read the paper). Short answer: It’s almost impossible to use SVG safely. And almost no one is trying.

Something’s going to break in a big way some day.

Continuous Blood Pressure Monitoring

I watch my BP both because of family history and because it has tended to be a bit on the high side. I use a Withings (now Nokia) wireless cuff and it’s ok. Unlike any other system I’ve used, it fails to take a measurement maybe 20-30% of the time.

This paper describes a very cool method for doing truly continuous monitoring. Cuffs can’t do that because you have to go through a compressing cycle which is uncomfortable, takes time, and needs a recovery period. The pulse transit delay approach takes two measurements: one at the heart using an ECG and once at the wrist (measuring wrist pulse). The time between these two events correlates with arterial wall stiffness and thus with BP. You still need to wear a ECG but that’s gotten a lot easier. It’s not something you’d want to do all the time, but it’s clearly much easier than current best practice.

And it just seems neat.

“AI” and cheating

Oh Wired:

For years, students have turned to CliffsNotes for speedy reads of books, SparkNotes to whip up talking points for class discussions, and Wikipedia to pad their papers with historical tidbits. But today’s students have smarter tools at their disposal—namely, Wolfram|Alpha, a program that uses artificial intelligence to perfectly and untraceably solve equations. Wolfram|Alpha uses natural language processing technology, part of the AI family, to provide students with an academic shortcut that is faster than a tutor, more reliable than copying off of friends, and much easier than figuring out a solution yourself.

This is gibberish. If the main form of cheating is solving equations, the NLP front end is largely irrelevant. It’s not very good either. Indeed, the article goes on to say it’s not very AIy:

The system is constrained by the limits of its data library: It can’t interpret every question. It also can’t respond in natural language, or what a human would recognize as conversational speech. This is a stumbling block in AI in general. Even Siri, which relies heavily on Mathematica—another Wolfram Research product and the engine behind Wolfram|Alpha—can only answer questions in programmed response scripts, which are like a series of Mad Libs into which it plugs answers before spitting them out of your speaker or onto your screen.

Alpha indeed makes Mathematica more accessible (if only by price!) which makes it easier to use for cheating. But afaict this is a web and economic change, not an “AI” change.

And of course there’s the standard Wolfram silliness. Alpha was crazily hyped when it came out but it really isn’t all that. The denials of Wolfram and his employees are hilarious:

Alan Joyce, the director of content development for Wolfram Alpha, says that cheating is “absolutely the wrong way to look at what we do.” But the staff understands what might make teachers uncomfortable. Historically, education had to emphasize hand calculations, says John Dixon, a program manager at Wolfram Research.

Suuuure dude. Sure. And:

Indeed, the people who are directing the tool’s development view it as an educational equalizer that can give students who don’t have at-home homework helpers—like tutors or highly educated and accessible parents—access to what amounts to a personal tutor. It also has enormous potential within the classroom. A “show steps” button, which reveals the path to an answer, allows teachers to break down the components of a problem, rather than getting bogged down in mechanics. The “problem generator” can pull from real datasets to create relevant examples. “When you start to show educators the potential,” Dixon says, “you can see points where their eyes light up.”

This isn’t reporting. They don’t interview educators (except a random Prof of Astronomy). They don’t talk to people trying to cope with the cheating. They don’t look at anything except Wolfram propaganda.

Beyond JSON

JSON pretty clearly won and won big. This is perhaps inevitable given the dominance of Javascript. And that’s ok! JSON is a pretty decent sort of sexpr and having both lists and dicts makes it pretty useful for quicky externalisation of all sorts of data. The friction, in a typical scripting derived language, of manipulating JSON in memory is super low. Slinging dicts and lists is something any Python (or Javascript, or…) programmer is going to find congenial.

But it has some infelicities and I just don’t mean the lack of query and schema languages (which is sorta being addressed). JSON is rather annoying to hand author and doesn’t seem great for documents and document formats. Or even hacking existing documents like HTML…if only because there’s no standard reflection of HTML structure into JSON.

There are some moves to improve this situation.

JSON5 tackles the writability. Probably the biggest move is not having to quote (certain) keys in objects. That helps both reading and writing! For reading, there’s a clear visual difference between key strings and “value” strings. For writing, less quoting!!

The other big one is multi-line strings (with the ‘\’ as the continuation character). Having to have a continuation character sucks, but it’s much better than the status quo ante.

Comments are also a good idea! The rest seem minor, but these definitely make a difference.

Mark Notation is aimed at bringing XML-like structuring and extensibility to JSON. It does this by adding a key syntactic (and semantic!) extension, the Mark object which is a name (thing tag/element name), a set of properties (think attributes, but with complex contents), and a list of content objects (think child content). It builds on JSON5 so has those authoring felicities.

Semantically, Mark objects get mapped into pretty simple Javascript objects. I don’t fully understand this claim:

contents: an ordered list of content objects, which are like child nodes of elements in HTML/XML. Mark utilizes a novel feature of JS that JS object can be array-like. It can store both named properties and indexed properties.

I don’t see why this matters as you have a special Mark object with has an explicit contents variable. Ah, maybe:

properties: can be accessed through markObj.prop or markObj[‘prop’] when prop is not a proper JS identifier. You can also use JS for … in loop to iterate through the properties. Unlike normal JS array, Mark object has been specially constructed so that Mark contents are not enumerable, thus do not appear in for … in loop.
contents: can be accessed through markObj[index]. You can also use JS for … of loop to iterate through the content items.

So you don’t have to do a field access but just can use special loops. I don’t see that this would be painful in, say, Python even with field accessing. I might default to making Python Mark Objects iteratable over the contents (on the theory that that’s more “normal”).

It would be interesting to compare APIs to see whether this really brings JSON like easy of programmer use.

And, of course, there’s YAML, which you can think of as JSON++++. (JSON is a subset these days.) It’s designed from the ground up for writability and capturing complex structures. And that it does. The price is considerably more complexity. Like a ton more. (You can define entity like things (actually, more like IDs) on the fly! Inline!) It has excellent embedded complex multiline strings (essentially “here-strings“).

I have to say that it might be easier to grow Mark Notation a bit toward YAML than the reverse. Here-like-strings plus id references go a long way.


XML at 20

Tim Bray has a reflective piece on XML’s birthday last month. The key bit is in the middle:

Twenty years later, it seems obvious that the most important thing about XML is that it was the first. The first data format that anyone could pack anything up into, send across the network to anywhere, and unpack on the other end, without asking anyone’s permission or paying for software, or for the receiver to have to pay attention to what the producer thought they’d produced it for or what it meant.

Hmm. Really? I mean, csv dates to the 70s. It is less well specified, I guess, and simpler. The first isn’t really mentioned but maybe this is part of the “pack anything up into”. But then S-Expressions are easily as expressive as XML and go way way back, though largely de facto standardised. But then there’s ASN.1…maybe that needed permission? I can’t find anything that suggests this, at all, though. I don’t remember any such claims at the time. I do remember a lot of struggling to find XML parsers!

So, I’m very very skeptical about this claim. Maybe, it was the first to grab a lot of attention? But then I’m v. skeptical about its influence. A lot of trends were coming together and I think some notation which was human readable would have pushed forward. XML definitely failed entirely to become the language of web pages or of the web.

Update: On Facebook, Peter points out the other giant howler, to wit, “for the receiver to have to pay attention to what the producer thought they’d produced it for or what it meant.” I guess I’m just inured to this because its so ubiquitous in the XML and RDF communitie, but yeah, the idea that you don’t have to care what it meant is bonkers and part of that is paying attention to what the producer thought they produced it for. And metric tons of effort went into that (bonkers attribution of magic powers to namespaces, anyone?)

(I’m a bit more charitable to Bray in that thread. Maybe I’ll tackle it another day.)