Unit Tests Effects on “Testability”

If your code base looks like this attic, I pity you.

Unit tests seem generally regarded as critical for quality code bases. Prima facie, the key effect we might expect is correctness. After all, that’s what most unit tests aim for testing!

Unit testing may not be sufficient for correctness (no test approach is, in the extreme) but it does seem that having lots of tests should promote correctness.

However, this primary, most direct outcome is not the only outcome we might expect:

  1. Proponents of test driven development argue that having good unit tests promotes refactoring and other good practices because you can make changes “with confidence” because your tests protect you from unintended effects
  2. Some units are harder to test than others, i.e. are less testable. Intuitively, long functions or methods, complex ones with lots of code paths, and complex signatures all make a given unit hard to test! So we might expect that writing lots of tests tends to promote testable code. We might expect synergy with 1.

It all sounds plausible (but defeatable). But what does reality say?

We are living in a golden age for empirical study of software engineering in many ways. There’s so much stuff freely accessible on the web (code of all sorts, with revision history, and a vast amount of side matter…issue and mailing lists, documentation, etc). It’s a lot easier to get a survey or experiment going.

That’s what Erik Dietrich did in a very nice blog post. He looked at 100 projects off of Github, characterized then binned them by percentage of methods which were test methods. If 50% of your methods are test methods, it’s a pretty good bet that it’s heavily tested.

Right off the bat we have the striking results:

Of the 100 codebases, 70 had unit tests, while 30 did not.

(I’m really loving the WordPress iPhone app EXCEPT for the fact that I can’t strip formatting when pasting text and can’t keep that formatting from contaminating subsequent text. That sucks WP especially FOR A FREAKING BLOGGING APP!!!

Update: It seems that the formatting nonsense is only in the app but doesn’t come through in the actual post. Yay!)

This could be an artifact of his detector or maybe the tests are elsewhere. Still!

Overall, only 5 of his 10 very natural hypotheses were correct. For example, testing anticorrelated with method length and complexity.

For cyclomatic complexity…this may not be surprising. You generally need more tests (to hit all the code paths). Also, as supported by “Beyond Lines of Code: Do We Need More Complexity Metrics?” (from the awesome Making Software, which needs a second edition!!), complexity metrics including cyclometric complexity, tend to correlate closely with lines of code. So larger methods and more complex methods are going to march together (and probably nesting too).

In any case, this is a very nice start.

Advertisements

Google’s Computer Science Resource…thing (list? library?)

Google has a fancy curated page of computer science educational material, called the Google “Tech Dev Guide”. It’s not just a categorized list (e.g., by topic, or by type of material), it has “pathways“!

The basic “Foundations” pathway is weird as it just launches you into a set of problems. The first is a former Google interview question. It looks fun and helpful. There are hints. There’s a discussion. There’s just no orientation. I have no idea why I’m here and what I’m doing.

I’m supposed to have 1 or 2 programming course under my belt. The first learning objectives are:

This question gives you the chance to practice with algorithms and data structures. It’s also a good example of why careful analysis for Big-O performance is often worthwhile, as is careful exploration of common and worst-case input conditions.

The discussion is interesting (at a glance) but it only has an implicit analytical framework. I’m not given a strategy, I just see it enacted. I wonder how well that works out for people! I’d think this would discourage completion.

(I wonder if Manchester should put one of these together…or maybe I should for Software Engineering…)

The GPL Won

It seems like GPL and related licenses dominate open source projects, by a lot:

At the beginning of his talk, DiBona said that according to Google’s net crawlers, the web now contains over 31 million open source projects, spanning 2 billion lines of code. Forty-eight per cent of these projects are under the GPL, 23 per cent use the LGPL, 14 per cent use the BSD license, 6 per cent use Apache, and 5 per cent use the MIT license. All other licenses are used with under 5 per cent of the projects.

So, GPL variants govern 71% of 31 million projects. Daaaamn. That’s a lot. A lot more than the rest, which are less restrictive.

I confess to being a bit surprised given the hostility (or exasperation) one often encounters by e.g., business folks when dealing with the GPL. Of course, it has two strong factors: it’s viral (so derived projects must use it) and it’s has a lot of advocacy, both dedicated (think Stallman) and more incidental (think Linux in general).

Ooo, I really have an itch to find out whether virality is a big factor….

Update:

On Facebook, Dan Brickley (thanks Dan!) points out that 1) this survey is from 2011 and 2) more recent surveys point to a shift away from GPL to more more permissive licenses, to wit, MIT and Apache:

Indeed, if we contrast each license’s share of the repositories surveyed by Black Duck [January 2017] versus January 2010, the shift is quite apparent….

In Black Duck’s sample, the most popular variant of the GPL – version 2 – is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%. What this means is that over the course of a seven year period, the GPLv2 has gone from being roughly equal in popularity to the next nine licenses combined to 10% out of first place.

All of which suggests that if we generally meant copyleft when we were talking about open source in 2007, we typically mean permissive when we discuss it today.

Read the whole thing, esp. the bit about the rise of unlicensed projects on Github.

Now, methodologically, their survey is smaller:

This open source licensing data reflects analysis of over two million open source projects from over 9,000 global forges and repositories.

So, it might be the case that the Google population wouldn’t show this shift. But, ex ante, a focused crawl is more likely (perhaps) to be dominated by “high quality” repositories, thus may reflect best or active practice better.

This all still cries out for some causal investigation.

The Concentration of Web Power

In the early days of the Web (and the Internet in general), there was a believe that it was a different kind of system. Distributedly anarchistic, that is, the was no central authority. No one organisation owned…or could own the whole network. Oh we worried about some centralising tendencies…browser makers had a lot of power. DNS is pretty centrally controlled. But the decentralised, distributed nature was also supposed to be resilient against various sorts of attack, including attempts to control it. The internet “routes around damage”, including control freaks. The contrast class were closed “walled gardens” like Compuserve and AOL. These were behemoths of online activity until they were crushed by open systems. Internet systems just scaled better because they were open and decentralised…so the story went.

But the world has changed. While we have more online activity than ever before, the capability for single organisations to control it all has also increased. Many organisations (Google, Apple, Facebook, Amazon to name a few) have the technical wherewithal and economic resources to build a global infrastructure that could handle the traffic of the current Web. (More orgs could do it if we reuse the current communication infrastructure. Google experiments notwithstanding, it would be a heavy lift for them to build out broadband and cell towers across the globe. These organisations could, of course, buy mobile carriers and backbone providers…)

Furthermore, governments have been rather effective in controlling the internet, cf China. The ability to route around such breakage on a mass scale is proving fairly limited.

As a result, power is getting increasingly concentrated in the online world (which, given the economic gains, is translating to power everywhere). We are pretty close to the Compuserve/AOL walled garden world. Even nominally distinct organisations rely heavily on the same high level infrastructure, which is why you see the same awful ads everywhere.

Google seems to be trying hardest for vertical integration. They push Chrome relentlessly and that lets them play with protocols specific to Chrome and their servers. Sure, they generally push them toward standards…but this is a new capability. Facebook tried this a bit (Facebook phone anyone?) but didn’t do so well. Mobile apps make this true for everyone.

Strangely we’re getting some refragmented experiences. I was trying to debug Zoe‘s new YouTube Channel (a bad experience, to be sure) and some things were different in the iPad app than on the website. Fine fine. I mean, terrible, but ok. But I tried to debug it on my iPad and I could not open the YouTube website in a browser. It forced me into the app with no other options (“Open in YouTube App? Ok. Cancel.” where “Cancel” means don’t open the page!). (Ok, I eventually “googled” how to modify the URL (from “www.” to “m.”) to make it work but holy hell that was wrong.) I was trying to share a page from the Guardian and got stuck in AMP hell. I could not get rid of the AMP/Google URL to get to the Guardian one (without string hacking).

That AMP page sure loaded fast…

…but the URL, server, connection all belong to Google.

It is interesting that mass phenomena are easier to control that small scale ones, in some respects. There’s more money available and most people are not very technical sophisticated. Hell, I’m not. I don’t want to be. I leave a lot of things broken because while I could suss it out, I just don’t have the time or energy to fight through all the nonsense. (And there is a ton of a ton of nonsense.)

So, this is the world we live in. The problem is that the big players will never stumble and fall…they might! Yahoo died. If Facebook dies (depending on how it does) it will be traumatic. If Google dies (depending on how it does), it will be very bad. But, these are probably survivable in most scenarios. They’ll degrade slowly and other big players will grab some and some new players will get big.

However, it’s hard to see how it won’t be big players from here on out.

Update:

Ruben Verborgh has a related discussion. My quick hit on the key difference is that Ruben isn’t focused on the centralisation and web architectural aspects, but on, roughly, publishers vs. consumers. A world where every site has a preferred app and directs you to it is still (potentially) a decentralised and distributed one. He’s more focused on a different sort of power, i.e., the power of individual site owners over their viewers. Now, obviously, this is related to the concentration of power I focused on since one of the bad things an “owner of the web” can do is exploit the rest of us in a myriad of ways. But I think it’s important to note that concentration and centralisation have not been “app” driven. Network effects (plus quality!) seem sufficient to explain the rise of Google and Apple. Facebook and Twitter rose by network effects alone, I’d say. Once you have a ton of cash and a big user base, you can exploit those in a variety of ways to gain more power. Though this isn’t trivial, witness Google’s persistent failures in social (Google+!?).

Kyle Schreiber writes about AMP lock in (via Daring Fireball):

Make no mistake. AMP is about lock-in for Google. AMP is meant to keep publishers tied to Google. Clicking on an AMP link feels like you never even leave the search page, and links to AMP content are displayed prominently in Google’s news carousel. This is their response to similar formats from both Facebook and Apple, both of which are designed to keep users within their respective ecosystems. However, Google’s implementation of AMP is more broad and far reaching than the Apple and Facebook equivalents. Google’s implementation of AMP is on the open web and isn’t limited to just an app like Facebook or Apple.

AMP (and other URL/browser behavior hijackings like search result URLs) is extra offensive because it hits deep into the working of the Web. But if we all end up living in Facebook all the time, it won’t matter if the URLs “look” independent.

Note that I  host my blog on WordPress, for “free.” WordPress is a pretty popular Content Management System with an ecosystem that is hard to rival. But the results are pretty vanilla Web. WordPress.com itself isn’t super dominant and it’s very easy to break free. There seems to be a material difference in the situations.

Ontology Management on the Gartner Hype Cycle!

The Gartner hype cycle is an analytical construct (of sorts) which tries to capture the relation between a technology and the expectations we have for that technology. It’s based on the pretty reasonable observation that esp with new technology, there’s a tendency for expectations to outrun the current or even potential benefits. Everyone wants to use the new glittery magic, so vendors and specialising consultants do very well for a while. But it turns out that the new technology isn’t magic, so people find that they’ve spent a bunch of money and time and energy and they still have the problems the tech was supposed to magically solve. This leads to a crash in expectations and a backlash against the tech. But lots of new tech is actually useful, used appropriately, so some of the new tech, its shiny worn off, finds a place in our toolkit and tech landscape. The Gartner hype cycle is a pretty iconic graph with fun-ish labels:

(The y-axis gets different labels over time.)

And people try to operationalise it:

Hype-Cycle-General.png

But I’m skeptical about a lot of this as being rigorously evaluate.

Of course, sometimes a tech takes off and doesn’t really stop. It goes pretty straight from trigger to productivity. The iPhone/iPhone style phones comes to mind. It Just Grew. It may level off as it hits saturation, but that’s a completely different phenomenon.

This is all pretty banal stuff, but Gartner takes it very seriously (they’ve branded it!).

ANYWAY, this year’s hype cycle, excitingly, includes ontology management for the first time! WE’RE ON THE MAP!

  • 16 new technologies included in the Hype Cycle for the first time this year. These technologies include 4D Printing, Blockchain, General-Purpose Machine Intelligence, 802.11ax, Context Brokering, Neuromorphic Hardware, Data Broker PaaS (dbrPaaS), Personal Analytics, Smart Workspace, Smart Data Discovery, Commercial UAVs (Drones), Connected Home, Machine Learning, Nanotube Electronics, Software-Defined Anything (SDx), and Enterprise Taxonomy and Ontology Management,

Alas, if you look at the graph, we’re on the downslope into the Trough of Disllusionment:

And it has a “more than 10 years” to mainstream adoption label.

Ouch!

This is discouraging and perhaps hopeful. Remember that the hype cycle doesn’t tell you much about the qualitymaturity, or utility of the technology, only the perception and influence of perception on the market. (To the degree you believe it at all.) 10 years to mainstream adoption is not 10 years from being a boon for your business or a viable business itself. It means you will often have a hard sell, because people are skeptical.

Update: Oh WordPress. Picture management please.

Grumpy about Textbooks

I definitely need to do more research but I don’t feel that there is a really solid textbook on software engineering. I use Steve McConnell’s Code Complete (second edition) and Making Software for readings.

These are both pretty good. Code Complete is a bible for many people (not for me!) but regardless it’s definitely on a “you should read this if you are a software engineer” list. It has a few problems though:

  1. It’s not written with courses in mind, as far as I can tell. It introduces a lot of stuff and sometimes in a helpful order, but other times not. The “learning objects” are not clear at all.
  2. It’s not super well written. You get a lot of interesting lists (e.g., of program qualities) but they are often not coherent, have some redundancies, are are perfunctorily designed. These often feel revelatory on a first read but if you try to work with them you get a bit grumpy. For example, we have 4 kinds of tests: unit, component, integration, and system. Unit and component test bits of the same size: a unit. The difference is whether the unit is maintained by one team (thus a unit test) or more than one team (a component test). This is bonkers. It’s esp. bonkers to compare with integration or system tests. It could be part of an interesting axis (who’s the owner vs. who’s writing the tests). But there are much better frameworks out there.
  3. It’s a bit dated. The second edition came out in 2004 and is thus 12 years old. This doesn’t invalidate it per se, but given that the book itself has a prominent discussion of the need for life long learning because the fundamentals of software engineering keep changing, it’s a problem. I’d prefer something other than Basic as the “other” example language.
  4. It pretends to focus on code construction, but has just enough architecture, etc. to be almost a reasonably complete text. But the scattershot approach is a bit disorienting.

If you read it cover to cover and absorbed it all with an appropriately skeptical eye and organised it appropriately, then you’d be in great shape.

My pal Mark suggested reoriented on The Pragmatic Programmer, which is another classic and definitely on the must read list. But a lot of my concerns apply to it too. (That there’s a basic divide between those pushing Code Complete and those pushing the Pragmatic Programmer is interesting. The lines I’ve seen is that Code Complete aspires to be encyclopaedic and the Pragmatic Programmer is more opinionated and thus effective. Roughly. They both feel scattered to me.)

I could try both (not this year!). I could go with Pragmatic Programmer because it’s smaller and thus they could possibly read the whole thing.

But neither feel satisfactory as a textbook. The systematicity and pedagogic logic just don’t seem to be there. So I’m left imposing some order on them.

Software Gripes: Scrivener and ConcertWindow (and WordPress)

I think I need regular “features” i.e., columns of a particular type or theme, to keep the blogging going, so here’s a new one near and dear to my hard: ranting about software problems (I’ll through in other system gripes but the most common is software).

Scrivener

I want to love Scrivener. It certainly is enticing, if a bit complex. I’m trying to use it as a course materials (lectures, quizzes, etc.) management and editing tool. People certainly seem to have had some success with it as such. I think it could also be handy for paper or book writing and esp. grant writing. Grants have VERY complex and finicky structure which Scrivener’s “break it into bits” and “annotate and organise” and “hey, templates all the way down” approach looks to be quite good.

But there’s a fundamental problem: The whole Scrivener model is “compiling” the project into a single final document. Really. Uhm…that’s bonkers. Even if your final output is conceptually a single book, you very well may want the “out of Scrivener” view to be split up in multiple files. (Think Website with a separate HTML page per Chapter. Or just Website.)  For courses, I don’t want one output to contain it all, I want lots of documents (syllabus, references, slides broken out by day or by lecture, quizzes, lab sheets, etc.) Scrivener HAS THAT STRUCTURE, but, as far as I can tell, it doesn’t like to spit it out. You can “export” the file structure, and maybe that will turn out to be good enough. (I only figured that out today.) But I want some of the structure to be flattened! E.g., if I make a Lecture which has separate subdocuments as “slides”, for some workflows they should be combined! But the whole Lecture shouldn’t be combined with all the other lectures. (Except for the global print version.)

Ok, “export” at least lets me write my own custom compiler. But then why do I have to deal with the “project” structure and explicitly set “export”? Why can’t a project just be a directory/file structure in the file system. In other words, why “export”? That adds a really painful step to the process. It makes synching harder, etc.

Additionally, Scrivener has some simple WYSIWIG formatting (bold, italic, tables, lists, etc.) It also has export to MultiMarkdown. This all seems extremely promising for downstream processing: Write using the GUI, explore to Markdown, then run tools that parse and manipulate the Markdown to generate the final versions.

Oh, silly me! All Scrivener does is compile snippets written in MultiMarkdown to other formats (HTML or PDF via LaTeX)! You have to write the Markdown.

Well that sucks. It’s not like Scrivener is a word class Markdown editor with syntax checking etc. The key formatting features it supports in the GUI are eminently Markdownable, so why not export to it? Indeed, for things like Tables, having a reasonable GUI is much much nicer than hacking Markdown syntax directly. Sigh.

Finally, they have this cork board view. Before 2.7, it defaulted to a cork textured background and index card looking cards. Very skeuomorphic, but in a good way. It took you out of the UI and forced a cognitive mode shift. 2.7 it defaulted to a “flat” interface that was 1) bland and 2) merged it visually with every other view.

Sigh. But wait! You can tweak it back. But now, in my preferred Index Card style, they stuck a pushpin.

screenshot_03Why, why, why, why?! It doesn’t read; it doesn’t help; it forces a “vertical” orientation (I actually viewed them as piles before). This little tumour does exactly nothing positive. It serves no visual-informatics purpose and, indeed, distracts. It’s centred, bright, and in line with meaningful information. This is skeuomorphic madness, where the designer slavishly emulates the real world object without thinking about the design. Pushpins are not a useful information part of the design…they are there to hold the cards in place. If you lay the cork board flat, you don’t need them.

“But Bijan,” you say, “the cork only exists to have pins pushed in! Isn’t that the same problem?”

No, gentle reader, while the cork in the real object is there functionally to be stuck with pins, it has several user interface functions: 1) visual mode switching; it’s a very strong cue about the difference in working style; it provides an information cue, 2) it supports the illusion without affecting other information per se, and 3) it is high contrast yet not obtrusive. The main problem with skuomorphism is that people take it too far. The idea shouldn’t be to exactly replicate the real world object, but to design an interface that works. Flat interfaces general suck because they generally designed that chrome should be indistinguishable from content (or not be perceptible at all) and content should have few sub distinguishing features. (Microsoft’s Metro interface is something of an exception.)

ConcertWindow

Zoe tried to do a ConcertWindow concert last Sunday. There were numerous technical hassles, but we managed to struggle through most of them and have a reasonable concert which most viewers could see most of. One cool feature is that you can get the full recording of the stream and the website lets you post a one song snippet of the recording on their website. This was exactly what we wanted to promote the new album (in progress).

We do not have such a recording.

The reason we do not have one is that they have a “feature” that is supposed to help you debug your streaming. For a given concert slot, you can set up a “test” session which will not be exposed to anyone except your testers and can happen at other times than your scheduled slot. This sounded sensible, but there were a few problems:

  1. It doesn’t work from the iOS app, which is how were were going to broadcast the concert. Grr. But ok, we can at least test the basic setup via the browser version.
  2. Testing via the browser version just doesn’t help very much. You still need to test via the iOS app. A lot. So we were scheduling test concerts all over the place. That was better in someways, since that’s what exposed that the “Pay what you want” option is really “Pay what you want as long as it is at least $1”. Grr.
  3. When you go to look at your video, the prepend “for your reference” all the test video you did. What? Why? Who wants that? Who wants that in their concert recording? Shouldn’t you just save that as a separate file, if at all? Weird.
  4. Oh, and if you tested in your browser, but recorded from iOS, you now have a video that is half test video and half corrupted nothing. That’s right, the “test” mode can corrupt your concert recording. So we have no video of the concert, whatsoever.
  5. In the FAQ for “Preparing for the show” they have “How can I sound check before the show?” which says

    Choose if you’re going to broadcast with Web, iOS, or RTMP, then switch to “Test” mode and start broadcasting. No one will be able to see it on your channel. Click the “Test URL” link below the broadcaster and you’ll be able to see your test stream in real time. You can also send this link to a friend.

    In the FAQ for  “After the show” they have “My archived video file has errors and/or the recording is corrupted” (it’s on the SECOND PAGE of this FAQ)

    This can sometimes happen if you broadcasted to the same show via multiple devices (iOS + laptop) or in different frame rates / formats.

    To avoid this happening, be sure to broadcast to each show using only one device and one video/audio format.

    If you do broadcast using multiple devices or formats, the live stream will work totally fine, but the archived recording may be corrupted.

    So, the advice they give before hand can corrupt your recording because they have a feature (prepending test video) which is completely worthless. And their own help leads you there.

Message to the ConcertWindow programmers who did this: Never corrupt important data. Never. Ever. Especially don’t corrupt real data with test data. I mean…come on. Shame

Message to the ConcerWindow documentation writers who did this: If there is a risk of data corruption…DON’T RECOMMEND ACTIONS THAT RAISE THAT RISK. Oh, and WARN PEOPLE ABOUT THE RISK AND HOW TO MITIGATE IT before they might do the action that destroys their data.

You should be profoundly ashamed of yourselves.

While we’re talking documentation nonsense, let’s consider this gem:

At Concert Window, we give the artist a full private copy of their show, for free. You can use it for any non-commercial use, including uploading it to YouTube. 

The video files are in .mp4 format, which is playable with most major video players including VLC and can be imported into iMovie and Final Cut Pro.

Sometimes, due to errors during broadcast or other reasons, the video files may be corrupted or unplayable. In that case we’re sorry but there’s nothing we can do. This is part of why we offer video archives as a free service.

In addition to downloading your full show recording, you can also create a short highlight video. Here’s an article with more details: How to create a highlight video

*Artists are not allowed to sell their show videos due to copyright restrictions.

First, note the “or other reasons” for corruption…like BEING MISLED BY THE DOCUMENTATION TO HIT A DESIGN BUG WHICH IS KNOWN TO CORRUPT YOUR CONCERT. Maybe you should fix that.

Second, note the nonsense of the highlight blocks. Zoe owns the copyright for the songs she played and the performance. The terms of service explicitly SAYS that she owns the copyright.

(BTW, the terms of service are absurd and horrible. I’ll break that out in another post.)

WordPress

Current gripe: Adding a category doesn’t put the new category under the parent one you’ve selected.

Also, I want to have categories be more meaningful. I’m currently inserting two key categories into my post title (see current post’s title): Music Monday and Software Gripe. This is wrong. I’m polluting my title with Metadata about my post in order to get the visual effect I want. Boo!