Demoing YAGNI with FizzBuzz

I use FizzBuzz as a warm up lab and as a key lecture. Tom Dalling has an excellent blog post where they iterate the implementation of FizzBuzz into a fully packaged, fully parameterized with sensible defaults, fully documented module. Which is horrific…completely larded with unnecessary complexity. And the punchline is “The Aristocrats”…er “You Ain’t Gonna Need It”.

I’m wondering if this would work as a lecture or even if it makes sense. Each step is supposed to be inevitable or at least sensible but there’s something made up about it. OTOH, it might go too far toward suspicion of abstraction.

It could but fun if the build was right. Perhaps it would be great if it also made clear when those moves made sense. Right now, it’s only negative. But a lot of those moves are good ones in some contexts.

Advertisements

The Viz Fallacy

Taxonomies are fun and usually treeshaped. Trees are “easy” to visualize. When people have a taxonomy they often like to visualize them.

But the results aren’t always so nice. Given that this is a taxonomies of fallacies, I found their misuse of a visualization highly amusing. It is also reminiscent of one of my thesis topics (non logical reductios).

mc et al called this the pathetic fallacy (of RDF).

The Web is (Holy Crap) Insecure

Blame Javascript. And browsers. And the W3C.

77% of 433,000 Sites Use Vulnerable JavaScript Libraries. I’ll bet that various random webpages I put up are vulnerable. Who updates random pages?! Not me!

Of course, my random, unvisited experimental pages are probably not a big problem, but still. Yeek.

Oh my god, SVG is a mess (read the paper). Short answer: It’s almost impossible to use SVG safely. And almost no one is trying.

Something’s going to break in a big way some day.

Gantt Charts

Oh Gantt charts…the bane of many a student and grant proposal writer.

I spent a good chunk of today creating a starter kit for MSc projects on the School’s gitlab. I’m a bit believer in canned setup even when experience says they don’t work that great. At least, not for everyone. But hey, at the very least they might encourage people to use gitlab for their projects!

One thing that always causes problems is the god damned Gantt charts. Don’t get me wrong, a bit of speculative “big picture” planning is a good idea. But…I don’t think we train them in Gantt charts, nor do we supply nice software. Gantt charts are a nightmare! Most Gantt chart software is a nightmare!

I really wanted to use a textual syntax as the common denominator. The three candidates of note were:

  1. Mermaid (a Javascript library with reasonable Markdown integration)
  2. PlantUML which has a nice, if experimental, syntax with half assed rendering
  3. pgfgantt which is comprehensive, thoughtful, genius output and documentation that makes my gums bleed.

PlantUML has a sorta friendly syntax but seems to want to have every day as a slot, which makes a 9 month project unwieldy.

Mermaid refuses to put days on the top or have flexible slots (e.g., weeks). It also doesn’t have milestones (you have to fake it with a one day task). But it scales reasonably.

pgfgantt truly looks brilliant but I just couldn’t fight it today. It’s perfect for LaTeX docs but maybe not so much for Markdown. Plus, not really a friendly stand alone syntax. Definitely something to explore for another day.

With a lot of experimenting, I was able to coax a decent looking Gantt chart from Mermaid with reasonable source. Here’s the source:

gantt
    dateFormat  YYYY-MM-DD
    title Generic CS MSc Project Plan
       
    section Period 3 
        COMP66090            :done,  des1, 2018-01-29, 2018-03-07

    section Period 4
        Supervisory Meetings (term) :active, sup1, 2018-03-08, 2018-05-04
        POP Submission :sub1, 2018-05-10, 2018-05-11

    section Exams
        Exam Period :exams, 2018-05-06, 2018-06-06
        Semester End : 2018-06-08, 2018-06-09

    section Summer
        Supervisory Meetings : sup2, 2018-06-09, 2018-09-07
        Dissertation Submission :sub2, 2018-09-06, 2018-09-07

and here’s the output:

Not bad! Easy to edit. For students, they can just delete Period 3 and Period 4 and have a chart skeleton ready to go. Mermaid includes some dependency support (hence the names) but we really don’t need that for what we’re trying to do: Break a project into tasks, estimate times for a single person, and try to slot them in a calendar. This does the job.

Apigee Awesome API Consoles Bite The Dust

Alas.

The were cool. It was web based, with no sign in (except for individual services). Lots of standard services (Twitter, Facebook, YouTube, etc.) predefined. Perfect for class discussion. It could use your authentication from your currently logged in services so very trivial to get going.

I mean, check it out (until April 15th, 2018).

The alternatives look pretty grim, thus far. There are some nice tools, but none seem to fill the niche.

Python Static Site Generators: External Dependencies

How “big” are they?

Yesterday, I looked at SLOC counts for 4 major Python static site generators. Here’s the summary of their counts wrt python files:

Package Python files Python SLOC
Nikola (plugins) 147 (75) 17,506 (8,510—48%)
Pelican 18 4,539
Ivy 33 1,004
Urubu 10 784

As you can see, there’s a pretty steep gradiant, even if we separate out Nikola’s plugins. But this only scratches the intellectual footprint of each. Given that “integrating third party components” is on the list of considerations, I thought I’d peek at their dependencies. This reminded me that Nikola might be a bit inflated as I installed it with “extras”. Since this option is easily available from pip it didn’t seem unreasonable, but there might be a “slimmer” version of Nikola. (A quick install of the non-extras version yields 500 more SLOC! No idea what’s up right now.)

Let’s look at the dependencies.

Nikola Pelican Ivy Urubu
Pygments
Pillow
python-dateutil
docutils
mako
Markdown
unidecode
lxml
Yapsy
PyRSS2Gen
logbook
blinker
setuptools
natsort
requests
piexif
doit
feedgenerator
jinja2
pygments
docutils
pytz
blinker
unidecode
six
python-dateutil
markdown
pygments
pyyaml
jinja2
libmonk
ibis
shortcodes
libjanus
syntex
jinja2
pygments
markdown
pyyaml
beautifulsoup4

(Note that the Pelican docs say you have to install Markdown yourself. So it really should be on this list.)

So, they all outsource:

  • Templating, typically jinja2 (thought that’s an extra for Nikola; mako is the “builtin”)
  • Syntax, typically markdown and/or Restructured Text
  • pygments, because nerds gotta have source code highlighting

Nikola has a fair bit of image stuff (Pillow, piexif). It supports image gallaries and the like out of the box.

Ivy has some homegrown utils for standard stuff (e.g.,libjanus for argument parsing).

Barring install issues, I’m not sure sheer number of packages should deter. It seems easy enough to focus down “on the core” and doing so in the context of a lot of other stuff seems valuable.

Nikola Extended

I wanted to look at the Nikola “extras” and “tests” dependency lists separately:

Extras Tests
Jinja2
husl
pyphen
micawber
pygal
typogrify
phpserialize
webassets
notebook
ipykernel
ghp-import2
ws4py
watchdog
PyYAML
toml
mock
coverage
pytest
pytest-cov
freezegun
codacy-coverage
colorama

That test stuff is really tempting both because it suggests that Nikola has great testing, but because they touch things I wanted to cover anyway.

Static Site Generators

I’ve become very interested in static site generations systems. They have become increasingly popular esp as client side Javascript has replaced many things you needed server control for and third party services (such as Disqus) have replaced a lot more. Static sites are generally cheaper to deal with (in total cost) and support some kinds of flexibiliy that’s hugely painful with server heavy solutions.

I’ve got three major use cases (beside my personal website):

  1. My course websites. Right now I manage these statically but with a hodgepodge of systems including horrid HTML hacking.
  2. Coursework for my software engineering class. I want them to have to deal with a real, moderately large but well behaved system. Right now, they just work on a wc clone from scratch.
  3. Zoe’s website. Or more generally, muscian websites.

1 forces a certain flexibility. I want to migrate slowly. I want to incorprate slideshows, notebooks, coursework stuff, etc. and I want progressive display (i.e. week 1’s, then week 2’s stuff…)

3 forces non-expert suitability. It has to be really easy enough for people without high levels of computer expertise. I’ll accept requiring a command line.

2 means that the code has to be sensible. Preferably, there would be some tests already. A plugin architectre would be helpful. I need a generally sensible framwork so I can assign comprehension, debugging, performance, and extension tasks. It needs to be reasonably testable so some degree of marking or other analysis can be supported by automated tooling. It has to be in Python as well, since that’s what I’m using and that’s what my class uses.

I’d prefer one system for all of these so that I can build up a high level of expertise, examples, etc.. So I’d better not hate it.

Candidates

My unscientific screening leads me to the following candidates

Pelican and Nikola are heavyweights: big, big communities, functionality, etc. They are also “blog first” frameworks, though they can handle “pages”. Think of them as static WordPress clones.

Urubu and Ivy are smaller, generally one person efforts. They center on website often with a heirarchical folder structure. Urubu advertises itself as a “mircoCMS” and I think that’s fair to appl to Urubu.

We can make this a touch more precise by looking at some metrics for each system. Indeed, some lines-of-code metric seems like a good starting place!

I used three tools: two Python-based tools (radon, metrics) and the more or less standard sloccount. My first thought is to just throw each tool at it from the command line against the whole package directory and see what comes out in a summary. This means that plug-ins, templates, examples, etc are all potentially in the mix, but also that I need a decent summary (which ruled out pygount]).

Here’s what I get when I try to measure Nikola:

$ radon raw nikola -s
[lots of individual file stats]
** Total **
    LOC: 26393
    LLOC: 12145
    SLOC: 17733
    Comments: 3291
    Single comments: 3964
    Multi: 1245
    Blank: 3451
    - Comment Stats
        (C % L): 12%
        (C % S): 19%
        (C + M % L): 17%

Not bad! And it jibs with sloccount:

$ sloccount nikola
[lots of noise and I edit out blank lines and copyright stuff]
Totals grouped by language (dominant language first):
python:       17063 (100.00%)
Total Physical Source Lines of Code (SLOC)                = 17,063
Development Effort Estimate, Person-Years (Person-Months) = 3.93 (47.19)
 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months)                         = 0.90 (10.81)
 (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule)  = 4.36
Total Estimated Cost to Develop                           = $ 531,251
 (average salary = $56,286/year, overhead = 2.40).
...
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."

Interesting! I Like the COCOMO stuff. I talk about COCOMO a tiny bit in class, so that’s fun. The SLOC is close (within 3%) which is fine! Yay!

metrics is wack:

$ metrics nikola/**
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
   13                         Python        5806    2373   1302 
----- ------------------------------ ----------- ------- ------ 
   13                          Total        5806    2373   1302

Oy, apparently ** isn’t doing its recursive magic. I did some manual expansion and started to hit diminishing returns at this very long command:

$ metrics nikola/** nikola/**/** nikola/**/**/** nikola/**/**/**/* nikola/**/**/**/**/* nikola/**/**/**/**/**/* nikola/**/**/**/**/**/**/*
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
  102                        Cheetah           0      58      4 
   22                            CSS       16196     275      0 
   98                     JavaScript       21852    5102   4676 
    8                           JSON        6472       0      0 
    5                       markdown           4       0      0 
  147                         Python       17506    5815   2978 
   11               reStructuredText        1750     206     37 
   14                      Text only           0       0      0 
    2                           XSLT          54       0      0 
----- ------------------------------ ----------- ------- ------ 
  409                          Total       63834   11456   7695

Ok! This is pretty cool. Look at all that Javascrit and CSS! The Python count stabilised somewhat earlier at something pretty close to what the others gave. (I also like that the Cheetah has a McCabe of 4 with 0 SLOC!) These are all close enough that I feel pretty confident that for SLOC of Python, I can use one and be happy (esp. if I double check against one other.)

I’m going to use metrics (which needs the double check just to make sure I expanded enough for each project) because the file count seem marginally useful.

So, Pelican:

$ metrics pelican/** pelican/**/** pelican/**/**/** pelican/**/**/**/* pelican/**/**/**/**/* pelican/**/**/**/**/**/* pelican/**/**/**/**/**/**/*
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
    5                            CSS         604      69      0 
   34                           HTML         294      89      2 
    1                       Makefile          96       0      0 
   18                         Python        4539     824    912 
----- ------------------------------ ----------- ------- ------ 
   58                          Total        5533     982    914

Whoa! 18 files with 4.5k SLOC! Plus the templates must be simpler (no Javascript?!?!) and the McCabe scrore is waaay lower (whatever that means).

Ivy:

$ metrics ivy/** ivy/**/** ivy/**/**/** ivy/**/**/**/* ivy/**/**/**/**/* ivy/**/**/**/**/**/* ivy/**/**/**/**/**/**/*
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
    7                            CSS        1121     395      0 
    1                     JavaScript           1       3      3 
    1                       Makefile           9      10      0 
    8                       markdown           9       0      0 
   33                         Python        1004     486    193 
    4                      Text only           0       0      0 
----- ------------------------------ ----------- ------- ------ 
   54                          Total        2144     894    196

And finally, Urubu:

$ metrics urubu/** urubu/**/** urubu/**/**/** urubu/**/**/**/* urubu/**/**/**/**/* urubu/**/**/**/**/**/* urubu/**/**/**/**/**/**/*
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
   10                         Python         784     268    165 
----- ------------------------------ ----------- ------- ------ 
   10                          Total         784     268    165

This is quite the spread! Alas, it really isnt enough. Nikola is a beast, but it has commensorate functionality and plenty of extension points. Indeed, if we break out the plugins we see that the “real” SLOC is rather lower:

$ metrics nikola/plugins/** nikola/plugins/**/** nikola/plugins/**/**/**
Metrics Summary:
Files                       Language        SLOC Comment McCabe 
----- ------------------------------ ----------- ------- ------ 
    1                        Cheetah           0       0      0 
    1                     JavaScript        1041       0    155 
   75                         Python        8510    3221   1637 
    1                      Text only           0       0      0 
----- ------------------------------ ----------- ------- ------ 
   78                          Total        9551    3221   1792

We’re still twice the SLOC of Pelican with 3-4x the files, but it’s closer.

Bascially, I need to decide whether a “rich”, complex system is what the student should face or a “big enough” one that they can reasonably read all the way through.

Moer investigation needed.

Of Interest

Lektor is an outlier but interesting. It comes with a (web based) graphical UI as well as a command line one. I need a command line for class purposes, but a GUI would be helpful for Zoe and, well, me. And it would make for a richer system to play with (there’s a command line as well). Buuuut:

Alternatively you can manually install the command line version with virtualenv if you know how that works. Note that this method is heavily discouraged for anything other than advanced use cases such as build servers.

I don’t know that they’re doing this for a bad reason but it complicates a big chunk of the story I want to tell students (e.g., let’s talk virutal enivronments!) The development version installation reads:

If you want to install the development version of Lektor you can do so. It’s the same as with installing the command line application but instead of using PyPI you install directly from git and you need to have npm installed to build the admin UI:

$ git clone https://github.com/lektor/lektor
$
cd lektor
$ make build-js
$
virtualenv venv
$ . venv/bin/activate
$
pip install —editable .

This…worries me. We need npm, etc. I’ll keep it off the eval list for now.