Language Productivity in Terms of Function Points

One of the perennial tough nuts for software engineering research is understanding the effect of programming language on productivity. Just understanding productivity is challenging! A fairly standard metrics is debugged, logical lines of source code (LOC) with the average productivity for a developer being in the teens (10-38 for large projects, maybe 140 for small). This is held to be language independent, thus a core hypothesis for improving productivity is to increase the “power” per line of code. This is an argument for higher and higher level languages and rich libraries.

Richard Kenneth Eng has blog post which claims that Smalltalk is a highly productive language in terms of number of hours it takes to product a function point. Function points are an alternative metric to LOC for measuring software. Just as when counting lines, we know to ignore comments or lines due to “mere” formatting nonsense because they don’t contribute to the actual functioning of the program, intuitively, function point analysis says not to focus on how many lines are needed to express a bit of functionality but on the functionality itself. In principle, this would make cross language productivity comparisons more accurate (i.e., the higher productivity the language, the greater function points per hour, regardless of the lines of code produced).

Smalltalk comes out tops (in his selection) in terms of number of hours to produce 1000 function points (6,879 for Smalltalk up to  26,273 for C…the least productive).

The cited report is pretty interesting (the blog post keys in on table 16) but is big and complex. It’ll take a while to digest. One interesting bit of analysis (table 17) is breaking out the effort into code and non-code efforts. Roughly, as productivity goes up, the percentage of time shifts between activites, e.g., for  C it’s 1.42% non code to
88.58% code, where as for Smalltalk it’s 43.61% non code to 56.39% code. This suggest that more effort is going in to design, looking stuff up, or requirements analysis (or maybe hanging out!).

As can easily be seen for very low-level languages the problems of LOC metrics are minor. But as language levels increase, a higher percentage of effort goes to non-code work while coding effort progressively gets smaller. Thus LOC metric s are invalid and hazardous for high-level languages.

It might be thought that omitting non-code effort a nd only showing coding may preserve the usefulness of LOC metrics, but this is not the case. Productivity is still producing deliverable for the lowest number of work hours or the lowest amount of effort.

Producing a feature in 500 lines of Objective-C at a rate of 500 LOC per month has better economic productivity than producing the same featu re in 1000 lines of Java at a rate of 600 LOC per month.

Objective-C took 1 month or 149 work hours for the feature. Java took 1.66 months or 247 hours. Even though coding speed favors Java by a rate of 600 LOC per month to 500 LOC per month for Objective-C, economic productivity clearl y belongs to Objective-C because of the reduced work effort.

I don’t see the methodology for this work (and they use “mathematical proof” in a weird way). This makes me a bit sad because it really means that citing these numbers is dubious.

We already discuss lines of code as a complexity metric in our class (using chapter 8 of Making Software). It would be interesting to try to introduce function point analysis at least conceptually.