all bits considered data to information to knowledge


Analogy trap: Codermetrics

Recently I have read Codermetrics book by Jonathan Alexander, and was left with mixed feelings about it’s proposition: to introduce sports' style analytics to optimize performance of software development teams…

The similarities seem uncanny:  developing software is a team endeavor with people in one or more roles working towards a shared goal (release). Yet in my opinion the differences contradict similarities – often to the point of making the analogy superficial and altogether flawed. It is one thing to score 10 goals in the field, and quite another to produce 10 good software-intensive systems – emphasis on “good”; the sports metaphor ignores quality aspect almost entirely - a goal is a goal is a goal... and one good system design might be diametrically opposite of an equally good system design. While it could be argued that “winning software” is also quantifiable the approach is much more nuanced, and trends to change over long period of time. In addition, applying metrics to software development team immediately affects dynamics of the team and might introduce more problems that it is supposed to solve. The temporal dimension is also is all but left out - months of team training followed by a couple of hours of team effort is hardly analogous to a solitary task of learning and honing one's skills followed by months or even years of development marathon...


Quality Attributes: either binary or quantifiable

Architectural document that reads like a promotional materials always makes me wonder... What exactly does the author mean by "easy to maintain"? Does adjusting mere 20+ configuration files on several machines in a cluster, and umpteen start up parameters for the dozens of processes qualify? How about going through several tabs with dozens of conflicting options on every instance? "An easy" is in the eye of the beholder, architects must do better than that.

In 1983, Tom  DeMarco in his seminal work - Controlling Software Projects: Management, Measurement and Estimation - famously remarked: "You can’t control what you can't measure"

This rhymes with a similar sentiment expressed by Lord Kelvin almost a hundred years earlier in somewhat more convoluted form characteristic of the times:

"...I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be."

-Baron William Thomson Kelvin
From 'Electrical Units of Measurement', a lecture delivered at the Institution of Civil Engineers, London (3 May 1883), Popular Lectures and Addresses (1889), Vol. 1, 73. Quoted in American Association for the Advancement of Science, Science (Jan-Jun 1892), 19, 127.

This correlates strongly with system quality attributes used to control system architecture development process. Any quality attribute to which the architecture must conform has either to be measurable (numbers!) or be binary - yes/no - in nature, relative terms just would not cut it (easier? more flexible? robust-ier?). If you state that the system is "adaptable" - it would mean that it is designed to accommodate some anticipated changes; it does not mean that it can swallow any change that comes after the system was designed; the value of such an attribute could only be relative as it provides no ways to quantify it. On the other hand, if "responsiveness" is the desired quality attribute then it has to be measurable (unless you are talking about Leibnitz’s monads, that's it ); milliseconds response time per user per concurrent users might be one example...

There unquantifiable adjectives and/or adverbs used in quality attributes must be kept to an absolute minimum, and be qualified (e.g. "adaptable within current technological paradigm")


Software Development Trends

Once in awhile I stumble across an article that raises questions I am chewing upon myself. Here is one, The Decade of Development  by Darryl K. Taft.

He proposes a Top 10 list of trends that had impacted Software Development in the past decade. And, of course, I had something to say about each of his points 🙂

1. Web Services/SOA

Yes, most definitely. This is logical evolution of distributed computing. And it is past its potty-training stage as well. Combined with platform as a service it would virtually (pun intended :)) guarantee interesting future.

2. Rise of Open Source Software

Yes... No.. Maybe.  Definitely not to the same extent as other trends listed

3. Web becomes #1 development platform

A bit redundant after Web Services/SOA , and hardly a major trend unto itself. Distributing computing, WOA (Web Oriented Architecture) would be more relevant.

4. The Emergence of Web Frameworks

Most emphatically - YES. Once we stop re-inventing the wheel things will began to improve. Take a hint from electronics: hardware engineers do not start designing ASIC from sifting sand, and yet their creativity still has room to flourish (most often heard complaint from framework-phobic developers). Component based approach and frameworks will lift software engineering from craftsmanship into industry. It goes down to a nitty-gritty technical detail such as unit testing and logging and all the way up to Application Life Cycle Management frameworks in context of Enterprise Architecture

5. Web 2.0

I believe that this is a hopelessly over-hyped buzzword. Yes, there are new tools for collaboration, but the idea is hardly new. Groupthink blown to epic proportions.

6. Simple Beats Complex

Any time, I mus add! Hardly a decade-long trend. I would argue that this goes back as far as human history (though not necessarily in straight line): arcane system of tribal lore and taboos gets replaced by a codified law system (though some might argue that it is no less arcane...) Also, one must beware of oversimplification. Albert Einstein once famously remarked: "As simple as possible but no simpler"

7. The Rise of Scripting/Dynamic Languages

I have to admit, this caught me completely off-guard at the beginning of the century. I used to regard these dynamic languages as second class citizens, even having witnessed power of the Shell (KORN, C, bash). I suspect that the major factor is increased hardware power which alleviated inherently slow performance of scripting languages. Another pet peeve of mine was that scripting languages used to be weakly typed; this either changed (Ruby), or was adressed through a variety of frameworks...[interesting discussion on (de)merits of weakly vs. strongly typed languages here) In retrospect it appears logical (de)evolution: compiled -> byte code -> script... I predict that pendulum will swing back, and we'll see resurgence of compiled languages, maybe self-compiled, JIT compiled etc. Just take a look at emerging EXI -  binary XML standard - one of the oft-touted features of XML was that it is "human readable"; apparently this ceased to be of paramount importance

8. The Developer Community Bifurcates

I disagree, I do not see it as a trend, but rather as a human trait. Once the entry barrier into the field was lowered (thanks a lot, Visual Basic!), the field was swamped with accidental programmers. Even before that, there were sloppy written COBOL and Algol code and atrocious pointer math (just look out there how many tools are created to detect memory leaks in C code)

9. Heterogeneity Rules

Yes, this is an unexpected twist on the old "best-of-breed" adage, and facilitated by inherently heterogeneous web. XML, Web Services and scripting languages complete the picture.

10. The Emergence of Team Development (and the rise of Agile development)

This is a biggie. Finally, we are at the dawn of engineering, with (emerging) body of knowledge and methods to tackle notoriously hard-to-pin-down software problems. Methodology and frameworks (yes, I do see Software Factories on the horizon!)

I would also add raise of Architecture, especially Enterprise Architecture; understanding of the ultimate importance of ecosystem in this inteconnected age.


The weakest link

Open Kernel Labs clamed to produce the world first 100% verified "bug free" software. Some ~8,000 lines of code were subjected to scrutiny of "over 10,000 intermediate theorems in over 200,000 lines of formal proof"  to make sure that the software is "functionally correct".

By functional correctness they mean that "the implementation always strictly follows a high-level abstract specification of kernel behavior."

 As saying goes, the chain is as strong as its weakest link... Applied to software world one could be tempted to paraphrase: a software stack is as robust as its weakest component. But software system is more than sum of its components. By virtue of interdependence, a software-intensive system includes everything - from server to network to software to user... And all of these parts change - not only by themselves but also in response to the changes occurred in other parts. Say a faulty memory chip will lead to application's working memory set corruption which introduces instability into the system... What good would be a perfectly correct code if it cannot execute? What if the "high-level abstract specification of kernel behavior" was flawed to begin with?

This brings me to the same dilemma I've been pondering in my cozy data world: it IS possible to have perfectly correct data yielding absolutely incorrect information... I sense that Gödel's Incompleteness Theorem is lurking here somewhere…


Release as if there is no tomorrow

The product releases should be in meaningful increments, each release potentially becoming the last in line. This gives the customer (even if the customer is you) a sense of control, and produces a relatively self-contained system at each release...And a possibility to call it quits at any release time.

Branching, in my opinion, is not for release but rather for investigating alternatives; they will either be merged back into trunk, or abandoned. The last thing you need is multiple versions of the product floating around...Every release should be tagged, and - if there were changes to the environment -wrap up the envirionment (VM), and store it alongside with the product.