Monday, November 10, 2014

AngularJS Simple Noob SPA

(Note: Link to the doc is at the bottom of this post).

I've been working on really formalizing my JS skills for quite a while now. Much as I have loved ActionScript-based development (and it has loved me), the fact of the matter is, the time has come (actually it came at least two years ago); get urgent about JavaScript technologies, both client and server-side, or get run over by industry urgency.

If you don't, you assume a lot of risk:
  • Users will complain. They want flexible, adaptive apps that update with new features frequently. 
  • Your talent pool will suffer: UI engineers and developers know that confining yourself to ActionScript based technologies is short sighted in terms of career.
Believe me, I have nothing against Flash based technologies. But the wheel has turned; if you're not already actively moving this way...well...not sure how to finish that.

To that end, you have to pick your guns. There's SO much JS out there. I have chosen, as my core study (no surprises here):
  • AngularJS
  • JQuery
  • Node
  • The related HTML5 capabilities (local storage, canvas, Server Sent, etc.). 
As far as a packaging technology, I'm looking at Browserify and RequireJS. People are pretty adamant about one or the other, I'm trying to sort it out. (And of course, I keep up on my fundamental Java, I study Clojure when I can, and so on, but that's another discussion. The notion of programming directly to an AST is fascinating to me; I think there's a lot of potential synergy between Clojure and UI). 

I have found (and continue to find as they evolve) these technologies can take care of just about everything, and are relatively stable and well supported compared to the rest of the pack. I get that there's a LOT more and people can be very religious. But I don't think anybody would debate that this is a perfectly acceptable front end stack (and even back end with Node, which I'm increasingly amazed with).

Anyway, I had a conversation with a dev friend, who lamented, "Angular seems great, I want to learn it. But there's SO many modules, so many ways to bootstrap, so many directives...all I want is a simple one pager that shows me how to do the most commonly used, fundamental things."

So I looked around, matched it with what I have found as the most common and basic things I use, and built that simple one pager.

It shows, very simply and without any CSS or other errata that gets in the way of the quanta, which in Angular terms is Controllers, Directives, Filters:
  • This is how you tie a Controller to an area of HTML markup. 
  • This is how you inject data from that Controller into that markup area. 
  • This is how you use directives to control that injection.
  • This is how you use filters to further control the injected data. 
Sure, there's way more to learn, and this is by no means how you should plan on writing your apps. But it got my colleague started, so I figured maybe it's a useful cheater.

To use:

You need to have a local or remote web server that you can create a directory on, and put three files in.

If you don't already have a pretty firm grasp of JS and its peculiarities, and don't already have or know how to set up a basic web server (you can use Node, Apache, IIS, Tomcat, whatever), then you have more groundwork to do and really don't belong here yet. 


Those files are:
  • The downloaded one (views-directives.html)
  • Two "fragment files" (you will need to create these):
    • fragment1.html (should contain This is fragment ONE).
    • fragment2.html (should contain This is fragment TWO). 
Those fragment files should contain NOTHING else. No html/head/body tags. Just the ... tag (they are fragments, not full HTML files).

Then access this way and it should just work:

http://myHost/myDirectory/views-directives.html

Again, there's no formatting, no nothing. It's VERY bare bones. View the source, read the comments, experiment with the individual pieces to get a feel for it.

Here's the file link:

https://drive.google.com/file/d/0B5V0acodILWkQUJPbFA4UGI4azQ/view?usp=sharing

As always, thanks for visiting.

Monday, October 6, 2014

Inversion of Control, and Dependency Injection (IoC + DI)...uhm...what?

I see it all the time: you ask, "do you know what IoC and DI is." You get, "sure, this framework, and that framework, use them, and I've used those frameworks. And DI is something you do every time you create an instance and pass it as an argument. Three common examples are RobotLegs (ActionScript), Angular JS (JavaScript), and Spring (Java)."

"Ok...now without naming a framework, or using an example from one, in fact without referring to technology at all, tell me what IoC and DI are, and why they are generally spoken of in the same sentence."

If it's a phone conversation, I begin to hear the telltale save-me-Google keyboard taps. If it's in person, I see the forehead produce the uncomfortable light sweat. And I'm not talking junior guys and such; I'm talking Master's degrees, PhDs, the whole bit.

So the honest answer almost every time is, "I use frameworks that have these mechanics. But I really have no idea, apart from the usual 'cleaner/easier' stuff, what these mechanisms are or why the framework designers decided to use them."

It's to be expected. I've never--literally not once--found a simple explanation of these concepts that is understandable apart from a specific language or technology, or even apart from technology completely. I have actually heard IoC and DI referred to as "magic" in an online course on Angular with no further explanation. What is the idea of it?

The following is typical of what I've found, these from WikiPedia:
"In software engineeringinversion of control (IoC) describes a design in which custom-written portions of a computer program receive the flow of control from a generic, reusable library. A software architecture with this design inverts control as compared to traditional procedural programming...Inversion of control is used to increase modularity of the program and make it extensible,[1] and has applications in object-oriented programming and other programming paradigms."
 "Dependency injection is a software design pattern that implements inversion of control and allows a program design to follow the dependency inversion principle."
That's just not going to do it. The techno babble is confusing, and increased modularity and so forth (cleaner code, more testable, etc.) aren't good enough reasons; people often respond, "I know how to write modular code and test it, I don't need the overhead of some framework for that."

Thing is, they're correct. You don't. So why bother?

I came up with the following a while back. I've delivered it more than once, and it seems to do the trick.
Say I run a service. It's front-ended by a long desk with a number of employees standing behind it. Sort of like the counter at McDonalds, or an old bank teller counter. Behind these employees are all kinds of filing cabinets, with dozens of drawers. Each drawer is labeled with the name of the part it contains.
It's the job of each employee, to hand out a box with the "part du jour" to any customer that walks up to the counter.  
When employees arrive in the morning, they look at a board in the back room that tells them what the part du jour is; some write it down, some use memory, whatever. Each employee then goes to their station at the counter. When a customer walks up to the counter and asks for a part du jour (whatever it might be that day), an employee goes to the filing cabinets, finds a drawer that contains the correct part, takes one out, places it in a box, then returns to the counter and gives the box to the customer.
Potential shortcomings and failures in this system:
  • What if the part du jour needs to change during the day? 
  • What if one of the drawers in the filing cabinet is poorly labeled? 
  • What if the employee simply misread the board, or forgot and guesses?
  • What if you need to recall and replace all the parts that were handed out that day? 
And who knows what else.

At this point, people generally start to talk about enhancing the Employee.
  • Create a process to notify employees of a change in the part.
  • Create a process by which the employee can ensure they got the right part from the drawer.
  • Create a process that tests the employee's understanding of what was on the board.
  • Create a process such that employees take down the names and addresses of all customers.
Starting to sound like some software projects you may have been involved in?

Unfortunately, none of this will really solve the problem; these "fixes" will in fact aggravate it, because the root cause of the problem is that the employee has too much responsibility, a great deal of which has nothing to do with their main function; customer service. Adding more unrelated responsibility to the employee may get you some short term tactical success, but will fail as a strategy.

To use the parlance of IoC and DI:
The root cause is that the Employee is in Control of Injecting the Dependency into the Box, and shouldn't be.
It translates easily to programming.
  • The Employee is a Class, a View, etc; a thing that is supposed to have a well defined responsibility, and that interacts with Customers that need parts from it. 
  • A Customer is another class, a function, a user interface components that displays data, etc; some element that needs a Dependency Injected from an Employee. 
  • A Dependency is a part the Employee packages and hands out to the Customers that need it. 
So if a given Employee packages and hands out the wrong Dependency, the system will break down at the Customer level somewhere, and you will have to backtrack to find the Employee that gave that Customer the wrong Dependency. Hopefully, this doesn't happen in more than one place or you will have a lot of debugging and fixing to do.

And then, what if one day, you need to issue a recall and replace a given part from every Customer that uses it? Break out the Red Bull.

So on goes the anecdote:
But what if you did this: 
Create a small team, call them "Packing," which is responsible for packing the boxes. They make pre-packaged boxes available to the Customer Service Employees, who in turn give them to the Customers. 
This small team of Packing Employees has a completely separate and specific process for ensuring they put the correct part du jour (Dependency) in the boxes. The Customer Service Employee no longer has the overhead of selection and packing, and can focus entirely on Customer Service. 
In other words, the packing Control has been Inverted away from the Customer Service Employee to a specialized Packing Team. 
With this "framework" methodology of specialized elements interacting to perform a complex task, how many packing errors can a Customer Service employee make? The answer, is...zero. The only mistake they can make is to somehow never give the package to the Customer, which would be a clear failing of the Customer Service process, not Packing.

I get these kind of follow on questions, all good ones:

But...what about the Packing Team? Can't they make a mistake?

Sure they can. But again, they have a specialized process. Their ONLY job is to ensure they are putting the right part in the boxes. It involves inspection, validation, etc. Everything you need done to ensure proper packaging, but shouldn't be making your Customer Service Employees responsible for.

How does this make the process more airtight?

All employee jobs are much more clearly defined. If a Customer didn't a get a box, you talk to Customer Service. If a Customer got the wrong Dependency, you talk to Packing.

But doesn't this mean more overhead? You have to hire more employees, create more processes?

In a way, this is true. But, if you don't break apart these responsibilities, you'll eventually place so much responsibility on Customer Service that they won't be able to keep up with the demand for Dependencies; the system will cease to scale. By breaking the roles apart and implementing more specialized processes as a framework, you make problems easier to find, and can tweak one process without involving the other.

So there you have it: IoC and DI working together to create more stable, easier to fix and maintain, systems. Read the Wikapedia definitions again; hopefully they make more sense.

Of course there's a lot more to it, but this generally gets the conversation going in the right direction, and I usually see the listener start to discuss opportunities to use IoC and DI together.

As always, thanks for visiting.


Thursday, August 21, 2014

Functional Programming: What, Why?

I've been using functional programming for a very long time, mostly because I'm a UI engineer, and the languages UI devs use often have functional capabilities. JavaScript, ActionScript, etc.

Wait...that's not entirely true. To be precise, I took first-class-functions for granted as an occasionally useful capability, usually event handling or sorting. Imperative programming and OOP rule the roost everywhere I've ever worked right up to today. When I do see functional forms, it's usually from a contract ActionScript or JavaScript developer and it tends to look something like this:
ab = function ( afunc ) { return afunc ( function ( x ) ); } ( );
I'd ask, "why did you do it that way," and get, "because that's functional programming." I'd reply, "aside from telling me, 'I do it because that's what it is,' what specific reasons do you have?"

Not once, in all my years, have I gotten a straight answer to this question. Nobody has ever sketched out a rough equivalent in OOP and pointed out the pros and cons. In fact, the more cryptic the function, the more satisfaction seemed to come from writing it; huge paragraphs of functions feeding anonymous functions from some global variable somewhere in some imported script. The fact that it would confuse me almost seemed desirable.

In the end, based on observation and discussions with people I've worked with at companies big and small, again right up to today, I had good reason to believe that functional programming is an oft-abused stylistic preference. When you grow up you go to OOP. One guy, who has taught comp-sci at a prestigious university and written software you may even have purchased, said that functional coding in the average programmer's hands makes a cryptic and impossible-to-debug mess that usually needs to be refactored. The fact that some of this guy's work is reputable cryptology programming makes that sort of funny to me.

But...is that all there is, is there nothing more? Why are languages like Scala, Clojure, the mechanisms offered by Groovy, and so forth, catching fire?

Qualifier: optimizations not found in many imperative languages, such as memoization, tail optimization for recursion, lazy evaluation, mutable vs. immutable, and so on, aren't what this blog is about; of course they matter, but this post is focused on the writing of code.

Anyway, off to the papers, forums, and Kindle store I went.

In general, the introduction is always the same; functional programming is about using higher order functions, those being, functions that take and return functions. It makes your code more succinct. Write less code, get superior results.

Pithy quotes abound, like this one from  C.A.R. Hoare's 1980 ACM Turing Award Lecture:
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies."
They all go on to describe how such and such languages already have the higher-order function capability, Haskell has been around for decades, and things like "lamdas" are coming to Java.

Then the examples. "You could write this sequence of operations this way...but if you use this language and these higher-order function, you could write it this way." And then you start seeing the word "reduce" a lot, with no real idea what exactly reduce does...ok let me go look that up...hmm ok collection operations, and so on...I'm drifting. Back to the functional stuff.

The "but...this way" is always shorter and more succinct. But they omit the implementation of the higher order functions, resulting in my previous drift...which if they placed right under the "shorter, more succinct," code, would appear to remove that benefit (particularly as higher order functions often use other higher order functions...which can be VERY confusing to unravel).

Examples: in Clojure:
( reduce + ( map inc mylist ) )
In C# functional style:
result = mylist.Select ( i => i + 1 ).Sum ( );
In C# imperative style:
int sum = 0; 
foreach ( int i in mylist )
{
   sum += ( i + 1 );
}

Ok...sure, both the Clojure and C# functional are one line. The C# imperative one is five. But I don't need to see any other implementations in the imperative example to know what this does (other than List, but that's true for all three examples). What is the implementation of reduce, map, Select, Sum, etc?

Argument: "that's the point, you don't need to know that. Functional Programming delegates more optimization and utility functions to the runtime."

Sure, this is true for languages that offer such robust higher-order utility. But what if your language doesn't have a robust list of these amazingly useful and succinct higher order functions, as pointed out in Marijn Haverbeke's excellent book, Eloquent Javascript, a Modern Introduction to Programming:
"Unfortunately, a standard JavaScript environment comes with deplorably few essential functions, so we have to write them ourselves or, which is often preferable, make use of somebody else's code."
And...what if I just defined a couple of static functions in OOP and fed that list to them? I could reduce that C# imperative example to this:
int sum = ListIterator.sumOfElementsIn ( myList, 0 );

No higher order functions, one line, very succinct.

So on the surface (keep reading, I get back to this later), I tended to reject the most commonly given rationales for functional programming as being primary:
  • Make code easier to write.
  • Make code more expressive (reduce need for comments etc.). 
Anybody can write expressive code in any language--well, except maybe Objective-C--and anybody can write unbelievably cryptic code in any language. Some of the functional programming production code I have seen is absolutely impossible to unravel, even the developer can't do it a few days after writing it, and this isn't just the odd incompetent, it's pretty common.

So what else do we have?
  • Make code more about the problem and less about the mechanics. 
This is a matter of expressiveness, and the availability of more robust higher-order functions. Still not really doing it for me; again, somebody has to write those libraries/functions, and beyond a certain point, that person may very well be you, particularly if you use JavaScript.

So, moving along:
  • Reduce the objects I need to be aware of and use. 
Here at last I found some ground to stand on in favor of the functional paradigm. From Neal Ford's Functional Thinking: Paradigm Over Syntax, I got this blurb, and it's not hard to look over a few languages and confirm it:
"In the OOP world, developers are encouraged to create unique data structures, with specific operations attached in the form of methods. Functional programming languages don't try to achieve reuse in quite the same way. In functional programming languages, the preference is for a few key data structures (such as list, set, and map) with highly optimized operations on those data structures. To utilize this machinery, developers pass data structures plus higher-order functions to plug into the machinery, customizing it for a particular use."
This is later backed by a quote from such amateurs as Alan Perlis, the first recipient of the Turing Award:
"It is better to have 100 functions operate on one data structure than 10 functions to operate on 10 data structures."
I found this to be VERY enlightening, my first real "ah HAH" moment. One of the things I shy away from in Java development is, "well what we'll do is extend HashMap and implement with these additional, overridden/overloaded functions...or maybe we don't need the hash at all, so we'll implement a simplified two-array map object..."

Hmm...why don't we just create a function that accepts the map as-is and does what we need, so we're not endlessly creating new kinds of maps? And if we can submit a function along with it, we can even provide the processing. Sort of look at the problem inside out from OOP. 

Hey, how about that. I've been instinctively dabbling in functional thinking all along; favor course-grained reuse mechanisms, avoid highly specific ones when you can.

Here's some of those simple examples you need to get you thinking, again from Eloquent JavaScript:

First, a very standard routine to log things from an array, imperative style:
var array = [ 1, 2, 3 ];
for (v ar i = 0; i < array.length; i++ ) {
  var current = array [ i ];
  console.log ( current );
}
Ok...make it a function. Still imperative, but more succinct.
function logEach ( array ) {
  for ( var i = 0; i < array.length; i++ )
    console.log ( array [ i ] );
}
var array = [ 1, 2, 3 ];
logEach ( array );

Now we begin to see where higher-order functions fit in. What if I want to logEach index? What if I want to log the sum? Should I add params to the function argument so I can add different loops (or just do something different within the same loop based on the arg)? Write more functions specific to the kind of logging I want? Or some such similar solution?

If you think about it, that's a violation of DRY. I am constantly repeating the need to iterate. I need to sort-of turn it inside out: make the iteration the factored operation, and make the specifics of the iteration a matter of what I feed the factored operation.

Functionally, that looks like this (again from Eloquent JavaScript):
function forEach ( array, action ) {
   for ( var i = 0; i < array.length; i++ )
       action ( array [ i ] );
}
var words = [ 'Wampeter', 'Foma', 'Granfalloon' ];
forEach ( words, console.log );
Now we're talking. If used properly, you do in fact enable more succinct code, even without built-in higher-order functions. That second code block is two lines long and very succinct. And because I factored the courser-grained general "need for iteration of an array", I don't have to constantly write those loops.

Now we get to a point of divergence in my thinking; anonymous functions. This could be a result of just not seeing and using them every day.

If I want to do something different with my forEach, I can still do it nice and functionally. For instance, add up the array of numbers instead of log them, logging the result after the fact:
var numbers = [ 1, 2, 3, 4, 5 ], sum = 0;
forEach ( numbers, function ( number ) {
  sum += number;
} );
console.log ( sum );

To me, even in this little frag, the succinctness falls apart. That anonymous function just makes it a lot more cryptic. It would be a LOT clearer if that anonymous function had a name that added to the "readability" of the line. Doing it with the anon func sort of feels like submitting a hard-coded string to a function instead of a val from an enum or const, with a name that tells you what the purpose of that value is. This will force you to add a comment.

So, I would tend to say, less lines don't mean easier to read. Declare your functions as smartly named variables and use them by reference, exactly as you would any other var, if you intend to feed that function as an argument to a function that can serve many purposes (like an iteration function). 
var numbers = [1, 2, 3, 4, 5], 
    sum = 0,
    addTheNumbers = function ( number ) { sum += number; };

forEach ( numbers, addTheNumbers );
console.log ( sum );
Let's add operations, that imperative programming would typically implement with multiple loops, or a loop with some kind of if-then or switch qualifier, seeing the effect of course-grained on your code:
var numbers = [1, 2, 3, 4, 5],
    words = [ 'Wampeter', 'Foma', 'Granfalloon' ],
    sum = 0,
    addTheNumbers = function ( number ) { sum += number; };
forEach ( words, console.log );
forEach ( numbers, addTheNumbers );
console.log ( sum );
Nobody should have a problem understanding this. The lexical access makes "sum" interesting, but that's a JavaScript thing here.

Also notice; the forEach function can either perform the operation without returning anything if the operation passed in has the implementation (e.g. console.log), or it can return values, whatever is useful at the time. Imperative programming just doesn't work this way, and that can be a very hard notion to get your head around (or un-around and then around again) for an OOP/imperative programmer.

Mold the language to the problem, not the problem to the language. I get it.

So, there you have it; the conduit that finally got me away from just going through the functional motions, and actually "thinking" functional as a way of factoring a problem and writing code. Ever since, I have started seeing opportunities to do these sorts of things in my day-to-day work, and I have to say it's reinvigorated my programming.

As always, thanks for visiting.

Thursday, July 10, 2014

Code Commenting Reality


If you're a coder, you've heard it a thousand times: 

"Comment your code." 

You might even have worked in an environment that mandates javadoc comments on every function, or worse yet, failed to land a job because your code did not have "enough comments".
I used to be a docs and comments nut, years ago. I was one of the few coders that actually maintained my comments over time.

To get right to the point, experience has shown me that "commenting" is generally a knee-jerk default that is abused as a crutch under the guise of "accepted practice".

I'll let the book Clean Code: A Handbook of Agile Software Craftsmanship, do the talking here. If you're a coder and have never heard of this book...well, you want to do something about that. The chapters on commenting and formatting are very relevant to the new wave of "software metrics" (which personally I see as a way of trying to commoditize the art and practice of coding):

"There are certainly times when code makes a poor vehicle for explanation. Unfortunately, many programmers have taken this to mean that code is seldom, if ever, a good means for explanation. This is patently false."

"Usually [comments] are crutches or excuses for poor code or justifications for insufficient decisions."

"It is just plain silly to have a rule that says that every function must have a javadoc, or every variable must have a comment."

- Martin, Robert C. (2008-08-01). Clean Code: A Handbook of Agile Software Craftsmanship

I find this all to be absolutely true, and not "sort of, sometimes."
Here's the perfect example. I see this ALL the time regarding event handlers. I do mean ALL the time, at every single organization and contract I've worked for (look at my LinkedIn profile to get an idea of the scope of that statement):

function handleClick ( obj ) { ... }

This coder will be told, in a review or something, "what's it handle exactly?"

The coder will do this:

// This functional handles a click when the user clicks the Submit button
// It accepts the params from the form that the Submit button is part of.
function handleClick ( obj ) { ... }

Hey I did a good job. I added comments to my code to clarify it. Very professional.

Not really. You copped out on the thinking. It's like taking medication for something that could be prevented by exercise.

This is better, because it removes the need for the comment.

function onFormSubmitClick ( formArgs ) { ... }

Should you never comment anything?

Of course not. But a comment, should be considered a temporary compromise; the code is good enough, causes no defects, but isn't easily intelligible with a reasonable amount of perusal. I should probably rethink it when I get a chance, so that I can remove the need for the comment.

So if somebody says to you, "your code doesn't have enough comments," what this more than likely actually means, is that your code is difficult to understand without more comments. Which means it's probably not very well written. Try again.

My final, and favorite, quote from Clean Code:

"The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures."

As always, thanks for visiting.


Sunday, June 22, 2014

Code Review Reality

It's been a while since I posted. However, I've been more active in development than ever, and naturally have developed more opinions on certain aspects of coding. Call it a by product of experience and observation, coupled with a reporter's instincts (if you didn't know, I was a newspaper reporter in a past life).

One thing that I have latched onto, is code review.

I'll be blunt as to the reason for this research; I instinctively never bought the idea that ad-hoc code review is a meaningful way of identifying and reducing defects, and my research and observation up to this point indicates that code review is generally adopted based on anecdote, is usually implemented in a short-cut fashion ("desk inspections"), and as such, will have little to no real, consistently measurable result as far as reducing defects.

When you've finished derisively snorting and if you haven't already dismissed this blog post as ignorant, please continue reading.

As some background, over my 2.5 decades as a professional developer, here's how I've seen code review introduced at most organizations:


  • A team of coders creates a product, very quickly, to get a prototype off the ground.
  • That product begins to gain traction.
  • "In the wild" bugs, become embarrassing, and sometimes costly.
  • Management pushes engineering on the matter.
  • Senior coders state that most problems are due to the prototype quality of the code.
  • Senior coders request time to reengineer the tiers of the system.
  • Management rejects this recommendation in favor of delivering selling-point features.
  • Management insists on a process to measure bugs.
  • Bug counting at phases of the development-release cycle is introduced.
  • Management demands that the bugs at each phase of the cycle be drastically reduced or eliminated.
  • Senior coders again state that the prototype nature of the software makes this very difficult.
  • Management again rejects requests to reengineer.
  • Hiring begins in earnest: "more coders means more productivity," (a well known fallacy).
  • Hiring rates for senior level coders are disappointing.
  • Management looks to things like "process improvement".
  • Code Review, supposedly cheap, easy to implement, and a quality improvement "no brainer", is introduced.
  • It doesn't get the anticipated results; bug patterns appear more or less the same over time.
  • The data is pushed around, argued, determined to be insufficient to draw real conclusions.
  • More processes are introduced (common reluctance to abandon "investment" in a failed system).
  • Nothing really changes.
  • The process is often perceived as ponderous and/or punitive by coders.


Regarding the latter, the litany of articles on how to try and avoid code review being perceived as punitive and ponderous, should make it clear that the sentiment is common. I won't bother citing references to such an easily predictable fact.

I'm sure many of you reading this have seen a similar pattern. Does it mean code review is pointless and ineffective?

Not at all. I want to be clear on this point; I am NOT saying that code review has no purpose.

What I am saying, is that short-cut "desk inspection" code review without the prerequisite processes to improve the code that is submitted for review in the first place, and without building in preparation time to conduct reviews properly, can reduce productivity and morale, and increase cost, without any real result in terms of reducing defects.

So if you are going to implement code review, be realistic about what can be achieved with it. If you are doing desk inspection of fragments, then no other result aside from improved formatting and use of internal standards, if the organization actually has such standards to refer to (and most do not), should be expected. If the organization does not have an internal standard to refer to (and again, most don't), then ad-hoc desk inspection code review is pointless.

To keep this blog post short of a long white paper, I'll present a handful of findings that correlate with my overall findings and observation.

Consider this statement from "Agile Testing and Quality Strategies: Discipline Over Rhetoric" by Scott Ambler and Associates. Note that he makes an important distinction regarding Agile vs. Serial development:


"First and foremost, I consider the holding of reviews and inspections to be a process smell which indicate that you have made a mistake earlier in the life cycle which could be very likely rectified by another, more effective strategy (non-solo development, for example) which has a much shorter feedback cycle. This goes against traditional wisdom which says that reviews are an effective quality technique, an opinion which is in fact backed up by research -- in the book Practical Software Metrics for Project Management and Process Improvement Robert Grady reports that project teams taking a serial (non-agile) approach that 50 to 75 percent of all design errors can be found through technical reviews. I have no doubt that this is true, but my point is that although reviews are effective quality techniques in non-agile situations that in agile situations there are often much better options available to you than reviews, so therefore you should strive to adopt those quality techniques instead."

There's another important point I infer from this. If you are finding lots of bug in your code reviews, then your coders need to up their game. An obvious bug caught in review should be quite rare. 


Let's talk about the cost of catching a bug in the coding cycle. I found this graph, taken from the previous source, to be pretty much dead on.


Note that defects found via review or inspection are more costly than defects found in a number of processes that should be implemented before code review, such as Pair Programming, Continuous Integration and Test Driven Development. If bugs are getting through to review...you have a problem that is costing you money.

Another interesting read was the IEEE white paper, "The Impact of Design and Code Reviews on Software Quality: An Empirical Study Based on PSP Data" by Chris F. Kemerer, Member, IEEE Computer Society, and Mark C. Paulk, Senior Member, IEEE.

This paper is related to much more formal and highly academic software development setups than your typical "agile" (I use the term loosely) startup or small development shop has in place (again, stressing the result between serial, and agile, development) But even when using such an environment as the control, it correlates that preparation prior to a review, such as familiarizing oneself with the overall system, requirements, and design, is crucial in achieving the effect of reduced defects.

"Despite consistent finding that inspections are generally effective, Glass has summarized the contradictory empirical results surrounding the factors that lead to effective inspections. Weller found that the preparation rate for an inspection, along with familiarity with the software product, were the two most important factors affecting inspection effectiveness."

So, if you are not intimately familiar with a given area of the software product, and do not have time to prepare for the review to get a complete understanding of the design and requirements, you should not be expected to conduct reviews that measurably decrease defects. Anything short of this discipline throws the value of code review and inspection, as far as reducing defects, into debate:

"In summary, although the benefits of inspection are widely acknowledged, based on these competing views and conflicting arguments the discipline has yet to fully understand the fundamental drivers of inspection costs and benefits."

Let's focus on "preparation" for a code review.

The faux-agile notion of code review is a quick "desk inspection": you get a fragment of code, look it over quickly, give it a pass, or open an issue for comment. There's a little back and forth, something might get tweaked (e.g. "we put returns before our brackets"), the coder eventually gets a thumbs up, and commits.

I reject the notion that enough meaningful and costly-to-fix bugs can and will be caught in such a review process to make it worthwhile. I found this correlated in numerous sources that were far more critical of the subject, but I decided to lead with the IEEE paper because it speaks of the power and value of code review...if it is done correctly. That means proper preparation, which means more time, which means more cost.

So if you are implementing code review and expecting it to be a cheap way of reducing defects...well, I hope your job doesn't depend on the results.

Here is one such more outspoken source.
From "Code Review - The Meaningless Ritual" by Daniel Pietraru:

"Done properly code reviews are extremely time consuming. Today’s programmers are more efficient, due to tools and better languages, then 30 years ago. They can create more code faster. But the amount of code inspected per hour is about the same because they are not smarter than people 30 years ago. Imagine 1000 lines of code that one programmer can write in 1-2 days.

With a traditional code inspection 4 people are required to inspect this code at a rate of 200 LOC/hour after they prepared the code review by reading the code and understanding the design. If we are optimistic the whole thing will require 6-8 hours per developer. This means a total of 24-32 hours spent on inspecting code produced in 8-16 hours. Probably this kind of review can find a lot of bugs in the code but is it worth the price? How many teams get away with estimates including 2-3 hours of inspection for every hour of development?"

People might say those production numbers are unrealistic. So, I'll give real data. In one month, I pushed 84 commits, with 2000+ deletions and 5700+ changes (numbers from GitHub). So push those numbers out to a team of say, 15 coders. In fact, cut the number in half if you think I'm unusually productive. Then do the review-cost math submitted by Pietraru.

EXPENSIVE. And that's just the desk inspection version. With proper inspection, those reviews could have taken a week...just for one coder, me.

And to be frank, if I had to get through reviews of all my code, that productivity would have dropped significantly. There's stats for you; if I reduce my productivity to zero, I won't introduce any bugs, proving the value of reviews.

Pietraru also goes on to point out a very interesting fact; where did we get code reviews from at all? Again, numerous sources correlate the following:

"Code inspections or reviews as we know them come from the hardware world. Initially Michael Fagan [from IBM, 1976] used them for hardware design reviews. This fact is very significant because uncaught defects in hardware design are pretty expensive.

Also defects in low level software were very expensive at the time. Imagine you write a piece of code that has to be burned on a ROM chip (non erasable). You cannot afford to wait and test/debug your code until after the ROM was destroyed. But this is not the case anymore. Unit tests and running the code in debuggers for success and failure paths are pretty cheap with modern IDEs and testing frameworks."

This reinforces the notion stated by the "Agile Testing and Quality Strategies" paper cited before; in a shop that purports to be "agile":

"...you should strive to adopt those [other] quality techniques instead".

In closing:

There is simply too much information available to show that ad-hoc "desk-inspection" code reviews in and of themselves do not usefully correlate to a measurable reductions in defects. In its useful form, code review is a disciplined practice involving a significant investment of time, which translates to money. It met with success in hardware programming, because the investment was offset by the irretrievable cost of producing bad hardware. But this has nothing to do with most software development today; modern IDEs and productivity tools allow us to write FAR more code than hardware programmers did in the 70s, and we can push out hot patches, update UIs without swapping out server bits, and so forth. We're not worried about the cost of 100,000 bad ROMs.

Ultimately, such short-cut reviews can raise overall costs and have other negative effects. There are other more effective cultural changes and processes available that can really make a difference in the quality of your product.