Measure Once, Cut Twice

Estimation Rules of Thumb

Posted in agile, software development by steve on May 31, 2010

Software project estimation is hard. In fact, it is so hard that estimating within the accuracy most people expect is actually impossible. To get as accurate as humanly possible, read McConnell’s software estimation book,[1] collect your own metrics, and then carefully and critically apply the principles. If you just need to get a quick order of magnitude check, here are some heuristics and techniques for a bottom up approach based on estimating code size.

The basic numbers are:

  • 5-20 LOC per developer per hour
  • 2000 person hours per year
  • 50 LOC/class (Java), 100 LOC/class (C++)

This method uses objects as a proxy for size estimation.[2] You need to supply the number of objects in the target software and out pops the magic number. The two dominant variables tend to be the the number of objects (obviously) and the LOC per developer per hour. The second can often be pulled from historical data. I tend to measure the start when developers are first engaged in serious coding, skipping the early requirements and visioning, and the end when the code is running, unit tested, and lightly functionally tested i.e. DCUT code (Design-Code-Unit Test). For some teams this alpha, others beta, and others Running Tested Features. However you do it, try to find reasonably consistent points and make your historical measurements. 

If you have no historical data, here is a rough continuum:

  1. 25+ LOC/person/hour — prototypes; small trivial projects
  2. 20 LOC/person/hour — small, 2-3 person team with fast micro-requirement turnaround (e.g. onsite customer, or more commonly, the developers are able to fill in many of the details of the requirements)
  3. 10 LOC/person/hour — regular agile team building a non-trivial app
  4. 5 LOC/person/hour — typical enterprise development pace
  5. 1-3 LOC/person/hour — stringent or archaic, unproductive environments (e.g. banking software); you’ll see this in some historical literature, but they are often taking into account the time beyond DCUT

Pick one that seems to fit your team size and environment. Don’t be too optimistic. How big is your team size? Is it a prototype? Do you have to worry about localization, security, scalability? How familiar is the team with the languages, frameworks, and tools? 

The 2000 person hours per year is just a shortcut to take care of holidays, sick days, bathroom breaks, and other daily down time. Also known as non-ideal programmer days (hours). 

Now the hard part. How do you figure out the number of objects or lines of code in your future software? The easiest way is by analogy. Find a similar project that either you’ve done or someone else has done. There may be some open source projects that cover some of your project scope. If so, take a look at their code bases. 

Barring that, you’ll need to do some high level design in order to start figure out how big your code will be. Knowing how many layers your architecture will have and which frameworks you’ll be using is important. More layers tend to add more code. Frameworks often provide design constraints that you can use to start to enumerate the scope of the code — count the number of services, commands, or functions. Database tables and screens are also good proxies for code size estimation. If you already have a database schema, how many objects will be needed to wrapper it? Will there be a separation of data objects and domain objects? 

Screens tend to map to template files, controllers, views, model proxies, etc. If you have both a existing database schema and requirements that map out screens, you should be in pretty good shape. If you have a pure codebase with no external anchors such as screens, database tables, web services to process, or transactions to fulfill, you may want take a different approach.

Once you estimate out how many objects it is just a matter of multiplying out the Objects * LOC per object * LOC per person per hour to get the total person hours. Multiply by 2000 to get the person years. 

Now take a look at the software estimation cone of uncertainty and realize your error bars are probably worse than +/-100%.[3] Still, it is better than nothing at this point. Ideally, you should use this technique along with a couple of others, such as a top-down work breakdown structure, gut checks with a few team members, and/or high level epic estimation via planning poker. Multiple techniques done independently (don’t taint each other!) are more powerful than one expert judgement.

Note this number does not take into account non-code related and other project related costs. Designing the database, setting up build machines, project management, and high level requirements definition should be estimated separately.

[1] Software Estimation, Steve McConnell.
[2] A Discipline for Software Engineering, Watts S. Humphrey.


3 minutes, 3 hours, 3 days

Posted in agile by steve on March 30, 2010

Communication efficiency on projects is intrinsically linked to distance. Consider the following scenario: you have a blocking issue that can be answered quickly by the right person. You’re not exactly sure who the right person is but you know roughly the team or group you should talk to. It is a bit tough to describe in an email, so you need to articulate it verbally, or even better, with some drawing and wild gesticulating. The other group is:

a) Your immediate team. You lean over to the person next to you and start describing it. Within 30 seconds, they probably know if they are the right person or it is someone else. Bonus: ‘accidental broadcast’ — if you are sitting close together one or two other teammates probably overheard and may chime in. Within a 3 minutes, you have your answer and everyone is back to work.

b) Another team down the hall or on another floor. You’re not sure what their schedule is, so instead of walking over and risking the key person not being there, you book a meeting at least a couple of hours in the future just to be courteous. The meeting time-slot is at least 1/2 hour long since Outlook defaults to that. 3 hours later you have your answer.

c) Another team across the Atlantic/Pacific. You have a one or two hour window the next day that is a good conference call time for both parties. If that isn’t open, you try the next day. After the call, inevitably, there is a dangling thread so you send a quick clarification email. You get an answer the following day, unblocking you. Total calendar time: 3 days.

There are in-between brackets too. Someone who is on your team but not sitting next to you is more likely 30 minutes than 3 minutes (“I’ll wait until I get a coffee to swing by their desk”). Someone only three timezones away is likely 30 hours rather than 3 days.

But that is not the worst part. If anything goes wrong in the communication or clarification goes beyond a quick follow-up email, the communication delay typically jumps up to the next bracket. 3 hours turns into 3 days and 3 days quickly turns into 3 weeks.

At a large company with widely distributed teams three week delays happen all the time. A conference call requires a follow-up or two, a key person is on holidays for a week, people are booked during the two hour timezone window until next tuesday, and so on.

The important thing to understand is that *there is no fix*. Consider it a fundamental latency attribute of the medium. Timezones, the lack of face-to-face communication, the inability to “instant interrupt” for minor issues, less awareness of people’s “micro-schedules”, and other practical issues such as room bookings and the dreaded 15-minute-delay-while-the-organizer-fights-the-web-conference-software all conspire to alter the bandwidth of the channel.

Higher average latency and less bandwidth means people will default to trying to solve issues themselves when it could have been solved more efficiently with input from someone else. If every person on a ten person team needs to communicate on a weekly basis with someone far away, that adds up to a lot of friction.

What can be done is to organize in such a way that you don’t need to communicate across that channel as often. Conway’s law is a reflection of this. The small, co-located teams recommended by Agile methods like Scrum and XP are a realization of this. They advocate re-organizing the project backlog and/or the teams so you can avoid having to communicate across slow, thin pipes. Break up into mostly independent sub-teams. Invest in the up front retraining or knowledge transfer to make it possible.

When those 3 day or 3 week delays happen repeatedly, instead of creating more processes to improve the communication or looking towards to technology (video conferencing!) to solve the problem, try to figure out how to avoid the need to communicate in the first place.

Tagged with: , , ,