Making Estimates of Software Development Effort If you don't know where you are going, any road will take you there. Lewis Carroll Planning to develop some software without making a serious effort to estimate how much work will be required is careless. It is especially careless now since so much experience has been built up about actually making software. There are many measures available to make accurate estimates. Here we will briefly look at several which are based on the work usually referred to as COCOMO II. This method is based on matching the requirements to the amount of project effort required to produce the functionality in question using a computer. For the effort to be worth it, we also need to have enough abstraction so that we can reduce any computer functionality to a common language description. Common languages have abstractions for things that generalized languages have many particular expressions for. Joining all those particular expressions into a single aggregated language makes the discussion much more focused. The most successful common language for computer functions is one that has only five words in it. The words are Input (meaning a screen), Output (meaning either or both a screen or a file), Query, Files (opened and read to distinguish them from Output) and Interface (talking to other software or equipment). That is all we have figured out to do with computers at an abstract level so all we have to do is to recognize the requirements that ask for those functions. There is an elaborate methodology called Function Point analysis that studies all sorts of instances of these five words, but for estimating effort, just being able to recognize the basic idea in a requirement is enough. As you read a requirement just make a tally mark when you see something that says collect information as an input. When it says show the information to a user that is an output. When there is an implication that some information will be stored and picked back up again at some later time that is a query. Any place where the requirements talk about defaults or standards or configuration that is a file. When there is an implication that some part of the requirement talks to some other part or some existing application that is an interface. If the requirements are specific then think of all of the words as describing simple things. If the requirements are vague then think of all of the words as describing complicated things. Function Points are reduced to numbers (since we are headed at making some calculations) using the following values and calculations: The words are assigned values as either simple (the small number) or complex (twice the simple). This means that if you had one simple input screen that only asked for 4 or 5 values and one complicated that asked for 20, the input function points would be 9. If you had one simple and two complicated, the input function points would be 15. Input screens:
3 or 6
Output to file: Query to DB: Files read: Interfaces:
4 or 8 3 or 6 7 or 14 5 or 10
When you have the total number of function points for all the requirements, you calculate how much project work it would take to produce those functions. To get this project work into a simple enough form so that we can compare one project to another, we will calculate the length of time it takes for a project to do all the work to support the design, development, test and deployment associated with the functions in the requirements. The abstraction on down to a true common value this estimate is reduced to the number of lines of code. Again, this is making a concrete measure to map an actual piece of software back to the requirements associated with the application. It is just a number. The relationship between lines of code and a function point depends upon the programming language that is used to develop the software. The quality of the design could also have some effect but this method assumes that the design is normal, so it will be reasonably constant. The following table presents actual count investigations for common programming languages. These values come from 1,467 completed projects where both Function Points and lines of code were collected. Doing the counts was a 2005 project and some of the projects where the counts were made were as old a 2003. Programming language assembly language C Cobol Fortran Procedural, line numbered Ada Php and python Java, C++, C# object-oriented languages fourth generation languages (4GLs) code generators Spreadsheets and query languages TIBCO (draw-a-program languages)
LOC/FP (multiplier) 320 128 105 105 90 70 67 31 30 20 15 6 4
The LOC is calculated by FunctionPoints * language multiplier. COCOMO II is an empirically validated software cost estimating formula. It uses these three simple calculations to derive the effort, duration and staffing for the project that has to produce some number of lines of code.
Effort = 3.6 ( KLOC ) 1.20 Simple non? So you take the LOC calculated from the function points and drop the last 3 0's, multiply by 3.6 and then multiply the product by 1.2. That number is the total "on task" work required to produce the lines of code in terms of hours. Since not even programmers work all time, we have to calculate how long the projects will last, which is called Duration. Duration = 2.5 (Effort) .32 Duration is the scheduled time, assuming 8 hour days with regular work weeks. Finally we have to know how many guys it will take. This is the product of Effort divided by Duration. People = Effort/Duration Two things to remember about manually calculated COCOMO II. One is that it doesn't make any allowance for how good the people actually are. The second is that it assumes a full software development life cycle. The actual coding time is 50% of the total duration which includes time for detailed design, code, unit test and integration testing. 10% of the total time is requirements gathering, 15% is high level design, 15% is integration and quality assurance, and the last 10% is transition to production. You can see where the work breakdown structure comes from as well as the allocation of skill sets. Another useful measure to remember is that there will be five bugs per function point and only 80% of these bugs will be fixed before the transition to production (last 10% of the Duration). It is useful to keep track of the number of bugs found and fixed relative to the project size since achieving at least 80% of projected fixes is the best measure for quality of the product before releasing it to production. You will still have 20% of the defects in the code when you get to production. Also, you need to back into the number of "new" function points produced by the fix LOC, because any new FP "introduced" by fixing will have still have the same five bugs in it. WAYS TO IMPROVE THE ESTIMATE Function Points are a pretty course method to describe computer functionality. One method to improve them extends the idea of complexity by including the basic functional unit into a project context. This allows for differences between projects like a simple proof of concept and a fully configurable multifunction product. FP = total x [0.65 + 0.01 x sum(complexity)], 1. Number of user input screens * weight 2. Number of user outputs * weight 3. Number of queries * weight
4. Number of files * weight 5. Number of interfaces * weight The weight depends on the estimated complexity of each part. After the preliminary analysis is made, the entire application is scored for complexity. The formula for this is sum(complexity) = sum of "complexity adjustment values" These complexity adjustment factors are based on responses to 14 questions. Each positive answer adds 3, 4 or 5 points. This exercise is easier for all modern applications, since we always answer yes to the first six questions, giving us 18 complexity adjustment points to start. 1. Does the system require reliable backup and recovery? 2. Are data communications required? 3. Are there distributed processing functions? 4. Do you care how fast this runs? 5. Will the system run in an existing environment? 6. Does the system require on-line data entry? Complexity of 18 to here. 7. Does the on-line data entry require multiple screens? +4 8. Are the master files updated on-line? +4 9. Are there any complex function points in this group? +4 10. Is there complex internal processing? +4 11. Is the code supposed to be reusable? +4 12. Is conversion of data included in the design? +4 13. Is the application supposed to be easy to use? +4 14. Is this for multiple different organizations? +5
COTS ADJUSTMENTS TO COCOMO II Some projects anticipate the use of common off the shelf software products. When that is planned the estimate of effort for the project needs to be adjusted by adding the time required to assess, tailor, tune and write glue code for the COTS product. Normally this effort is added to the entire COCOMO II Project effort, but the assessment portion is supposed to be finished in the first 25% of the schedule. In order to make this happen the project will need extra staffing in the requirements gathering and high level design phase. COTS product assessment Effort = sum(feature points)*number of products evaluated
Feature points are the attributes of the COTS application that the project and the user community evaluate the COTS application by. There are seventeen of these and one will add either 0 if you don't care, 3 if nice to have and 5 if must have. Seventeen feature points list: Accuracy Availability/Robustness: availability, fail safe, fault tolerance, redundancy Security Product Performance Understandability: documentation quality, simplicity Ease of Use Version Compatibility upward compatibility Inter-component Compatibility Flexibility Installation/Upgrade Ease Portability Functionality Price Maturity Vendor Support User Training Vendor Concessions: willingness to escrow source code, or make specified modifications Product tailoring/tuning Effort = sum(Tailoring Activities) * number of COTS products Tailoring Activities Effort Parameter Specification Script Writing I/O Report & GUI Screen Specification & Layout Security/Access Availability of COTS Tailoring Tools
1 point
5 points
50 to 150 parms 5 to 20 line script
1001 or more parms 51 or more line script
automated w/templates 1 security level w/ 20 profiles
handwritten or custom 5 security levels w/ 100+
excellent tools
no tools
Glue Code development for functions found to be unavailable in COTS component To calculate a function point for the glue path based on any glue code requiring six complex interfaces (two from the local application, two from the glue code, and two from the COTS application) which is automatically 60. If there is more than one path, it will also require six complex interfaces. Each glue path automatically has a complexity of 30 (18 + easy to use, multiple organizations, conversion of data and complex operations). This is the long way to say that every glue code path is 1,000 function points which will add 3 to 4 months of schedule and 3/4th of a person to the full life cycle of project duration.
SIMPLE HALSTEAD COMPLEXITY MEASURE If you have to decide if a piece of code you already have is worth having: Count the total conditional words/operators: If Then Else Case While Where && Goto (counts as four) Count total Lines of code Divide total lines of code by total conditionals Ratio 10+::1 a simple keeper
5::1 refactor?
2.5::1 discard
Total Conditionals = Total QA Tests for this to be completely tested. Function points can be grouped into four functional areas to determine coverage for the application requirements: • Management (workgroup, application servers, systems on a network, computer) • Coordination (check pointing and recovery, deactivation and recovery, migration, transaction, replication, events, interface reference tracking) • Repository (storage, information organization, relocation, types, trading) • Security (access control, security audit, authentication, integrity, confidentiality, non-repudiation, key management) If the application does not address all four functional areas, what you have is not a complete description of the application. The estimate will be off by 25% per slighted functional grouping.