Experiences in Implementing Measurement Programs

Use of any trademarks in this report is not intended in any way to infringe on the rights of the trademark holder. Internal use. Permission to reproduce this document and to prepare derivative works from this document for internal use is granted, provided the copyright and " No Warranty " statements are included with all reproductions and derivative works. External use. Requests for permission to reproduce this document or prepare derivative works of this document for external and commercial use should be addressed to the SEI Licensing Agent. Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center. The Government of the United States has a royalty-free government-purpose license to use, duplicate, or disclose the work, in whole or in part and in any manner, and to have or permit others to do so, for government purposes pursuant to the copyright license under the clause at 252.227-7013. For information about purchasing paper copies of SEI reports, please visit the publications portion of our Web site


List of Figures
List of Tables Table 1: Focus of Cycle Time Indicator Table 2: Metrics Collection Form [Augustine 99] Table 3: Data Access Recommendations

Background
Data collected by Howard Rubins of Rubin Systems, Inc. show that four in five metrics programs fail to succeed [Pitts 97].Here, success is defined as a measurement program that lasts for more than two years and that impacts the business decisions made by the organization.The primary reasons that metric programs fail are not due to technical issues but rather due to organizational issues [Rubins 92]: • not tied to business goals • irrelevant or not understood by key players • perceived to be unfair, resisted • motivated wrong behavior • expensive, cumbersome • no action based on the numbers • no sustained management sponsorship A successful measurement program is more than collecting data.The benefit and value of doing software measurement comes from the decisions and actions taken in response to analysis of the data, not from the collection of the data [Zubrow 98].One of the challenges faced by measurement professionals in large complex organizations (such as those developing and maintaining major software systems) is the fact that so many opportunities for measurement exist.The search for the "right" measures can easily become overwhelming when the selection is not driven by the information requirements to be addressed by the measures.For measurement to be cost effective, it must be designed and targeted to support the business goals of the organization.In their survey of organizations with "reputations for having excellent measurement practices," Rifkin and Cox observed that this tie between measures and goals is one of the characteristics that differentiate successful organizations from the rest [Rifkin 91].The goal-driven software measurement process produces measures that provide insights into important management issues as identified by the business goals.Since these measures are traceable back to the business goals, the data collection activities are better able to stay focused on their intended objectives.Hence measurement and analysis is planned to support goals; no measurement is done just to collect data for the sake of collection alone.
In this paper, we summarize a number of different case studies, which illustrate the application of goal-driven measurement in diverse settings.These organizations had very different goals for their measurement programs, and chose to focus on measures that were uniquely suited to their needs.The Software Engineering Institute (SEI) provided assistance to these organizations in implementing their measurement programs.Using artifacts and lessons learned from these organizations; we will discuss the issues and challenges faced in implementing these measurement programs.Despite the rather obvious differences in the needs of these organizations, the impediments they faced are very similar.In the following sections, we summarize the various steps in the measurement process and provide advice for improving the success of a measurement program.
2 Overview of the Measurement Programs

Case 1: Measurements Across a Global Enterprise
The purpose of this measurement program was to establish a system of uniform measures across a global enterprise.It included designing measures that supported global business concerns and creating an organizational infrastructure that spanned diverse geographic locations and cultural milieus.The business units involved in this work had different business concerns, processes, native languages, and cultures.These differences were so pervasive that it was sometimes difficult to establish common definitions for even very basic terms, such as the word "project" for example.What was identified as a project in one business unit, might be called a task in another unit.Some business units had multiple tasks that made up projects, while others preferred to call each of those tasks a project.The global scope exacerbated many already difficult technical problems, such as generating definitions that could be applied consistently, and normalizing data for comparison purposes.
Among the advantages that this organization was trying to achieve with their measurement program were • the ability to answer questions about the enterprise (For example, are we getting better or getting worse; is an enterprise-wide improvement program having an effect?)• the ability to evaluate new technologies, methods, and practices by -collecting identical measures to enable meaningful comparisons and trend analysis -creating a large pool of project data from which similar projects can be chosen for comparison purposes • the establishment of a visible ongoing enterprise focus for software engineering excellence The participating business units involved were located around the globe: Tokyo, Singapore, Hong Kong, India, Argentina, and the United States.

Case 2: Assessing the Impact of Software Process Improvement
The purpose of this measurement program was to assess the impact of investment in Software Process Improvement (SPI).As a consequence of the ongoing implementation of SPI activities, the schedule, cost, and quality of future software projects were expected to be significantly better than previous efforts.The indicators in this measurement program were developed to understand, influence, and communicate the actual benefits of software process improvement to completed software projects.The defined measures were to serve as a means of focusing attention on what was important rather than on the many other interesting aspects of software development.

Case 3: Enterprise Performance Management, a Local Perspective
The purpose of this measurement program was to support management in workload balance and effective project management in the context of an ongoing process improvement program.Standardization of measurement across the organization had been a major theme, but the alignment of more "local" performance objectives was needed to reconcile perceived conflicts between what the customer demands and what "corporate" requires.This organization's workload consisted of maintenance and enhancement activities across a portfolio of major systems with a diverse set of users and stakeholders.Strategic planning was conducted at the enterprise-and business-unit-levels. Relating the performance of small groups of technical staff to the mission of the enterprise was the ultimate goal for this very large organization.

Description of the Methodology
Through a series of workshops, these organizations used the goal-question-(indicator)-metric (GQ[I]M) methodology to define a set of measures related to their business goals.The "I" in parentheses distinguishes the GQ(I)M methodology from the closely related GQM methodology introduced and described by Basili and Rombach [Basili 88,Basili 89,Rombach 89].
The steps of the GQ(I)M approach are organized into three general sets of activities: 1. goal identification 2. indicator identification and the specification of data needed 3. infrastructure assessment and action planning to guide the implementation In the goal-driven software measurement methodology, business goals are translated into measurement goals (Basili and Weiss, 1984;Briand et al., 1996) by first identifying highlevel business goals and then refining them into concrete operational statements with a measurement focus.This refinement process involves probing and expanding each high-level goal by using it to derive quantifiable questions whose answers would assist managing the organization.The questions provide concrete examples that can lead to statements that identify what type of information is needed.In originally devising this measurement scheme, Basili emphasized the importance of having a purpose for the measurement data before selecting data to collect.Without a purpose, we cannot know what the "right" data would be.
In our elaboration of Basili's methodology, we have added an intermediate step to assist in linking the questions to the measurement data that will be collected.The importance of linking data to the questions they answer is clear in the success Basili has had with the GQM approach.Our experience suggests that identifying questions and measures without visualizing an indicator is often not sufficient to get a successful measurement program started.The displays or reports used to communicate the data (called indicators in our variation of the GQM methodology) are a key link that can determine the success or failure of a measurement program.These indicators serve as a requirement specification for the data that must be gathered, the processing and analysis that must take place, and the schedule by which these activities occur.
Following the specification of indicators, an action-planning step is carried out.First, the existing data collection and measurement activities within the organization are analyzed to avoid duplication and to identify gaps.Priorities, in terms of data to gather to produce the indicators, are assigned.Then tasks are defined to take advantage of existing activities and to address the gaps.Part of the plan also addresses the need for the measurement activities to evolve as the organization's goals change or as new insights are gained using the measurement program.
The goal-driven software measurement approach is described in the SEI's Goal-Driven Software Measurement Guidebook [Park 96]. Figure 2 depicts the general approach we used to derive the indicators and measures.The goal-driven software measurement methodology was implemented in a 10-step course/workshop as shown in Figure 3. Tailored versions of the course are presented in workshop format in an industrial training approach.
Step 1: Identify your business goals Step 2: Identify what you want to know or learn Step 3: Identify your subgoals Step 4: Identify the entities and attributes Step 5: Formalize your measurement goals Step

Identification of Business Goals
Clear specification of one or more business goals is a necessary input to a meaningful measurement program.These goals serve to identify the purpose for work underway in the organization.When these goals are well articulated, they beg questions that lead us to evaluate success or failure with regard to the purpose of the work rather than some arbitrary characteristic of the work itself.
In Case 1, the general business goals were articulated by the Chief Technology Officer (CTO).He wanted to measure progress towards the following corporate improvement goals: • Increase productivity by a factor of 2 over 5 years.
• Improve quality by a factor of 10 over 7 years.
• Reduce development time by 40% over 7 years.
• Reduce maintenance effort by 40% over 7 years.
In Case 2, the entire software process improvement initiative was guided by the vision of software excellence articulated by senior management that explicitly describes the attributes of the desired state that are considered essential for the foreseeable future: world-class cycle time, productivity, and quality.The measurement program was to serve as a means of focusing attention on what is important rather than the many other interesting aspects of software development that could be measured.
In Case 3, establishing a common basis for comparing information across a widely distributed organization was a major concern.From the perspective of the organizational subunit, their top priority was to report compelling information that accurately reflects the work being done.From the perspective of the sponsor, the expressed priorities were to support enterprise-wide performance management and for using measurement to support the transfer of process improvement suggestion across the enterprise.
Advice: Clearly specify the goal that is being addressed.

Identification of Indicators
In goal-driven measurement, the primary question is not What metric should I use, but What do I want to know or learn?Starting with each organization's corporate goals, we conducted workshops with representatives of the organizations to work through the GQ(I)M methodology.This 10-step workshop is illustrated in Figure 3 (above).Our experience shows that it is much easier to postulate indicators and then identify the data items needed to construct them, than it is to go directly to the measures.Starting with the raw data (measures or data elements) and creating an indicator can lead to convenient or elegant displays that incorporate the data but fail to answer the questions that drove the data collection.With an indicator specified, the information to be derived from the raw data has been articulated, and we are better able to construct indicators that answer the questions we care about.Also, an indicator or graph is easy to "think about" and "talk to" when you are getting input from others.
Once the indicators have been identified, we found it extremely useful to review the unique focus of each indicator.Illustrated in Table 1, is the work done in Case 2 for one of their indicators.

Question Addressed Expected Impact of SPI
Cycle Time What is the trend in the number of calendar days typically used by our projects to deliver a software feature (i.e., historical schedule duration)?
The average number of days to implement a feature should decrease as a result of SPI.The greatest impact should be seen in large projects.
The workshop participants were also asked to visualize success and then answer the following questions: • What are you going to do with the information?
• What decisions are going to be driven with this data?
The selection of indicators was also driven by a number of other factors in addition to the things we would like to know about each goal.These included • Who is the audience for the indicators?
• What should be the total number of indicators?
• Can the indictors be interpreted correctly? CMU/SEI-2001-TN-026 • Do the indicators provide an accurate and high-level view?
• Could you collect the data in your organization?Are there major barriers?
Advice: Maintain traceability from the indicators back to the business goals.If questions arise later about intent, you will be able to look back to the origins and provide implementation decisions that are consistent with your business objectives.

Classification of Indicators
Once the organizations had defined their goals, we found many of them had difficulty deciding how to tell if or when their business goals had been achieved.While the organizations were able to articulate a strategy and define tasks for achieving their goals, they had difficulty understanding the difference between success indicators (indicators used to determine if the goals have been met) and progress indicators (indicators used for tracking the execution of tasks).These organizations were using the indicators used for tracking the execution of tasks as a proxy for measuring if the goal had been achieved.When all the tasks had been executed, the organizations declared success-their goals were met.They did not analyze the outcome of the tasks as part of the decision process for determining if the goals have been met successfully.Execution of the defined tasks is a necessary but not sufficient condition for meeting the goal.
We used the following figure to clearly illustrate the differences in the type of indicators: 1. Success Indicators: These indicators are constructed from the defined success criteria and are used to determine if the goals have been met.
2. Progress Indicators: These indicators are used to track the progress or execution of the defined tasks.A Gantt chart is a good example of this type of indicator.The successful execution of all the defined tasks does not necessarily guarantee that the goal has been successfully met.
3. Analysis Indicators: These indicators are used to assist in analyzing the output of each task.The analyses help test our assumptions about the data we are using to judge progress and success.
As seen in Figure 4, each of these indicator types has a specific use.To assist in postulating success indicators, we asked the workshop participants to think about the following questions: • How do you know if you achieved the goal?
• How do you define success?
• How do you know if the goal has been met?
From the answers to these or similar questions, the criteria that can be used to decide if the goal has been met are identified.From the success criteria, success indicators can be postulated.
Advice: Have a clear understanding of the type and purpose of each indicator.Articulate clearly the criteria you will use to decide if the goal has been met.Do not use Progress Indicators as a proxy to Success Indicators.Use Analysis Indicators to study the data you use, in order to support accurate progress and success tracking.

Number of Indicators
Selecting the number of indicators was one of the most difficult decisions we had to make.As senior management was the audience for the indicators, the workshop participants decided the indicators should constitute a comprehensive profile that could fit on one page.The intent of this profile was to provide an overview to focus the reviewer's attention on the key issues.Other charts could be attached if necessary.
Advice: Start small and build on success.As a starting point, limit the number of indicators so that they fit on one page.

Using the Full Set of Indicators
Each indicator, taken alone, has an obvious interpretation.Shorter cycle time is good; an increase in the number of errors that a customer finds is bad, and so on.The obvious interpretation is not necessarily the correct one.Cycle time can be shortened in several ways, CMU/SEI-2001-TN-026 some of which are clearly not desirable (e.g., skipping testing).A profile should be comprehensive, easy to understand, and force an awareness of possible hidden tradeoffs.For example, if testing is sacrificed to reduce cycle time, this may show up in the number of defects reported by the customer.
Advice: Develop a comprehensive set of indicators to detect trends and hidden tradeoffs.

Definitions
Definitions are critical for achieving proper interpretations of the data.Crafting a set of definitions that can be understood-not misunderstood-is one of the keys to any measurement effort.Without clear definitions, it is impossible to interpret any result in a meaningful way.We have found that the use of specialized templates and checklists enables us to collect and communicate data using a set of uniform definitions.

Indicator Definition Template
In a measurement program that encompasses multiple sites and business units, the issue of good definitions becomes more difficult and much more important.Different sites typically have different processes, business and technical environments, cultures, and assumptions.Due to the global scope of Case 1, the site personnel spoke different languages.To ensure that each unit would construct each indicator the same way using the same measures, assumptions, algorithm, etc., we developed a template for defining and documenting each indicator.
The template includes fields for • precise objective of the indicator

• inputs
• algorithms • assumptions • data collection information • data reporting information

• analysis and interpretation of results
In all our example cases, the completed templates for each indicator were collected in a measurement handbook that was distributed to each unit.A sample indicator template is provided in Figure 5. Appendix B contains a more detailed description of each field in the indicator template.
Based on feedback that we have received from organizations, the template was one of the key ingredients to success when implementing a measurement program.Organizations tend to tailor the template to fit their environment.Adding, modifying, or deleting fields, in advance of specifying a set of indicators, can help ensure that the template will be accepted and implemented by the organization.The contents of the metrics collection form are shown in Table 2.Both the indicator template described in Figure 5 and the metric collection form shown in Table 2 provide information so that everyone from the collectors to the decision-makers can understand their purpose in collecting, reporting, and making decisions based on this metric [Augustine 99].

Definition Checklist
Communicating measurement definitions in clear and unambiguous terms is a non-trivial undertaking.To assist in this task, the SEI developed a series of measurement framework checklists for common software measures such as size, effort milestones, and defects.These framework documents can be downloaded via the SEI Web site at http://www.sei.cmu.edu/sema/publications.html.
The general format of the definition checklists is show in Figure 6.Each checklist contains an identification section followed by the principal attributes that characterize the object we want to measure.The values that the attribute can assume are listed for each attribute.These values must be both exhaustive (complete) and mutually exclusive (non-overlapping).The checklist also contains columns that specify if the specific value of the attribute is included or excluded in the data collected.By using a checklist of this format, the principal attributes and their values can be explicitly identified.

Precise Definitions
When working with multiple sites, we found (in Case 1) that there were significant differences in the assumptions being made about the start and end date of a project, as well as other key dates (milestones) in the development process.It was very difficult to combine data from individual projects into a comprehensive view for the entire enterprise or larger unit unless there was a consistent definition of key dates in the development.We developed a checklist that specified exactly what constitutes the start and end dates of a project as well as other milestones.To develop this checklist, each site presented how they precisely defined a project's start and end date.The data from all the sites were consolidated, and common definitions for these dates were developed.Figure 8 illustrates how the checklist that specifies project start and end dates was developed.In Case 1, English was not the native language of most of the participants.We found it extremely useful to use graphics to illustrate concepts and definitions.Graphics were also extremely useful to prevent misunderstandings when working with the different business units in Case 2. Figure 9 is an example of one of the graphics used with Case 1.

Use of the Data Elements: Etiquette
Using data appropriately is one of the keys to getting a metrics program off the ground, motivating people, and sustaining commitment.If individuals see the metrics as tools to help them succeed, they are likely to rally to the cause.If, on the other hand, they feel robbed of respect, treated like a tool rather than a person, stiff resistance will be the result.To ensure that measures are used appropriately, honor the following three principles: 1. Never allow anyone in the organization to use metrics to measure individuals.
2. Specify how the data are being used by relating them to strategy and providing regular feedback to staff.

3.
Have clear rules about who has access to specifics of data, and clear hand-offs when it passes from private status to public.Some data should only be made available to specific individuals; while other data can be accessed at the project-or organizational-level.The table below shows the breakdown suggested by Robert Grady [Grady 92].

Collection of Data
Culture may aid or hinder the implementation of a measurement program.In Case 1, (Measurement Across a Global Enterprise), cultural differences among the sites caused us to expend considerable resources.At one site, measurement was engrained in all activities.At others, our request for effort data was taken as an insult.Considerable time and effort was expended explaining the benefit of collecting and sharing this information.Each time a new individual came on board, we had to re-address this cultural issue.
Advice: Culture is a major issue.Plan to address it early and throughout the implementation.
Respect the needs of people involved, and work collaboratively.

The 100% Solution May Not Be Feasible
Trying for the perfect solution that will satisfy all participants may be a futile endeavor.It may be impossible to obtain agreement on all issues by all the participants.A good example of this problem was our solution for the unit of size for Case 1.A number of candidate solutions were proposed that included Function Points, source lines of code (SLOC), as well as several "proxy" measures such as screens and functions.All have their advantages and disadvantages.Since the primary applications being developed by this organization are database and report-intensive information systems in a variety of languages, Function Points seemed a natural choice.On the other hand, SLOC has the advantage of being relatively inexpensive to collect reliably, given a careful definition of what counts as a line of code and which lines to count.A considerable amount of time and effort were expended defending each particular candidate.
To come to some kind of resolution, we conducted a survey to determine how much software existed in each language and how much would be constructed in each language in the next year.The survey showed that we could count size in SLOC with existing code counters for around 80% or more of the software being developed.Also, using available code counters is a relatively inexpensive way to collect size information when compared to Function Point analysis.This information allowed us to come to the initial solution of using SLOC to determine the size of 80% of the software.The remaining 20% would be addressed later.We also decided to support experimentation with Function Points and other "proxy" measures to get a better sense of the expense and potential advantages.
Advice: When there is no consensus on how to do something, (e.g., measure size) take as your initial position the least costly of the adequate solutions available.Then experiment on a limited basis with other solutions to see if they demonstrate added value.Since no solution will please everyone under these circumstances, adopting any single solution will require some of the adopters to implement something in which they don't fully believe.The lower the cost, the more likely they are to go along.It may be impossible to obtain the 100% solution, 80% may be good enough.

Pilot Implementations
Pilot implementations allow us to test the feasibility and robustness of definitions, checklists, templates, and procedures developed to implement the measurement program as well as to develop the operational aspects of: • forms for collecting and recording data • data storage and access tools • who will collect, store, and access data • tools to aid in collection and analysis • roll up procedures

• training
The pilot implementations enabled us to identify a number of problem areas.One identified problem was related to what constituted the end of the project.From the checklist shown in Figure 10 for Case 1, we can see that the end of the project was signaled by customer sign-off at the end of User Acceptance Testing (UAT).During the pilot implementation, we found that some customers would deploy the software and never execute a sign-off at the end of UAT.
As a result of the pilot, we had to modify the checklist so that a measurable and observable event that all could agree on would signal the end of the project.Advice: Use pilot implementations to verify feasibility and to test definitions, checklists, and templates.

Implementation Time Frame
Implementing a measurement program across many different business units was a tedious, time-consuming process.In a number of cases, major reorganizations and retirement of key management stakeholders during the pilot implementations hampered progress.The new management had other priorities, which made it difficult to maintain project momentum.
Since the basic indicators provided considerable value to the individual business units, the collection and refinement of the indicators, templates, and forms continued.
Advice: Recognize that implementation of a measurement program may take a long time and that management can have a short-term window.Therefore, plan to show some short-term successes before management change.Start small and build upon success.

Automation/Tools
In general, you should automate the collection of the raw data as much as possible.This will reduce the effort required to collect the raw material on which the measurement program is built.Automation focused on simplifying the recording of primitive data elements will tend to be more beneficial at first.Elaborate data analysis and presentation tools are best selected after the stakeholders have an opportunity to explore the amount of decision support available in the data collected.The ability to refine or add primitive data elements will be more beneficial at first than the ability to perform a complex analysis or draw an intricate display.
Effective communication tools must be selected after the nature of the communication is well understood.
Advice: Make the tool fit the process, not the other way around.Maximize yield of relevant information, while minimizing data collection effort.

Using Analysis Indicators
In Case 3, the rate of implementation for system change requests had decreased dramatically in recent times, and quality concerns expressed by the customer had increased.
Questions about the relationship of productivity in the implementation of change requests and quality led to analyzing the processes used to accomplish changes to the system.Analysis determined that each system release contained changes accomplished using three different processes: 1. using the standard process 2. using an abbreviated process 3. using the process for emergency fixes Analysis of the delivery rate for change packages revealed the following pattern of package types shown in Figure 11. Figure 11 shows the number of change packages implemented by each of the three changes processes used for system releases.As can be seen, the processes used to implement changes packages have changed dramatically in recent times.A large number of emergency fixes are seen in most releases starting with release 3.The number of changes being released with the abbreviated process consistently exceeds the rate of changes released using the standard process.The differentiating factor among these processes is the degree of formality and amount of time allocated to the process of analyzing the change and obtaining agreement to the proposed solution.The standard process requires approval of a very time consuming Change Control Board, while the approval process of the other two processes was much less formal hence less time consuming.
The relationship between quality and the change process used must be understood before modifying the standard process.Whether the occurrence of quality issues leads to unpredictable schedules or the compression of schedules leads to quality problems must be understood.Changes made to the standard process and the abbreviated process may independently affect the schedule or quality performance or both.Improving the standard process (assuming that it will have a uniform effect on performance) requires further analysis.Also further analysis must be performed to address the quality concerns expressed by the customer.
Advice: Look at your data, and test your assumptions.Don't be afraid to revise your intuition based on evidence.

Motivating the Wrong Behavior
Measurement stakeholders, determined to find useful information to guide decisions, can be lead astray by context-dependent information presented out of context.In the example from CMU/SEI-2001-TN-026 Case 3 described above, one initial reaction to the situation was a suggestion to eliminate the abbreviated process-in an effort to enforce the standard process.However, this would potentially limit the productivity of the project even further, unless some effort to improve the usability of the standard process is undertaken.Merely eliminating (an apparently productive) avenue for releasing products is an apparent "near term fix" with potentially counterproductive long-term consequences.
Data provided to wider audiences can be censored or distorted when the intentions of data recipients are not understood (or trusted).Defect data from inspections are frequently subject to bias due to effort expended by some to perform pre-review quality checks.However, others spend less time (or no time) preparing for the measurement point.While the performance measures indicate improved performance, the prevalence of undocumented ways of doing work exceeds the organization's ability to balance the workload.The problem compounds itself as the data used become more and more distorted, and the decisions justified by the data become less and less credible.
Matters of trust and motivation aren't the only sources of distortion in data that motivate the wrong behavior.Conflicting goals for performance can frequently lead to tradeoffs in accuracy, which constrain the performance of projects with unintended consequences.Organizations struggling to manage "unit cost" often find themselves defining and redefining a narrowly focused, context-independent, performance index driven by arbitrary definitions and counting rules negotiated in committees.The inclusion or exclusion of various effort categories such as overtime, vacation time, project management, quality assurance, and rework can perpetuate an unrealistic expectation for performance.When one group of roles is motivated by minimizing an arbitrary measure, their influence on another group of roles in the organization can lead to a type of shell game that maximizes the appearance of performance in one respect, while actively sabotaging the reliability of information available.
In many cases, the choice not to collect and examine some data may actually cause people to use the data for more appropriate purposes.Respecting the ownership of data by the people who can most directly act on it will reduce the occurrence of unintended consequences due to conflicting perspectives on what the data mean.
Advice: Beware of unintended consequences, and the perspectives of different stakeholders.Make the right thing to do the easy thing to do.

Summary of Lessons Learned
We have used the Goal-Driven Software Measurement methodology to implement measurement programs in a large number of organizations.The three example cases had different goals for their measurement programs.The solutions to the problems and impediments encountered, the artifacts developed (such as templates and checklists), and the lessons learned will provide insight to others trying to implement a measurement program.
The following is a summary of advice to those currently implementing or considering implementing a measurement program:

Summary of Advice
Use the GQ(I)M methodology to identify your indicators and measures to ensure traceability back to the business goals.
Clearly specify the goal that is being addressed.
Maintain traceability from the indicators back to the business goals.If questions arise later about intent, you will be able to look back to the origins and provide implementation decisions that are consistent with your business objectives.
Have a clear understanding of the type and purpose of each indicator.Articulate clearly the criteria you will use to decide if the goal has been met.Do not use Progress Indicators as a proxy to Success Indicators.Use Analysis Indicators to study the data you use, in order to support accurate progress and success tracking.
Start small and build on success.As a starting point, limit the number of indicators so that they fit on one page.Develop a comprehensive set of indicators to detect trends and hidden tradeoffs.
Customize the indicator template for relevance in your environment by adding, modifying, and deleting fields as required.Define all indicators using the indicator template and use it for precise communication.
Use definition checklists to explicitly define your measures.
Use specialized templates, checklists, and graphics to disseminate unambiguous information that precisely defines the inputs for the measurement program.
Pay close attention to privacy issues pertaining to who can see what portion of the data.
Culture is a major issue, plan to address it early and throughout the implementation.Respect the needs of people involved, and work collaboratively.
When there is no consensus on how to do something, (e.g., measure size) take as your initial position the least costly of the adequate solutions available.Then experiment on a limited basis with other solutions to see if they demonstrate added value.Since no solution will please everyone under these circumstances, adopting any single solution will require some of the adopters to implement something in which they don't fully believe.The lower the cost, the more likely they are to go along.It may be impossible to obtain the 100% solution, 80% may be good enough.
Use pilot implementations to verify feasibility and to test definitions, checklists, and templates.
Recognize that implementation of a measurement program may take a long time and that management can have a short-term window.Therefore, plan to show some shortterm successes before management moves on.Start small and build upon success.Make the tool fit the process, not the other way around.Maximize yield of relevant information, while minimizing data collection effort.
Look at your data and test your assumptions.Don't be afraid to revise your intuition based on evidence.
Beware of unintended consequences and the perspectives of different stakeholders.Make the right thing to do the easy thing to do.

Conclusion
When implementing a measurement program, pay special attention to the lessons we learned and to the artifacts we developed such as templates and checklists.They may assist you in becoming a success data point in Howard Rubin's database of companies that have implemented a measurement program.maintenance spending is being accomplished by failing to service high-priority requests that should not be neglected.
Customer Satisfaction.This indicator tracks two components of customer satisfaction: satisfaction with the implemented solution and satisfaction with the working relationship with the implementing team.
Cost of Quality (COQ).This analysis breaks down overall costs (in effort-hours) into four categories.We modified the approach used by Crosby to fit the needs of this organization.The categories of cost that we used are: • rework: total hours for fixing defects discovered prior to release, including the cost of reinspecting and retesting • appraisal: total hours for inspecting and testing (except when those hours are part of rework) • prevention: total hours for defect prevention activities, such as Pareto analysis • performance: costs that are not one of the above (e.g.effort associated with building the product).

Figure 1 :
Figure 1: Measurement Program Starts and Successes Figure 2: Goal-Driven Software Measurement Figure 3: Goal-Driven Software Measurement Workshop Figure 4: Types of Indicators Figure 5: Indicator Template Figure 6: Definition Checklist Figure 7: Adapted Staff-Hour Checklist Figure 8: Developing a Checklist for a Project's Start and End Dates Figure 9: Date Definition Checklists Figure 10: Example of a Start and End Date Checklist Figure 11: Changes per Release Number Figure 12: Enterprise Profile Example Figure 13: Indicator Template Figure 14: Modified Indicator Template Figure 1: Measurement Program Starts and Successes Figure 2: Goal-Driven Software Measurement Figure 3: Goal-Driven Software Measurement Workshop

Figure 4 :
Figure 4: Types of Indicators Figure 5: Indicator TemplateThe concept of an indicator template is not unique.Other organizations have recognized the importance of precise communication and collecting measurements based upon why they need the information rather than collecting measures because they have the capability to measure.Capt.Thomas Augustine in "An Effective Metrics Process Model"[Augustine 99] describes a form that is used to collect the information for each indicator in their metrics plan for his organization.The individual fields are very similar to those of the indicator template.The contents of the metrics collection form are shown in Table2.Both the indicator template described in Figure5and the metric collection form shown in Table2provide information so that everyone from the collectors to the decision-makers can understand their purpose in collecting, reporting, and making decisions based on this metric[Augustine 99].

Figure 8 :
Figure 8: Developing a Checklist for a Project's Start and End Dates Pay close attention to privacy issues pertaining to who can see what portion of the data.

Figure 10 :
Figure 10: Example of a Start and End Date Checklist CMU/SEI-2001-TN-026 21 Figure 11: Changes per Release Number

Figure 12 :
Figure 12: Enterprise Profile Example (Data values are for illustrative purposes only)

Table 2 : Metrics Collection Form [Augustine 99]
Advice: Customize the indicator template for relevance in your environment by adding, modifying, and deleting fields as required.Define all indicators using the indicator template and use it for precise communication.
In all three examples, we tailored the checklists to the specific environments.In Case 1 and Case 2, we developed a set of customized checklists, based upon the SEI definition checklists to define the critical data elements.Being able to specify explicitly what attribute values were to be included and excluded in the final value for staff-hours, for example, made it possible to compare data collected from the different organizations.Figure7illustrates a portion of the tailored Staff-Hour Definition Checklist developed for Case 1. CMU/SEI-2001-TN-026

Start and End Times Definition checklist Definition checklist Definition checklist Figure 9: Date Definition Checklists Advice
: Use specialized templates, checklists, and graphics to disseminate unambiguous information that precisely defines the inputs for the measurement program.