Appendix A

ccount Project Documents

This appendix contains software engineering project documents for our example software metrics program ccount including the following:

Also included is the ccount user documentation, a manual page.

 

ccount C Metrics Tool Concept Exploration

Problem Definition

A primary purpose of software engineering is to manage the complexity of software. Tools are needed to measure the complexity of parts of a software system so that the parts can be changed to make them less complex, or so that they can be given special attention, such as extra testing.

In general, as the size of code increases so does its complexity. Thus, one type of measurement that reflects complexity is the count of noncommentary source lines (NCSL) in a piece of software. The tool proposed, ccount , will provide NCSL counts for C source code files. Specifically, ccount will report NCSL for the file as a whole, for each C function definition in the file, and for code, such as macro definitions and variable declarations, outside any function definition in the file.

Counts of the commentary source lines (CSL), and comment-to-code ratios (CSL/NCSL), can be useful as rough estimates of the adequacy of program documentation. ccount will also count the CSL and compute comment to code ratios for the whole file, for each function definition, and for code external to function definitions in the file.

System Justification

There are about twenty C programmers in our shop and hundreds in the company. All could benefit from a simple metrics tool like the one proposed. Although other metrics tools exist, none provide the simple metrics we describe in a small, efficient, and portable tool.

User Characteristics

The proposed tool would be used by C programmers and testers during the coding and testing phases of the life cycle. Users can be assumed to have a good knowledge of the C programming language and the UNIX operating system, and to be familiar with a wide range of UNIX/C programming tools.

Goals for System and Project

ccount should be simple to learn and use, and the metrics it reports should be easy to interpret. Response time should be short to encourage programmers to use the tool often. Project management and program maintenance should be minimal.

Constraints on System and Project

Development of this system must not require the purchase of new hardware or software. Development cannot require more than two person months of effort. Because programmers in our shop and in the company work in a variety of environments, the program should be easy to port to environments with a C compiler.

Solution Strategy

To save time and to check requirements, ccount will first be implemented as a prototype with the major features of the target system. The prototype will be written using the UNIX shell and UNIX tools. If the prototype is fast enough, it will be enhanced to attain full functionality, and revised for robustness, reliability, and maintainability. If the prototype is too slow, the final program will be written in C.

Development, Operation, and Maintenance Environments

ccount will be developed on a Sun 3/50 workstation under Sun OS *, Sun OS is a registered trademark of Sun Microsystems Inc. a version of the UNIX operating system. The prototype will be written in the UNIX shell language, with major portions written in the awk programming language. If the prototype is inadequate, the final program will be written in C. ccount will be maintained for UNIX systems and perhaps for DOS as well.

Feasibility Analysis

In summary, ccount is both technically and politically feasible.

ccount C Metrics Tool Requirements

Product Overview

A primary purpose of software engineering is to manage the complexity of software. Tools are needed to measure the complexity of parts of a software system so that the parts can be changed to make them less complex, or so that they can be given special attention, such as extra testing.

In general, as the size of code increases so does its complexity. Thus, one type of measurement that reflects complexity is the count of noncommentary source lines (NCSL) in a piece of software. The tool described in this requirements document, ccount , will provide NCSL counts for C source code files. Specifically, ccount will report NCSL counts for the file as a whole, for each C function defined in the file, and for code outside any function definition in the file (such as macro definitions and variable declarations).

Counts of the commentary source lines (CSL), and comment-to-code ratios (CSL/NCSL), can be useful as rough estimates of the adequacy of program documentation. ccount will also report CSL counts and compute comment-to-code ratios for the whole file, for each function definition, and for code outside function definitions.

Development, Operation, and Maintenance Environments

Development Environment. ccount will be developed on a Sun 3/50 workstation under Sun OS *, Sun OS is a registered trade mark of Sun Microsystems, Inc. a version of the UNIX operating system. It will be written in the C programming language.

Operating Environment . ccount should be portable (with minor changes) to other environments with a C compiler. This requirement will be tested by porting ccount to two other environments: a Vax running UNIX System V, and a PC running DOS.

Maintenance Environment . ccount will be maintained in its development environment.

Conceptual Model

ccount is built on the model of a standard UNIX filter.

Options and Input.

Options and input files are specified on the command line, however the default function delimiter (see later) can be stored in an operating system environment variable, and input can also be supplied on stdin .

Data Output . Data output is directed to stdout .

Error Output . Error output is directed to stderr .

Conceptual Model . Figure A-1 is the conceptual model for ccount.

User Interface Specifications

This section specifies the input user interface. The output formats are specified in the next section.

Command Line Format . ccount is invoked by the command " ccount ," possibly followed by options, possibly followed by a list of input source code file names. The allowed options are t and d . Schematically, the ccount command line looks as follows:

ccount [-t][-d function_delimiter ] [ file_name ...]

Option Format . As with other UNIX tools, the options may be listed separately as shown, or they may be put together (i.e., as -td ). The white space separating the -d option from its argument is optional. Repeated options are allowed; the argument for the last -d option is used as the function_delimiter . Options and file names may be interspersed. A double dash can be used to mark the end of the options list and the start of the filenames.

Option Actions . The -t (for "tabbed") option specifies that each field in the output should be separated by a single tab. The -d (for "delimiter") option specifies a function start delimiter string. This string must either not contain white space or characters significant to the UNIX shell, or it must be quoted. Details about the functions of these options are provided in the next section.

Input File Specification . A file_name is a path to a C source files. When files are named on the command line, ccount uses the named files for input; if no file is named on the command line, ccount reads stdin .

Functional Requirements

Incorrect Input . ccount assumes that its input is syntactically correct. There is no need for ccount to make estimates of CSL and NCSL for incorrect programs; consequently ccount is required to work properly only on syntactically correct programs. ccount should not core dump on incorrect input, but otherwise its behavior is undefined.

Definitions of NCSL and CSL . These definitions are as follows:

Note that a blank line is considered neither a CSL nor a NCSL, and that a line containing both code and comments is both a CSL and a NCSL.

Definition of Function Delimiter String . It is possible to tell by parsing a source code file where a function definition's code begins, but it is not possible to tell where the comments that go with a function definition begin. Therefore, if ccount is to report accurate counts of CSL by function definition, a special character string marking the beginning of the comments for each function definition must be placed in the source file, and provided to ccount . Such a string is called a function delimiter string , and it marks the line at which counting for a function definition begins. Once counting for a function definition begins, all lines up to and including the line containing the brace ( } ) ending the definition are included in the counts for the function. The line containing the function delimiter string itself counts as a CSL or NCSL for the function, as appropriate.

Missing Function Delimiter . If no function delimiter string is specified, then all lines are regarded as external to any function definition, and no per-function counts are reported.

Function Header Recognition . A quick look at the grammar for the C programming language suggests that recognizing function definition headers, necessary for finding function names, will be difficult. The problem arises for two reasons: C declarators can be complex, and identifiers can name types using the typedef facility. Handling arbitrary function definitions would probably require sophisticated parsing algorithms, a symbol table to track type definitions, and header file processing. Because limited development resources are available for this project, the final program is only required to recognize simple function definition headers. Specifically, the program need only recognize function definition headers in which the function name is not contained in a parenthesized declarator. This restriction simplifies parsing without unduly damaging ccount 's usefulness.

Output Report Contents . For each C source file, ccount will report the following:

Output Report Formats . The output report will contain a section for each file provided as input. The report is produced in either of two formats: tabbed format (specified by the -t option on the command line), and nontabbed format (the default). Report sections consist of a header and a body .

Report Section Header Contents . Report section headers contain file names and dates, though their exact contents depends on the output format, as specified subsequently.

Report Section Body Contents . Report section bodies start with lines reporting counts for each function defined in the file, followed by a line reporting counts of source file lines external to any function definition, followed by a line reporting file totals. As noted earlier, when no function delimiter string is specified, all lines are regarded as external to any function definition, and only the external and total lines appear in the report section body.

Report Section Body Fields . Each line in a report section body contains four fields: the first field is a function name, or "external," or "total," for data lines reporting counts for a function definition, for code external to any function definition, or for all code in the file, respectively. The second field is a CSL count, the third field is a NCSL count, and the fourth field is the ratio of CSL to NCSL expressed as a decimal number rounded to two decimal places, with a leading whole number digit or digits. When the NCSL count is 0, this ratio is undefined, so the report shows a dash (-) in place of the decimal number.

Tabbed Format Header Layout . In tabbed format, if the program is reading its input from stdin , the header is empty; otherwise it contains the name of the current file.

Tabbed Format Body Layout . Each field on a line in the body of the report section is left justified and separated from its predecessor by a single tab.

Tabbed Format Output Example and Discussion . An example of a tabbed format output report section is the following:

main.c
function1 23  29 0.79
function2 69 172 0.40 
function3 30 111 0.27 
function4 46  60 0.77 
external 143 212 0.67 
total    313 584 0.54 

Tabbed format is intended to make it easy to use ccount in pipelines with other UNIX filters. For example, the filter cut can be used to select fields from a data stream when these fields are separated by tabs. So if portions of ccount 's output are to be directed to other filters, cut can be used to select data from ccount output delivered in tabbed format. Similarly, if the header line in ccount output is not desired, then files can be piped to ccount through stdin and the header line will not appear in tabbed format output.

Nontabbed Format Header Layout . In nontabbed format, report section headers are five lines long. The first and third lines are blank. The second line contains the current file name, or "stdin" if the program is reading from stdin , followed by a tab, followed by the date and time in the format shown below in the example. The fourth line contains column labels identifying the output fields, and the fifth is a line of 45 dashes separating the header from the body of the report section.

Nontabbed Format Body Layout . Each field in each line in the body of the report section provides four fields of right justified output. The first field occupies 20 character positions. Function names too long to fit in this space are truncated to 20 characters. The CSL and NCSL fields are 6 characters wide; the CSL-NCSL ratio field is 8 characters wide. Fields are separated from one another by a single blank.

Nontabbed Format Output Example and Discussion . An example of a nontabbed format output report section is the following:

main.c Thu Mar 3 16:39:21 1988 

Function           CSL     NCSL      CSL/NCSL 
--------------------------------------------- 
function1           23       29          0.79 
function2           69      172          0.40 
function3           30      111          0.27 
function4           46       60          0.77 
 external          143      212          0.67 
    total          313      584          0.54

Nontabbed format is intended for use by software engineers working at their terminals. Consequently this output format is spaced to ease reading and avoid confusion.

Function Delimiter String Restrictions . The ccount function delimiter string can be any string of printable ASCII characters, up to 63 characters in length, that contains at least one nonwhite space character.

Function Delimiter String Determination . The function delimiter string is determined as follows:

ccount -d "/* FN */" prog.c

Function Delimiter String Recognition . ccount will recognize a function delimiter string only if it appears at the beginning of a line. This provision is made so that the function delimiter string can be mentioned in a file without upsetting the counting mechanism.

Nonfunctional Requirements

Performance . On an unloaded Sun 3/50 workstation, ccount must be able to process a source file of forty thousand bytes in 2 seconds or less of CPU time as measured by the UNIX time command.

Input File Size . ccount must be able to process a source file at least one Mbyte long.

Error Handling

Error Actions . Errors will be handled in the following way:

 Error  Messages  Action
 Bad option  Usage message  Issue message and halt
 No string after -d  Usage message  Issue message and halt
 Bad delimiter string  None  Process file without delimiter
 Nonexistent file  No such file filename  Issue message, process next file
 Unreadable file  Cannot read file filename  Issue message, process next file
 Out of memory  Memory allocation failure  Issue message and halt
 Internal error  Internal error  Issue message and halt

 

Usage Message Format . The ccount usage message is the following:

Usage: ccount [-t] [-d ] [ ...]

The usage message should start on a new line and end with a new line character.

Error Message Format

All ccount error messages have the format: ccount: Error messages should start on a new line and end with a new line characters.

User Documentation

Because ccount is a simple UNIX filter, it need only be documented for the user with a manual page.

Foreseeable Changes and Enhancements

The most likely enhancement to ccount is improvement of the parsing mechanism to recognize function definition headers, and hence function names, no matter how complex the function declarator. Another desirable enhancement is the provision of regular expression patterns as function delimiters rather than fixed strings. This enhancement would allow more flexible function delimiter specifications. Another likely enhancement to ccount is to have it check a project specification file and report any files or functions out of specified bounds. For example, a project specification file might state minimum, maximum, and target values for ccount metrics. Functions or source files with values outside the extrema might be logged in a separate file.

ccount C Metrics Tool Design

Overview

Structured design is the design method used for ccount . The dataflow diagram presented as the ccount conceptual model in the requirements document is elaborated into a detailed dataflow diagram. This dataflow diagram is converted into a structure chart, and a data dictionary and mini-specs to go along with the dataflow diagram and the structure chart are presented. These products constitute the architecture of ccount.

Because ccount is so small and simple, several units identified in the structure chart for logical reasons are too small to be compile modules. In the detailed design, units are chosen from the structure chart to be compile modules. All module interfaces, error handling, internal structure, and so on are presented as the detailed design.

Architectural Design

The ccount conceptual model is a dataflow as shown in Figure A-2. The detailed elaboration of this diagram is shown in Figure A-3.

This dataflow diagram has a transform flow that is, it has the overall form of a filter, with an input portion, a process portion, and an output portion. This form serves as the starting place for transforming the dataflow diagram into a structure chart, as shown in Figure A-4.

Note that every dataflow from the dataflow diagram is captured in a data couple in the structure chart, and every transform bubble in the dataflow diagram is captured in a rectangle in the structure chart. Next we present a data dictionary describing each dataflow-data couple.

 Data Item  Description
 classification  Set of boolean values marking lines as CSL, NCSL, external, function ending
 clean-command-line  Command line arguments array checked and rearranged
 command-line-error  Error code indicating a command line error
 counts  A list of triples of function names and counts of CSL and NCSL
 delimiter  String with the text of the function delimiter string
 error-report  Error and usage messages as specified in the requirements
 file-name  String with the text of a C source code input file name
 file-names  Array of strings with C source code input file names
 file-access-error  Error code indicating a file access error
 function-name  String with the text of a C function name
 function-start-flag  Boolean value true when a line is a function start, false otherwise
 identifier  String with the text of a C program name or a key word
 metrics-report  Output reports as specified in the requirements
 output-format  Boolean value indicating whether output is in tabbed format
 raw-command-line  Original command line arguments in an array of strings
 source-characters  Characters from a C source code input file
 source-line  String with the text of a line from a C source code input file
 token  Name of a C program lexical object

The final element of our architectural design is a mini-spec for each rectangle in the structure chart (corresponding to the transforms in the dataflow diagram).

Module Overview

As discussed earlier, many of the modules in the structure chart are too simple to be compile modules. In choosing units to be compile modules, we have taken the internal nodes of the structure chart, and two leaf nodes that have logically distinct functions. This gives six modules. In addition, we formed another compile module to implement the only complex data type in the program, the count list used to accumulate data and send it from the counting module where it is collected to the reporting module that uses it to generate output. These modules are summarized in Table A-3.

 Module  Description
 ccount  Root module overseeing data and control flow
 params  Checks command line and fetches program parameters
 counter  Processes a file to generate line counts
 classify  Parses lines and classifies them
 report  Generates metric output reports
 error  Generates error messages from error indications
 list  Implements an abstract data type for lists of line counts

 

Module Descriptions

ccount This module contains the program's main function. Its job is to pipe data between the other modules.

Get_Parameters(in: cmd_line, out: is_tabbed, delimiter, files); 
  do { 
      Count_Lines(in: delimiter, file, out: count_list);  
      Report_Metrics(in: is_tabbed, count_list, file); 
      next file; 
     }
  while more files;
int Get_Parameters( 
      int argc,         /* in: argc from main */ 
      char *argv[],     /* in: argv from main */ 
      int *is_tabbed,   /* out: output format */ 
      char **delimiter, /* out: delimiter string */ 
      char ***files     /* out: array of source files */ 
)

The argc and argv parameters are straight from the main function's parameter list. The output format parameter is TRUE if the output is supposed to be in tabbed format, and FALSE otherwise. The delimiter string is set according to the command line or the environment variable FDELIM . If the string has length 0 or does not exist, then delim is NULL , meaning that there is no delimiter string. Any string on the command line that is not an option and not a delimiter string is considered a file name. No further checking of file names is done. The parameter files is a pointer to an array of strings, so it is a pointer to an array of pointer to characters. The end of this array is indicated by a NULL pointer as the last element. Hence if no files are specified, the only element of the array is the NULL pointer. The return value is SUCCEED (0) on success and FAIL (1) on failure; however, all errors encountered by this module should cause a program abort.

Clean_Command_Line(in/out: argc, argv); 
if ( error ) return( FAIL ); 
if ( -t on command_line ) 
   is_tabbed = TRUE; 
else 
   is_tabbed = FALSE; 
if ( -d on command_line ) 
   set delimiter; 
for all other strings on command_line 
   add to list of files; 
if ( !delimiter_set ) 
   set delimiter from environment; 
if ( !delimiter_ok ) 
   delimiter = NULL; 
return( SUCCEED );

counter The counter module accepts a C source file name and a delimiter string as input, and produces a list of counts for that source file as output.

void Count_Lines(
     char *file,         /* in: C source file */ 
     char *delimiter,    /* in: delimiter string */ 
     count_list *counts  /* out: the count list */ 
)

When file is the NULL pointer, data should be read from stdin rather than from a file. When the delimiter is NULL , no per-function counts are made. When the count list is empty, no data was generated.

initialize counters; 
Create_List(out: counts); 
open( file ); 
if ( error ) { 
   Error( ACCESS_ERROR ); 
   return; 
} 
for each line in file { 
   Classify_Line(in: line, delimiter, out: classification); 
   update counters based on classification; 
   if ( end_of_function ) { 
      Append_Element(in: func_counters, in/out: counts); 
      reset function_counters; 
   } 
} 
Append_Element(in: extern_counters, in/out: counts); 
Append_Element(in: total_counters, in/out: counts); 
close( file );

report The report module accepts a file name, an output format indicator, and a count list for a file, and generates a formatted report on stdout . As a side-effect it also consumes the count list so that it is empty when it is finished.

void Print_Metrics( 
     char *file_name,     /* in: of C source file */ 
     int is_tabbed,       /* in: output format */ 
     count_list *counts   /* in/out: count list */ 
)

The file_name parameter is needed because the file name appears in output reports. If the output format indicator is_tabbed is TRUE then a tabbed format output report is generated; otherwise a nontabbed format report is generated. The count list contains the data; if the list is empty then no report at all should be generated.

if ( Is_Empty_List(in: cnts) ) return; 
if ( is_tabbed ) 
   print tabbed header; 
else 
   print nontabbed header; 
for every element in counts { 
   Delete_Element(in/out: cnts, out: cnt_data); 
   if ( is_tabbed ) 
     print tabbed cnt_data 
   else 
    print nontabbed cnt_data 
}

The only point of note here is that if file_name is NULL then in tabbed format the header is empty, and in nontabbed format the file name is "stdin."

classify This module parses C source lines and classifies them as CSL or NCSL, as external to any function or internal to a particular function, and as function ending or not. It also finds function names and returns these for use in the output report.

void Classify_Line( 
   char *line,       /* in: source file line */ 
   char *delimiter,  /* in: delimiter string */ 
   int *is_CSL,      /* out: TRUE iff a CSL */ 
   int *is_NCSL,     /* out: TRUE iff an NCSL */ 
   int *is_extern,   /* out: TRUE iff external */ 
   int *is_func_end, /* out: TRUE iff end of func */ 
   char **func_name  /* out: function name */ 
)

The delimiter string is needed to watch for the start of a function. The four boolean variables suffice to completely classify the input line. The func_name parameter is set to point to a buffer with the function's name in it when the current line is the last line of a function (that is, when is_func_end is TRUE ).

-Tokenizing: Tokenizing depends on what tokens are recognized. For the task at hand, the following tokens must be recognized: identifiers (for function names), comment delimiters (for distinguishing CSL and NCSL), left and right parentheses (for recognizing function headers), left and right braces (for finding the ends of function definitions), line ends (to signal the end of processing), and all others. A private enumeration type for these tokens called token_type is defined as in Table A-4.

 Name  Token
 ID_TKN  Identifiers and keywords
 BEG_CMNT_TKN  Comment start delimiter
 END_CMNT_TKN  Comment end delimiter
 LPAREN_TKN  Left parenthesis
 RPAREN_TKN  Right parenthesis
 LBRACE_TKN  Left curly bracket
 RBRACE_TKN  Right curly bracket
 EOL_TKN  End of line
 OTHER_TKN  Anything else

Recognizing these tokens requires character classification. The following character classes must be distinguished: letter, digit, slash, star, left parenthesis, right parenthesis, left brace, right brace, end of line, white space, and all others. A private enumeration type for these character classes called char_class is defined in Table A-5.

 Name  Description
 WHITE_CH  White space characters
 EOL_CH  The newline character
 LETTER_CH  The upper and lowercase characters
 DIGIT_CH  The digits
 STAR_CH  The asterisk
 SLASH_CH  The slash
 L_PAREN_CH  The left parenthesis
 R_PAREN_CH  The right parenthesis
 L_BRACE_CH  The left curly brace
 R_BRACE_CH  The right curly brace
 OTHER_CH  Anything else

Input characters must be mapped to character classes. The standard way to accomplish this is with an array. A preloaded private ASCII array mapping characters to character classes must be defined.

A private buffer for accumulating the text of identifiers must be defined. Once all this machinery is in place, a scanner can be built as a simple finite state machine that takes the (remaining) input line and consumes it character by character until it recognizes a token. The token is returned along with the unconsumed portion of the input line. The scanner also accumulates the text of identifiers.

- Recognizing function names: The function name recognition problem is simplified if we assume that we only need to process tokens outside the braces delimiting a function definition (that is, tokens at nesting level 0), and outside macro definitions. We make these assumptions. Then we can recognize a function name by watching for an identifier immediately followed by a parenthesized expression, in turn immediately followed by either an identifier (a type specifier) or a left brace.

This job can be done with a souped-up finite state machine that works as follows: Watch for identifiers, and save them as candidate function names. Once an identifier is encountered, look for an argument list followed by an identifier or a left brace. If such a string is recognized, save the identifier as a function name. The machine is represented in Table A-6.

 State  Identifier  (  )  {  Other
 0  1  0  0  0  0
 1  1  2  0  0  0
 2  2  2  3  2  2
 3  1(A)  0(D)  0(D)  0(A)  0(D)

 

The A means "accept" and the D "do not accept." The algorithm must count unmatched left parentheses, and allow the transition from State 2 to State 3 only if this count is 0. Finally, whenever a transition is made to State 1 on an identifier, the text of the identifier is saved. The saved text is returned when an accepting state is reached.

Line classifying: The line classifier must watch every line for a function delimiter marking the start of function counting. It must keep track of comment delimiters and ignore everything but the end of a line when it is inside a comment. Outside a comment, it must count braces to find the ends of functions, to know when code is outside a function, and to know when to feed tokens to the function name finder. It must also keep track of whether a line is a preprocessor line, so as not to feed macro definition tokens to the function finder. This is not hard, but it is tricky. The following describes the algorithm used for this task:

if ( line starts with delimiter ) 
   in_function = TRUE; 
while ( token != end_of_line ) { 
   if ( in_comment ) { 
     is_CSL = TRUE; 
     if ( token == end_of_comment ) 
        in_comment = FALSE; 
   } 
   else 
     if ( token == start_of_comment ) 
        is_CSL = in_comment = TRUE; 
     else { 
       is_NCSL = TRUE; 
       if ( 0 == num_braces && !cpp_line ) 
          Find_Function_Name(in: token); 
       if ( token == left_brace ) 
          num_braces++; 
       else if ( token == right_brace ) { 
          num_braces--; 
          if ( 0 == num_braces ) { 
             if ( in_function ) { 
               is_func_end = TRUE; 
               Get_Function_Name(out: name); 
             }
           in_function = FALSE; 
       }
      }  
    }
  }

error This module contains the error function.

 Error Indication  Description
 BAD_OPTION  Option is not a character
 MISSING_DELIMITER  Missing delimiter string
 ACCESS_ERROR  Cannot access file
 MALLOC_FAILURE  Memory allocation failure
 INTERNAL_ERROR  Input scanner failure

 

The error module also exports the following function:

int Error( 
    error_type indication, /* in: what error */ 
    char *file_name /* in: file name */ 
    )

This first issues error messages, aborts the program if appropriate, and returns FAIL (1) for error propagation.

list This module implements the abstract data type of count lists. A count list is a list of elements consisting of a string and two long values. In this program, the string is used for the function name, and the long values are used for the CSL and NCSL counts for the function. The count list data type has four operations: a list creation operation, an empty list predicate operation, an append list element operation, and a delete list element operation.

int Is_Empty_List( 
    count_list list /* in: list checked */ 
)
void Create_List( 
    count_list *list /* out: list created */ 
) 
void Append_Element( 
    count_list *list, /* in/out: list changed */ 
    char *name,       /* in: name field */ 
    long CSL,         /* in: first count */ 
    long NCSL         /* in: second count */ 
) 
void Delete_Element( 
    count_list *list, /* in/out: list changed */ 
    char *name,       /* out: name field */ 
    long *CSL,        /* out: first count */ 
    long *NCSL        /* out: second count */ 
)

The function Is_Empty_List returns TRUE if its argument is the empty count list, FALSE otherwise. Create_List makes a new list. Append_Element adds a list element with the indicated values in its fields to the tail of the list, and Delete_Element removes an element from the head of the list, placing the values of its fields in the other parameters. These operations impose a queue discipline on count lists.

Uses Relationships

Table A-8 indicates the uses relationships among ccount modules:

 Module  Uses
 ccount  classify, counter, error, list, params, report
 params  error
 counter  classify, list
 report  list, error
 classify  error
 list  error
 error  

 

Rationale

The ccount problem is simple enough that there are few decisions to be made in the implementation, and certainly no difficult decisions. The decisions that were made are summarized subsequently.

ccount Project Coding Standards

This document lists the guidelines to be used during the coding phase of the ccount development effort.

Naming Conventions

These naming conventions allow program readers to recognize many program objects at a glance. For example, document_type is a type name; Is_Empty is the name of a function returning a boolean value; and MEMORY_ERROR is a constant or a macro name.

Types

Control Structures

Expressions

Formatting

Use of Preprocessor

Comments

Error Checking

Modules and Access to Program Objects

Target Metric Values

Code Checks and Inspections

Run all code through lint . All error found by lint must either be fixed, or explained as lint mistakes.

Inspect each module; if this is not feasible, inspect modules in order from those most to least likely to contain errors, based on the difficulty of their design.

Module Header Comment Template

/***************************** module name ******************************** 
Purpose: One or two declarative sentences describing the contents of the 
         module, with emphasis on explaining the modularization principle(s) 
         governing inclusion of code in this module. 

Provenance: A record of the update history of the module. 

Notes: A discussion of the program goal(s) realized by the code in this 
       module, and its relation to other modules. Also included are remarks 
       regarding data, module, or code dependencies, special features, 
       nonportable features, and so forth. 
***/ 

Function Header Comment Template

 

/*FN*************************************************************************** 

      function_name( parameter_list ) 

Returns: data_type -- purpose of the return value. 

Purpose: A sentence stating the purpose of the function. 

Called by: local_function1, local_function2, ... 
           function1, function2, ... in module1. 
           function1, function2, ... in module2. ... 

Plan: Part 1: Goal of part 1 (line_number) 
      Part 2: Goal of part 1 (line_number) ... 

Notes: Notification of side-effects, references to publications, 
       specifications, and so on, discussion of any other noteworthy 
       features. 
***/

Note that this function header template incorporates a ccount function delimiter string in its first line, namely /*FN . Including the ccount function delimiter string in the header template is an easy way to guarantee correct and consistent use of the function delimiter string.

ccount Regression Test Script and Test Cases

The following is a regression test script for ccount . The test scripts vary only in the test cases used, and in the name of the test output files. Consequently this example can serve as a model for other test scripts.

# This shell script runs a regression test on the program ccount 

echo This script tests ccount 
echo comparing output to the expected result in rtst1.output. 
echo "" 

# run ccount with the test input and capture stderr and stdout

ccount ccount.c > temp.output 2>&1


# test for differences in the output file and the correct answer file 

diff temp.output rtst1.output > diffs.1 

# if a difference was found diff will exit with status 1 
# this value will be in the shell variable "?" 

if test "$?" = 1 then 
  echo An error has been found in ccount. Following are the 
  echo differences between the actual and expected output. 
  cat diffs.1 
 else 
  echo no error found in ccount for this test 
  rm diffs.1 
  rm temp.output 
fi 
#run the cumulative nvcc statistics 
nvcs -c

Regression testing was done by running the family of scripts. A master shell script could be used to automate execution of a large family of test scripts.

The following list summarizes the ccount test cases. An initial list was generated from the requirements. This list was increased with other test cases to bring branch coverage above 90 percent. The annotations suggest the input conditions for the test.

  1. ccount ccount.c (Correct file, no options).
  2. ccount ccount.c -- ccount.c (File before double dash).
  3. ccount -vd ccount.c (Illegal and legal options).
  4. ccount -d'/*FN' ccount.c (Single quoted string).
  5. ccount noread.c (Unreadable file).
  6. ccount -v ccount.c (Illegal option).
  7. ccount bogus.c (Nonexistent C file).
  8. ccount ccount.o (An object file).
  9. ccount -t -d"FB" hello.c (Both options, no space).
  10. ccount -t -d "FB" ccount.c (Both options, space).
  11. ccount -td"FB" ccount.c (Both options, compressed).
  12. ccount -d"FB" < hello.c (One option, no space).
  13. ccount hello.c (Correct file, no options).
  14. ccount -d"x ... x" < hello.c (Oversize delimiter with 65 x's).
  15. ccount - hello.c (Missing option).
  16. ccount "" hello.c (Empty file name).
  17. ccount -? hello.c (Nonalphanumeric option).
  18. ccount -d (Missing delimiter)
  19. ccount ccount (Lexically bad file)