Chapter 15

Find: A Tool for Transitive Pattern Recognition

As in Figure 15-1, Find displays the plain text of main.c, which may be scrolled with either the scroll bar to the left of the text or by using the PgUp, PgDn or arrow keys. Selecting wc.c in the ``Modules'' control box causes Find to display wc.c in the program area. The displayed module is denoted by an exclamation mark to its left. If the control boxes are deselected by using the ``Edit/Show Controls'' menu entry, then a ``Modules'' menu appears in the tool bar that enables the user to switch between modules. Before continuing the tutorial, reselect the control boxes and main.c in the ``Modules'' box. The resulting display will be the same as that in Figure 15-1.

The program may be seeded with initial patterns to be matched either from a seed file or interactively from the program display area. Selecting seeds interactively is accomplished by sweeping out an atom with the left mouse button and invoking either the ``All'' or ``File'' option in the ``Seed'' menu depending on whether the selected atom is in either global or file scope. Two seed files are included in the tutorial as shown in Figure 15-2. The file seeds.sd is used in this tutorial, whereas dates.sd is displayed to provide a more illustrative example of a seed file. Patterns can be changed or inserted one per line in the seed files, which must have the extension .sd. Patterns may be excluded beyond the default list by populating the excludeSeed region of the file. These patterns should be entered between the curly braces, one pattern per line.

===== File seeds.sd ===== set includeSeed {{seedList { word }}} set excludeSeed {} ===== File dates.sd ===== set includeSeed {{seedList { ti?me? da?te? epoch ne?we?r older ye?a?r mo?n?th da?y ho?u?r minute seco?n?d? }}} set excludeSeed {}
Figure 15-2 The word-sensitive (seeds.sd) and date-sensitive (dates.sd) seed files

===== File seeds.sd =====

set includeSeed {{seedList {
word
}}}
set excludeSeed {}


===== File dates.sd =====

set includeSeed {{seedList {
ti?me?
da?te?
epoch
ne?we?r
older
ye?a?r
mo?n?th
da?y
ho?u?r
minute
seco?n?d?
}}}
set excludeSeed {}

Figure 15-2 The word-sensitive (seeds.sd) and date-sensitive (dates.sd) seed files

Import the seeds.sd file from the ``File'' menu by selecting the ``Import'' entry which invokes a standard file selection dialog. Other entries in the ``File'' menu include:

``New'', which clears the state of the current Find session and makes the tool available for analyzing a new feature in a different set of modules. The state of an analysis is savable at any time into a Find document file, which has the .xfd file extension. This document allows the state of an analysis to be saved and restored so large problems may be subdivided into several Find sessions. The Find document contains references to all modules, seeds, included and excluded atoms and any other data necessary to continue working on a particular problem.
``Open'', which opens a previously saved Find document for continuation or review of feature code analysis.
``Save'', which saves the current state of the analysis into a Find document. The ``Save'' menu entry displays a standard file dialog for saving a file the first time it is selected. Thereafter the Find file is saved each time this menu entry is invoked.
``Save As'', which saves a copy of the current Find document. The ``Save As'' menu entry also uses a standard file dialog. The ``New'', ``Open'', ``Save'', and ``Save As'' menu entries interact in a way that has become familiar to most users of GUI-based applications.
``Import'', which allows the reading of both seed and code modules into Find. This tutorial has already demonstrated how the ``Import'' menu entry is used to read seed files. Although the example modules were invoked from the command line in this tutorial, code modules may also be read into Find using the ``Import'' file dialog. Filters exist in the standard file dialog for C and C++ language files. Any other type of text file may also be read into Find.
``Export'', which allows the saving of items listed in the ``Transitive Seeds'' control box. This item is of limited usefulness to the general user and may go away in a future release.
``Report'', which publishes the results of an analysis in printable form. The report is formatted to facilitate postprocessing that makes it suitable input for other tools.
``Adiff'', which exports the results of a feature-code analysis in a format compatible with atacdiff (see Chapter 10, ATAC: Testing Modified Code). This output is usable anywhere in the Toolsuite where atacdiff results are used. It is particularly useful in identifying test sets that exercise the feature code that the user delineates with Find.
``Exit'', which exits the program. Find will query whether to save any unsaved state changes in the analysis.

Once Find reads the seed file and code modules, analysis begins. The Find window should look like Figure 15-3. Find highlights all the atoms that satisfy the regular expression represented by ``word''. Each of these is now a candidate for inclusion or exclusion into the code-feature set in either file or global scope. The white area covering each line that includes a candidate enhances its visibility in the scroll bar. Small candidates in large modules tend to be overlooked without this highlighting.



Figure 15-3 The updated Find window after seeds.sd is imported

Before proceeding with the analysis, take a few moments to become familiar with the operation of the ``Seed Files'' and ``Seed List'' controls at the bottom of the screen. The ``Update'', ``All'' and ``None'' buttons in each control box determine whether the seed file or individual regular-expression seeds are considered when generating candidates. The ``Update'' button functions as a toggle. Selecting the entry ``word'' in the ``Seed List'' control box and clicking the ``Update'' button removes the highlighting of all atoms that match the pattern ``word''. Reselecting ``word'' and clicking the ``Update'' button highlights all atoms that match the pattern ``word'' again. The ``All'' and ``None'' buttons have the expected effects. The ``Seed Files'' control box operates similarly. An exclamation mark to the left of the pattern or file indicates that the item is active.

Classify each of the candidates highlighted in red according to whether it is included or excluded in file or global scope. Atoms that are defined as static or automatic variables should be selected at file scope while those defined as global should be selected at global scope. Click the left mouse button on the red highlighted wordct at the beginning of main.c. The pop-up menu in Figure 15-4 appears. Move the mouse down to select ``Include File''. This will insert wordct into the ``Patterned Seeds'' control box. Other entries in the pop-up menu include:

``Include All'', which selects the atom as a global patterned seed in all code modules. Double clicking an atom is a short cut for selecting this entry.
``Exclude All'', which excludes the atom from further consideration in all code modules.
``Exclude File'', which excludes the atom from further consideration in this file.
``Show/Hide'', which shows candidates in the file that have been selected through the transitive relation. This menu item is not useful at this point but its purpose will become clear later.
``Next'', which selects the next instance of the atom specified. This menu is used in conjunction with ``Show''.



Figure 15-4 The updated Find window as the user begins to classify candidates according to whether they are included or excluded in file or global scope

If a classification needs to be changed, remove entries from the ``Patterned Seeds'' control box by clicking the seed and selecting ``Edit/Delete''.

Click twordct and move the mouse down to select ``Include File''. Repeat this for doword. These atoms will appear in the ``Patterned Seeds'' control box and all other instances of the atom will change color to either green or yellow depending on whether they were included or excluded from further consideration. Atoms that the user does not wish to explicitly exclude may be left alone; they will not appear at the next level in the relation. For example, the candidates which match the ``word'' pattern in module wc.c should be ignored, as should all words within comments.

Now invoke the ``New Level'' entry in the ``Seed'' menu, which computes the next set of candidate atoms and updates the Find window. Scroll down until you see what is displayed in Figure 15-5.



Figure 15-5 The updated Find window after ``New Level'' is selected from the ``Seed'' menu

Going to a new level causes several changes in the window. The title of the ``Patterned Seeds'' control box changes to ``Transitive Seeds'' indicating that atoms are no longer being selected through regular expressions but rather through the transitive relation of residing on the same line in the module. The entry in the ``Seed Files'' control box changes from `seeds.sd', the name of the seeds file, to ``seedGen_1'' indicating the level of the transitive relation between atoms. Most importantly a new set of candidate seeds is highlighted in red and the included seeds remain highlighted in green with red text, which indicates these atoms are included in a previous classification pass.

Other entries in the ``Seed'' menu include:

``All'', which is used in conjunction with manually selected atoms. An unhighlighted atom which is not in the stop list is selected by either double clicking on it or sweeping it out with the left mouse button. On choosing the ``All'' menu entry, the selected text is included as a transitive seed in global scope just as if it were a candidate selected with the pop-up menu.
``File'', similar to ``All'' except the atom is included in file scope.

Go through the modules excluding the following atoms at file scope: linect, charct, tlinect, tcharct, dochar, doline, and file and excluding at global scope stdin, a global variable. These atoms are excluded to relieve clutter on the display at the next level. If they were not excluded they would remain as candidates. The atoms count and print should be included at global scope, thereby selecting them as candidate atoms in wc.c in the next level. Invoke ``Seed/New Level'' to go to the next level.

Select the wc.c module and observe that, ``count'' is a candidate atom. It is known from the previous level that the third argument of ``count'' is connected with the word counting feature, therefore ``p_nw'' should be included in file scope. Everything else on that line should be excluded at the file level. Select ``Seed/New Level'' to generate new candidates. This action leads to one new candidate atom, ``nw'', which should be included in file scope and ``Seed/New Level'' selected. At this point no new relevant candidates emerge. Ignore the candidates on the integer declaration line as there is no true dependency among these. This is because the declaration of a variable does not convey how it is going to be used and hence what it is related to. The analysis is now complete (Figure 15-6). Saving the state in a .xfd file, generating a report or creating a .dif file may be done at this time if desired.



Figure 15-6 The updated Find window after completion of the static analysis

Displaying each module shows the atoms included in the word counting feature for that module in the ``Seed List'' control box. Global atoms are preceded with an exclamation point, and seeds in file scope are preceded with a ``+'' sign.

Examination of Figure 15-6 demonstrates one of the limitations of static analysis. The integer variable ``state'' (e.g. in the statement ``state = OUT'') was not found by Find but is in fact an atom that contributes to counting words, as the code shows. To include the ``state'' variable from this statement (Figure 15-7) in the analysis, double click it with the left mouse button and select the ``Seed/File'' menu entry.



Figure 15-7 Include the variable state in the statement ``state = OUT'' into analysis

To quit Find, click on the ``File'' button in the top button bar, then select ``Exit''.

[Top] [Prev] [Next] [Index] [TOC]

Chapter 15 Find: A Tool for Transitive Pattern Recognition

15.1 Background

15.2 A Tutorial

Chapter 15

Find: A Tool for Transitive Pattern Recognition