Validation of Software Engineering Research

- - - - - - rough draft - - - - - - work in progress - - - - - - rough draft - - - - - -

The Validation Task

Software engineering does not as yet have well-established norms for validating research results. The range of problems and solutions is wide enough that no single technique will suffice -- but validation is certainly required. The appropriate validation technique(s) depends both on characteristics of the problem and characteristics of the solution. The alternatives are laid out below; first we need a little notation.

A typical research project begins with a problem in the world, PW, that serves to motivate the research. The researcher translates this to a problem, PM, in a model setting, M. In some cases the model setting is explicit or even shared within a line of research; in other cases it is implicit in the problem statement. The researcher produces a solution, SM to PM. This may include developing a tool or technique, TM and demonstrating the solution on examples EM, all in the model setting. Validation consists of showing first that SM solves PM and then that SM provides guidance for developing a solution SW that works for PW in the world at large. This last step -- the end-to-end check on the work -- is altogether too often omitted.

<<<<<diagram here>>>>

A characteristic of software engineering research is that PM is as faithful as possible to the world, retaining the thorny difficulties and warts that actually made the problem in the world hard.

<<<<<what remains to be done is extend characteristics of problems and solutions, extend list of validation strategies, flesh out descriptions, show connections between problem/solution lists and validation, and provide citations for good examples>>>>

Character of the Problem

  1. Is it an issue to do PW at all?
  2. Is the problem to make something perform better (e.g., run faster)?
  3. Is the issue of PW to extend an existing capability in some particular way?
  4. Are human performance factors at issue?
  5. Is the issue of PW to change the way some task or activity is carried out?
  6. Does the problem have multidimensional (intrinsic) structure?
  7. Does the problem have a natural, easily quantifiable, granularity?

Character of the Solution Technique

  1. Does SM impose modularity (e.g., by separating concerns and showing interactions among parts), and does this modularity map to SW?
  2. Is there a natural model of solution complexity (e.g., number of intermediate points) that can be compared to other solutions?
  3. Does the solution involved a fixed sequence of steps to be carried out by a person?

Strategies for Validation

  1. Demonstration by existence (useful if it was an issue whether PW could be done at all)
  2. User studies at varying degrees of rigor
  3. Comparing parameter counts of SM with other solutions (useful if there's a meaningful thing to count)
  4. Case studies -- thoughtfully chosen and justified examples
  5. Performance measurement (and comparison to alternative solutions)
  6. Benchmark performance (assuming benchmarks for PW are defined)
  7. Comparison of SM applied to the field's customary example with other solutions
  8. Analytic coverage arguments (SM does everything other solutions do, plus meaningful additions)
  9. Show that can be usefully incorporated in existing development processes
  10. Formal analysis: may be symbolic/semantic or numerical

This page is part of Mary Shaw's site in the School of Computer Science at Carnegie Mellon University. It was last modified on 11/21/97. Use of any portion of this site to generate spam or other mass communications is forbidden. Comments to maintainer.