Coding Calibration

Problem
When analyzing qualitative data (e.g., interview results and insight reports), the data are often coded to impose structure on large, unstructured data by multiple coders (or raters). Unless the coding scheme is determined by prior literature (rare in visualization) or open-coding (Creswell, 1997) is used, multiple coders often need to construct a coding scheme while analyzing the data (closed coding). This process is iterative and often causes painful re-coding of the entire dataset due to changes of coding scheme.

Solution
Having frequent meetings to calibrate a coding scheme while coding the randomly selected subsets (about 10%) of data is crucial. During calibration, codebooks should be compared, and discrepancies between results discussed. The discussion often leads to refining codebooks, and clar- ified definitions should be written on a shared document. Calibration meetings should be continued until no major disagreement is found. Even after the coding scheme is stabilized, if any coder identifies unclear cases, new meetings should be called. Inter-coder reliability (Tinsley and Weiss 1975) can be calculated after the coding scheme is stabilized to clarify definitions and prevent minor errors (although high inter-coder reliability cannot guarantee similar analyses by all coders ( Armstrong et al., 1997) ).

Consequences
While it may require an additional investment of effort, Coding Calibration ultimately saves resources by establishing a code scheme as early as possible.

Examples
Recent work by Kwon et al. (2012) use a similar approach, and report on the calibration process to some degree. Such information, including coding schemes, should be more explicitly described in the literature.