Factor Mining

Problem
Deciding upon an experimental design is key to any user study, but this is sometimes challenging for complex problem domains. The factors that govern how difficult a trial will be to complete successfully for a participant may be unknown and difficult to control.

Solution
Split the experiment into two phases, where the first phase is an exploratory study used for mining suitable factors, and the second is a straightforward experiment that uses the findings from the first. The exploratory study should use representative trials (possibly generated using Trial Mining). For each trial, calculate each of the metrics that are candidates to be used as factors for the follow-up experiment. When statistically analyzing the results, include all of the candidate metrics in the model and note which ones have a significant main effect on the main performance metrics. The significant metrics are the ones you may consider using as factors, and the range of values in the tested trials should give you an indication of which levels to choose for each factor. Interaction effects are particularly interesting to include since they indicate situations where results are split depending on a particular condition.

Consequences
Factor Mining will inevitably add complexity, time, and budget expenditure to a project since it requires an additional phase. Furthermore, in order for the identified factors to be representative, the trials have to be representative as well. This is often problematic: if we knew how to construct a specific trial at a specific level of difficulty, we would likely already know the relevant factors and would not need Factor Mining in the first place. To sidestep this issue, Factor Mining is often used in conjunction with Trial Mining to randomly generate a large number of trials, characterize them, and select representative ones.

Examples
Factor Mining was recently used in a study on the perception of animated node-link diagrams of dynamic graphs (Ghani et al. 2012). While the readability of static graphs is well understood, this is not true for dynamic graphs (i.e., graphs where edges and vertices appear and disappear over time). The work enumerates a large number of dynamic graph metrics, such as node and edge speed, angular momentum, and topology change, but there exists no results on the relative significance of these candidate metrics. Therefore, the work used an exploratory study where the important metrics (node speed and target separation) were identified.