Saturday, September 1, 2012

Paper Reading #2: Taming Wild Behavior - The Input Observer for Obtaining Text Entry and Mouse Pointing Measures from Everyday Computer Use


Introduction:
The purpose of this blog entry is to present The Input Observer, which is an experiment to test real-world text input and mouse movement speed and accuracy. “Taming Wild Behavior: The Input Observer for Obtaining Text Entry and Mouse Pointing Measures from Everyday Computer Use” is a scholarly paper published by Abigail C. Evans and Jacob O. Wobbrock from the University of Washington. The main goal of The Input Observer is to measure text entry and mouse movement speed and accuracy “in the wild,” or outside of a controlled test environment. Abigail is a research assistant at the University of Washington, who specializes in graphic design, web development, and knitting (sweaters, gloves, and such). Jacob is an associate professor at the University of Washington, who specializes in CHI. A CHI researcher by the name of Hurst performed a similar experiment, tracking mouse movements and text input, but their tests were limited to one application on the computer. Whereas The Input Observer gathers information directly from the operating system, so it does not distinguish between testing applications. In this blog, I will discuss the overview of Abigail and Jacob’s research paper, related work, how they evaluated The Input Observer, and my opinion regarding their research, findings, and presentation.

Related Work:
One piece of related work studying pointing performance, "Accurate measurements of pointing performance in situ observations" does a pretty good job measuring pointer input, but it seems to be scripted in the sense that the observation occurs in a controlled program. Also, "Automatically detecting pointing performance" is another example of an in situ study of mouse performance. A slightly different take on pointing performance is considered in "Understanding pointing problems in real world computing environments." Similar studies are done in "Accuracy measures for evaluating computer pointing devices" and "Optimality in human motor performance." Furthermore, similar work is done in "Goal crossing with mice and trackballs for people with motor impairments" and "Instrumenting the crowd: using implicit behavioral measures to predict task performance." Researchers investigate the problems and advantages of in-the-wild studies in "Being in the thick of in-the-wild studies: the challenges and insights of researcher participation." This is also discussed in "Ethnography and participation observation" and "Into the wild: challenges and opportunities for field trial methods."

Summary:
As I mentioned before, the main purpose of The Input Observer was to observe input speed and accuracy in the wild. However, in normal input speed and accuracy testing environments, a predetermined script is used to ensure consistent results. Abigail and Jacob believe that this also affects test results, because users have to read and copy the test text, instead of creating the text as they go. Most of the time in a real world setting, we will be inventing the text as we type. In order to truly measure these characteristics in the wild, no scripts could be used. This presented a challenge from the very beginning. In order to overcome this, the researchers devised a innovative idea of how to measure text input, including edits and errors. An example of an edit is if I type “I went to the supermarket,” but then I backspace and I write “I went to the bowling alley,” this would be an example of an edit because it was not a grammar or spelling mistake, but rather a content mistake. Users were not penalized for edits. On the other hand, an example of an error is if I were to type “I like to splunk” and then I backspace partially and change a misspelled word, or “I like to spelunk.” This is an example of an error because of a misspelled word. In The Input Observer, errors count against the sample user. In order to find valuable strings of words on which to test input speed and accuracy, The Input Observer would record text input until a finalizing remark was made (punctuation, enter, etc.). Then, the system would test the length of the string, requiring a minimum length of 24 characters and no edits. If the string was considered a valid test example, then the system would treat that one string as a sample input, recording the input speed and accuracy.

Evaluation:
When it came to evaluating the performance of The Input Observer, the team compared two scenarios. First, the team loading The Input Observer on twelve computers in a test environment and provided a script for users to follow. Next, the team loaded The Input Observer on twelve computers in test user’s homes and did not provide a script for them to follow. Then, the team used objective quantitative methods to compare the results of the two scenarios. The main point of emphasis regarding the comparison of results is that the team did not expect the results to match up. They expect the results to be different because they believe that people will perform differently “in the wild,” when they are creating their own script as they go. This scenario is what the research team hopes to measure with The Input Observer. The results were as follows.





Summary:
After reading “Taming Wild Behavior: The Input Observer for Obtaining Text Entry and Mouse Pointing Measures from Everyday Computer Use,” I believe that Abigail and Jacob have created a great test system that will provide much more useful results than a script-based text input and mouse movement speed and accuracy test. However, I am not happy with their evaluation method because they compared themselves to the situation that they were trying to prove a point about. The purpose of The Input Observer is to record results in the wild, which means that they should not compare their results to the test environment results. On the other hand, I do not see a viable alternative for testing their results, because of the wild, unscripted nature of their experiment setting. Therefore, I believe that comparing their results to the test-environment results was the only possible solution.

1 comment:

  1. nice post & also come to my Online Shopping Site for Men,Women,Mobiles,Electronics,Sports,Home & Appliances and many more Choose the Wide range of Selection from all Brands at Best Price in India.
    visit once:-Online Electronics Shopping Site in India

    ReplyDelete