Sunday, August 31, 2008

Visual Similarity of Pen Gestures

A. Chris Long, Jr., James A. Landay, Lawrence A. Rowe, and Joseph Michiels

Comments

Nabeel's blog

Summary

The authors propose metrics for evaluating the visual similarity between pen gestures. The limits of human attention and memory make it difficult to recall gestures. Their goal is to develop a systematic method for checking the similarity between gestures, so that gestures can be designed different enough from each other to not confuse the user during recall.

Two experiments were conducted by the authors. For each experiment, a set of gestures was defined. For the first experiment, the authors wanted to derive metrics for evaluating the similarity between gestures, so they generated a set of gestures that covered a wide variety and that reflected different spatial orientations. In the first experiment, participants were shown all possible combinations of creating three different gestures from the set and asked to pick the least similar gesture of the three.

The authors selected several possible features for comparing similarity. They looked at the 11 features used by Rubine, as well as others derived from examination of the data using a technique called multi-dimensional scaling (MDS). Some of the predictors resulting from the first experiment where curviness, total absolute angle, density, cosine of angle between first and last points, and aspect.

In the second experiment, the authors wanted to evaluate the predictive power of the metrics derived from the first experiment against new people and gestures, and explore how changing different types of features would affect the results. A new gesture set was derived for the experiment based on these criteria. Some of the resulting predictors for the second experiment were Log(aspect), total absolute angle, and two density metrics.

The authors found that the predictors from experiment 1 were better at predicting than those from experiment 2. Also, the authors noticed in both experiments that participants used different features in determining similarity.

Discussion

This work introduces metrics for evaluating the similarity between pen gestures. Developing systematic ways to analyze the similarity between gestures allows developers to design gestures that are dissimilar for commands that are unrelated, and similar gestures for those commands that are related.

The two faults I find with this work are one, as mentioned by the authors, they did not let the participants actually draw the gestures. Unforeseen changes in perception may arise when the participants are engaged in drawing the gestures. The other fault is that I did not believe the authors adequately examined research in perception related to this topic. There has been work done on perception of similarity by psychologist since the beginning of the early 20th century. Spatial proximity plays a role in how we perceive similarity. The snapshots of the study show the three gestures close to each other. This could have adverse effects on their results.

I think this type of research is incredibly valuable to sketch recognition. My future work would be to run more studies with the issues previously discussed addressed. A continued refinement of the metrics is needed, which can only come with further evaluation and resulting insight.

Specifying Gestures by Example

Dean Rubine

Comments

Ben's blog

Summary

Rubine introduces a new single stroke gesture recognition algorithm based on statistical pattern recognition and a toolkit called GRANDMA for adding gestural recognition to an interactive application. The work is introduced by example in a gesture-based drawing program called GDP that uses GRANDMA.

GRANDMA uses a structure similar to the Model/View/Controller methodology, where controllers are the input gestures and views are view classes that represent visual objects on the screen that allow gesture interaction. GRANDMA contains a gesture designer, which allows for a developer to design new gestures and assign them to view classes.

The gesture recognition algorithm breaks a single stroke gesture down into 13 features. These features include the sine and cosine of the initial angle, the length and angle of the bounding box diagonal, distance between the first and last point, the sine and cosine of the angle between the first and last point, the total gesture length, and the total angle transversed. The algorithm then classifies the gesture by calculating weighted vector for each recognizable gesture based on sum of weights assigned to each feature. The gesture with max weighted vector is the recognized gesture.

In order to determine the weights for the features, a classic linear discriminator is used to train the feature set. The weights are based on an inverted estimate of the common covariance matrix.

In order to deal with ambiguous gestures, the author calculates the probability the gesture was recognized correctly, and if that probability falls below 0.95, the gesture is ignored.

Discussion

This work presents a new algorithm for recognizing gestures by example and a toolkit that allows for easier addition of gesture recognition in other applications. Prior gesture recognition systems used hand-coded recognizers. With Rubine's algorithm, new gestures are recognized by providing it with a variety of examples on how the gesture can be drawn. No hand-coding is needed.

One of my biggest faults with this work is the rejection approach. The authors say that rejection should not occur when an application supports quick undo. As a user, I would hate having to undo every time the recognizer failed. I would much rather it ask me to input again, than to do something I did not want. As well, having to go execute undo takes me away from my current task, disrupting my flow.

If working further on this algorithm, I would design towards recognizing gestures of more than one stroke. The algorithm is already using the geometry of a single stoke, why not try and do this for multiple strokes? Difficult, definitely, but potentially worth the effort.

Thursday, August 28, 2008

Introduction to Sketch Recognition

Tracy Hammond and Kenneth Mock

Comments on others

daniel's blog

Summary

The authors present an overview of pen-based interactive systems and applications for these systems. The paper begins with a summary of the technology used in pen-based interaction. Passive digitizers allow the use of any stylus including one's finger, but suffer from vectoring (unintended triggers when for example a palm brushes the digitizer), require touch before recognition, and have lower resolution and accuracy than active digitizers. An active digitizer needs a special stylus, but this eliminates the problems of vectoring and required touch associated with passive digitizers.

An outline of software features across operating systems is described in the paper. Microsoft Windows has the largest feature set.

The authors compare and contrast the use of large screen displays such as SMART Board versus smaller TabletPC displays. Large displays offer more screen real estate and allow displaying information to multiple people without the need for individual displays. The TabletPC allows for greater accuracy and flexibility in movement.

Several applications of sketch recognition are presented. A few of these are:

  • ChemPad: converts sketched chemical diagrams to 3-D models

  • LADDER MechEng: recognizes and simulate hand-drawn mechanical engineering diagrams

  • LADDER FSM: draw finite state machines, and run an input through them

The process of using LADDER and the GUILD system to build a new sketch interface is outlined. The domain specific information is defined using LADDER, and GUILD automatically generates a system for recognizing sketches in that domain.

The paper concludes with two case studies and a future work section. The case studies illuminate the advantages of a TabletPC-based lecture, pointing to higher student involvement and better attention spans.

Discussion

The contribution of this paper (or fragments of a book) is an overview of sketch recognition technologies and their applications. Also, the paper points out the benefits of using sketch recognition technologies in an educational environment.

As this paper is an overview, it's difficult without significant knowledge in the field to point at possible faults. Difficulty in reading it arose from its fragmentation, but only until I realized it was not a continuous document.

Future work to pursue would include mentioning more related hardware (possibly a brief discussion of multi-touch and comparison). As well, evaluating designed systems to further emphasize the value of sketch recognition systems.

Wednesday, August 27, 2008

Sketchpad: A Man-machine Graphical Communiction System

Ivan E. Sutherland

Comments on others

ben's blog

Summary

Sutherland presents a new (in 1963) graphical communication system called Sketchpad that uses a pen interface instead of a keyboard. Using a light pen and a set of push buttons, a person can create drawings on a computer using Sketchpad.

Sketchpad uses a ring structure to store relationships between elements in a drawing. Elements are structured in a hierarchy where ancestors are more generic than their descendants. This allows for separation of generic and element specific code, not too unlike modern OOP. Sketchpad supports the addition of new element types as well.

Sketchpad supports the display of not only graphical elements, but also abstractions. An example of an abstraction is a constraint block, which is a rule that specifies certain values must be maintained (e.g. making lines parallel). By visualizing these abstractions, Sketchpad allows the user to make changes to them.

Several atomic operations, controlled by the push buttons, provide for the creation of new drawing elements in the display. One of these operations, the copy function, lets the user create a new instance of an existing element, referred to as a "definition picture." Definition pictures can contain "attachers," that are used in relating the definition picture to other elements. Copied instances are linked to each other, so a change to one affects the others. Using this copy functionality large patterns can easily be created and modified.

Sunderland used Sketchpad for a number of different applications including linkages, bridge structural diagrams, animation, and electrical circuit diagrams.

Discussion

This paper is obviously significant for being the first to use a pen interface, a ground breaking achievement, which was sadly not followed up on until much later. This work combines human drawing with the computer's mathematical computation. Sketchpad allows a person to apply real world constraints to a design drawing that isn't possible with pen and paper.

I found two faults with this work. The first is using a flick to terminate. A flick being, as described in the paper, a quick movement too fast for tracking program to recognize. Since the system is using the pen and paper metaphor, I think terminate should be done my moving the pen away from the display, but perhaps hardware limitations prevented them from doing so. The second is the use of push buttons. I realize hardware limitations may have necessitated this design decision, but that type interaction does not match the fluidity I feel when just using pen and paper. It does remind me of some recent work I've seen that uses bimanual pen and and direct-touch for interaction. Not the same idea, but similarities in using both hands with a pen in one.

My one question is,"where's the user study?" Perhaps it is just my HCI background that begs this question. Not that the paper needs a user study. It's innovative, that's enough a reason for writing about it. It seems that potentially more people would have pursued this approach had some evaluation shown that this work was an improvement over other interfaces.

The future work for me is fixing the problems described and evaluating the system.

9 Questions

andrew at ecologylab dot net
1st year Ph.D.

Why are you taking this class?

I'm very interested in designing systems that use a more natural and fluid interface than the mouse and keyboard. Particularly, I want to design systems that promote creativity and design. It is my believe that sketch recognition can help in the creation of such systems.

What experience do you bring to this class?

My background is in HCI, more specifically Human-centered Computing (HCC). I've done a healthy dabble in the arts and design. I have familiarity with the humanities.

What do you expect to be doing in 10 years?

Hopefully, not sitting on my ass wondering where my life went wrong in the last 10 years. Ha. No, I expect to be doing fun, exciting, and innovative research in either academia or industry. I think academia will give me more freedom, but I would like to spend a little time in industry before that (a cliché answer, meaning I don't know whether I want academia or industry).

What do you think will be the next biggest technological advancement in computer science?

The end of Microsoft, or the multi-touch MacBook. Both I'm excited for. Seriously, I'd like to see something in wearable computing happen. I'd love to be able to place an iPhone in my pocket, and be able to access features of it from my shirt sleeve such as calendar, google maps, music, and so on.

What was your favorite course in undergrad (CS or otherwise)?

There were three I really liked:
  1. Design Communication Foundations: fantastic design course that broaden my design skills.
  2. Structures of Interactive Information: really exposed me to the ways information can be represented.
  3. Photography: b&w film; shooting, developing, and making prints.

If you could be another animal, what would it be and why?

Pterodactylus - I like dinosaurs, and flying seems like fun.

What is your favorite motto or slogan?

"Everyone wants to be Cary Grant. Even I want to be Cary Grant."
- Cary Grant

What is your favorite movie?

It's almost impossible to pick a favorite. It will always depend on what day you ask me. Today, it's Amelie.

Name some interesting fact about yourself.

I played in a metal band in high school. We weren't very good.