Monday, September 29, 2008

GLADDER: Combining Gesture and Geometric Sketch Recognition

Paul Corey and Tracy Hammond

Summary

The authors introduce a sketch recognition algorithm that integrates a two recognizer, a modified Rubine recognizer, a feature-based linear classifier, and the LADDER recognizer, a geometric-based hierarchical classifier. The approach first uses the Rubine classifier on a stroke, and calculates the Mahalanobis distance to any Rubine class. If this distance is shorter than a threshold, the classification is used; otherwise, the stroke is passed to the LADDER classifier. Evaluation pointed to a slight improvement in the integrate approach in comparison with each classifier individually.

Discussion

Integrating two distinct approaches that are each suited for handling different types of sketches and strokes is an advantageous approach, as the results indicated. Future work could include investigating alternate hybrids using different classifiers, as well as, determining a method to obtain correlated confidence values between Rubine and LADDER to help better choose which classification to use.

A Domain-Independent System for Sketch Recognition

Bo Yu and Shijie Cai

Summary

The authors present a domain-independent sketch recognition system. The system has two stages: an imprecise stroke approximation stage, and a post-processing stage. In the first stage, curvature and direction is calculated for each point. Area is used to direct recognition and verify results. Feature area for a stroke is determined by the area between points on a stroke and another reference object.

The system combines a corner finding approach with a primitive shape approximation approach. First, a check is made to see if any the stroke can be approximated by any primitive shape. If it can, the process stops; otherwise, it divides the stroke at the highest curvature point, and continues the process on the two new segments.

Line segments are approximated using least-squares best fit line to check if derivation is small between the stroke and the best fit line. If it is, the feature area is used to validate the candidate line. Curve segments are approximated by checking if the direction graph can be approximated as a line. Feature area is once again used to validate candidate curves.

In the post-process stage which occurs after the sketch has been complete drawn, false and redundant objects are removed, strokes are beautified, and domain-independent objects are recognized.

The system has a user interface to help users obtain their desired drawings. A modal toolbox exists for the creation, modification, and deletion of primitive shapes, as well as, command gestures for copy, delete, undo, and redo.

The authors conducted an evaluation where the system achieved accuracy for primitive shapes and polylines was 98%. The system obtained an accuracy of 70% for hybrid shapes with smooth connections.

Discussion

The incorporation of corner detection with primitive approximation seems beneficial. Corner finding algorithms are still not perfect. Any added assistance to help in the segmentation of strokes is useful. The use of command gestures was nice; however, having to use the modal toolbox to enter modification mode is not so nice. I understand the complexity of distinguishing gesture commands from actual sketching. An interesting piece of future work would be to investigate ways to separate these two. I could see using FlowMenus for this system, instead of a modal toolbox.

Eliminating False Positives During Corner Finding By Merging Similar Segments

Aaron Wolin, Brandon Paulson, and Tracy Hammond

Summary

The authors present MergeCF, a corner finding algorithm that computes an initial set of corners and then removes false positives by merging line segments. The initial corners are determined by calculating the curvature and speed at each point, and any point whose curvature value is above a specific threshold and speed value is below is deemed a corner. Small line segments are then merged with that adjacent line segment that causes the least primitive fit error. The algorithm reported a higher accuracy for both correct corners found and "all or nothing" versus algorithms by Sezgin and Kim. MergeCF saw a significant accuracy increase in "all or nothing."

Discussion

This approach of removing false positives is important in obtaining high "all or nothing" accuracies. Wolin et al. have a simple and quick approach. Further work could be done to analyze what false positives are not getting removed, and adding functionality to the algorithm to remove these false positives, thereby increasing "all or nothing" accuracy.

A Curvature Estimation for Pen Input Segmentation in Sketch-based Modeling

Dae Hyun Kim and Myoung-Jun Kim

Summary

Kim and Kim present a corner finding algorithm which can be used on-the-fly by using local curvature information. The authors first resample the stroke to have equidistant points, and then compute curvature estimates for each resampled point using differential geometry. The local curvature information used are local convexity and local monotonicity. Points are considered locally convex when the sign of the curvature value across a series of points does not change. Local montonicity determines if a series of points are continually decreasing. The algorithm proposed by the authors local monotonicity must be a threshold requirement in order to explore points as possible corners. Local maximum for positive curvature and local minimum for negative curvature are used to select corners.

Evaluations were conducted to see how different drawing styles affected the algorithm, how different approaches (curvature estimation only, local convexity only, local convexity and local monotonicity, and a bending function from Fu et al.), and how the algorithm performs under a some special test cases.

Discussion

The approach is interesting and seems to work well for both polyline and arcs. The on-the-fly aspect seems beneficial in some cases. Although in most cases, a sketch doesn't need to be recognized until it is fully drawn. By not comparing their algorithm to other approaches, the evaluation is questionable to me. While their algorithm may perform well on their test set, another approach could perform even better. Without this evaluation, they can't truly validate this approach as improvement or alternative to the other corner finding algorithms.

PaleoSketch: Accurate Primitive Sketch Recognition and Beautification

Brandon Paulson and Tracy Hammond

Summary

Paulson and Hammond present PaleoSketch, a sketch recognition approach that is able to recognize and distinguish basic shape primitives within a free-hand sketch. The algorithm begins with a pre-recognition stage where curvature and speed graphs for each stroke are computed. Also, the normalized distance between direction extremes and the direction change ratio are calculated to help in differentiate polyline shapes from curved shapes. A number of test are run on a stroke to determine what type of primitive the stroke represents. The tests are for line, polyline, ellipse, circle, arc, curve, spiral, helix, and complex shapes. Without a comparable error metric across the test, PaleoSketch uses a hierarchy to determine which shape is the best fit. The hierarchy is based on the minimum number of corners (or line segments) a particular shape is expected to have. The authors reported recognition accuracies above 98%.

Discussion

PaleoSketch is accurate and useful for a wide variety of applications. The biggest issue with this research is the use of a hierarchy. Developing a comparable error metric for the various test seems like a critical piece of future work. Currently, adding a new primitive means restructuring of the hierarchy. With an error metric, no hierarchy is needed, and new tests can be added easily as long as error metric can be computed for them.

Sketch Based Interfaces: Early Processing for Sketch Understanding

Tevfik Metin Sezgin, Thomas Stahovich, and Randall Davis

Summary

Sezgin et al. present a corner finding algorithm based on calculated curvature and speed data of a stroke. The approach involves plotting a curvature graph and a speed graph. An average-based theshold is used on each the graphs to avoid problems of a fixed threshold. The algorithm generates a hybrid fit to determine which points of the stroke are corners. This process consists of three steps: computing vertex certainties, generating a set of hybrid fits, and selecting the best fit. The hybrid fit with the fewest vertices and error below a specific threshold is the selected best fit.

Discussion

The use of curvature and speed information seems critical to helping find corners, especially when extending a polyline approach to look at strokes with curves as well. While Sezgin reports high accuracy in this approach, his accuracy is based soley on whether or not the correct corners were found, and doesn't address additional found corners that are not correct. Others have reported that his accuracy decreases greatly when using an "all or nothing" accuracy. This is not to say that Sezgin's algorithm is in effective, but that further research needs to be done to address this issue.

Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or its Caricature

David H. Douglas and Thomas K. Peucker

Summary

The authors sought a method to reduce the number of points needed to record a digitized stroke. They proposed a corner finding algorithm to address this issue. The Douglas-Peucker algorithm uses a floating and anchor point. Initially, the floating point is the last point, and the anchor point is the first point. The points on a stroke in between the anchor and floating points are examined to find the point with the greatest perpendicular distance. If the distance is less than a threshold value, the line is determined to be a straight line segment. If the distance is greater than the threshold, the line has a curve in it, and the floating point is moved to that point. The algorithm recursively continues until it finds a straight line segment. Once a straight line is found, the floating point becomes the anchor point, and the floating point returns to the last point. The new anchor point is marked as a corner. The algorithm repeats until no more corners are found.

Discussion

The algorithm is affective for finding corners in polyline strokes. However, the algorithm struggles with finding corners with highly obtuse angles. These corners appear as straight lines to the algorithm. As well, the algorithm recursively iterates over points in the stroke extra times, which can cause longer performance times. It seems that a sawtooth line with spread out teeth would take a lengthy amount of time for this algorithm.

Monday, September 15, 2008

ShortStraw: A Simiple and Effective Corner Finder for Polylines

Aaron Wolin and Tracy Hammond

Summary

ShortStraw is a simple corner finding algorithm for polylines. The algorithm initially resamples points in a stroke to be equidistant using the approach presented by Wobbrock et al. The algorithm then uses a bottom-up approach to find initial corner candidates by finding the length of "straws" in a stroke. Straw lengths are calculated at each point using the euclidean distance between two points that form a window around that point (±3). Based on a median straw length, any straw shorter than 0.95 * median is considered a corner candidate. The algorithm then uses a top-down approach to remove false postives and find missed corners. This is done by first checking if two consecutive corners contain a line between them. If not, a corner has been missed. The corner threshold is relaxed, and new corners computed between those two corners. A collinear check is done on triplet corners where the middle corner is removed if the corners are found to be collinear. The algorithm's accuracy, whether "correct corners found" or "all-or-nothing," was greater than that of Sezgin's and Kim and Kim's.

Discussion

The algorithm is incredibly simple, and elegant in that simplicity. The concept of straws work as an effective metaphor for the algorithm. The accuracy of the algorithm speaks for itself. However, one could argue that the test symbols did not contain an equal number of acute, right, and obtuse angles, and favored right and acute angles, which favor the algorithm. If either Sezgin or Kim and Kim perform better on obtuse corners, then the accuracies may be closer with a more equally balanced set. Although, I doubt these algorithms are better with obtuse, as obtuse angles seem like the more difficult corners to recognize.

The obvious problem with this algorithm, as described by the authors, is the difficult this algorihtm has in recognizing obtuse angles. A possible solution could be to have a second more relaxed threshold for obtuse candidate corners. For each obtuse candidate corner, one of the lines could be extended to create an acute angle opposite the obtuse angle. A threshold is applied to the acute angle. Anything greater than the threshold (say 5 degrees) is considered a corner.

Prototype Pruning by Feature Extraction for Handwritten Mathematical Symbol Recognition

Stephen M. Watt and Xiaofang Xie

Summary

The authors present a mathematical symbol gesture recognizer using prototype pruning by feature extraction. Initially, a preprocessing stage smooths out strokes, resamples, size normalizes, and chops off heads and tails. The features used include a number of geometric features (number of loops, number of intersections, and number of cusps), ink related features (number of strokes, point density), directional features, and global features. Using the features, a prototype pruning process reduces the number of potential classes for an unrecognized symbol. An elastic matching recognizer is the final step, where an unknown symbol is classified.

The recognizer was evaluated on a test set of 227 symbols. The author's compared their results with those derived by J. Kurtzberg. While the accuracy of the authors' recognizer was 1% less than that of J. Kurtzberg's, the number of pruned prototypes was significantly greater for the author's approach. The author's approach pruned 85.8% of prototypes, while J. Kurtzberg's approach pruned 61.5%.

Discussion

The features selected by the authors are interesting. I can see a uniqueness in symbols coming from the number of loops, intersections, and cusps, especially with letters and numbers. The basis for their selection is what they say humans use in reconizing symbols. They don't present any evidence of this, though. Not that I necessarily disagree with them, but I think to make this statement without evidence or at least a reason for this insight seems bold.

Thursday, September 11, 2008

User Sketches: A Quick, Inexpensive, and Effective way to Elicit More Reflective User Feedback

Maryam Tohidi, William Buxton, Ronald Baecker, and Abigail Sellen

Summary

Usability testing typically involves interacting with potential users of a system through verbal communications and visual observations. The authors of this work noticed that the type of feedback received using standard usability techniques was often more reactive than reflective. Participants report more criticisms than actual suggestions and improvements. To resolve this issues, the authors looked to incorporate sketches as usability feedback metric. The authors ran a traditional usability study comparing designs for a thermostat. At the end of the study, participants were asked to draw their own design for a thermostat. The authors noticed that the sketches pointed out the same issues as the more traditional methods, but also, introduced more suggestive feedback. Participants' sketches contained new ideas or combined ideas from the designs shown earlier in the study.

Discussion

The authors introduce a new quick and inexpensive way to collect feedback in a usability study. By allowing participants to sketch out ideas, they are better able to think about and convey design suggestions than through verbal and textual means. I think this is a great idea. Communicating issues with visual designs is sometimes better reflected through drawing.

One thing I would suggest in terms of this research is to not only allow users to create their own design, but also to sketch changes to an existing design when using the more traditional usability techniques. Participants in these studies are not experts on UI design. They often may not understand restrictions placed on the design. However, that is not to say that their new designs cannot benefit, just that, new ideas may not solve existing issues and may not be practical.

This work does not directly deal with sketch recognition; however, the use of sketch recognition techniques to help in evaluating participant sketches seems useful. Allowing a computer the computer to compute quantitative results can reduce even further the small time already demanded by this approach.

Sunday, September 7, 2008

Graphical Input through Machine Recognition of Sketches

Christopher F. Herot

Comments

Manoj's blog

Summary

Herot presents an approach to sketch recognition that infers user intent by measuring how quickly a pen is moved and how hard it is pressed. He introduces a system called HUNCH and discusses several sketch processing features within it (e.g. latching and overtracing), including problems faced by those features. HUNCH is able to map 2D sketches into 3D structures. The author points out that human understanding of context helps the user make sense of a drawing, and computers could benefit from having a similar knowledge structure. He details the HUNCH system's approach in which context is specified by the user.

The final designed system is a more interactive system that incorporates improvements to the features of HUNCH. The system is able to recognize lines and curves on the fly based on the speed of the stroke and it's "bentness." In HUNCH, the user specified when the processing features were run. In the new system, these are ran in the background to avoid disrupting the user's flow in designing.

Discussion

The author has designed graphical input system for recognizing sketches that seeks to use both machine processing of sketch features and human understanding of context. I think it's better to ask user about context, than to wrongly determine context. However, persistent querying of the user also seems wrong. An adequate balance is needed. Questions about context should be asked "in-context", that is appearing at the point of interaction. Forcing the user to look elsewhere on the screen defeats any benefit from asking him/her for feedback. This of course means developing clear transitory affordances for the user to interact with the system. It sounds as if the author was headed on this approach, but I am uncertain how exactly the user provided contextual information within the presented system.

Wednesday, September 3, 2008

Gestures without Libraries, Toolkits, or Training: A $1 Recognizer for User Interface Prototypes

Jacob O. Wobbrock, Andrew D. Wilson, and Yang Li

Comments

Manoj's blog

Summary

Wobbrock, et al., introduce a new light-weight gesture recognizer that doesn't require training or special libraries and toolkits. The algorithm was designed to work on any development platform and be simple enough for UI designers to create gestures to be recognized by it. The algorithm has four steps.
  1. The point path is re-sampled so that points on the path are equidistant from each other. This is needed so that speed a gesture is drawn has no effect on the gesture recognition.
  2. Rotate the gesture once based on the indicative angle.
  3. Scale the gesture to a reference square. Then, translate the gesture, so it's centroid is at the origin.
  4. Find the optimal angle for comparing the gesture with the templates to obtain the best score.
The authors compared their algorithm with both Rubine's and a template matcher based on Dynamic Time Warping (DTW). The results showed that their algorithm performed with better accuracy than Rubine's, and comparable with DTW. Also, the authors noted the $1 recognizer had a greater separation between the 1st and 2nd gesture scores.

Discussion

The contribution of this work is a simple gesture recognizer that does not require extra software, training, or expert knowledge in the field of pattern recognition. The authors developed two implementation of the $1 recognizer, one in C# and one in javascript. It's ability to be implemented in a variety of platforms including light-weight ones such as javascript and Flash seems incredibly valuable.

To handle variability in gestures, the authors use an idea of aliasing, so that multiple templates can be assigned to one visual object (e.g. arrow example, see Figure 7 from paper). The simpleness of the algorithm probably limits them to this approach, but it seems future work could be directed towards addressing this problem better.

At interesting point to me is that in the evaluation the authors asked participants to rate the gestures subjectively. It surprises me that this has not been done in other work we have looked at so far. It seems to me that getting input from actual users about the quality of gestures is an important step in choosing gestures to recognize. Of course, I realize the majority of the work is about the actual recognition of the gestures and not the gestures themselves.

MARQS: Retreiving Sketches Using Domain- and Style-Independent Features Learned From A Single Example Using a Dual-Classifier

Brandon Paulson and Tracy Hammond

Comments

Daniel's blog

Summary

The authors introduce a new multi-stroke sketch recognition algorithm that uses dual-classifiers, and a system called MARQS that uses this algorithm to do search by sketch on a collection of photos and music organized in albums.

The algorithm uses different classifiers based on the number of existing training examples. In the case of only a single example, a simple classifier is used that computes a set of features for a sketch, and compares those features with sketches in a database. A total error is computed, and the sketches with the lowest errors are returned. Each new sketch query is added as a training example. Once multiple training examples exist, a linear classifier is used.

Discussion

The contribution of this work is a new sketch recognition algorithm requiring only a single example to recognize a sketch, but will improve it's accuracy by adding new sketches drawn by the user as training examples. The algorithm is domain-independent and is not affected by orientation, scale, and other user-specific features.

An issue with this research, noted in the paper, is that as the number of examples increases, overfitting can occur. The authors propose a threshold to stop adding new examples. A potential variation to this idea would to be only add new examples when it would improve accuracy. Any sketch that doesn't help as an example is thrown away. Old ones are removed when new ones offer improvement. I'm not sure how this would be implemented, but it could be worth future research.

The idea of a sketch query is nice, as sometimes a query cannot be easily defined simply by words. I could see searching something like the U.S. Patent Offices database using this.