Sunday, November 9, 2008

Grouping Text Lines in Freeform Handwritten Notes

Ming Ye, Herry Sutanto, Sashi Raghupathy, Chengyang Li, and Michael Shilman

Summary

The authors present an approach for grouping text lines in handwritten notes containing both text and shapes. The approach uses a cost function based on the likelihood that a set of strokes form a line of text and the configuration consistency. Likelihood is based on mesaures from a fitted line: the linear regression error and the maximum inter-stroke distances projected onto the fitted line and its orthogonal. The configuration consistency looks to form groups of text with similar spatial orientation. This is done by computing a neighborhood graph and grouping connected nodes of the graph whose connecting edge's length is below a threshold value. A gradient-descent local optimization method solves the cost minimization for the function.

An evaluation wad done using data from 600 Windows Journal pages from tens of real TabletPC users. The system was evaluated using the recall metric (number of correct lines divided by number of labelled lines in each page), and achieved an accuracy of 0.93.

Discussion

This work achieves high accuracy for grouping lines of text in freehand notes. The approach is quite different from many of the other methods we have looked at, but it also is quite focused in in terms of the recognition goal. They are looking to find lines of text in freehand notes, which already contain large amounts of text (often arranged in single or multi-column). This method would not work for state labels in finite state machines or text no arranged in a line, such as text formed around an arc.

No comments: