Summary
Bishop et al. present an algorithm for determining if a stroke in a sketch is text or graphics. The algorithm uses 9 features based on the stroke itself, a total least squares (TLS) fit for the stroke, and fragments of the stroke defined by local maxima in curvature. To classify on these features, a multilayer perceptron (MLP) is trained. The MLP returns a probability that given a feature vector for the stroke, made up of the 9 features, that the stroke is text. Spatial and temporal context of successive strokes is used to help further in the classification. A Hidden Markov Model is used to combine the probabilities of the feature-based approach with the probabilities of the context approach. An additional approach adding to the HMM approach the use of the gap between strokes as a characteristic for classification.The conducted evaluation found the non-gap HMM preformed better than the gap HMM. The feature-base approach performed best on text, but worst on graphics. All approaches struggled with graphics.

No comments:
Post a Comment