andrew's sketch rec blog: November 2008

Interactive Learning of Structural Shape Descriptions from Automatically Generated Near-miss Examples

Tracy Hammond and Randall Davis

Summary

Hammond and Davis present an approach to fix over- and under-constrained shape definitions for a shape description language. The authors developed the approach for the LADDER shape language. Their approach requires a positive hand-drawn example and shape description that will properly recognize the provided example. Over-constrained descriptions are checked by sequentially negating each constraint and constructing a near-miss shape that tests the constraint. Under-constrained descriptions are checked similarly, except that negated constraints are added instead of checking existing constraints.

Discussion

Having used LADDER, the ability to uncover under- and over-constrained descriptions is incredibly valuable. It's very easy in LADDER to under-constrain a description.

What Are Intelligence? And Why?

Randall Davis

Summary

Randall Davis presents "definitions" for intelligence and why intelligence exists. The definitions of intelligence come from the five areas of study looked at by artificial intelligence researchers: mathematics (logic), psychology, biology, statistics, and economics. Logicists view intelligence through formal calculations of logical rules. Psychologists view intelligence as human behavior. In biology, intelligence is viewed as a response-stimuli behavior based on the physiological architecture. Statistics provides a probability theory approach, and economics presents an approach based on utility theory.

Davis proceeds to explain possible reasons why intelligence exists through an exploration of how the human mind evolved. Fossil records show that the encephalization quotient (ratio of brain size to body size) of human ancestors began to increase over four million years ago. However, early man did not begin developing tools and language skills until 300,000 years ago. He points out a number of theories why this may be. He also notes that evolution is more of random search than a goal-oriented process, and that the products of evolution are often messy and multifaceted.

A number of examples of animal intelligence are presented. These examples serve as to point out difference in intelligence, while also showing similarity between animal intelligence and human intelligence. The goal being to find ways to uncover aspects of human intelligence by investigating simpler forms of intelligence in animals.

He concludes the paper with an exploration of the idea that we think by "reliving." He explains evidence for how we create concrete visual ideas in our mind. In order to answer questions of what would happen, we picture in our mind visually how something would play out to answer these questions.

Discussion

I enjoyed this reading. It contains a lot of interesting information about areas that I know only a small amount about. I found it very captivating to look at how different areas look at human intelligence, and to theorize how it came about through evolution-based analysis. My only questions are:

How can we apply these different views of intelligence?
How can we integrate the views?
Are certain AI tasks better suited for specific models of intelligence? In other words, is the one view or combination of views that would work best to address a specific focus topic within artificial intelligence?

The view of human intelligence formed from a messy layering of evolutionary forces is a nice take that makes sense to me. The complexity of human intelligence comes not only from how advanced it is, but also how complicated and inefficiently designed it is. It makes me think that looking at simpler animal intelligence is a good idea for building a basis for looking at human intelligence.

Magic Paper: Sketch-Understanding Research

Randall Davis

Summary

This paper presents an overview of the field of sketch understanding and more specifically a sketch understanding system called Magic Paper. It points out reasons for developing systems that understand free hand sketches. A number of problems with sketch understanding are described. Solutions from other areas such as speech recognition are pointed out and explained why they are not suitable for sketch recognition. The basics of sketch understanding are presented including sketch representation, finding primitives, and recognizing shapes. A number of sketch-enabled interfaces are described where sketch understanding is connected to a back-end system such as RationalRose or ChemDraw. Techniques for automated learning of new sketch domains and the difficulties associated with this task are presented.

Discussion

This paper presents a nice overview of the issues of sketch understanding, and why solving these issues is so challenging. The goal of this work is to create "magic paper" which affords the same natural and easy interaction as paper, but is capable of understanding what is drawn on the paper. It would seem advances in both hardware technology and software algorithms are still needed to achieve this ambition goal, but several big steps have already been taken to get us there. The concept of true "magic paper" seems the killer app for sketch understanding. Only when "magic paper" is better than or at least comparable to real paper will sketch understanding find itself deeply-seated in the daily lives of humans.

Perceptually Supported Image Editing of Text and Graphics

Eric Saund, David Fleet, Daniel Larner, and James Mahoney

Summary

Presented in this paper is an image editing program called ScanScribe. ScanScribe provides special functionality for selecting and structuring groups. Grouping is represented by a lattice structure where an image object can belong to more than one group. ScanScribe has an image analysis technique for seperating foreground and background. ScanScribe uses auomatic structure recognition to group elements of an image. The group recognition is based on Gestalt laws of human visual perception. No formal evaluation was conducted, but a number of users reported that the system was easy to learn to use.

Discussion

Our ability as humans to easily differentiate text from shape and form groupings of these different objects seems a valuable place to begin investigating for methods to employ with machines. I really like the idea of using laws of perception as a basis for building mathematical calcualtions of similarity. However, the human eye and mind do not function the same way as a computer; therefore, these laws of perception may not translate as easily to machines. As well, other factors, such as domain and contextual knowledge, play a role in our ability to differentiate shape from text.

Grouping Text Lines in Freeform Handwritten Notes

Ming Ye, Herry Sutanto, Sashi Raghupathy, Chengyang Li, and Michael Shilman

Summary

The authors present an approach for grouping text lines in handwritten notes containing both text and shapes. The approach uses a cost function based on the likelihood that a set of strokes form a line of text and the configuration consistency. Likelihood is based on mesaures from a fitted line: the linear regression error and the maximum inter-stroke distances projected onto the fitted line and its orthogonal. The configuration consistency looks to form groups of text with similar spatial orientation. This is done by computing a neighborhood graph and grouping connected nodes of the graph whose connecting edge's length is below a threshold value. A gradient-descent local optimization method solves the cost minimization for the function.

An evaluation wad done using data from 600 Windows Journal pages from tens of real TabletPC users. The system was evaluated using the recall metric (number of correct lines divided by number of labelled lines in each page), and achieved an accuracy of 0.93.

Discussion

This work achieves high accuracy for grouping lines of text in freehand notes. The approach is quite different from many of the other methods we have looked at, but it also is quite focused in in terms of the recognition goal. They are looking to find lines of text in freehand notes, which already contain large amounts of text (often arranged in single or multi-column). This method would not work for state labels in finite state machines or text no arranged in a line, such as text formed around an arc.

Sketch Recognition for Computer-Aided Design

Christopher F. Herot

Summary

The author presents a sketch recognition system for use in computer-aided design that attempts to infer user intention from information on how the user sketches strokes. Speed of a stroke is used to infer whether a stroke is a line, corner, or curve. Over-traced lines are replaced with a thicker to show emphasis from the the user. Using speed, line length, and density of lines around a point, lines that are meant to be connected but were not drawn as so are made connected (latched).

Herot concludes the paper with a section detailing how the user needs to be involved in the machine's inference of intention.

Discussion

What is presented in this paper is very similar to the previous Herot reading, but what makes this paper interesting to me is the final section. In the previous paper, he only brushed on the concepts that he goes into much more detail in the final section. He mentions the idea of coordinating "two concurrent processes" which is essentially the concept of mixed initiatives. His thoughts and ideas on this correlate closely with work that has come to light over the past 10 years. Particularly, the idea that the machine forms a model of the user and adjusts this model based on user interaction; while at the same time, the user is forming a model of the system and needs methods for providing feedback to the system to help the system mimic the user perceived model.

andrew's sketch rec blog

Sunday, November 9, 2008

Interactive Learning of Structural Shape Descriptions from Automatically Generated Near-miss Examples

Summary

Discussion

What Are Intelligence? And Why?

Summary

Discussion

Magic Paper: Sketch-Understanding Research

Summary

Discussion

Perceptually Supported Image Editing of Text and Graphics

Summary

Discussion

Grouping Text Lines in Freeform Handwritten Notes

Summary

Discussion

Sketch Recognition for Computer-Aided Design

Summary

Discussion

blog archive

about me