Comments
Daniel's blogSummary
The authors introduce a new multi-stroke sketch recognition algorithm that uses dual-classifiers, and a system called MARQS that uses this algorithm to do search by sketch on a collection of photos and music organized in albums.The algorithm uses different classifiers based on the number of existing training examples. In the case of only a single example, a simple classifier is used that computes a set of features for a sketch, and compares those features with sketches in a database. A total error is computed, and the sketches with the lowest errors are returned. Each new sketch query is added as a training example. Once multiple training examples exist, a linear classifier is used.
Discussion
The contribution of this work is a new sketch recognition algorithm requiring only a single example to recognize a sketch, but will improve it's accuracy by adding new sketches drawn by the user as training examples. The algorithm is domain-independent and is not affected by orientation, scale, and other user-specific features.An issue with this research, noted in the paper, is that as the number of examples increases, overfitting can occur. The authors propose a threshold to stop adding new examples. A potential variation to this idea would to be only add new examples when it would improve accuracy. Any sketch that doesn't help as an example is thrown away. Old ones are removed when new ones offer improvement. I'm not sure how this would be implemented, but it could be worth future research.
The idea of a sketch query is nice, as sometimes a query cannot be easily defined simply by words. I could see searching something like the U.S. Patent Offices database using this.

1 comment:
Your idea of keeping only those examples that improve accuracy sounds good, but implementation may not be easy, since accuracy can only be tested by using a large amount of data (at least what i can think of now), and it should be the same set of test data, because you want to testify that the improvement in accuracy is really the effect of a new example, not caused purely by improvement in the test data. So basically, you need to run the test data each time a new member join the example set and compare the result of former experiment before you can decide whether to keep it or not...which would need a considerable amout of time
Post a Comment