Tools for the Efficient Generation of Hand-Drawn Corpora Based on Context-Free Grammars
Scott MacLean, David Tausky, George Labahn, Edward Lank, Mirette Marzouk
Sketch Based Interfaces and Modeling, 2009, pp. 125--132.
Abstract: In sketch recognition systems, ground-truth data sets serve to both train and test recognition algorithms. Unfortunately, generating data sets that are sufficiently large and varied is frequently a costly and time-consuming endeavour. In this paper, we present a novel technique for creating a large and varied ground-truthed corpus for hand drawn math recognition. Candidate math expressions for the corpus are generated via random walks through a context-free grammar, the expressions are transcribed by human writers, and an algorithm automatically generates ground-truth data for individual symbols and inter-symbol relationships within the math expressions. While the techniques we develop in this paper are illustrated through the creation of a ground-truthed corpus of mathematical expressions, they are applicable to any sketching domain that can be described by a formal grammar.
Article URL: http://dx.doi.org/10.2312/SBM/SBM09/125-132
BibTeX format:
@inproceedings{MacLean:2009:TFT,
  author = {Scott MacLean and David Tausky and George Labahn and Edward Lank and Mirette Marzouk},
  title = {Tools for the Efficient Generation of Hand-Drawn Corpora Based on Context-Free Grammars},
  booktitle = {Sketch Based Interfaces and Modeling},
  pages = {125--132},
  year = {2009},
}
Search for more articles by Scott MacLean.
Search for more articles by David Tausky.
Search for more articles by George Labahn.
Search for more articles by Edward Lank.
Search for more articles by Mirette Marzouk.

Return to the search page.


graphbib: Powered by "bibsql" and "SQLite3."