Comparison of complex mathematical notation and applications for searching and plagiarism detection

Abstract:

We present the design and implementation of an end-to-end search engine for mathematical formulae. The input can be provided in a convenient form of natural language expression or a visual query. It is then processed using the defined presentation and transcription schemes of mathematical notation, to a common form that is relevant for comparison by means of two predefined word distance measures. As a part of a complete solution for acquisition and processing of mathematical queries, we introduce a novel technique for unification of special symbols and operators which - as we demonstrate with included examples - allow for more flexible and precise search in specific search scenarios.