Analyzing Digital Documents Using Isothetic Components

eng02
Algorithm used:
Shyamosree Pal, Partha Bhowmick, Arindam Biswas, and Bhargab B. Bhattacharya.
Understanding Digital Documents Using Gestalt Properties of Isothetic Components,
International Journal of Digital Library Systems, 2010 (accepted).









Left: input
Below: output

(pink isothetic polygons are recognized as graphics;
bluish isothetic polygons as text.)

We have shown how Gestalt properties can be used for identifying various components in a document image. The idea that our mind makes a holistic approach to vision rather than a disintegrated approach has been shown to be effective for document analysis also. Since the major constituent components (textual or non-textual) in a document page are arranged in a rectilinear fashion, we first make an isothetic decomposition of different components in a document page.
After representing the page as a feature set of its polygonal covers corresponding to the distinct regions of interest, each polygon is iteratively decomposed into the sub-polygons tightly enclosing the corresponding sub-components so as to capture the overall information as well as the necessary details to the desired level of precision. Subsequently, these components and sub-components are analyzed one by one using Gestalt laws/properties, which have been explained in detail in the context of this work.
Text regions, tabular structures, and various graphic objects readily admit some of the Gestalt properties, whence they are successfully recognized by our technique. We have tested our algorithm on several benchmark datasets, and some relevant results have been produced here to demonstrate the effectiveness and elegance of the proposed method.

eng02


An intermediate stage of our algorithm




Recognizing flowcharts in a document page from its set of outer isothetic covers

(a) Input document page.

(b) Set of isothetic polygons {Pj(12)j=1,...,12}.

(c) The geometric feature set corresponding to the polygons in P(12).



(d) Input subset.

(e) Set of isothetic polygons for g=6.

(f) Set of isothetic polygons for g=2.

(g) Set of isothetic polygons for g=1.
(d-g): Isothetic covers of the components lying inside the polygon P3(12) corresponding to the component of flowchart and their sub-components for different grid sizes. The vertex centers of polygons are shown as red-colored '+'.


Recognizing tables


Recognizing bar charts


Recognizing architectural plans





Some other snapshots (Input, Intermediate-I, Intermediate-II, Final)
Click on the image to enlarge!