Document processing system which can selectively extract and process regions of a document

PublishedMarch 28, 2000

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A document processing apparatus comprising: region extracting means for receiving document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photography and at least one graph, and extracting a plurality of specific document regions corresponding to at least some of the text, the table, the drawing, the photography and the graph for the document information; region recognition means for recognizing the document regions extracted by said region extraction means in accordance with types of the specific document regions; recognition result display means for displaying images indicating a recognition result from said region recognition means; region edit means for editing the images indicating the recognition result; output image forming means for forming an output image corresponding to an edit result from said region edit means or the recognition result; and image output means for outputting the image formed by said output image forming means, wherein said specific document regions include data regarding a position, size, shape, structure and density distribution.

2. An apparatus according to claim 1, wherein said output image forming means includes means for discarding at least one unnecessary region from the recognized regions.

3. An apparatus according to claim 1, wherein said region recognition means includes means for measuring characteristics of the document regions extracted by said region extracting means to recognize a type or importance of the regions in accordance with predetermined rules.

4. An apparatus according to claim 1, wherein said region extracting means includes labeling processing means for labeling the document information to obtain a plurality of connection components each including a plurality of pixels, merging means for merging the connection components to form one merged element and extract it.

5. An apparatus according to claim 1, wherein said merging means connects the connection components between which there is a minimum distance between the pixels.

6. An apparatus according to claim 1, wherein said region recognition means recognizes the document regions in accordance with the position, size, shape, structure and density distribution.

7. An apparatus according to claim 1, wherein said region recognition means recognizes a region positioned at a corner of an image or an extremely small region as a noise region.

8. A document processing apparatus comprising: region extracting means for receiving document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photography and at least one graph, and extracting a plurality of specific document regions corresponding to at least some of the text, the table, the drawing, the photography and the graph for the document information; region recognition means for recognizing the document regions extracted by said region extraction means in accordance with types of the specific document regions; output image forming means for dividing the recognized regions into independent images and forming an output image from the independent images in accordance with a recognition result from said region recognition means; and image output means for outputting the image formed by said output image forming means, wherein said specific document regions include data regarding a position, size, shape, structure and density distribution.

9. An apparatus according to claim 8, wherein said output image forming means includes means for discarding at least one unnecessary region from the recognized regions.

10. A document processing apparatus comprising: region extracting means for receiving document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photography and at least one graph, and extracting a plurality of specific document regions corresponding to at least some of the text, the table, the drawing, the photography and the graph for the document information; region recognition means for recognizing the document regions extracted by said region extraction means in accordance with types of the specific document regions; output image forming means for forming an output image in accordance with a recognition result from said region recognition means; image output means for outputting the image formed by said output image forming means; and image accumulation means for accumulating the image output by said image output means, wherein said specific document regions include data regarding a position, size, shape, structure and density distribution.

11. An apparatus according to claim 10, wherein said output image forming means includes means for discarding at least one unnecessary region from the recognized regions.

12. A document processing apparatus comprising: region extracting means for receiving document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photography and at least one graph, and extracting a plurality of specific regions corresponding to at least some of the text, the table, the drawing, the photography and the graph for the document information; region recognition means for recognizing the document regions extracted by said region extraction means in accordance with types of the specific document regions; recognition result display means for displaying images indicating a recognition result from said region recognition means; region edit means for editing the image indicating the recognition result; output image forming means for forming an output image corresponding to an edit result from said region edit means or the recognition result; image output means for outputting the image formed by said output image forming means; and image accumulation means for accumulating the image output by said image output means, wherein said specific document regions include data regarding a position, size, shape, structure and density distribution.

13. An apparatus according to claim 12, further comprising similarity measuring means for measuring a similarity between the image output by said image output means and the image accumulated in said image accumulation means, and wherein said image output means uses the similarity measured by said similarity measuring means to determine on the basis of the degree of similarity whether the output image is similar to the image already accumulated in said image accumulation means, thereby inhibiting output of the already accumulated image.

14. An apparatus according to claim 12, further comprising associated data setting means for setting at least one of attribute data of an input image of the image accumulated in said image accumulation means and position data on the input image.

15. An apparatus according to claim 12, further comprising image encoding means for encoding the image data or forming a vector of the image data, and wherein said output image forming means encodes the image data in a region or forms the vector of the image data in accordance with the recognition result from said region recognition means.

16. An apparatus according to claim 12, wherein said output image forming means includes means for discarding at least one unnecessary region from the recognized regions.

17. A document processing method comprising the steps of: (a) extracting a plurality of specific document regions from document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photography and at least one graph, the specific document regions corresponding to at least some of the text, the table, the drawing, the photography and the graph; (b) recognizing the region extracted in the region extraction step in accordance with types of the specific document regions; (c) displaying images indicating a recognition result obtained in the step (b) of recognizing the region; (d) editing the images indicating the recognition result; and (e) forming an output image corresponding to an edit result obtained in the region edit step or the recognition result, wherein the step (b) of recognizing the region recognizes the region using a position, size, shape, structure and density distribution.

18. A document processing method comprising the steps of: (a) extracting a plurality of specific regions associated with each other from document information read out from a document containing at least some of at least one text, at least one table, at least one drawing, at least one photograph and at least one graph, the specific document regions corresponding to at least some of the text, the table, the drawing, the photograph and the graph; (b) recognizing the region extracted in the region extraction step in accordance with types of the specific document regions; (c) displaying images indicating a recognition result obtained in the step (b) of recognizing the region; (d) editing the images indicating the recognition result; (e) forming an output image corresponding to an edit result obtained in the region edit step or the recognition result; and (f) accumulating the image formed in the step (e) of forming an output images wherein the step (b) of recognizing the region recognizes the region using a position, size, shape, structure and density distribution.

Detailed Description

Complete technical specification and implementation details from the patent document.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06V G06T

Patent Metadata

Filing Date

Unknown

Publication Date

March 28, 2000

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search