An image processing apparatus includes a character recognizing unit and a font type determining unit. The character recognizing unit is configured to determine a character code of a character in a text of a predetermined process unit in an image. The font type determining unit is configured to determine a font type of the character. Further, the font type determining unit performs a font type determining process; and in the font type determining process, (a) determines a font type on a character by character basis and (b) sets as a specific font type font types of all characters in a text of the predetermined process unit if a ratio of the number of characters with the specific font type to the number of all characters in the text of the predetermined process unit exceeds a predetermined threshold value.
Legal claims defining the scope of protection, as filed with the USPTO.
1. An image processing apparatus, comprising: a character recognizing unit configured to determine a character code of a character in a text of a predetermined process unit in an image; and a font type determining unit configured to determine a font type of the character; wherein the font type determining unit performs a font type determining process; and in the font type determining process, (a) determines a font type on a character by character basis and (b) sets as a specific font type font types of all characters in a text of the predetermined process unit if a ratio of the number of characters with the specific font type to the number of all characters in the text of the predetermined process unit exceeds a predetermined threshold value.
2. The image processing apparatus according to claim 1 , wherein if no ratios of all font types of characters in a text of the predetermined process unit exceed the predetermined threshold value even though the font type determining unit determines the font types character by character, the font type determining unit sets font types of all characters in the text of the predetermined process unit as a font type with the largest ratio.
3. The image processing apparatus according to claim 1 , wherein if the predetermined process unit is word, the font type determining unit (a) performs the font type determining process for a word of three or more characters and a word of one character, and (b) for a word of two characters, sets a font type of the word of two characters as a font type identical to a font type of either a previous word or a next word to the word of two characters if a font type of the previous word and a font type of the next word are identical to each other.
4. The image processing apparatus according to claim 3 , wherein for the word of two characters, the font type determining unit (b1) without performing the font type determining process, sets a font type of the word of two characters as a font type identical to a font type of either a previous word or a next word to the word of two characters if a font type of the previous word and a font type of the next word are identical to each other, and (b2) performs the font type determining process if a font type of the previous word and a font type of the next word are different from each other.
5. The image processing apparatus according to claim 1 , wherein for a text of a superordinate process unit of the predetermined process unit, the font type determining unit sets as a specific font type font types of all characters in a text of the superordinate process unit if a ratio of the number of texts of the predetermined process unit with the specific font type to the number of all texts of predetermined process unit in the text of the superordinate process unit exceeds a predetermined threshold value.
6. The image processing apparatus according to claim 5 , wherein the predetermined process unit is word; and the superordinate process unit is either a line or a block as a set of lines.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
February 21, 2019
October 27, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.