Patentable/Patents/US-6094506
US-6094506

Automatic generation of probability tables for handwriting recognition systems

PublishedJuly 25, 2000
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Patent Claims
44 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method in a computer system for generating a shape feature probability matrix for use in recognizing handwritten characters, the method comprising: receiving a plurality of sample handwritten characters each sample handwritten character representing a character and having a sequence of one or more strokes each stroke represented by one of a plurality of shape features that describes a shape of the stroke; determining for each sample handwritten character a shape feature string that represents that character, the shape feature string having the shape feature of each stroke in the sequence of one or more strokes for that character, each of the shape features in a shape feature string having a place within the shape feature string based on the sequence in which the described stroke was handwritten in the sample handwritten character; for each possible combination of pairs of the plurality of shape features, generating a match count, for all possible pairs of shape feature strings representing the plurality of handwritten characters, of all occurrences of the combination in which one of the shape features of the combination is at a place within one of the pair of shape feature strings, in which the other of the shape features of the combination is at the same place within the other of the pair of shape feature strings, and in which each of the pair of shape feature strings represents the same character; generating a total count, for all possible pairs of shape feature strings representing the plurality of handwritten characters, of all occurrences of the combination in which one of the shape features of the combination is at a place within one of the pair of shape feature strings and in which the other of the shape features of the combination is at the same place within the other of the pair of shape feature strings; calculating a probability value based on the generated match count and the generated total count; and storing the calculated probability value for the combination of the shape features in the shape feature probability matrix, so that the stored probability values can be used to recognize handwritten characters.

2

2. The method of claim 1 wherein each shape feature string has a feature length indicating the number of feature shapes in the shape feature string and wherein generating the total count includes: for each of the plurality of shape features, for each of a plurality of possible places within a feature string, for each of a plurality of possible feature lengths, generating a shape/place/length count of a number of times that the shape feature occurs at the place within a shape feature string having the feature length; and for each possible combination of pairs of the plurality of shape features, for each of the plurality of possible places within a feature string, for each of the possible feature lengths, multiplying the shape/place/length count for one of the shape features of the combination by the shape/place/length count for the other shape feature of the combination for the place and the feature length to generate a product; and accumulating the generated products as the total count for the combination of shape features.

3

3. The method of claim 1 wherein each shape feature string has a feature length indicating the number of feature shapes in the shape feature string and wherein generating the match count includes: for each possible character, for each of the plurality of shape features, for each of a plurality of possible places within a feature string, for each of a plurality of possible feature lengths, generating a character/shape/place/length count of a number of times that the shape feature occurs at the place within a shape feature string representing the character and having the feature length; and for each possible combination of pairs of the plurality of shape features, for each of the plurality of possible places within a feature string, for each of the possible feature lengths, multiplying the character/shape/place/length count for one of the shape features of the combination by the character/shape/place/length count for the other shape feature of the combination for the place and the feature length to generate a product; and accumulating the generated products as the match count for the combination of shape features for the character; and totaling the accumulated products as the match count for the combination of shape features.

4

4. The method of claim 3 including sorting the shape feature strings according to the represented character.

5

5. The method of claim 1 wherein calculating the probability value includes dividing the generated match count by the generated total count.

6

6. The method of claim 1 wherein the probability value is calculated according to the following formula: ##EQU6## where the probability is the generated match count divided by the generated total count.

7

7. The method of claim 1 wherein the probability value is logarithmically derived from the generated match count divided by the generated total count.

8

8. A computer readable memory for directing a computer to perform in accordance with the method of claim 1.

9

9. A method in a computer system for identifying a character represented by an unknown handwritten character, the computer system having plurality of character prototypes and a shape feature probability matrix, each character prototype representing a character and having a sequence of one or more strokes, each stroke represented by one of a plurality of shape features that describes a shape of the stroke, each character prototype being represented by a shape feature string having the shape feature of each stroke in the sequence, each of the shape features in a shape feature string having a place within the shape feature string based on the sequence in which the described stroke was handwritten in the character prototype, the shape feature probability matrix having a probability value for each possible pair of shape features, the method comprising: receiving a sequence of strokes of the unknown handwritten character; generating an unknown shape feature string to represent the unknown handwritten character based on the received sequence of strokes; for each character prototype, for each place in the unknown shape feature string, selecting the shape feature at the place in the unknown feature string; selecting the shape feature at the same place in the shape feature string of the character prototype; and retrieving the probability value for the selected shape features from the shape feature probability matrix; and combining the retrieved probability values to produce a combined probability value that the unknown handwritten character represents the same character as the character prototype; and selecting the character represented by the character prototype with the highest combined probability value as the identified character.

10

10. A method in a computer system for generating a position feature probability table for use in recognizing handwritten characters, the method comprising: receiving a plurality of sample handwritten characters, each sample handwritten character representing a character and having a sequence of one or more strokes, each stroke represented by one of a plurality of position features that describe starting and ending coordinates of the stroke; determining for each sample handwritten character a position feature string that represents that character, the position feature string having the position feature of each stroke in the sequence, each of the position features in a position feature sting having a place within the position feature string based on the sequence in which the described stroke was handwritten in the sample handwritten character; for each possible feature distance, each feature distance representing a combined distance between the starting coordinates of a pair of position features and between the ending coordinates of the pair of position features, generating a match count of all occurrences of the feature distance between each pair of position features in all possible pairs of position feature strings in which the pair of position features are at the same place within the pair of position feature strings and in which the pair of position feature strings represent the same character; generating a total count of all occurrences of the feature distance between each pair of position features in all possible pairs of position feature strings in which the pair of position features are at the same place within the pair of position feature strings; calculating a probability value based on the generated match count and the generated total count; and storing the calculated probability value for the feature distance in the position feature probability table, so that the stored probability values can be used to recognize handwritten characters.

11

11. The method of claim 10 wherein calculating the probability value includes dividing the generated match count by the generated total count.

12

12. The method of claim 10 wherein the probability value is calculated according to the following formula: ##EQU7## where the probability is the generated match count divided by the generated total count.

13

13. The method of claim 10 wherein the probability value is logarithmically derived from the generated match count divided by the generated total count.

14

14. A computer readable memory for directing a computer to perform in accordance with the method of claim 10.

15

15. A method in a computer system for identifying a character represented by an unknown handwritten character, the computer system having a plurality of character prototypes and a position feature probability table, each character prototype representing a character having a sequence of one or more strokes, each stroke represented by one of a plurality of position features that describe stating and ending coordinates of the stroke, each character prototype being represented by a position feature string having the position feature of each stroke in the sequence, each of the position features in the position feature string having a place within the position feature string based on the sequence in which the described stroke was handwritten in the character prototype, the position feature probability table having a probability value for each possible feature distance, each feature distance representing a combined distance between the starting coordinates of a pair of position features and between the ending coordinates of the pair of position features, the method comprising: receiving a sequence of strokes of the unknown handwritten character; generating an unknown position feature string to represent the unknown handwritten character based on the received sequence of strokes; for each character prototype, for each place in the unknown position feature string, selecting the position feature at the place in the unknown position feature string; selecting the position feature at the same place in the position feature string of the character prototype; and retrieving the probability value for the position distance between the selected position features from the position feature probability table; and combining the retrieved probability values to produce a combined probability value that the unknown character represents the same character as the character prototype; and selecting the character represented by the character prototype with the highest combined probability value as the identified character.

16

16. The method of claim 15 wherein the computer system has a shape feature probability matrix, each stroke represented by one of a plurality of shape features that describes a shape of the stroke, each character prototype being represented by a shape feature string having the shape feature of each stroke in the sequence, each of the shape features in a shape feature string having a place within the shape feature string based on the sequence in which the described stroke was handwritten in the character prototype, the shape feature probability matrix having a probability value for each possible pair of shape features, the method further comprising: generating an unknown shape feature string to represent the unknown handwritten character based on the received sequence of strokes; for each character prototype, for each place in the unknown shape feature string, selecting the shape feature at the place in the unknown shape; selecting the shape feature at the same place in the shape feature string of the character prototype; and retrieving the probability value for the selected shape features from the shape feature probability matrix; and combining the retrieved probability values to produce a combined probability value that the unknown handwritten character represents the same character as the character prototype; and wherein the character represented by the character prototype with the highest total of the combined probability value of the shape feature and of the combined probability value of the position feature is selected as the identified character.

17

17. A method in a computer system for generating a probability table for use in identifying patterns, the method comprising: receiving a plurality of sample patterns; determining for each sample pattern an associated set of characteristics based on that sample pattern; for each possible pair of sample patterns, comparing the pair of sample patterns to generate a set of comparison values based on the characteristics associated with the pair of sample patterns; and for each possible comparison value, generating a total count of the comparison value in the generated sets of comparison values; generating a match count of the comparison value in the sets of comparison values that are generated from pairs of matching sample patterns; calculating a probability value based on the generated match count and the generated total count; and storing the calculated probability value in the probability table so that the stored probability value can be retrieved using the comparison value, so that the stored probability values can be used to recognize handwritten characters.

18

18. The method of claim 17 wherein each sample pattern comprises handwritten strokes.

19

19. The method of claim 17 wherein a characteristic of a sample pattern represents a shape of a stroke of the sample pattern.

20

20. The method of claim 18 wherein the characteristics in each set are ordered by an order in which the strokes are handwritten.

21

21. The method of claim 19 wherein each characteristic has a place within the set and wherein a comparison value is a pair of characteristics, a first characteristic of the pair is selected from the set of characteristics associated with a first sample pattern of the pair of sample patterns and a second characteristic of the pair is selected from the set of characteristics associated with a second sample pattern of the pair of sample patterns, wherein the first and second characteristics have the same place within the sets of characteristics from which they are selected.

22

22. The method of claim 20 wherein the probability table is a shape feature probability matrix.

23

23. The method of claim 17 wherein a characteristic of a sample pattern represents a pair of coordinates describing a starting point and an ending point of a stroke of the sample pattern.

24

24. The method of claim 21 wherein the characteristics in each set are ordered by the order in which the strokes are handwritten.

25

25. The method of claim 22 wherein each characteristic has a place within the set and wherein a comparison value is calculated by adding together the squared distance between the starting points and the squared distance between the ending points of two characteristics, the two characteristics being selected from the same place within the sets of characteristics representing the pair of sample patterns.

26

26. The method of claim 23 wherein the probability table is a position feature probability table.

27

27. A computer readable memory for directing a computer to perform the method of claim 17.

28

28. A method in a computer system for identifying an unknown input pattern comprising handwritten strokes, the computer system having a probability table and a plurality of pattern prototypes, each pattern prototype associated with a set of characteristics, a characteristic representing a shape of a stroke, the probability table having a probability value for each of a plurality of comparison values, the method comprising: determining a set of characteristics for the unknown input pattern; for each pattern prototype, comparing the characteristics of the unknown input pattern with the characteristics of the pattern prototype to produce comparison values; retrieving the probability values for the produced comparison values from the probability table; and combining the retrieved probability values to produce a combined probability that the unknown input pattern matches the pattern prototype; and selecting the pattern prototype with the highest combined probability value as identifying the unknown input pattern, wherein the characteristics in each set are ordered by the order in which the strokes are handwritten and each characteristic has a position within the set of characteristics.

29

29. The method of claim 28 wherein each stroke has a place within the set of characteristics and wherein a comparison value is generated from a pair of characteristics, a first characteristic of the pair selected from the set of characteristics associated with the unknown pattern and a second characteristic of the pair selected from the set of characteristics associated with the pattern prototype, wherein the first and second characteristics have the same place within the sets of characteristics from which they are selected.

30

30. The method of claim 29 wherein the probability table is a shape feature probability matrix.

31

31. A method in a computer system for identifying an unknown input pattern comprising handwritten strokes the computer system having a probability table and a plurality of pattern prototypes, each pattern prototype associated with a set of characteristics, a characteristic representing a shape of a stroke, the probability table having a probability value for each of a plurality of comparison values, the method comprising: determining a set of characteristics for the unknown input pattern: for each pattern prototype, comparing the characteristics of the unknown input pattern with the characteristics of the pattern prototype to produce comparison values; retrieving the probability values for the produced comparison values from the probability table; and combining the retrieved probability values to produce a combined probability that the unknown input pattern matches the pattern prototype; and selecting the pattern prototype with the highest combined probability value as identifying the unknown input pattern, wherein the characteristic in each set has a place within the set and wherein a comparison value is calculated by adding together the squared distance between the starting points and the squared distance between the ending points of two characteristics, the characteristics selected from the same place within the sets of characteristics representing the unknown pattern and the pattern prototype.

32

32. The method of claim 31 wherein the probability table is a position feature probability table.

33

33. A computer system for generating a probability table for use in identifying patterns, the computer system having a plurality of sample patterns, each sample pattern associated with a set of characteristics, comprising: means for comparing each pair of sample patterns by generating a set of comparison values based on the characteristics associated with the pair of sample patterns; means for generating a total count of the comparison values in the generated sets of comparison values for each possible comparison value; means for generating a match count of the comparison values in the sets of comparison values that are generated from pairs of matching sample patterns for each possible comparison value; means for calculating a probability value based on the generated match count and the generated total count for each possible comparison value; and means for storing the calculated probability values in the probability table so that the stored probability value can be retrieved using the comparison values.

34

34. A computer-readable medium containing instructions for causing a computer system to generate a probability table for use in identifying patterns, by: receiving a plurality of sample patterns; determining for each sample pattern an associated set of characteristics based on that sample pattern; for each possible pair of sample patterns, comparing the pair of sample patterns to generate a set of comparison values based on the characteristics associated with the pair of sample patterns; and for each possible comparison value, calculating a probability value based on a total count of the comparison value in the generated sets of comparison values and a match count of the comparison value in the generated sets of comparison values that are generated from pairs of matching sample patterns; and storing the calculated probability value in the probability table so that the stored probability value can be retrieved using the comparison value and used when identifying an unknown pattern with a set of characteristics, so that the stored probability values can be used to recognize handwritten characters.

35

35. A computer-readable medium containing a data structure for use by a computer system to identify patterns, each pattern having a set of characteristics, the data structure comprising: for each possible pair of characteristics, a stored probability value that is generated from a total count of that pair of characteristics that occur between each possible pair of sample patterns and from a match count of that pair of characteristics that occur between each possible pair of sample patterns that represent the same pattern, wherein each stored probability value can be retrieved from the data structure by using an indication of the pair of characteristics for that stored probability value, so that the stored probability values can be used to identify patterns.

36

36. The computer-readable medium of claim 35 wherein each pattern comprises handwritten strokes.

37

37. The computer-readable medium of claim 36 wherein a characteristic represents a shape of a stroke.

38

38. The computer-readable medium of claim 37 wherein the set of characteristics are ordered by the order in which the strokes are handwritten.

39

39. The computer-readable medium of claim 35 wherein each characteristic has a place within the set and wherein a first characteristic of a pair is selected from the set of characteristics associated with a first sample pattern of the pair of sample patterns and a second characteristic of the pair is selected from the set of characteristics associated with a second sample pattern of the pair of sample patterns, wherein the first and second characteristics have the same place within the sets of characteristics from which they are selected.

40

40. The computer-readable medium of claim 39 wherein the data structure is a shape feature probability matrix.

41

41. The computer-readable medium of claim 35 wherein a characteristic of a sample pattern represents a pair of coordinates describing a starting point and an ending point of a stroke of the sample pattern.

42

42. The computer-readable medium of claim 41 wherein the characteristics in each set are ordered by the order in which the strokes are handwritten.

43

43. The computer-readable medium of claim 42 wherein each characteristic has a place within the set and wherein a comparison value is calculated by adding together the squared distance between the starting points and the squared distance between the ending points of two characteristics, the two characteristics being selected from the same place within the sets of characteristics representing the pair of sample patterns.

44

44. The computer-readable medium of claim 43 wherein the data structure is a position feature probability table.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

Unknown

Publication Date

July 25, 2000

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Automatic generation of probability tables for handwriting recognition systems” (US-6094506). https://patentable.app/patents/US-6094506

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.