8732183

Comparing Strings of Characters

PublishedMay 20, 2014
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
24 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A computer-implemented method for comparing character strings, the method comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; based on the ratio being less than a threshold, that comprises a variable value based on the first string length: parsing the first character string into one or more first groups of trigrams, each of the first groups of trigrams comprising three characters of the first character string; parsing the second character string into one or more second groups of trigrams, each of the second groups of trigrams comprising three characters of the second character string; determining a number of trigram matches between the one or more first groups of trigrams and the one or more second groups of trigrams; determining a second ratio of the number of trigram matches and a sum of the first and second groups of trigrams; and based on either of the first or second ratios being greater or equal to the threshold, preparing at least one of the first or second character strings for display.

2

2. The method of claim 1 , further comprising based on the ratio being less than the threshold, matching the first character string to at least a first sub-group of the one or more second sub-groups of characters of the second character string; and based on a ratio of first string length and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

3

3. The method of claim 1 , further comprising based on the ratio being less than the threshold, matching a first sub-group of the first character string with a first sub-group of the second character string; matching a last sub-group of the first character string with a last sub-group of the second character string; and based on a ratio of a combined length of the first and last sub-groups of the first character string and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

4

4. The method of claim 1 , further comprising based on the ratio being less than the threshold, determining that the characters of the first character string match first characters of at least a subset of the one or more second sub-groups of characters; determining a number of matched characters being equal to a sum of a number of characters in the subset of the one or more second sub-groups of characters; determining a number of unmatched characters being equal to the second string length minus the number of matched characters; and based on a ratio of the number of matched characters and the number of unmatched characters that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

5

5. The method of claim 1 , further comprising, based on the ratio of the number of trigram matches and a sum of the first and second groups of trigrams being less than the threshold, determining an edit distance value between the first character string and the second character string, the edit distance comprising a number of edits to the first character string so that the first character string matches the second character string; determining a similarity of the first and second character strings based on a ratio of the edit distance of the first string distance; and based on the similarity being greater or equal to the threshold, preparing at least one of the first or second character strings for display.

6

6. The method of claim 1 , further comprising: determining the variable value based on the first string length, a upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold .

7

7. The method of claim 6 , wherein determining the variable value comprises: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

8

8. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; based on the ratio being less than a threshold, that comprises a variable value based on the first string length: parsing the first character string into one or more first groups of trigrams, each of the first groups of trigrams comprising three characters of the first character string; parsing the second character string into one or more second groups of trigrams, each of the second groups of trigrams comprising three characters of the second character string; determining a number of trigram matches between the one or more first groups of trigrams and the one or more second groups of trigrams; determining a second ratio of the number of trigram matches and a sum of the first and second groups of trigrams; and based on either of the first or second ratios greater or equal to the threshold, preparing at least one of the first or second character strings for display.

9

9. The non-transitory computer storage medium of claim 8 , wherein the operations further comprise, based on the ratio being less than the threshold, matching the first character string to at least a first sub-group of the one or more second sub-groups of characters of the second character string; and based on a ratio of first string length and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

10

10. The non-transitory computer storage medium of claim 8 , wherein the operations further comprise, based on the ratio being less than the threshold, matching a first sub-group of the first character string with a first sub-group of the second character string; matching a last sub-group of the first character string with a last sub-group of the second character string; and based on a ratio of a combined length of the first and last sub-groups of the first character string and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

11

11. The non-transitory computer storage medium of claim 8 , wherein the operations further comprise, based on the ratio being less than the threshold, determining that the characters of the first character string match first characters of at least a subset of the one or more second sub-groups of characters; determining a number of matched characters being equal to a sum of a number of characters in the subset of the one or more second sub-groups of characters; determining a number of unmatched characters being equal to the second string length minus the number of matched characters; and based on a ratio of the number of matched characters and the number of unmatched characters that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

12

12. The non-transitory computer storage medium of claim 1 , wherein the operations further comprise, based on the ratio of the number of trigram matches and a sum of the first and second groups of trigrams being less than the threshold, determining an edit distance value between the first character string and the second character string, the edit distance comprising a number of edits to the first character string so that the first character string matches the second character string; determining a similarity of the first and second character strings based on a ratio of the edit distance of the first string distance; and based on the similarity being greater or equal to the threshold, preparing at least one of the first or second character strings for display.

13

13. The non-transitory computer storage medium of claim 8 , wherein the operations further comprise: determining the variable value based on the first string length, a upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold.

14

14. The non-transitory computer storage medium of claim 13 , wherein determining the variable value comprises: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

15

15. A system of one or more computers configured to perform operations comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; based on the ratio being less than a threshold, that comprises a variable value based on the first string length: parsing the first character string into one or more first groups of trigrams, each of the first groups of trigrams comprising three characters of the first character string; parsing the second character string into one or more second groups of trigrams, each of the second groups of trigrams comprising three characters of the second character string; determining a number of trigram matches between the one or more first groups of trigrams and the one or more second groups of trigrams; determining a second ratio of the number of trigram matches and a sum of the first and second groups of trigrams; and based on either of the first or second ratios greater or equal to the threshold, preparing at least one of the first or second character strings for display.

16

16. The system of claim 15 , wherein the operations further comprise, based on the ratio being less than the threshold, matching the first character string to at least a first sub-group of the one or more second sub-groups of characters of the second character string; and based on a ratio of first string length and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

17

17. The system of claim 15 , wherein the operations further comprise, based on the ratio being less than the threshold, matching a first sub-group of the first character string with a first sub-group of the second character string; matching a last sub-group of the first character string with a last sub-group of the second character string; and based on a ratio of a combined length of the first and last sub-groups of the first character string and the second string length that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

18

18. The system of claim 15 , wherein the operations further comprise, based on the ratio being less than the threshold, determining that the characters of the first character string match first characters of at least a subset of the one or more second sub-groups of characters; determining a number of matched characters being equal to a sum of a number of characters in the subset of the one or more second sub-groups of characters; determining a number of unmatched characters being equal to the second string length minus the number of matched characters; and based on a ratio of the number of matched characters and the number of unmatched characters that is greater or equal to the threshold, preparing at least one of the first or second character strings for display.

19

19. The system of claim 15 , wherein the operations further comprise, based on the ratio of the number of trigram matches and a sum of the first and second groups of trigrams being less than the threshold, determining an edit distance value between the first character string and the second character string, the edit distance comprising a number of edits to the first character string so that the first character string matches the second character string; determining a similarity of the first and second character strings based on a ratio of the edit distance of the first string distance; and based on the similarity being greater or equal to the threshold, preparing at least one of the first or second character strings for display.

20

20. The system of claim 15 , wherein the operations further comprise: determining the variable value based on the first string length, a upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold.

21

21. The system of claim 20 , wherein determining the variable value comprises: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

22

22. A computer-implemented method for comparing character strings, the method comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; and based on the ratio being less than a threshold, preparing at least one of the first or second character strings for display, the threshold comprising a variable value based on the first string length, the variable value determined based on the first string length, an upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold, the determination comprising: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

23

23. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; and based on the ratio being less than a threshold, preparing at least one of the first or second character strings for display, the threshold comprising a variable value based on the first string length, the variable value determined based on the first string length, an upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold, the determination comprising: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

24

24. A system of one or more computers configured to perform operations comprising: identifying a first character string having a first string length, and a second character string having a second string length greater than the first string length; parsing the first character string into one or more first sub-groups of characters; parsing the second character string into one or more second sub-groups of characters; comparing each of the one or more first sub-groups of characters against the one or more second sub-groups of characters; determining a ratio of a number of characters in the one or more first sub-groups of characters that match the one or more second sub-groups of characters and the second string length; and based on the ratio being less than a threshold, preparing at least one of the first or second character strings for display, the threshold comprising a variable value based on the first string length, the variable value determined based on the first string length, an upper limit of the threshold, a lower limit of the threshold, and a growth rate of the threshold, the determination comprising: determining a complement of the threshold that is equal to a sum of (a) a value that equals the first string length multiplied by a function that equals 1 minus the upper limit of the threshold, and (b) a value that equals a log of a ratio of the first string length and the growth rate of the threshold; and determining the variable value based on a value equal to 1 minus a ratio of the complement and the first string length.

Patent Metadata

Filing Date

Unknown

Publication Date

May 20, 2014

Inventors

Shachar Soel
Dmitry Gorenchteine
Udi Cohen

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “COMPARING STRINGS OF CHARACTERS” (8732183). https://patentable.app/patents/8732183

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.

COMPARING STRINGS OF CHARACTERS — Shachar Soel | Patentable