7725414

Method for Developing a Classifier for Classifying Communications

PublishedMay 25, 2010
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
50 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A computer implemented method for developing a classifier for classifying electronic communications comprising: querying a user for a phrase that indicates that a communication is not related to a concept; receiving a user identification of the phrase; (a) presenting user-generated electronic communications to a user for labeling as relevant or irrelevant, the electronic communications being selected from groups of electronic communications including: a training set group of electronic communications, the training set group of electronic communications being selected by an active learning algorithm; a system-labeled set of electronic communications previously labeled by the system; a test set group of electronic communications, the test set group of electronic communications for testing the accuracy of a current state of a classifier being developed; a faulty set of electronic communications suspected to be previously mis-labeled by the user; and a random set of electronic communications previously labeled by the user; (b) developing the classifier for classifying electronic communications based upon the phrase, the relevant labels and the irrelevant labels assigned by the user during the presenting of the electronic communications to the user; (c) deploying the classifier for use in classifying electronic communications based upon the relevant labels and the irrelevant labels; and (d) storing a set of electronic communications labeled by the classifier in a memory.

2

2. The method of claim 1 , wherein the presenting of the electronic communications to the user includes: assessing a first value related to performance that labeling a first set of electronic communications from a first one of the training set group, the system-labeled set, the test set group, the faulty set, or the random set provides to the classifier being developed; assessing a second value related to performance that labeling a second set of electronic communications from a second one of the training set group, the system-labeled set, the test set group, the faulty set, or the random set provides to the classifier being developed; and selecting a next group for labeling based upon a greatest of the first value and the second value provided to the classifier being developed from the assessing.

3

3. A computer implemented method for developing a classifier for classifying electronic communications comprising: querying a user for a phrase that indicates that a communication is not related to a concept; receiving a user identification of the phrase; (a) presenting electronic communications to a user for labeling as relevant or irrelevant, the electronic communications being selected from groups of user-generated electronic communications including: a training set group of electronic communications, the training set group of electronic communications being selected by an active learning algorithm; a test set group of electronic communications, the test set group of electronic communications for testing the accuracy of a current state of a classifier being developed; and a previously-labeled set of electronic communications previously labeled by at least one of the user, the system and another user; (b) developing the classifier for classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels assigned by the user; (c) deploying the classifier for use in classifying electronic communications based upon the relevant labels and the irrelevant labels; and (d) storing a set of electronic communications labeled by the classifier in a memory.

4

4. The method of claim 3 , wherein the previously-labeled set of electronic communications includes electronic communications previously labeled by the user.

5

5. The method of claim 4 , wherein the previously-labeled set of electronic communications includes electronic communications suspected by the system to be possibly mis-labeled by the user.

6

6. The method of claim 3 , wherein the previously-labeled set of electronic communications includes electronic communications previously labeled by the system.

7

7. The method of claim 3 , wherein the previously-labeled set of electronic communications includes electronic communications previously labeled by a user and electronic communications previously labeled by the system.

8

8. The method of claim 3 wherein presenting the electronic communications to the user includes: assessing a first value that labeling a first set of electronic communications from a first one of the training set group, the system-labeled set, the test set group, the faulty set, or the random set will provide to the classifier being developed; assessing a second value related to performance that labeling a second set of electronic communications from a second one of the training set group, the system-labeled set, the test set group, the faulty set, or the random set provides to the classifier being developed; and selecting a next group for labeling based upon the greatest of the first value and the second value that will be provided to the classifier being developed from the assessing.

9

9. The method of claim 3 wherein presenting the electronic communications to the user includes: assessing a value that labeling a set of electronic communications from each group will provide to the classifier being developed; and selecting a next group for labeling based upon achieving known performance bounds for the classifier.

10

10. The method of claim 3 further comprising developing an expression of labeling criteria in an interactive session with the user.

11

11. The method of claim 10 , wherein the interactive session includes posing hypothetical questions to the user regarding what type of information the user would consider relevant.

12

12. The method of claim 11 , wherein the hypothetical questions elicit “yes”, “no” and “unsure” responses from the user.

13

13. The method of claim 11 wherein subsequent questions are based, at least in part, upon the answers given to previous questions.

14

14. The method of claim 11 wherein developing an expression of labeling criteria produces a criteria document.

15

15. The method of claim 14 wherein the expression and/or the criteria document include a group of keywords and/or phrases for use by the system in automatically labeling electronic communications.

16

16. The method of claim 10 wherein developing an expression of labeling criteria produces a criteria document.

17

17. The method of claim 16 wherein the criteria document includes a list of items that are considered relevant and a list of items that are considered irrelevant.

18

18. The method of claim 17 , wherein presenting the electronic communications to the user includes querying the user to identify which item(s) influenced the label on a user-labeled electronic communication.

19

19. The method of claim 16 , wherein at least one of the expression or the criteria document include at least one of a group of keywords or phrases for use by the system in automatically labeling electronic communications.

20

20. The method of claim 10 wherein the interactive session is conducted prior to presenting the electronic communications to the user.

21

21. A computer implemented method for developing a classifier for classifying electronic communications comprising: (a) developing an expression of labeling criteria in an interactive session with a user, wherein the interactive session includes querying a user to identify a phrase that indicates that a communication is not related to a concept and receiving a user identification of the phrase; (b) presenting electronic communications to the user for labeling as relevant or irrelevant, wherein the electronic communications are user-generated; (c) developing a classifier for classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels assigned by the user; (d) deploying the classifier for use in classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels; (e) storing a set of electronic communications labeled by the classifier in a memory; and wherein at least one of (b) and (c) use the expression of labeling criteria developed in (a).

22

22. The method of claim 21 , wherein the interactive session includes posing questions to the user regarding what type of information the user would consider relevant.

23

23. The method of claim 22 , wherein the questions elicit “yes”, “no” and “unsure” responses from the user.

24

24. The method of claim 22 wherein subsequent questions are based, at least in part, upon the answers given to previous questions.

25

25. The method of claim 22 wherein the questions are structured from several dimensional levels of relevance, including a first dimension of question segments on a topic, a second dimension of question segments on an aspect of the topic and a third dimension of question segments on a type of discussion.

26

26. The method of claim 25 , wherein: the first dimension of question segments on a topic include one or more of the following segments: a first segment concerning a client's product and a second segment concerning a client's competitors; the second dimension of question segments on a topic include one or more of the following segments: a third segment concerning a feature of the topic, a fourth segment concerning the topic itself, a fifth segment concerning corporate activity of the topic, a sixth segment concerning price of the topic, a seventh segment concerning news of the topic and an eighth segment concerning advertising of the topic; and the third dimension of question segments on a topic include one or more of the following segments: a ninth segment concerning a mention of the second dimension segment, a tenth segment concerning a description of the second dimension segment, an eleventh segment concerning a usage statement about the second dimension segment, a twelfth segment concerning a brand comparison involving the second dimension of questions segments, and a thirteenth segment concerning an opinion about the second dimension segment.

27

27. The method of claim 21 wherein developing the expression of labeling criteria produces a criteria document.

28

28. The method of claim 27 wherein the criteria document includes a list of items that are considered relevant and a list of items that are considered irrelevant.

29

29. The method of claim 28 wherein the criteria document includes a group of keywords for use by the system in automatically labeling electronic communications.

30

30. The method of claim 28 , wherein presenting the electronic communications to the user includes querying the user which items influenced the label on a user-labeled communication.

31

31. The method of claim 21 wherein the expression of labeling criteria includes a group of keywords and/or phrases for use by the system in automatically labeling electronic communications.

32

32. The method of claim 31 wherein the group of keywords is also for use by the system in gathering electronic communications.

33

33. A computer implemented method for developing a classifier for classifying electronic communications comprising: (a) defining a domain of electronic communications on which a classifier is to operate, wherein the electronic communications are user-generated; (b) collecting a set of electronic communications from the domain; (c) eliciting labeling criteria from a user by querying a user to identify a phrase that indicates that a communication is not related to a concept and receiving the phrase; (d) labeling, by the system, electronic communications from the set of electronic communications according, at least in part, to the labeling criteria elicited from the user; (e) labeling, by the user, electronic communications from the set of electronic communications; (f) building the electronic communications classifier according to a combination of labels applied to electronic communications in (d) and (e); (g) deploying the classifier for use in classifying electronic communications based upon the combination of labels; and (h) storing a labeled set of electronic communications labeled by the classifier in a memory.

34

34. The computer implemented method of claim 33 , wherein (d) and (e), and (f) includes selecting electronic communications for labeling by the user targeted to build the electronic communications classifier within known performance bounds.

35

35. The computer implemented method of claim 34 , wherein selecting electronic communications for labeling by the user selects electronic communications from groups of electronic communications including: a training set group of electronic communications, the training set group of electronic communications being selected by an active learning algorithm; a test set group of electronic communications for testing the accuracy of a current state of the classifier; and a previously-labeled set of electronic communications previously labeled by at least one of the user, the system and another user.

36

36. The computer implemented method of claim 34 , wherein selecting electronic communications for labeling by the user selects electronic communications from groups of electronic communications including: a training set group of electronic communications selected by an active learning algorithm; a system-labeled set of electronic communications previously labeled by the system; a test set group of electronic communications for testing the accuracy of a current state of the classifier being developed; a faulty set of electronic communications suspected to be previously mis-labeled by the user; and a random set of electronic communications previously labeled by the user.

37

37. The computer implemented method of claim 33 , wherein the labeling criteria elicited in the eliciting of (c) is used, in part, to determine electronic communications to collect in the collecting of (b).

38

38. The computer implemented method of claim 37 , wherein the eliciting (c) involves an interactive session with the user.

39

39. The computer implemented method of claim 37 , wherein the labeling criteria elicited in the eliciting (c) is used, in part, by the system to label electronic communications in the labeling (d).

40

40. The computer implemented method of claim 39 , wherein the eliciting (c) involves an interactive session with the user.

41

41. The method of claim 33 , wherein the building (f) involves an active learning process.

42

42. The computer implemented method of claim 33 , wherein the labeling criteria elicited in the eliciting (c) is used, in part, by the system to label electronic communications in the labeling (d).

43

43. The computer implemented method of claim 33 , wherein the eliciting (c) involves an interactive session with the user.

44

44. The method of claim 43 , wherein the interactive session includes posing questions to the user regarding what type of information the user would consider relevant.

45

45. The method of claim 44 , wherein the interactive session also allows the user to provide keywords based upon a criteria the user considers relevant.

46

46. The method of claim 44 , wherein the questions elicit “yes”, “no” and “unsure” responses from the user.

47

47. The method of claim 43 , wherein the building (f) involves an active learning process.

48

48. A tangible computer readable medium storing instructions that when executed cause a computer to develop a classifier for classifying electronic communications by: querying a user to identify a phrase that indicates that a communication is not related to a concept; receiving a user identification of the phrase; (a) presenting electronic communications to a user for labeling as relevant or irrelevant, the electronic communications being selected from groups of user-generated electronic communications including: a training set group of electronic communications selected by an active learning algorithm; a test set group of electronic communications for testing the accuracy of a current state of a classifier being developed; and a previously-labeled set of electronic communications previously labeled by at least one of the user, the system and another user; (b) developing the classifier for classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels assigned by the user during presenting electronic communications to the user; (c) deploying the classifier for use in classifying electronic communications based upon the relevant labels and the irrelevant labels; and storing a set of electronic communications labeled by the classifier in a memory.

49

49. A tangible computer readable medium storing instructions that when executed cause a computer to develop a classifier for classifying electronic communications by: (a) developing an expression of labeling criteria in an interactive session with the user, wherein the interactive session includes querying a user for a phrase that indicates that a communication is not related to a concept and receiving a user identification of the phrase; (b) presenting electronic communications to a user for labeling as relevant or irrelevant, wherein the electronic communications are user-generated; and (c) developing a classifier for classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels assigned by the user; (d) deploying the classifier for use in classifying electronic communications based upon the phrase and the relevant labels and the irrelevant labels; and storing a set of electronic communications labeled by the classifier in a memory; wherein at least one of (b) and (c) use the expression of labeling criteria developed in (a).

50

50. A tangible computer readable medium storing instructions that when executed cause a computer to develop a classifier for classifying electronic communications by: (a) defining a domain of electronic communications on which a classifier is to operate, wherein the electronic communications are user-generated; (b) collecting a set of electronic communications from the domain; (c) eliciting labeling criteria from a user by querying a user for a phrase that indicates that a communication is not related to a concept and receiving a user identification of the phrase; (d) labeling, by the computer system, electronic communications from the set of communications according, at least in part, to the labeling criteria elicited from the user; (e) labeling, by the user, electronic communications from the set of electronic communications; (f) building the electronic communications classifier according to a combination of labels applied to electronic communications in (d) and (e); (g) deploying the classifier for use in classifying electronic communications based upon the combination of labels; and storing a set of electronic communications labeled by the classifier in a memory.

Patent Metadata

Filing Date

Unknown

Publication Date

May 25, 2010

Inventors

Kamal P. Nigam
Robert G. Stockton

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “METHOD FOR DEVELOPING A CLASSIFIER FOR CLASSIFYING COMMUNICATIONS” (7725414). https://patentable.app/patents/7725414

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.