Automatically Populating Documents About Special Entities

PublishedAugust 19, 2025

Assigneenot available in USPTO data we have

InventorsTiffany DOLAN David MATZ Shashank MENDIRATTA Karen WAI Mithun GHOSH+2 more

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method for automatically populating documents about special entities, the method performed by one or more processors of a computing system and comprising: receiving a transmission over a communications network from a computing device, the transmission including user data associated with a user of the computing system; extracting, from the user data, a list of entities associated with the user and a list of events that occurred between the user and the entities; transforming metadata for the events associated with entities of interest into vectorized embeddings in a vector space; selectively classifying, using the vectorized embeddings in conjunction with a binary classifier model, ones of the entities of interest as special entities and ones of the transformed events as special events for a set of documents; assigning, using the vectorized embeddings in conjunction with a multi-class classifier model, one of a plurality of categories to each special event associated with each special entity, each of the categories mapping to a corresponding section within the set of documents; and populating, for each special entity, the corresponding sections within the set of documents based on the categories assigned to the special entity's special events.

2. The method of claim 1, further comprising: identifying, within the list of entities, the entities of interest for the set of documents based on the metadata.

3. The method of claim 2, wherein the metadata for a given entity includes at least an identifier for the given entity, the metadata for a given event includes at least one quantity associated with the given event, and identifying the entities of interest includes: selectively filtering, using the metadata in conjunction with one or more rule-based filters, ones of the entities from the list of entities, the selective filtering including: responsive to determining that the identifier for a given entity appears on an exclusion list, removing the given entity from the list of entities; and responsive to determining that an aggregation metric generated based on the quantities for the events associated with the given entity does not exceed a threshold, removing the given entity from the list of entities; and identifying the remaining entities as the entities of interest.

4. The method of claim 3, wherein the metadata for the given entity further includes a location of the given entity, and wherein the selective filtering further includes: responsive to determining that the location for a given entity is outside of an area of interest, removing the given entity from the list of entities.

5. The method of claim 3, wherein the metadata for each event further includes a type of the event, and wherein the selective filtering further includes: identifying ones of the events having an excluded type, wherein the aggregation metric excludes the quantities for the identified ones of the events.

6. The method of claim 1, wherein transforming the metadata for the events into the vectorized embeddings includes: extracting, from the list of events, a set of features and attributes for each event; generating, using a sentence transformer, a plurality of feature vectors from the sets of features and attributes, wherein the sentence transformer is a language model (LM) fine-tuned using a sentence transforming learning technique, and wherein the feature vectors incorporate semantic meaning and context from the sets of features and attributes; and embedding the plurality of feature vectors in the vector space.

7. The method of claim 1, further comprising: selectively preclassifying, using a set of predefined attributes in conjunction with a nearest neighbor technique, the transformed events as special events for the set of documents.

8. The method of claim 7, wherein the predefined attributes include positive attributes suggestive of an event being a special event and negative attributes suggestive of an event not being a special event, and wherein selectively preclassifying the transformed events as special events includes: vectorizing the set of predefined attributes; embedding the vectorized attributes in the vector space; generating a tree-based index based on the embeddings; performing, using an approximate nearest-neighbor (ANN) technique in conjunction with the tree-based index, a proximity analysis of each respective event; determining, for each respective event, whether the respective event has one or more neighbors in the tree-based index within a threshold distance based on the proximity analyses; and selectively preclassifying each respective event based on whether the respective event has the one or more neighbors in the tree-based index within the threshold distance, the selective preclassifying including: refraining from preclassifying the respective event and submitting the respective event to the binary classifier model responsive to determining that the respective event does not have one or more neighbors in the tree-based index within the threshold distance; and selectively preclassifying the respective event responsive to determining that the respective event does have one or more neighbors in the tree-based index within the threshold distance, the selective preclassifying including: preclassifying the respective event as a special event when more than a threshold proportion of the neighbors within the threshold distance are associated with the positive attributes; preclassifying the respective event as not a special event when more than the threshold proportion of the neighbors within the threshold distance are associated with the negative attributes; and refraining from preclassifying the respective event and submitting the respective event to the binary classifier model when the neighbors within the threshold distance do not exceed the threshold proportion with respect to the positive attributes or the negative attributes.

9. The method of claim 8, further comprising, for a given preclassified event: generating a first set of bigrams from the positive attributes; generating a second set of bigrams from the negative attributes; generating a third set of bigrams from the metadata associated with the given preclassified event; determining, for each respective bigram in the third set of bigrams, an edit distance between the respective bigram and each bigram in the first and second sets of bigrams, wherein the edit distance is a Levenshtein distance; and selectively reclassifying the preclassified event based on the edit distances, the selective reclassification including: reclassifying the preclassified event responsive to determining that the event was preclassified as a special event and that at least one bigram in the third set of bigrams has an edit distance below a threshold with at least one of the bigrams in the second set of bigrams; reclassifying the preclassified event responsive to determining that the event was preclassified as not a special event and that at least one bigram in the third set of bigrams has an edit distance below the threshold with at least one of the bigrams in the first set of bigrams; and refraining from reclassifying the preclassified event responsive to determining that (i) the event was preclassified as a special event and none of the bigrams in the third set of bigrams has an edit distance below the threshold with any of the bigrams in the second set of bigrams or (ii) the event was preclassified as not a special event and none of the bigrams in the third set of bigrams has an edit distance below the threshold with any of the bigrams in the first set of bigrams.

10. The method of claim 9, wherein reclassifying the given event comprises reversing the preclassification for the given event or submitting the given event to the binary classifier model.

11. The method of claim 1, wherein the binary classifier model is a random forest model trained, using labeled events in conjunction with a random forest learning technique, to predict whether a new entity is a special entity for the set of documents and whether a new event is a special event for the set of documents.

12. The method of claim 11, wherein the random forest model is fine-tuned to prioritize recall over precision.

13. The method of claim 11, further comprising: responsive to classifying a given entity as not a special entity, excluding the events associated with the given entity from the set of documents.

14. The method of claim 1, further comprising: selectively excluding ones of the special entities based on the special events, wherein the selective exclusion includes: identifying ones of the events associated with the special entity that are classified as special events; generating an aggregation metric based on the identified special events; and selectively excluding the special entity based on whether the aggregation metric exceeds a threshold, the selective exclusion including: refraining from excluding the special entity responsive to determining that the aggregation metric exceeds the threshold; and excluding the special entity from the set of documents responsive to determining that the aggregation metric does not exceed the threshold.

15. The method of claim 1, wherein the multi-class classifier is a random forest model trained, using labeled events in conjunction with a random forest learning technique, to predict which of a plurality of classes most closely matches a given event.

16. The method of claim 15, wherein each of the plurality of categories corresponds to one of the plurality of classes, and wherein assigning a category to a given special event includes: generating, for each respective category of the plurality of categories, a confidence score indicating an extent to which the given special event is predicted to match the respective category; identifying, for the given special event, a highest scoring category associated with a highest confidence score; and classifying the given special event into the highest scoring category.

17. The method of claim 16, wherein the random forest model is iteratively retrained to generate predictions based on revised guidelines, the iterative retraining including: feeding, to a language model (LM), the revised guidelines and historical guidelines previously used to train the random forest model; identifying, based on an output of the LM, one or more changes in the guidelines; generating updated trained data for the random forest model based on the identifies changes in the guidelines; and retraining the random forest model using the updated training data.

18. The method of claim 1, further comprising, for a given special entity: determining that two or more special events are mapped to a same section within the set of documents; generating, for the same section, an aggregation metric based on the metadata associated with the two or more special events; and populating the same section based on the generated aggregation metric.

19. The method of claim 1, further comprising: generating a summary, in at least near real-time with the receiving of the user data, the summary indicating at least one of the special entities, the special events, or the populated documents, wherein the summary includes a plain text reasoning for the classification of at least one of the special entities or the special events; after receiving the user data and prior to presenting the summary, presenting a progress visualization to the user for a minimum amount of time longer than the at least near real-time; presenting the summary to the user along with an option to modify at least one of the special entities, special events, or populated documents, wherein the summary is regenerated responsive to receiving one or more modifications from the user; requesting an approval of the summary from the user, wherein the documents are populated after receiving the user's approval; and at least one of: providing, to each special entity, the set of the documents populated based on the special entity's events; or providing the populated documents to one or more higher level entities.

20. A system for automatically populating documents about special entities, the system comprising: one or more processors; and at least one memory coupled to the one or more processors and storing instructions that, when executed by the one or more processors, cause the system to perform operations including: receiving a transmission over a communications network from a computing device, the transmission including user data associated with a user of the system; extracting, from the user data, a list of entities associated with the user and a list of events that occurred between the user and the entities; transforming metadata for the events associated with entities of interest into vectorized embeddings in a vector space; selectively classifying, using the vectorized embeddings in conjunction with a binary classifier model, ones of the entities of interest as special entities and ones of the transformed events as special events for a set of documents; assigning, using the vectorized embeddings in conjunction with a multi-class classifier model, one of a plurality of categories to each special event associated with each special entity, each of the categories mapping to a corresponding section within the set of documents; and populating, for each special entity, the corresponding sections within the set of documents based on the categories assigned to the special entity's special events.

Patent Metadata

Filing Date

Unknown

Publication Date

August 19, 2025

Inventors

Tiffany DOLAN

David MATZ

Shashank MENDIRATTA

Karen WAI

Mithun GHOSH

Ankit AGARWAL

Azal FATIMA

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search