Legal claims defining the scope of protection, as filed with the USPTO.
2. The computer-readable medium of claim 1, wherein the pointwise mutual information (PMI) score for a current content item is determined based on a formulation of a co-occurrence of the target content item and current given item in the documents of the document corpus.
3. The computer-readable medium of claim 2, wherein adding content items of the target content type within the current document to the aggregated set of content items comprises adding those content items of the target content type within the current document to the aggregated set of content items that have an associated embedding vector.
5. The computer-readable medium of claim 4, wherein the target content item is a text-based content item.
6. The computer-readable medium of claim 5, wherein the target content item is a text-based content item for which a text-based embedding vector is not available.
8. The computer system of claim 7, wherein the computer system identifies the predetermined first number of closest items of the first content type to the first content item according to a similarity measure of the first embedding vector for the first content item to a plurality of other embedding vectors in the first embedding space, and selecting the predetermined first number of closest items whose embedding vectors are closest to the embedding vector of the first content item in the first embedding space.
9. The computer system of claim 8, wherein the computer system determines the first similarity score between the first content item and the second content item according to a cosine similarity measure between the averaged embedding vector in the second embedding space and the second embedding vector of the second content item in the second embedding space.
11. The computer system of claim 10, wherein the computer system is further configured to combine the first similarity score and the second similarity score to generate a combined similarity score between the first content item and the second content item.
13. The computer system of claim 12, wherein the first content item type is a text-based content item type and the second content item type is an image-based content item type.
Unknown
April 16, 2024
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.