Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for providing semantic encoding and language generation in a computing system by a processor, comprising: automatically parsing unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encoding, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sampling, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generating text data from the set of the plurality of triples using the second machine learning model; computing a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjusting at least one predicted probability from the first machine learning model based on the determined penalty score.
2. The method of claim 1, further including training the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder.
3. The method of claim 1, further including using the first machine learning model to: identify the entities in the unstructured data.
4. The method of claim 1, further including using the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation.
5. The method of claim 1, further including sampling the set of the plurality of triples from the unstructured data of the one or more knowledge graphs for training a plurality of machine learning models via unsupervised machine learning.
6. The method of claim 1, further including: identifying one or more candidate entities in the unstructured data; and using the one or more candidate entities as nodes in the one or more knowledge graphs.
7. A system for providing semantic encoding and language generation in a computing environment, comprising: one or more computers with executable instructions that when executed cause the system to: automatically parse unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encode, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sample, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generate text data from the set of the plurality of triples using the second machine learning model; compute a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjust at least one predicted probability from the first machine learning model based on the determined penalty score.
8. The system of claim 7, wherein the executable instructions when executed cause the system to train the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder.
9. The system of claim 7, wherein the executable instructions when executed cause the system to use the first machine learning model to: identify the entities in the unstructured data.
10. The system of claim 7, wherein the executable instructions when executed cause the system to use the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation.
11. The system of claim 7, wherein the executable instructions when executed cause the system to sample the set of the plurality of triples from the unstructured data of the one or more knowledge graphs for training a plurality of machine learning models via unsupervised machine learning.
12. The system of claim 7, wherein the executable instructions when executed cause the system to: identify one or more candidate entities in the unstructured data; and use the one or more candidate entities as nodes in the one or more knowledge graphs.
13. A computer program product for providing semantic encoding and language generation in a computing environment, the computer program product comprising: one or more tangible computer readable storage media, and program instructions collectively stored on the one or more tangible computer readable storage media, the program instruction comprising: automatically parse unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encode, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sample, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generate text data from the set of the plurality of triples using the second machine learning model; compute a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjust at least one predicted probability from the first machine learning model based on the determined penalty score.
14. The computer program product of claim 13, further including program instructions to train the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder.
15. The computer program product of claim 13, further including program instructions to use the first machine learning model to: identify the entities in the unstructured data.
16. The computer program product of claim 13, further including program instructions to use the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation.
17. The computer program product of claim 13, further including program instructions to: identify one or more candidate entities in the unstructured data; and use the one or more candidate entities as nodes in the one or more knowledge graphs.
Unknown
February 4, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.