Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for automated application programming interface (API) validation, the method comprising: extracting API information from an API repository, the API information including a parameter placeholder, parameter information related to a parameter of an API endpoint, an endpoint of the API, an endpoint description, a description of the API, a description of the parameter, response information, an authentication requirement information, and an API name; resolving the parameter of the API endpoint, wherein resolution of the parameter includes: extracting the API name, the endpoint, the API description, and the parameter description; retrieving a parameter name and a parameter placeholder from the endpoint; generating a noun phrase list of noun phrases parsed from the endpoint description and the parameter description based on a syntactic analysis; extracting a keyword from the parameter name and the noun phrase list; classifying data type and domain of the parameter based at least partially on a query of the keyword in a third-party data repository; and parsing a label from a query result as a sample parameter value for the parameter in the API; communicating, to a native API system, a request using the sample parameter value for the parameter for the API; comparing a response from the native API system with the response information to determine a validity of the API endpoint; and based on the validity of the API endpoint, verifying integrity of a software application implementing the API endpoint for use with a native software application on the native API system.
2. The method of claim 1 , wherein the parameter is a first parameter of a plurality of parameters, and the endpoint is a first endpoint of a plurality of endpoints, the method further comprises: resolving each parameter of the plurality of parameters of each endpoint of the plurality of endpoints; and generating a validation plan that includes an aggregation of information to create a list of fully resolved parameters to be tested for each endpoint of the plurality of endpoints.
3. The method of claim 1 , wherein retrieving the parameter name and the parameter placeholder includes: retrieving the parameter placeholder from the endpoint; removing special characters from the retrieved parameter placeholder; comparing the placeholder name with the retrieved parameter placeholder; in response to the placeholder name being the same as the retrieved parameter placeholder: adding the parameter name as a parameter word; and using an API language model, capturing two or more words in the parameter word and adding a space between the two or more captured words; in response to the placeholder name being different from the retrieved parameter placeholder: adding the parameter name and the retrieved parameter placeholder as parameter words; and using the API language model, capturing two or more words in each of the parameter words and adding a space between the captured words.
4. The method of claim 1 , wherein the extracting the keyword includes: receiving the noun phrase list and the parameter name; computing a similarity score based on a similarity between the parameter name and at least one of the noun phrases of the noun phrase list; and in response to the similarity score being greater than a particular threshold, outputting the at least one of the noun phrases as the keyword.
5. The method of claim 1 , wherein classifying the data type and domain includes: computing a first word vector for the extracted keyword and the noun phrases of the noun phrase list; generating the query of the keyword in the third-party data repository; computing a second word vector for classes and categories of a first query result and a third word vector for classes and categories of a second query result; computing a first cosine similarity between the first word vector and the second word vector; computing a second cosine similarity between the first word vector and the third word vector; and returning the first query result in response to the first cosine similarity being greater than the second cosine similarity.
6. The method of claim 1 , further comprising: determining whether the parameter name is ID-related; in response to the parameter name being ID-related, determining whether the parameter name is user ID-related or non-user ID-related; in response to the parameter name being user ID-related, generating a user ID authentication identification, the generating the authentication identification including: searching for a sign up keyword with the API name in a public search repository; selecting a particular number of search results; following a link to a page returned in search results; determining, based on a sign up page language model, whether the page is a sign up page; in response to the page being a sign up page, performing an automated sign in process; and in response to the page not being a sign up page: extracting links on the page; computing a similarity score that indicates a similarity between a key list and the extracted links; and in response to the similarity score being greater than a particular threshold, selecting another particular number of results from the search for the keyword.
7. The method of claim 6 , wherein the automated sign in process includes: extracting text of one or more fields in the sign up page; entering first values into the fields; submitting the first values in the field of the sign up page; determining whether a sign-up process is complete based on the sign up page language model; in response to the sign-up process being complete, returning the first values as a user ID and a password; and in response to the sign-up process being incomplete, re-extracting text of the fields, entering second values into the fields, and submitting the second values in the field.
8. The method of claim 6 , further comprising: requesting authorization to access data automatically; at a log-in prompt page, determining whether a user ID and a password have been extracted; in response to the user ID and the password not being extracted, generating the user ID authentication identification; in response to the user ID and the password being extracted: accessing a granted authorization code; requesting an access token and refresh token using the granted authorization code; and following a particular period of time after which the access token expires, requesting a new access token using the refresh token.
9. The method of claim 6 , wherein in response to the parameter name being non-user ID-related, generating a non-user ID authentication identification, the generating the non-user ID authentication identification including: receiving the parameter name and the noun phrases of the noun phrase list; computing a similarity score indicative of a similarity between the parameter name and the noun phrases; in response to the similarity score being greater than a particular threshold, determining whether an output includes a non-user ID; and in response to the output including the non-user ID: executing the endpoint and retrieving the non-user ID; and in response to the output not including the non-user ID, evaluating another endpoint.
10. The method of claim 1 , further comprising storing a validation result in a validation repository.
11. A non-transitory computer-readable medium having encoded therein programming code executable by one or more processors to perform or control performance of operations comprising: extracting API information from an API repository, the API information including a parameter placeholder, parameter information related to a parameter of an API endpoint, an endpoint of the API, an endpoint description, a description of the API, a description of the parameter, response information, an authentication requirement information, and an API name; resolving the parameter of the API endpoint, wherein resolution of the parameter includes: extracting the API name, the endpoint, the API description, and the parameter description; retrieving a parameter name and a parameter placeholder from the endpoint; generating a noun phrase list of noun phrases parsed from the endpoint description and the parameter description based on a syntactic analysis; extracting a keyword from the parameter name and the noun phrase list; classifying data type and domain of the parameter based at least partially on a query of the keyword in a third-party data repository; and parsing a label from a query result as a sample parameter value for the parameter in the API; communicating, to a native API system, a request using the sample parameter value for the parameter for the API; comparing a response from the native API system with the response information to determine a validity of the API endpoint; and based on the validity of the API endpoint, verifying integrity of a software application implementing the API endpoint for use with a native software application on the native API system.
12. The non-transitory computer-readable medium of claim 11 , wherein: the parameter is a first parameter of a plurality of parameters; the endpoint is a first endpoint of a plurality of endpoints; and the operations further comprise: resolving each parameter of the plurality of parameters of each endpoint of the plurality of endpoints; and generating a validation plan that includes an aggregation of information to create a list of fully resolved parameters to be tested for each endpoint of the plurality of endpoints.
13. The non-transitory computer-readable medium of claim 11 , wherein the extracting the parameter name and the parameter placeholder includes: retrieving the parameter placeholder from the endpoint; removing special characters from the retrieved parameter placeholder; comparing the placeholder name with the retrieved parameter placeholder; in response to the placeholder name being the same as the retrieved parameter placeholder: adding the parameter name as a parameter word; and using an API language model, capturing two or more words in the parameter word and adding a space between the two or more captured words; in response to the placeholder name being different from the retrieved parameter placeholder: adding the parameter name and the retrieved parameter placeholder as parameter words; and using the API language model, capturing two or more words in each of the parameter words and adding a space between the captured words.
14. The non-transitory computer-readable medium of claim 11 , wherein the extracting the keyword includes: receiving the noun phrase list and the parameter name; computing a similarity score based on a similarity between the parameter name and at least one of the noun phrases of the noun phrase list; and in response to the similarity score being greater than a particular threshold, outputting the at least one of the noun phrases as the keyword.
15. The non-transitory computer-readable medium of claim 11 , wherein classifying the data type and domain includes: computing a first word vector for the extracted keyword and the noun phrases of the noun phrase list; generating the query of the keyword in the third-party data repository; computing a second word vector for classes and categories of a first query result and a third word vector for classes and categories of a second query result; computing a first cosine similarity between the first word vector and the second word vector; computing a second cosine similarity between the first word vector and the third word vector; and returning the first query result in response to the first cosine similarity being greater than the second cosine similarity.
16. The non-transitory computer-readable medium of claim 11 , wherein the operations further comprise: determining whether the parameter name is ID-related; in response to the parameter name being ID-related, determining whether the parameter name is user ID-related or non-user ID-related; in response to the parameter name being user ID-related, generating a user ID authentication identification, the generating the authentication identification including: searching for a sign up keyword with the API name in a public search repository; selecting a particular number of search results; following a link to a page returned in search results; determining, based on a sign up page language model, whether the page is a sign up page; in response to the page being a sign up page, performing an automated sign in process; and in response to the page not being a sign up page: extracting links on the page; computing a similarity score that indicates a similarity between a key list and the extracted links; and in response to the similarity score being greater than a particular threshold, selecting another particular number of results from the search for the keyword.
17. The non-transitory computer-readable medium of claim 16 , wherein the automated sign in process includes: extracting text of one or more fields in the sign up page; entering first values into the fields; submitting the first values in the field of the sign up page; determining whether a sign-up process is complete based on the sign up page language model; in response to the sign-up process being complete, returning the first values as a user ID and a password; and in response to the sign-up process being incomplete, re-extracting text of the fields, entering second values into the fields, and submitting the second values in the field.
18. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise: requesting authorization to access data; at a log-in prompt page, determining whether a user ID and a password have been extracted; in response to the user ID and the password not being extracted, generating the user ID authentication identification; in response to the user ID and the password being extracted: accessing a granted authorization code; requesting an access token and refresh token using the granted authorization code; and following a particular period of time after which the access token expires, requesting a new access token using the refresh token.
19. The non-transitory computer-readable medium of claim 16 , wherein in response to the parameter name being non-user ID-related, generating a non-user ID authentication identification, the generating the non-user ID authentication identification including: receiving the parameter name and the noun phrases of the noun phrase list; computing a similarity score indicative of a similarity between the parameter name and the noun phrases; in response to the similarity score being greater than a particular threshold, determining whether an output includes a non-user ID; and in response to the output including the non-user ID: executing the endpoint and retrieving the non-user ID; and in response to the output not including the non-user ID, evaluating another endpoint.
20. The non-transitory computer-readable medium of claim 11 , wherein the operations further comprise storing a validation result in a validation repository.
Unknown
November 20, 2018
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.