Patentable/Patents/US-20260045324-A1
US-20260045324-A1

Machine Learning System for Analyte Identification

PublishedFebruary 12, 2026
Assigneenot available in USPTO data we have
Technical Abstract

An example method performed via a computing device for providing support to a mass spectrometry (MS) system includes obtaining from a mass spectral library (i) an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and (ii) a first set of metadata corresponding to the ordered hitlist of reference spectra. The method also includes obtaining from the MS system a second set of metadata corresponding to the set of fragmentation spectra. The method also includes evaluating an order of entries in the ordered hitlist with a machine learning model based on the first set of metadata and the second set of metadata.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and a first set of metadata corresponding to the ordered hitlist of reference spectra; obtaining from a mass spectral library: obtaining from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluating an order of entries in the ordered hitlist with a machine learning (ML) model based on the first set of metadata and the second set of metadata. . A method performed via a computing device for providing support to a mass spectrometry (MS) system, the method comprising:

2

claim 1 . The method of, wherein obtaining the ordered hitlist includes submitting a search query with the set of fragmentation spectra to the mass spectral library.

3

claim 1 a normalized or absolute collision energy; one or more parameters of an ion activation method; a number of candidate compounds; one or more values of a matching score; an identity of a precursor ion; a number of peaks in a spectrum; a sparseness measure; an intensity of a peak and a corresponding accuracy; one or more distances between peaks in a spectrum; one or more spectrum labels; and a mean or average value of a selected numerical characteristic. . The method of, wherein at least one of the first and second sets of metadata includes a respective set of one or more parameters selected from the group consisting of:

4

claim 3 . The method of, wherein the matching score is computed using a dot product of a corresponding pair of spectra.

5

claim 4 . The method of, wherein the ordered hitlist is ordered in a descending order of matching scores of the entries.

6

claim 1 a random forest classifier; a gradient boosting model; a k-nearest neighbors algorithm; and a variational autoencoder. . The method of, wherein the ML model includes a model selected from the group consisting of:

7

claim 1 with an encoder, generating a features vector based on the set of fragmentation spectra of the analyte, the reference spectra from the ordered hitlist, the first set of metadata, and the second set of metadata; and applying the features vector to the ML model. . The method of, wherein the evaluating comprises:

8

claim 7 . The method of, wherein the ML model is configured to change the order of the entries.

9

claim 8 . The method of, further comprising displaying, on a display device, a modified hitlist having the changed order of the entries.

10

claim 1 an estimated probability of the analyte belonging to a specified compound class; an estimated probability of the analyte belonging to a specified chemical class; and an estimated probability of the analyte being from a same compound class as a compound corresponding to a selected reference spectrum from the ordered hitlist. . The method of, wherein the ML model is configured to determine one or more of:

11

claim 10 . The method of, wherein at least one of the estimated probabilities differs from a corresponding probability determined at the mass spectral library.

12

claim 11 . The method of, wherein the ML model is configured to determine an adjustment value to a matching score value provided by the mass spectral library with a respective reference spectrum of the ordered hitlist.

13

claim 1 . A non-transitory computer-readable medium storing instructions that, when executed by the computing device, cause the computing device to perform operations comprising the method of.

14

an interface device; a processing device; and a memory device including program code, wherein the memory device and the program code are configured to, with the interface device and the processing device, cause the apparatus at least to: an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and a first set of metadata corresponding to the ordered hitlist of reference spectra; obtain from a mass spectral library: obtain from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluate an order of entries in the ordered hitlist with a machine learning (ML) model based on the first set of metadata and the second set of metadata. . An apparatus for providing support to a mass spectrometry (MS) system, the apparatus comprising:

15

claim 14 a normalized or absolute collision energy; one or more parameters of an ion activation method; a number of candidate compounds; one or more values of a matching score; an identity of a precursor ion; a number of peaks in a spectrum; a sparseness measure; an intensity of a peak and a corresponding accuracy; one or more distances between peaks in a spectrum; one or more spectrum labels; and a mean or average value of a selected numerical characteristic. . The apparatus of, wherein at least one of the first and second sets of metadata includes a respective set of one or more parameters selected from the group consisting of:

16

claim 14 a random forest classifier; a gradient boosting model; a k-nearest neighbors algorithm; and a variational autoencoder. . The apparatus of, wherein the ML model includes a model selected from the group consisting of:

17

claim 14 with an encoder, generate a features vector based on the set of fragmentation spectra of the analyte, the reference spectra from the ordered hitlist, the first set of metadata, and the second set of metadata; and apply the features vector to the ML model. . The apparatus of, wherein the memory device and the program code are further configured to, with the interface device and the processing device, cause the apparatus to:

18

claim 17 . The apparatus of, wherein the ML model is configured to change the order of the entries.

19

claim 18 . The apparatus of, wherein the memory device and the program code are further configured to, with the interface device and the processing device, cause the apparatus to display, on a display device, a modified hitlist having the changed order of the entries.

20

claim 14 an estimated probability of the analyte belonging to a specified compound class; an estimated probability of the analyte belonging to a specified chemical class; and an estimated probability of the analyte being from a same compound class as a compound corresponding to a selected reference spectrum from the ordered hitlist. . The apparatus of, wherein the ML model is configured to determine one or more of:

Detailed Description

Complete technical specification and implementation details from the patent document.

Various examples relate generally, but not exclusively, to support systems for scientific instruments, such as mass spectrometry systems.

Determining the identity of a compound is one of the main tasks in chemical analysis. For compounds in complex mixtures, fragmentation patterns of their ions can be used for this purpose. The corresponding mass spectra can provide both the elemental composition of the compound and a direct read-out of labile bonds. When combined with gas or liquid chromatography (GC or LC), mass spectrometry (MS) can distinguish hundreds of components in complex mixtures.

Steady advances in the sensitivity and resolution of mass spectrometers continue to provide new capabilities for detecting ever-increasing numbers of components in chemical mixtures. Dealing with the increasing number of identifiable compounds and associated digital data presents a significant challenge to the effective use of such advanced instruments. An example tool for analyzing high-resolution mass spectrometry (HR-MS) data includes the use of spectral libraries: collections of chemical structures and their mass spectra that can support fast, reliable identification of a compound whose fragmentation pattern is measured with the MS instrument.

Disclosed herein are, among other things, various examples, aspects, features, and embodiments of a machine learning (ML) system for analyte identification. In one example, an ML classification algorithm for small molecule identification presents a modified scoring and ranking system that considers certain metadata of the query spectra and of the corresponding mass spectral library matches. Such metadata may include, but are not limited to, the normalized collision energy, ion activation method parameters, and/or pertinent characteristics of the candidate compound pre-selection. In at least some examples, the modified scoring and ranking system beneficially enables the corresponding ML system to generate modified hitlists having a higher proportion of correct candidates at the top thereof due to the combined use of similarity scoring and data-driven model stacking.

One example provides a method performed via a computing device for providing support to an MS system, the method comprising: obtaining from a mass spectral library (i) an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and (ii) a first set of metadata corresponding to the ordered hitlist of reference spectra; obtaining from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluating an order of entries in the ordered hitlist with an ML model based on the first set of metadata and the second set of metadata.

Another example provides an apparatus for providing support to an MS system, the apparatus comprising: an interface device; a processing device; and a memory device including program code, wherein the memory device and the program code are configured to, with the interface device and the processing device, cause the apparatus at least to: obtain from a mass spectral library (i) an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and (ii) a first set of metadata corresponding to the ordered hitlist of reference spectra; obtain from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluate an order of entries in the ordered hitlist with an ML model based on the first set of metadata and the second set of metadata.

Mass spectral libraries are an important resource to analytical chemists across a variety of applications. For example, the National Institute of Standards and Technology (NIST) provides several curated libraries of mass spectral reference data. Additionally, NIST produces and distributes search software for interacting with the libraries. Other mass spectral libraries and the corresponding search software for interacting with those libraries are available from various additional providers including, but not limited to, mzCloud, Mass Frontier, and myLibrary.

A typical mass spectral library search algorithm calculates a matching score (also sometimes referred to as the match factor) between a query spectrum and a set of reference spectra. In some examples, the matching score is represented by an integer between 0 and 999 that quantifies the “similarity” between a pair of spectra. Different search algorithms usually differ in how they compute the matching scores. In different search configurations, the set of reference spectra may include the entire library of spectra (i.e., no pre-search is performed) or a selected subset of the library spectra identified during preselection or pre-search. The search algorithm typically returns a list of reference spectra sorted in the descending order of the matching score. The returned ordered list is often referred to as the “hitlist” of the query.

Three commonly employed algorithms implemented in NIST MS Search software are the normal-identity, simple-similarity, and hybrid-similarity searches. Each of these algorithms uses some or all of the following basic operations: pre-search, peak matching, dot-product calculation, matching score calculation, and hitlist ranking and display. When used, pre-searching selects a subset of the library spectra likely to score highly. The objective of the normal-identity search algorithm is to return a hitlist that contains the correct identification of the query spectrum, preferably at the top of the hitlist. The objective of both the “simple” and “hybrid” similarity searches is to return a hitlist that can help an analyst to propose a structure for their query compound (analyte).

Whenever an expert analyst reviews a hitlist, they can often pinpoint certain weaknesses in the putative identification of the analyte. One example weakness might occur when the respective experimental conditions of the reference spectrum and query do not sufficiently match. Another example weakness might occur when there are too few peaks for sufficiently confident identification, etc. Some embodiments disclosed herein are directed at replacing the manual, human-centric “expert analyses” of the hitlist by a machine learning (ML) model that can increase confidence in the identification workflow. In some examples, the disclosed ML system for analyte identification can beneficially achieve higher specificity (e.g., through reducing the number of false positives) compared to algorithms relying on the metrics derived mostly from the spectrum similarity determined based on the above-mentioned dot-product metrics and the like.

According to one example, an ML classification algorithm for small molecule identification presents a modified scoring and ranking system that considers annotations and metadata of the query and library-spectra matches. On top of the similarity metric between the unknown and candidate spectra, the ML classification algorithm employing a suitably selected classifier, such as, for example, Random Forest, Logistic Regression, Bayesian Network, XGBOOST, LightGBM and the like, takes into consideration certain metadata, such as the normalized collision energy, ion activation method parameters, and pertinent characteristics of the candidate compound pre-selection (such as the number of candidate compounds, average spectral matching score, and others). In some examples, the side information associated with the fragmentation spectra that is fed into the ML classification algorithm may include one or more of the following: precursor ion m/z, precursor ion formula, charge state, peaks count, peak sparseness, peak accuracy, peak intensity, peak distances, neutral losses, peaks breakdown curve, peaks formula, peaks structure, chemical class, compound class and one or more mean, median, average, standard deviations, standard errors, relative standard deviation, relative error or variance values thereof. In at least some examples, the modified scoring and ranking system beneficially enables the corresponding ML system to generate hitlists having a higher proportion of correct candidates at the top thereof.

1 FIG. 8 FIG. 9 FIG. 100 100 100 100 110 120 130 140 is a block diagram illustrating an ML systemfor analyte identification according to some examples. The systemmay be implemented by circuitry (e.g., including electrical and/or optical components), such as one or more programmed computing devices. Examples of computing devices that may, singly or in combination, implement the ML systemare described in more detail below in reference to. Additionally, examples of systems of interconnected computing devices, to which the corresponding MS instrument is connected, are described in more detail below in reference to. In the example shown, the ML systemincludes an input module, a service module, a model module, and an output module.

110 110 The input modulemay be provided as part of a user interface through which search queries can be submitted for analyte identification. In one example, a search query submitted through the input moduleincludes one or more fragmentation mass spectra acquired with the corresponding MS instrument. The submitted mass spectra typically correspond to the same unknown compound. Each mass spectrum is a part of a spectrum dataset that typically includes pertinent metadata in addition to the mass spectrum itself. Herein, the term “mass spectrum” refers to a list of (m/z, I) data points, where m, z, and I are mass, charge, and intensity, respectively. The metadata may include one or more spectrum labels, a set of sample characteristics, and/or a set of acquisition parameters with which the spectrum was measured by the MS instrument. Various embodiments may accept one or more of: electron ionization (EI) spectra, small molecule tandem spectra, and peptide tandem spectra. EI searches are typically performed with unit-mass resolution, whereas tandem searches typically accept high-resolution spectra using either relative (ppm) or absolute (m/z) tolerances.

120 110 130 130 130 120 130 120 4 FIG. The service moduleprovides connectivity between the input moduleand the model module. In some examples, the trained ML model provided via the model moduleresides at the same server as the pertinent mass spectral library. In some other examples, the trained ML model provided via the model moduleresides at a different network-connected server or at a local (to the user) computing device. In latter examples, the service modulealso provides operational interconnectivity between the model moduleand the mass spectral library (not explicitly shown, e.g., see). In some examples, the service moduleis configured to provide flexibility in data, library, and ML model access modes, e.g., with FaaS functions, various programming languages, “big data” solutions, and the like. Herein, the acronym “FaaS” stands for function as a service, which is a category of cloud computing services allowing customers to develop, run, and manage application functionalities without the complexity of building and maintaining the infrastructure typically associated with developing and launching an app.

130 In some examples, the trained ML model provided via the model moduleis a random forest classifier. In some other examples, suitable alternatives to the random forest classifier can similarly be used. Example alternatives include a gradient boosting model, a k-Nearest Neighbors (kNN) algorithm, a variational autoencoder, and so on.

140 110 120 130 140 130 140 1 FIG. In some examples, the output moduleis provided as part of the same user interface as the input module. In such examples, the service moduleadditionally provides operational connectivity between the model moduleand the output module, e.g., as indicated inby the dashed arrow. Example outputs generated with the trained ML model of the model moduleand provided to the user via the output moduleinclude one or more of: (i) the estimated probability of the compound characterized by the queried spectra belonging to a given compound class; (ii) the estimated probability of the compound characterized by the queried spectra belonging to a given chemical class; (iii) the estimated probabilities of the queried spectra being similar to some or all of the library spectra from the hitlist; and (iv) comparison plots for the queried spectra and the library spectra from the hitlist.

2 FIG. 130 130 210 220 210 202 204 206 208 212 212 220 222 140 is a block diagram illustrating the model moduleaccording to one example. In the example shown, the model moduleincludes an encoderand a trained ML model. The encoderoperates to transform the inputs,,, andinto a corresponding features vector. The features vectoris then processed with the trained ML modelto generate an outputfor the output module.

202 204 206 208 202 204 202 206 202 208 206 In some examples, the inputs,,, andare as follows. The inputhas a set of mass spectra of the analyte acquired with the corresponding MS instrument. In some examples, such set includes a single spectrum. In some other examples, such set includes a plurality of spectra. The inputhas a set of metadata corresponding to the spectra of the input. The inputhas the hitlist spectra obtained from the mass spectral library via a conventional search for closest matches to the spectra of the input. The inputhas a set of metadata corresponding to the hitlist spectra of the inputthat is retrieved from the mass spectral library together with those spectra.

202 204 206 208 210 202 204 206 208 210 212 212 202 204 206 208 222 220 In one example, to accommodate the widely varying inputs,,,for different analyte samples, the encoderfirst performs tokenization configured to map the inputs,,,to a corresponding token vector having a fixed length. The token vector is then further encoded by the encoderto obtain the corresponding features vector. The features vectorcan qualitatively be understood as containing the information about various features conveyed via the inputs,,,in the form that can be leveraged to guide the generation of the outputin the trained ML model.

3 FIG. 3 FIG. 210 210 310 320 330 202 204 206 208 310 212 330 is a block diagram illustrating the encoderaccording to another example. In the example shown, the encoderincludes a feature extraction block, a feature engineering block, and a feature encoding blockserially connected as indicated in. The inputs,,,are applied to the feature extraction block. The outputis generated by the feature encoding block.

310 202 204 206 208 310 312 310 320 320 322 320 330 330 130 212 220 In the feature extraction block, the inputs,,, andare used to identify and extract relevant features from the raw data. The corresponding operations of the blockinclude applying data preparation techniques, such as scaling, missing value imputation, and separating mixed variables into separate features on the raw dataset for further processing. A first intermediate outputgenerated with the feature extraction blockis used in the feature engineering blockto create new features or to transform existing features. The corresponding operations of the blockmay include: (i) for pairs of features, compute a first discrete difference of elements; and (ii) for individual features, computing a description of the set with respect to the group including, but not limited to, the rank, size, mean, standard deviation, and absolute deviation of the mean from raw numerical features as well as features based on relative abundance and matching peaks. A second intermediate outputgenerated by the feature engineering blockis further transformed in the feature encoding block. For example, certain ML algorithms work exclusively with numerical values and, as such, it may be advisable to transform categorical values of relevant features into numerical features, e.g., in the form of vector values. Consequently, in the block, certain categorical features, such as the types of analyzers, ion activation methods, and ionization methods, can be transformed from their respective categorical values into the relevant numerical features that can be processed by the ML model employed in the block. In some examples, the features vectoris additionally processed to make sure that only relevant features, such as the target label, are included therein, whereas outliers and missing values are not present in the dataset used for the training and validation of the ML model.

4 FIG. 400 202 204 206 208 400 412 412 204 210 is a block diagram illustrating a workflowfor preparing the inputs,,,according to one example. The workflowincludes operating the corresponding MS instrument to acquire one or more MS spectraof the analyte in question. The corresponding set of acquisition parameters for each of the acquired MS spectrais exported from the MS instrument to form at least a portion of the inputto the encoder.

400 412 420 416 420 416 430 420 412 430 430 420 420 The workflowalso includes using the acquired MS spectrato submit a corresponding search query to a mass spectral libraryfor performing a corresponding searchtherein. The mass spectral libraryhas a collection of searchable reference spectra annotated with compound identifiers and further associated with the corresponding experimental and compound-related metadata. The searchreturns a hitlistincluding a subset of the reference spectra from the mass spectral libraryand further including the corresponding similarity metrics with respect to the experimentally acquired MS spectra. In some examples, the similarity metrics used with the hitlistinclude the above-mentioned matching scores. A set of experimental and compound-related metadata corresponding to the reference spectra of the hitlistis also retrieved from the mass spectral library. In some examples, the mass spectral libraryis the NIST library or the mzCloud library.

400 420 130 420 420 208 210 The workflowoptionally includes trimming or filtering the set of metadata retrieved from the mass spectral libraryto remove parts of the metadata representing the features not used with the model module. Depending on the system embodiment and the specifics of the used mass spectral library, the original set of metadata retrieved from the mass spectral libraryor the trimmed/filtered subset thereof is used to form the inputto the encoder.

400 432 430 412 432 202 206 210 432 The workflowalso includes applying one or more preprocessing operationsto the hitlistand the associated MS spectra. Outputs generated with the preprocessing operationsare used to from the inputsandto the encoder. Examples of the preprocessing operationsinclude: removing the compound labels, removing at least some of the annotations, removing some library compounds from the hitlist based on structural similarity or dissimilarity, removing specific peaks from the hitlist, and the like.

5 FIG. 2 FIG. 500 220 500 220 is a block diagram illustrating a training processused to train the ML model() according to one example. In general, the ML-model training process depends on the type of the model. In the example shown, the training processcorresponds to an embodiment in which the ML modelis a random forest classifier. Based on the provided description, a person of ordinary skill in the pertinent art will be able to implement other training processes suitable for other model types, without any undue experimentation.

A random forest classifier includes a plurality of decision trees, each of which outputs a respective prediction. When performing a classification task, each decision tree in the random forest votes for one of the classes to which the input may belong. After all of the trees have voted, the random forest classifier counts which class has the most populous vote, and this class is what the random forest classifier outputs as a final prediction. An individual decision tree splits data into groups of data based on the features represented by the data. The decision tree will continue to split the data into groups until a small set of data under one label (class representation) remains. The decision tree determines where to split the represented features based on a purity measure that measures information gain. For the classification task, the decision tree makes that decision based on the Gini index or entropy, and in the case of regression, based on the residual sum of squares.

i. Suppose the number of observations is N. These N observations will be sampled at random with replacement. ii. Suppose there are M features in the observations. A number m, where m<M, will be selected at random at each node from the total number of features, M. The best split on these m variables is used to split the node, and this value typically remains constant as the forest grows. iii. Each decision tree in the forest is grown to its largest extent. iv. The forest will output a prediction based on the aggregated predictions of the trees in the forest. In different examples, the aggregation can be either the majority vote or average. The random forest logic can briefly be described as follows:

500 220 502 502 210 504 212 504 510 520 520 220 510 504 220 220 2 FIG. The training processapplied to the random forest classifier′ uses a volumeof reference MS data with the corresponding experimental-conditions and compound-related metadata. The volumeis encoded with the encoderinto a corresponding pluralityof features vectorsfor which the ground truth classification is known. The pluralityis then split into a validation datasetand a training dataset. The training datasetis used to train the random forest classifier′, while the validation datasetis used to evaluate the classifier's performance. In one example, 75% of the pluralityis used for training, and the remaining 25% is used for evaluation. In other examples, other splitting ratios can also be used. Herein, the notation′ indicates that the random forest classifier is not fully trained yet. As indicated above, the corresponding fully trained random forest classifier is denoted with the reference numeral(e.g., see).

220 510 520 500 530 220 510 520 530 532 220 530 532 220 130 500 2 FIG. When properly configured and trained, the random forest classifiershould perform approximately equally on both of the datasetsand. The training processincludes an evaluation moduleconfigured to evaluate the relative performance of the random forest classifier′ on the datasetsand. Based on the evaluation, adjustmentsare applied to the random forest classifier′. The evaluationsand the adjustmentsare iteratively repeated until the applicable training-stoppage criterion is met. At that point, the random forest classifier′ is locked and is deemed to be fully trained and suitable for use in the model module(). Example operations performed during the training processmay include some or all of the following: data exploration, exploratory data analysis (EDA), setting threshold values, splitting the training data into training and validation datasets, scaling the data representing different features to the same scale, instantiating the forest, making predictions, evaluating the classifier performance using a scoring method, evaluating the classifier performance using a confusion matrix, ranking the features in the order of importance, choosing the number of trees for the forest, choosing the metric(s) used to split the features vectors into data groups, tuning the forest parameters using a random grid search, creating a dictionary of values to choose from, and tuning the forest parameters using a grid search within a delimited parameter space.

6 FIG. 1 FIG. 1 2 4 6 FIGS.,,, and 600 600 100 600 is a flowchart illustrating a methodfor analyte identification according to some examples. The methodcan be implemented, e.g., using the ML systemof. The methodis described below in continued reference to.

600 100 602 602 1 FIG. The methodincludes the ML system() obtaining, from an MS system, a set of fragmentation spectra of an analyte and the corresponding set of metadata (in a block). In some examples, the set of fragmentation spectra is experimentally measured with the MS system. In some other examples, the set of fragmentation spectra is retrieved from a network-connected storage to which it was previously transferred from the MS system upon completion of the experimental acquisition. In various examples, the set of metadata obtained in the blockincludes side information pertaining to the experimental conditions, system configuration, and analyte sample.

600 100 420 604 604 602 420 604 420 120 602 4 FIG. 1 FIG. The methodalso includes the ML systemobtaining, from the mass spectral library(), an ordered hitlist and the corresponding set of metadata (in a block). In various examples, the ordered hitlist obtained in the blockincludes reference spectra corresponding to the set of fragmentation spectra of the analyte obtained in the block. The hitlist is typically ordered in a descending order of the matching scores of the entries. In a typical example, a matching score is computed at the mass spectral libraryusing a spectrum similarity measure based on approaches, such as the dot product or Spearman's rank correlation coefficient of the corresponding pair of spectra. Operations of the blocktypically include submitting to the mass spectral library, via the service module(), a search query with the set of fragmentation spectra obtained in the block.

602 604 In various examples, the sets of metadata obtained in the blocks,may include one or more of the following components: a normalized or absolute collision energy; one or more parameters of an ion activation method; a number of candidate compounds; one or more matching scores; an identity of a precursor ion; a precursor ion formula, a fragment ion formula, a number of peaks in a spectrum; a sparseness measure; spectrum peak counts, an intensity of a peak and a corresponding accuracy; one or more distances between peaks in a spectrum, neutral losses, peaks breakdown curves; one or more spectrum labels; a mean; a median; an average; standard deviations; standard errors; a relative standard deviation; relative errors; and/or a variance value of a selected numerical characteristic. In some examples, labels and parameters are grouped into subcategories including, but not limited to, query and library spectra metadata, constructed features metadata differences, score statistics (for example, rank, size, number of unique values, mean, standard deviation, z-score, absolute deviations) over different groupings (such as, a group by query spectrum, a group by query spectrum and library compound, a group by query spectrum and library compound), and a maximum score in the library compound group. The subcategories are then used to calculate the statistics per query spectrum. In some examples, the two sets of metadata may have the same composition of components. In some other examples, the two sets of metadata may have different respective compositions of components.

600 100 220 606 602 604 606 210 212 602 604 212 220 212 220 606 420 2 FIG. 2 FIG. The methodalso includes the ML systemevaluating the order of entries in the hitlist with the ML model(in a block). In various examples, the evaluation is based, inter alia, on the two sets of metadata obtained in the blocks,. In one example, operations of the blockinclude: (i) with the encoder(), generating the features vectorbased on the set of fragmentation spectra of the analyte, reference spectra from the ordered hitlist, and two sets of metadata obtained in the blocks,; and (ii) applying the features vectorto the ML model(). In response to the features vector, the ML modelmay change the order of the hitlist entries in at least some cases. In such cases, the operations the blockmay also include determining an adjustment value to a matching score provided by the mass spectral librarywith the respective reference spectrum of the ordered hitlist.

606 220 420 604 420 In some examples, operations of the blockinclude the ML modeldetermining an estimated probability of the analyte belonging to a specified compound class, an estimated probability of the analyte belonging to a specified chemical class, and/or an estimated probability of the analyte being from a same compound class as a compound corresponding to a selected reference spectrum from the ordered hitlist received from the mass spectral libraryin the block. In some cases, at least some of these estimated probabilities may differ from a corresponding probability determined at the mass spectral library.

600 100 608 606 608 608 604 608 608 608 600 The methodalso includes the ML systemtaking a responsive action (in a block). The taken responsive action is typically based on the evaluation results of the block. In one example, operations of the blockinclude displaying for a user, on a display device, a modified hitlist having a changed order of the entries therein. In another example, operations of the blockinclude notifying the user that the order of entries in the hitlist obtained in the blockremains unchanged. In yet another example, operations of the blockinclude suggesting a chemical identity of the analyte and displaying the corresponding molecular and structural information on the display device. In other examples, other responsive actions may also be taken in the block. Upon completion of the operations of the block, the methodis terminated.

7 7 FIGS.A-C 7 FIG.A 2 FIG. 7 FIG.A 600 202 204 204 1 1 graphically illustrate certain operations of the methodaccording to one example. More specifically,graphically shows an MS spectrumof an analyte in question acquired by operating the corresponding MS instrument. The normalized collision energy (NCE) value corresponding to that experimental run is NCE=80%. This NCE value is an example of the metadata() and, as such, is labeled inusing the reference numeral.

7 7 FIGS.B andC 4 FIG. 7 7 FIGS.B-C 206 206 420 416 202 206 206 420 208 208 208 1 2 1 1 2 1 2 graphically show hitlist spectraand, respectively, obtained from the mass spectral libraryvia the search() for closest matches to the MS spectrum. The metadata corresponding to the hitlist spectraandthat are retrieved from the mass spectral librarytogether with those spectra include the NCE values NCE=20% and NCE=80%, respectively. These NCE values are examples of the metadataand, as such, are labeled inusing the reference numeralsand, respectively.

420 206 420 206 206 420 206 4 FIG. 1 1 2 1 In the example shown, the mass spectral library() is configured to rank the hitlist entries based on the spectrum similarity metric, e.g., corresponding to approaches such as the cosine (dot product) metric, without considering the NCE values. According to that methodology, the spectrumhas the highest matching score among the hitlist entries. As such, the mass spectral libraryranks the spectrumto be at the top of the hitlist. Accordingly, the spectrumis listed lower in the hitlist outputted by the mass spectral librarythan the spectrum.

206 100 204 208 208 204 208 130 420 206 206 1 1 1 2 2 2 1 FIG. 1 2 FIGS., Based on the above hitlist rankings, the chemical compound corresponding to the spectrummay be predicted to be the same as or closest to the analyte in question. This however is an erroneous analyte identification that can beneficially be corrected with the ML system(). More specifically, by also considering the NCE values,, andas parts of the metadataand, the model module() re-ranks the hitlist received from the mass spectral library. After such reranking, the resulting modified hitlist has the spectrumat the top of the hitlist. Based on the modified hitlist rankings, the chemical compound corresponding to the spectrummay be predicted to be the same as or closest to the analyte in question. In this particular example, the latter prediction provides the correct analyte identification.

8 FIG. 800 100 800 130 800 400 500 600 is a block diagram illustrating a computing deviceone or more instances of which can be used in the ML systemaccording to some examples. In some examples, one instance of the computing deviceis configured to implement the model module. In some examples, one or more instances of the computing devicecan be used in the workflow, the process, and/or the method.

800 800 802 804 800 800 810 810 8 FIG. 8 FIG. The computing deviceofis illustrated as having a number of components, but any one or more of these components may be omitted or duplicated, as suitable for the application and setting. In some embodiments, some or all of the components included in the computing devicemay be attached to one or more motherboards and enclosed in a housing. In some embodiments, some of those components may be fabricated onto a single system-on-a-chip (SoC) (e.g., the SoC may include one or more electronic processing devicesand one or more storage devices). Additionally, in various embodiments, the computing devicemay not include one or more of the components illustrated in, but may include interface circuitry for coupling to the one or more components using any suitable interface (e.g., a Universal Serial Bus (USB) interface, a High-Definition Multimedia Interface (HDMI) interface, a Controller Area Network (CAN) interface, a Serial Peripheral Interface (SPI) interface, an Ethernet interface, a wireless interface, or any other appropriate interface). For example, the computing devicemay not include a display device, but may include display device interface circuitry (e.g., a connector and driver circuitry) to which an external display devicemay be coupled.

800 802 802 The computing deviceincludes a processing device(e.g., one or more processing devices). As used herein, the terms “electronic processor device” and “processing device” interchangeably refer to any device or portion of a device that processes electronic data from registers and/or memory to transform that electronic data into other electronic data that may be stored in registers and/or memory. In various embodiments, the processing devicemay include one or more digital signal processors (DSPs), application-specific integrated circuits (ASICs), central processing units (CPUs), graphics processing units (GPUs), server processors, or any other suitable processing devices.

800 804 804 804 802 804 802 800 The computing devicealso includes a storage device(e.g., one or more storage devices). In various embodiments, the storage devicemay include one or more memory devices, such as random-access memory (RAM) devices (e.g., static RAM (SRAM) devices, magnetic RAM (MRAM) devices, dynamic RAM (DRAM) devices, resistive RAM (RRAM) devices, or conductive-bridging RAM (CBRAM) devices), hard drive-based memory devices, solid-state memory devices, networked drives, cloud drives, or any combination of memory devices. In some embodiments, the storage devicemay include memory that shares a die with the processing device. In such an embodiment, the memory may be used as cache memory and include embedded dynamic random-access memory (eDRAM) or spin transfer torque magnetic random-access memory (STT-MRAM), for example. In some embodiments, the storage devicemay include non-transitory computer readable media having instructions thereon that, when executed by one or more processing devices (e.g., the processing device), cause the computing deviceto perform any appropriate ones of the methods disclosed herein below or portions of such methods.

800 806 806 806 800 806 800 806 806 806 806 806 The computing devicefurther includes an interface device(e.g., one or more interface devices). In various embodiments, the interface devicemay include one or more communication chips, connectors, and/or other hardware and software to govern communications between the computing deviceand other computing devices. For example, the interface devicemay include circuitry for managing wireless communications for the transfer of data to and from the computing device. The term “wireless” and its derivatives may be used to describe circuits, devices, systems, methods, techniques, communications channels, etc., that may communicate data via modulated electromagnetic radiation through a nonsolid medium. The term does not imply that the associated devices do not contain any wires, although in some embodiments they might not. Circuitry included in the interface devicefor managing wireless communications may implement any of a number of wireless standards or protocols, including but not limited to Institute for Electrical and Electronic Engineers (IEEE) standards including Wi-Fi (IEEE 802.11 family), IEEE 802.16 standards, Long-Term Evolution (LTE) project along with any amendments, updates, and/or revisions (e.g., advanced LTE project, ultramobile broadband (UMB) project (also referred to as “3GPP2”), etc.). In some embodiments, circuitry included in the interface devicefor managing wireless communications may operate in accordance with a Global System for Mobile Communication (GSM), General Packet Radio Service (GPRS), Universal Mobile Telecommunications System (UMTS), High Speed Packet Access (HSPA), Evolved HSPA (E-HSPA), or LTE network. In some embodiments, circuitry included in the interface devicefor managing wireless communications may operate in accordance with Enhanced Data for GSM Evolution (EDGE), GSM EDGE Radio Access Network (GERAN), Universal Terrestrial Radio Access Network (UTRAN), or Evolved UTRAN (E-UTRAN). In some embodiments, circuitry included in the interface devicefor managing wireless communications may operate in accordance with Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA), Digital Enhanced Cordless Telecommunications (DECT), Evolution-Data Optimized (EV-DO), and derivatives thereof, as well as any other wireless protocols that are designated as 3G, 4G, 5G, and beyond. In some embodiments, the interface devicemay include one or more antennas (e.g., one or more antenna arrays) configured to receive and/or transmit wireless signals.

806 806 806 806 806 806 806 In some embodiments, the interface devicemay include circuitry for managing wired communications, such as electrical, optical, or any other suitable communication protocols. For example, the interface devicemay include circuitry to support communications in accordance with Ethernet technologies. In some embodiments, the interface devicemay support both wireless and wired communication, and/or may support multiple wired communication protocols and/or multiple wireless communication protocols. For example, a first set of circuitry of the interface devicemay be dedicated to shorter-range wireless communications such as Wi-Fi or Bluetooth, and a second set of circuitry of the interface devicemay be dedicated to longer-range wireless communications such as global positioning system (GPS), EDGE, GPRS, CDMA, WiMAX, LTE, EV-DO, or others. In some other embodiments, a first set of circuitry of the interface devicemay be dedicated to wireless communications, and a second set of circuitry of the interface devicemay be dedicated to wired communications.

800 808 808 800 800 The computing devicealso includes battery/power circuitry. In various embodiments, the battery/power circuitrymay include one or more energy storage devices (e.g., batteries or capacitors) and/or circuitry for coupling components of the computing deviceto an energy source separate from the computing device(e.g., to AC line power).

800 810 810 The computing devicealso includes a display device(e.g., one or multiple individual display devices). In various embodiments, the display devicemay include any visual indicators, such as a heads-up display, a computer monitor, a projector, a touchscreen display, a liquid crystal display (LCD), a light-emitting diode display, or a flat panel display.

800 812 812 The computing devicealso includes additional input/output (I/O) devices. In various embodiments, the I/O devicesmay include one or more data/signal transfer interfaces, audio I/O devices (e.g., microphones or microphone arrays, speakers, headsets, earbuds, alarms, etc.), audio codecs, video codecs, printers, sensors (e.g., thermocouples or other temperature sensors, humidity sensors, pressure sensors, vibration sensors, etc.), image capture devices (e.g., one or more cameras), human interface devices (e.g., keyboards, cursor control devices, such as a mouse, a stylus, a trackball, or a touchpad), etc.

806 812 806 812 802 804 806 812 802 804 Depending on the specific embodiment, various components of the interface devicesand/or I/O devicescan be configured to output suitable control signals, receive suitable control/telemetry signals, and receive and transmit data streams. In some examples, the interface devicesand/or I/O devicesinclude one or more analog-to-digital converters (ADCs) for transforming received analog signals into a digital form suitable for operations performed by the processing deviceand/or the storage device. In some additional examples, the interface devicesand/or I/O devicesinclude one or more digital-to-analog converters (DACs) for transforming digital signals provided by the processing deviceand/or the storage deviceinto an analog form suitable for being transmitted through a communication channel.

9 FIG. 900 100 400 500 600 910 920 930 940 900 is a block diagram illustrating an MS instrument support systemin which some or all of the scientific instrument support methods disclosed herein may be performed according to some examples. Various MS instrument support modules and methods disclosed herein (e.g., the system, the workflow, the process, and/or the method) may be implemented by one or more of an MS instrument, a user local computing device, a service computing device, and a remote computing deviceof the MS instrument support system.

910 920 930 940 800 910 920 930 940 800 8 FIG. Any of the MS instrument, the user local computing device, the service computing device, or the remote computing devicemay include any of the embodiments of the computing devicedescribed above in reference to, and any of the MS instrument, the user local computing device, the service computing device, or the remote computing devicemay take the form of any appropriate ones of the embodiments of the computing device.

910 920 930 940 802 804 806 802 802 802 910 920 930 940 804 804 804 910 920 930 940 806 806 806 910 920 930 940 8 FIG. 8 FIG. 8 FIG. The scientific instrument, the user local computing device, the service computing device, and/or the remote computing devicemay each include a respective processing device, a respective storage device, and a respective interface device. The processing devicemay take any suitable form, including the form of any of the processing devicesdiscussed herein with reference to, and the processing devicesincluded in different ones of the scientific instrument, the user local computing device, the service computing device, or the remote computing devicemay take the same form or different forms. The storage devicemay take any suitable form, including the form of any of the storage devicesdiscussed herein with reference to, and the storage devicesincluded in different ones of the scientific instrument, the user local computing device, the service computing device, or the remote computing devicemay take the same form or different forms. The interface devicemay take any suitable form, including the form of any of the interface devicesdiscussed herein with reference to, and the interface devicesincluded in different ones of the scientific instrument, the user local computing device, the service computing device, or the remote computing devicemay take the same form or different forms.

910 920 930 940 900 908 908 806 900 806 800 900 910 920 930 940 908 930 908 806 806 910 910 908 930 920 908 920 910 910 8 FIG. 9 FIG. The MS instrument, the user local computing device, the service computing device, and the remote computing devicemay be in communication with other elements of the MS instrument support systemvia communication pathways. The communication pathwaysmay communicatively couple the interface devicesof different ones of the elements of the MS instrument support system, as shown, and may be wired or wireless communication pathways (e.g., in accordance with any of the communication techniques discussed herein with reference to the interface devicesof the computing deviceof). The particular MS instrument support systemdepicted inincludes communication pathways between each pair of the scientific instrument, the user local computing device, the service computing device, and the remote computing device, but this “fully connected” implementation is purely illustrative, and in various embodiments, various ones of the communication pathwaysmay be absent. For example, in some embodiments, the service computing devicemay not have a direct communication pathwaybetween its interface deviceand the interface deviceof the MS instrumentbut may instead communicate with the MS instrumentvia the communication pathwaybetween the service computing deviceand the user local computing deviceand the communication pathwaybetween the user local computing deviceand the MS instrument. The MS instrumentmay be included into a more-general and/or more-versatile scientific instrument.

10 10 FIGS.A-B 10 FIG.A 10 FIG.B 600 1002 1002 1010 1020 1010 1012 1014 1016 1018 1020 illustrate improvements achievable with the methodover conventional analyte identification methods according to one example. More specifically,graphically illustrates a query spectrumcorresponding to an example control compound (Rutin, in this case) used to comparatively evaluate the relative performance of several analyte identification methods.shows a table containing a ranked list of library hits obtained for the query spectrum. The shown table has six columns that are labeled-, respectively. The columndisplays the rank of the entries. The columndisplays the match scores for the three indicated algorithms. The columndisplays the compound names from the library. The columndisplays the compound structures. The columndisplays the hit summaries. The columndisplays the hit metadata from the library. The entries are ranked based on their HighChem HighRes scores.

1018 1002 The top five hits are characterized by relatively high similarity scores. However, none of the top twelve hits is the hit on the correct actual compound (Rutin), which only appears at the thirteen's position. Although each of the top three hits has HighChem HighRes score above 80, a closer look at the hit summaries (the column) reveals that the number of matching peaks between the query spectrumand the corresponding library spectra is relatively low, at two.

1020 130 130 13 10 FIG.B Additionally, the intensity of peaks does not match sufficiently accurately, as indicated by some of the metadata (the column), such as the collision energy levels. In contrast, according to various embodiments, the ML modelis trained to learn the pertinent metadata and parameters to properly re-rank the library hits such that the correct hit gets pushed closer to the top of the hit list. In the example shown, a properly trained embodiment of the ML modelwill change the rank of Rutin from the numberindicated in the table ofto a number within at least the top five hits. In other words, example embodiments disclosed herein will enable a significant reduction in the occurrence of false positives in the compound identification process.

1 10 FIGS.- According to one example disclosed above, e.g., in the summary section and/or in reference to any one or any combination of some or all of, provided is a method performed via a computing device for providing support to a mass spectrometry (MS) system, the method comprising: obtaining from a mass spectral library (i) an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and (ii) a first set of metadata corresponding to the ordered hitlist of reference spectra; obtaining from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluating an order of entries in the ordered hitlist with a machine learning (ML) model based on the first set of metadata and the second set of metadata.

In some examples of the above method, obtaining the ordered hitlist includes submitting a search query with the set of fragmentation spectra to the mass spectral library.

In some examples of any of the above methods, at least one of the first and second sets of metadata includes a respective set of one or more parameters selected from the group consisting of: a normalized or absolute collision energy; one or more parameters of an ion activation method; a number of candidate compounds; one or more values of a matching score; an identity of a precursor ion; a number of peaks in a spectrum; a sparseness measure; an intensity of a peak and a corresponding accuracy; one or more distances between peaks in a spectrum; one or more spectrum labels; and a mean or average value of a selected numerical characteristic.

In some examples of any of the above methods, the matching score is computed using a dot product of a corresponding pair of spectra.

In some examples of any of the above methods, the ordered hitlist is ordered in a descending order of matching scores of the entries.

In some examples of any of the above methods, the ML model includes a model selected from the group consisting of: a random forest classifier; a gradient boosting model; a k-nearest-neighbors algorithm; and a variational autoencoder.

In some examples of any of the above methods, the evaluating comprises: with an encoder, generating a features vector based on the set of fragmentation spectra of the analyte, the reference spectra from the ordered hitlist, the first set of metadata, and the second set of metadata; and applying the features vector to the ML model.

In some examples of any of the above methods, the ML model is configured to change the order of the entries.

In some examples of any of the above methods, the method further comprises displaying, on a display device, a modified hitlist having the changed order of the entries.

In some examples of any of the above methods, the ML model is configured to determine one or more of: an estimated probability of the analyte belonging to a specified compound class; an estimated probability of the analyte belonging to a specified chemical class; and an estimated probability of the analyte being from a same compound class as a compound corresponding to a selected reference spectrum from the ordered hitlist.

In some examples of any of the above methods, at least one of the estimated probabilities differs from a corresponding probability determined at the mass spectral library.

In some examples of any of the above methods, the ML model is configured to determine an adjustment value to a matching score value provided by the mass spectral library with a respective reference spectrum of the ordered hitlist.

A non-transitory computer-readable medium storing instructions that, when executed by the computing device, cause the computing device to perform operations comprising any one of the above methods.

1 10 FIGS.- According to one example disclosed above, e.g., in the summary section and/or in reference to any one or any combination of some or all of, provided is an apparatus for providing support to a mass spectrometry (MS) system, the apparatus comprising: an interface device; a processing device; and a memory device including program code, wherein the memory device and the program code are configured to, with the interface device and the processing device, cause the apparatus at least to: obtain from a mass spectral library (i) an ordered hitlist of reference spectra corresponding to a set of fragmentation spectra of an analyte acquired with the MS system; and (ii) a first set of metadata corresponding to the ordered hitlist of reference spectra; obtain from the MS system a second set of metadata corresponding to the set of fragmentation spectra; and evaluate an order of entries in the ordered hitlist with a machine learning (ML) model based on the first set of metadata and the second set of metadata.

In some examples of the above apparatus, at least one of the first and second sets of metadata includes a respective set of one or more parameters selected from the group consisting of: a normalized or absolute collision energy; one or more parameters of an ion activation method; a number of candidate compounds; one or more values of a matching score; an identity of a precursor ion; a number of peaks in a spectrum; a sparseness measure; an intensity of a peak and a corresponding accuracy; one or more distances between peaks in a spectrum; one or more spectrum labels; and a mean or average value of a selected numerical characteristic.

In some examples of any of the above apparatus, the ML model includes a model selected from the group consisting of: a random forest classifier; a gradient boosting model; a k-nearest-neighbors algorithm; and a variational autoencoder.

In some examples of any of the above apparatus, the memory device and the program code are further configured to, with the interface device and the processing device, cause the apparatus to: with an encoder, generate a features vector based on the set of fragmentation spectra of the analyte, the reference spectra from the ordered hitlist, the first set of metadata, and the second set of metadata; and apply the features vector to the ML model.

In some examples of any of the above apparatus, the ML model is configured to change the order of the entries.

In some examples of any of the above apparatus, the memory device and the program code are further configured to, with the interface device and the processing device, cause the apparatus to display, on a display device, a modified hitlist having the changed order of the entries.

In some examples of any of the above apparatus, the ML model is configured to determine one or more of: an estimated probability of the analyte belonging to a specified compound class; an estimated probability of the analyte belonging to a specified chemical class; and an estimated probability of the analyte being from a same compound class as a compound corresponding to a selected reference spectrum from the ordered hitlist.

It is to be understood that the above description is intended to be illustrative and not restrictive. Many implementations and applications other than the examples provided would be apparent upon reading the above description. The scope should be determined, not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. It is anticipated and intended that future developments will occur in the technologies discussed herein, and that the disclosed systems and methods will be incorporated into such future examples. In sum, it should be understood that the application is capable of modification and variation.

All terms used in the claims are intended to be given their broadest reasonable constructions and their ordinary meanings as understood by those knowledgeable in the technologies described herein unless an explicit indication to the contrary is made herein. In particular, use of the singular articles such as “a,” “the,” “said,” etc. should be read to recite one or more of the indicated elements unless a claim recites an explicit limitation to the contrary.

The Abstract is provided to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in various examples for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed subject matter incorporate more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in fewer than all features of a single disclosed example. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separately claimed subject matter.

Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value or range.

Although the elements in the following method claims, if any, are recited in a particular sequence with corresponding labeling, unless the claim recitations otherwise imply a particular sequence for implementing some or all of those elements, those elements are not necessarily intended to be limited to being implemented in that particular sequence.

Unless otherwise specified herein, the use of the ordinal adjectives “first,” “second,” “third,” etc., to refer to an object of a plurality of like objects merely indicates that different instances of such like objects are being referred to, and is not intended to imply that the like objects so referred-to have to be in a corresponding order or sequence, either temporally, spatially, in ranking, or in any other manner.

Unless otherwise specified herein, in addition to its plain meaning, the conjunction “if” may also or alternatively be construed to mean “when” or “upon” or “in response to determining” or “in response to detecting,” which construal may depend on the corresponding specific context. For example, the phrase “if it is determined” or “if [a stated condition] is detected” may be construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]. ”

Also, for purposes of this description, the terms “couple,” “coupling,” “coupled,” “connect,” “connecting,” or “connected” refer to any manner known in the art or later developed in which energy is allowed to be transferred between two or more elements, and the interposition of one or more additional elements is contemplated, although not required. Conversely, the terms “directly coupled,” “directly connected,” etc., imply the absence of such additional elements.

The functions of the various elements shown in the figures, including any functional blocks labeled as “processors” and/or “controllers,” may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and nonvolatile storage. Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.

As used in this application, the terms “circuit,” “circuitry” may refer to one or more or all of the following: (a) hardware-only circuit implementations (such as implementations in only analog and/or digital circuitry); (b) combinations of hardware circuits and software, such as (as applicable): (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions); and (c) hardware circuit(s) and or processor(s), such as a microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g., firmware) for operation, but the software may not be present when it is not needed for operation.” This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in server, a cellular network device, or other computing or network device.

It should be appreciated by those of ordinary skill in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the disclosure. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

August 12, 2024

Publication Date

February 12, 2026

Inventors

Michal Raab
Marek Wadinger
Maria Falaq
Marynka Ulaszewska

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “MACHINE LEARNING SYSTEM FOR ANALYTE IDENTIFICATION” (US-20260045324-A1). https://patentable.app/patents/US-20260045324-A1

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.