Gender bias assessment techniques that are available currently only consider two genders viz male and female and is not inclusive of lesbian, gay, bisexual, transgender and queer (LGBTQ) genders. Present disclosure provides method and system for assessing gender fairness of large language models (LLM). The system first receive context aware prompt comprising context and prompt and then converts names present in the context using name anonymization technique. The system then convert many-to-many pronouns present in no-name-context into many to one pronouns which are then used to create counterfactual context for each gender pronoun group. Thereafter, system queries LLM with counterfactual prompt created using counterfactual context to obtain counterfactual response which is then used to create embedding data frame. Further, system trains Gaussian Mixture Model using embedding data frame which is then utilized to calculate cluster distances. Finally, system assess gender bias based on cluster distances and predefined distance threshold.
Legal claims defining the scope of protection, as filed with the USPTO.
. A processor implemented method, comprising:
. The processor implemented method of, wherein displaying the gender bias result comprises:
. The processor implemented method of, wherein displaying the gender bias result further comprises:
. The processor implemented method of, wherein conversion of the plurality of many-to-many pronouns present in the no-name-context into the plurality of secondary many-to-one pronouns comprises:
. The processor implemented method of, wherein the step of creating the counterfactual context for each gender pronoun group of the plurality of predefined gender pronoun groups based on the pronoun classification context comprises:
. A system, comprising:
. The system of, wherein for displaying the gender bias result, the one or more hardware processors () are configured by the instructions to:
. The system of, wherein for displaying the gender bias result, the one or more hardware processors () are further configured by the instructions to:
. The system of, wherein for converting of the plurality of many-to-many pronouns present in the no-name-context into the plurality of secondary many-to-one pronouns, the one or more hardware processors () are further configured by the instructions to:
. The system of, wherein for creating the counterfactual context for each gender pronoun group of the plurality of predefined gender pronoun groups based on the pronoun classification context, the one or more hardware processors are further configured by the instructions to:
. One or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause:
. The one or more non-transitory machine readable information storage mediums of, wherein displaying the gender bias result comprises:
. The one or more non-transitory machine readable information storage mediums of, wherein displaying the gender bias result further comprises:
. The one or more non-transitory machine readable information storage mediums of, wherein conversion of the plurality of many-to-many pronouns present in the no-name-context into the plurality of secondary many-to-one pronouns comprises:
. The one or more non-transitory machine readable information storage mediums of, wherein the step of creating the counterfactual context for each gender pronoun group of the plurality of predefined gender pronoun groups based on the pronoun classification context comprises:
Complete technical specification and implementation details from the patent document.
This U.S. patent application claims priority under 35 U.S.C. § 119 to: India application No. 202421039028, filed on May 17, 2024. The entire contents of the aforementioned application are incorporated herein by reference.
The disclosure herein generally relates to large language models, and, more particularly, to a method and a system for assessing gender fairness of large language models.
In today's technical era, large language models (LLMs) are being used in varied applications. From content creation to virtual assistants/customer support chatbots, LLMs have found their place in almost all the areas including enterprise applications.
In most enterprise applications, context aware prompts are used by the enterprise/business users to leverage LLMs. The context aware prompts generally comprise context that has enterprise information and the prompts that are generally given by the user to the LLMs based on the given context. The context is usually enterprise data based on which the business user will need a query to be answered. The LLM, based on the context aware prompt, provide a response which may be relevant and valuable to the business user.
In today's world, where diversity, equity and inclusiveness is paramount, the fairness aspect of the responses generated by the LLM is also equally important for the enterprises. Currently available LLMs like GPT 3.5, GPT 4, Claude 2 etc., are trained on data that is available on the Internet (e.g., common crawl data˜ petabytes of data) among other datasets. As this data is from the Internet, it may contain data that has gender bias, due to the fact that such kind of bias is prevalent on the Internet. However, when such LLMs are used as a part of an enterprise application, it is essential that the responses of the LLM do not contain such kind of gender bias.
Currently, there are many techniques available for assessing gender bias of LLMs. However, available techniques consider only two genders viz male and female and is not inclusive of lesbian, gay, bisexual, transgender and queer (LGBTQ) genders. Further, the available techniques are inefficient in case the LLM model used is black box due to the inaccessibility of internal layers and structure of these black box models
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one aspect, there is provided a processor implemented method for assessing gender fairness of large language models. The method comprises receiving, by a system via one or more hardware processors, a context aware prompt, the context aware prompt comprising a context and a prompt; masking, by the system via the one or more hardware processors, one or more names present in the context using a name anonymization technique to obtain a no-name-context, wherein the no-name-context comprises a plurality of primary many-to-one pronouns and a plurality of many-to-many pronouns; converting, by the system via the one or more hardware processors, the plurality of many-to-many pronouns present in the no-name-context into a plurality of secondary many-to-one pronouns, wherein an intermediate counterfactual context is obtained after the conversion, wherein the intermediate counterfactual context comprises a plurality of many-to-one pronouns, and wherein the plurality of many-to-one pronouns is a combination of the plurality of primary many-to-one pronouns and the plurality of secondary many-to-one pronouns; identifying, by the system via the one or more hardware processors, each many to one pronoun of the plurality of many-to-one pronouns and one or more honorific titles that are present in the intermediate counterfactual context using a first custom pattern finder, wherein the first custom pattern finder creates a first custom pattern of many to one pronouns and the one or more honorific titles in which each many to one pronoun and each honorific title is separated by an operator; predicting, by the system via the one or more hardware processors, a pronoun classification for each many-to-one pronoun of the plurality of many-to-one pronouns and each honorific title of the one or more honorific titles present in the first custom pattern created for the intermediate counterfactual context using a pre-trained decision tree classifier, wherein each many-to-one pronoun and each honorific title is replaced with an associated predicted pronoun classification in the intermediate counterfactual context to obtain a pronoun classification context; creating, by the system via the one or more hardware processors, a counterfactual context for each gender pronoun group of a plurality of predefined gender pronoun groups based on the pronoun classification context; inserting, by the system via the one or more hardware processors, the counterfactual context generated for each gender pronoun group into the prompt to obtain a counterfactual prompt for each gender pronoun group; querying, by the system via the one or more hardware processors, a large language model (LLM) with the counterfactual prompt obtained for each gender pronoun group to obtain a counterfactual response corresponding to each gender pronoun group, wherein a plurality of counterfactual responses are obtained corresponding to a plurality of predefined gender pronoun groups; generating, by the system via the one or more hardware processors, one or more counterfactual response embeddings for each counterfactual response of the plurality of counterfactual responses using one or more sentence embedding models, wherein a plurality of the counterfactual response embeddings is generated by each sentence embedding model of the one or more sentence embedding models; for each sentence embedding model of the one or more sentence embedding models, creating, by the system via the one or more hardware processors, a sentence embedding model based cluster distance list performing: creating, by the system via the one or more hardware processors, an embedding data frame, wherein the embedding data frame of a sentence embedding model comprises the plurality of counterfactual response embeddings generated for the respective sentence embedding model; training, by the system via the one or more hardware processors, a Gaussian Mixture Model for a single cluster using the created embedding data frame of the respective sentence embedding model; determining, by the system via the one or more hardware processors, a cluster center of the single cluster; calculating, by the system via the one or more hardware processors, a cluster distance of each counterfactual response embedding of the plurality of counterfactual response embeddings from the cluster center using a Euclidean distance; and adding, by the system via the one or more hardware processors, the cluster distance calculated for each of the counterfactual response embeddings to the sentence embedding model based cluster distance list predefined for the corresponding sentence embedding model, and wherein the sentence embedding model based cluster distance list obtained after addition of cluster distances comprises a plurality of calculated cluster distances in form of a plurality of elements; determining, by the system via the one or more hardware processors, whether any element among the plurality of elements present in either sentence embedding model based cluster distance lists is more than a predefined distance threshold; and displaying, by the system via the one or more hardware processors, a gender bias result based on the determination.
In an embodiment, displaying the gender bias result comprises: displaying the gender bias result as ‘LLM is unfair’ upon determining that at least one element in the plurality of elements present in either sentence embedding model based cluster distance lists is more than the predefined distance threshold.
In an embodiment, displaying the gender bias result further comprises: displaying the gender bias result as ‘LLM is fair’ upon determining that no element in the plurality of elements present in either sentence embedding model based cluster distance lists is more than the predefined distance threshold.
In an embodiment, the conversion of the plurality of many-to-many pronouns present in the no-name-context into the plurality of secondary many-to-one pronouns comprises: identifying, by the system via the one or more hardware processors, one or more sentences containing the plurality of many-to-many pronouns in the no-name-context using a second custom pattern finder, wherein the second custom pattern finder creates a second custom pattern that identifies the one or more sentences containing the plurality of many-to-many pronouns; and converting, by the system via the one or more hardware processors, each identified many-to-many pronoun to a corresponding pronoun in an unambiguous pronoun group using a custom prompt to the LLM, wherein conversion of each identified many-to-many pronoun to the corresponding pronoun in the unambiguous pronoun group generates the plurality of secondary many-to-one pronouns.
In an embodiment, the step of creating the counterfactual context for each gender pronoun group of the plurality of predefined gender pronoun groups based on the pronoun classification context comprises: identifying, by the system via the one or more hardware processors, a plurality of pronoun classifications present in the pronoun classification context using a third custom pattern finder; mapping, by the system via the one or more hardware processors, each pronoun classification with a corresponding counterfactual gender pronoun, wherein a plurality of mappings are obtained corresponding to the plurality of pronoun classifications; appending, by the system via the one or more hardware processors, the plurality of mappings to one or more honorific titles; creating, by the system via the one or more hardware processors, a third custom pattern comprising a pronoun classification of each of the plurality of mappings, wherein each mapping is separated by the operator; creating, by the system via the one or more hardware processors, a lambda function that fetches a corresponding gender pronoun for each of the pronoun classifications, wherein the lambda function is utilized in a custom pattern substitution function; and creating, by the system via the one or more hardware processors, the counterfactual context for each gender by using the custom pattern substitution function based on the mapping of the pronoun classifications and honorific titles with corresponding counterfactual gender pronouns in the pronoun classification context.
In another aspect, there is provided a system for assessing gender fairness of large language models. The system comprises a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive the context aware prompt comprising a context and a prompt; mask one or more names present in the context using a name anonymization technique to obtain a no-name-context, wherein the no-name-context comprises a plurality of primary many-to-one pronouns and a plurality of many-to-many pronouns; convert wherein an intermediate counterfactual context is obtained after the conversion, wherein the intermediate counterfactual context comprises a plurality of many-to-one pronouns, and wherein the plurality of many-to-one pronouns is a combination of the plurality of primary many-to-one pronouns and the plurality of secondary many-to-one pronouns; identify each many to one pronoun of the plurality of many-to-one pronouns and one or more honorific titles that are present in the intermediate counterfactual context using a first custom pattern finder, wherein the first custom pattern finder creates a first custom pattern of many to one pronouns and the one or more honorific titles in which each many to one pronoun and each honorific title is separated by an operator; predict a pronoun classification for each identified many-to-one pronoun of the plurality of many-to-one pronouns and each honorific title of the one or more honorific titles present in the first custom pattern created for the intermediate counterfactual context using a pre-trained decision tree classifier, wherein each many-to-one pronoun and each honorific title is replaced with an associated predicted pronoun classification in the intermediate counterfactual context to obtain a pronoun classification context; create a counterfactual context for each gender pronoun group of a plurality of predefined gender pronoun groups based on the pronoun classification context; insert the counterfactual context generated for each gender pronoun group into the prompt to obtain a counterfactual prompt for each gender pronoun group; query a large language model (LLM) with the counterfactual prompt obtained for each gender pronoun group to obtain a counterfactual response corresponding to each gender pronoun group, wherein a plurality of counterfactual responses are obtained corresponding to a plurality of predefined gender pronoun groups; generate one or more counterfactual response embeddings for each counterfactual response of the plurality of counterfactual responses using one or more sentence embedding models, wherein a plurality of the counterfactual response embeddings is generated by each sentence embedding model of the one or more sentence embedding models; for each sentence embedding model of the one or more sentence embedding models, create a sentence embedding model based cluster distance list by performing: create an embedding data frame, wherein the embedding data frame of a sentence embedding model comprises the plurality of counterfactual response embeddings generated for the respective sentence embedding model; train Gaussian Mixture Model for a single cluster using the created embedding data frame of the respective sentence embedding model; determine a cluster center of the single cluster; calculate a cluster distance of each counterfactual response embedding of the plurality of counterfactual response embeddings from the cluster center using a Euclidean distance; add the cluster distance calculated for each of the counterfactual response embeddings to the sentence embedding model based cluster distance list predefined for the corresponding sentence embedding model, wherein the sentence embedding model based cluster distance list obtained after addition of cluster distances comprises a plurality of calculated cluster distances in form of a plurality of elements; determine whether any element among the plurality of elements present in either sentence embedding model based cluster distance lists is more than a predefined distance threshold; and display a gender bias result based on the determination.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors assess gender fairness of large language models by receiving, by a system, a context aware prompt, the context aware prompt comprising a context and a prompt; masking, by the system, one or more names present in the context using a name anonymization technique to obtain a no-name-context, wherein the no-name-context comprises a plurality of primary many-to-one pronouns and a plurality of many-to-many pronouns; converting, by the system, the plurality of many-to-many pronouns present in the no-name-context into a plurality of secondary many-to-one pronouns, wherein an intermediate counterfactual context is obtained after the conversion, wherein the intermediate counterfactual context comprises a plurality of many-to-one pronouns, and wherein the plurality of many-to-one pronouns is a combination of the plurality of primary many-to-one pronouns and the plurality of secondary many-to-one pronouns; identifying, by the system, each many to one pronoun of the plurality of many-to-one pronouns and one or more honorific titles that are present in the intermediate counterfactual context using a first custom pattern finder, wherein the first custom pattern finder creates a first custom pattern of many to one pronouns and the one or more honorific titles in which each many to one pronoun and each honorific title is separated by an operator; predicting, by the system, a pronoun classification for each many-to-one pronoun of the plurality of many-to-one pronouns and each honorific title of the one or more honorific titles present in the first custom pattern created for the intermediate counterfactual context using a pre-trained decision tree classifier, wherein each many-to-one pronoun and each honorific title is replaced with an associated predicted pronoun classification in the intermediate counterfactual context to obtain a pronoun classification context; creating, by the system, a counterfactual context for each gender pronoun group of a plurality of predefined gender pronoun groups based on the pronoun classification context; inserting, by the system, the counterfactual context generated for each gender pronoun group into the prompt to obtain a counterfactual prompt for each gender pronoun group; querying, by the system, a large language model (LLM) with the counterfactual prompt obtained for each gender pronoun group to obtain a counterfactual response corresponding to each gender pronoun group, wherein a plurality of counterfactual responses are obtained corresponding to a plurality of predefined gender pronoun groups; generating, by the system, one or more counterfactual response embeddings for each counterfactual response of the plurality of counterfactual responses using one or more sentence embedding models, wherein a plurality of the counterfactual response embeddings is generated by each sentence embedding model of the one or more sentence embedding models; for each sentence embedding model of the one or more sentence embedding models, creating a sentence embedding model based cluster distance list by performing: creating, by the system, an embedding data frame, wherein the embedding data frame of a sentence embedding model comprises the plurality of counterfactual response embeddings generated for the respective sentence embedding model; training, by the system, a Gaussian Mixture Model for a single cluster using the created embedding data frame of the respective sentence embedding model; determining, by the system, a cluster center of the single cluster; calculating, by the system, a cluster distance of each counterfactual response embedding of the plurality of counterfactual response embeddings from the cluster center using a Euclidean distance; and adding, by the system, the cluster distance calculated for each of the counterfactual response embeddings to a sentence embedding model based cluster distance list predefined for the corresponding sentence embedding model, wherein the sentence embedding model based cluster distance list obtained after addition of cluster distances comprises a plurality of calculated cluster distances in form of a plurality of elements; determining, by the system, whether any element among the plurality of elements present in either sentence embedding model based cluster distance lists is more than a predefined distance threshold; and displaying, by the system, a gender bias result based on the determination.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Many large language models (LLMs), such as OpenAl generative pretrained transformer (GPT) 3.5, GPT 4, Claude 2 etc., are more powerful models and are often referred as black box models. In particular, LLMs that are trained on large datasets using proprietary techniques which are not open source, and whose internal structure in inaccessible to the user are often referred as black box models. A user can only interact with these kinds of LLMs through prompts by calling a large language model (LLM) application programming interface (API).
As discussed earlier, in case of enterprise use cases, business users leverage these powerful LLMs by utilizing the context aware prompts, where the LLM can respond based on the enterprise information given in the context. This further helps enterprises leverage these powerful LLMs by using their APIs.
However, it is all the more important the ensure the gender based fairness of the LLM, especially when the LLM used is black box as it comes with the constraints of not providing access to internal layers and structure of these black box models.
So, a technique that can efficiently analyze gender fairness of LLMs, of whose internal layers are inaccessible while ensuring accurate bias assessment is still to be explored.
Embodiments of the present disclosure overcome the above-mentioned disadvantages by providing a system and a method for assessing gender fairness of large language models. The system of the present disclosure first receives a context aware prompt comprising a context and a prompt. The system then masks names present in the context using a name anonymization technique to obtain a no-name-context. Thereafter, the system converts a plurality of many-to-many pronouns present in the no-name-context into a plurality of secondary many-to-one pronouns and thus obtains an intermediate counterfactual context. Further, the system predicts a pronoun classification for each many-to-one pronoun and one or more honorific titles using a decision tree classifier which is trained on a many to one pronouns and honorific titles dataset created using the LGTBQ+ community pronouns. The predicted pronoun classification is then used to replace each many-to-one pronoun and each honorific title in the intermediate counterfactual context to obtain a pronoun classification context.
Then, based on the pronoun classification context, the system create a counterfactual context for each gender pronoun group of a plurality of predefined gender pronoun groups which is then inserted into the prompt to obtain a counterfactual prompt for each gender pronoun group. The system then queries the LLM with the counterfactual prompt obtained for each gender pronoun group to obtain a counterfactual response corresponding to each gender pronoun group.
Thereafter, the system uses sentence embedding models to generate counterfactual response embeddings for each counterfactual response. Then, the system creates a single embedding data frame for all the counterfactual response embeddings, for all the sentence embedding models. The created embedding data frame is then used to train a Gaussian Mixture Model (GMM) for a single cluster of generated counterfactual response embeddings.
Further, the system determines a cluster center of the single cluster created for each sentence embedding model. The determined cluster center is then used by the system to calculate a cluster distance of each counterfactual response embedding from their respective cluster center using a Euclidean distance. The calculated cluster distance for each counterfactual response embedding is then added to a sentence embedding model based cluster distance list created for the respective sentence embedding model.
Finally, the system determines if any element in any sentence embedding model based cluster distance list is more than a predefined distance threshold. Upon determining that at least one element in any sentence embedding model based cluster distance list is more than the predefined distance threshold, the system displays that the LLM is unfair and it exhibits gender bias for this case.
In the present disclosure, the system and the method uses initial context aware prompt to come up with counterfactual prompt which is then used to assess gender bias of the LLM, thereby eliminating the need to have access to internal layers and structure which are generally not accessible in case of the black box models. Further, the system uses ensemble of more than one sentence embedding model for accessing gender bias of the LLM, thereby improving accuracy of the bias prediction. The system uses a decision tree classifier which is trained on a many to one pronouns and honorific titles dataset created using the LGTBQ+ community pronouns to predicts pronoun classification for each many-to-one pronoun, thus ensuring inclusiveness of the LGTBQ+ community along with male and female genders.
Referring now to the drawings, and more particularly to, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
illustrates an exemplary representation of an environmentrelated to at least some example embodiments of the present disclosure. Although the environmentis presented in one arrangement, other embodiments may include the parts of the environment(or other parts) arranged otherwise depending on, for example, converting many-to-many pronouns present in the no-name-context, creating counterfactual context for each gender pronoun group, generating counterfactual response embeddings etc. The environmentgenerally includes a system, a user device, each coupled to, and in communication with (and/or with access to) a network. It should be noted that one user device is shown for explanation purpose, there can be multiple user devices.
The networkmay include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts or users illustrated in, or any combination thereof.
Various entities in the environmentmay connect to the networkin accordance with various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, or any combination thereof.
The user deviceis associated with a user/enterprise user who wants to assess gender fairness of an LLM. Examples of the user deviceinclude, but are not limited to, a personal computer (PC), a mobile phone, a tablet device, a Personal Digital Assistant (PDA), a server, a voice activated assistant, a smartphone, and a laptop.
The systemincludes one or more hardware processors and a memory. The systemis first configured to receive a context aware prompt via the networkfrom the user device. The systemthen uses a counterfactual based gender fairness algorithm that utilizes the received context aware prompt to check/assess the gender bias of the LLM. The counterfactual based gender fairness algorithm is explained in detail with reference to. Thereafter, the systemdisplays a gender bias assessment result of the LLM to the user on the user device.
The number and arrangement of clouds, devices, and/or networks shown inare provided as an example. There may be additional clouds, devices, and/or networks; fewer clouds, devices, and/or networks; different clouds, devices, and/or networks; and/or differently arranged clouds, devices, and/or networks than those shown in. Furthermore, two or more devices shown inmay be implemented within a single device, or a single device shown inmay be implemented as multiple, distributed systems or devices. Additionally, or alternatively, a set of devices (e.g., one or more devices) of the environmentmay perform one or more functions described as being performed by another set of systems or another set of devices of the environment(e.g., refer scenarios described above).
illustrates an exemplary block diagram of a systemfor assessing gender fairness of large language models, in accordance with an embodiment of the present disclosure. In some embodiments, the systemis embodied as a cloud-based and/or software as a service (SaaS) based architecture. In some embodiments, the systemmay be implemented in a server system. In some embodiments, the systemmay be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
In an embodiment, the systemincludes one or more processors, communication interface device(s) or input/output (I/O) interface(s), and one or more data storage devices or memoryoperatively coupled to the one or more processors. The one or more processorsmay be one or more software processing modules and/or hardware processors. In an embodiment, the hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) is configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the systemcan be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
The I/O interface device(s)can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
The memorymay include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment a databasecan be stored in the memory, wherein the databasemay comprise, but are not limited to, the sentence embedding model based cluster distance list, distance threshold, gender pronoun groups, a first custom finder, a second custom finder, one or more processes and the like. The memoryfurther comprises (or may further comprise) information pertaining to input(s)/output(s) of each step performed by the systems and methods of the present disclosure. In other words, input(s) fed at each step and output(s) generated at each step are comprised in the memoryand can be utilized in further processing and analysis.
It is noted that the systemas illustrated and hereinafter described is merely illustrative of an apparatus that could benefit from embodiments of the present disclosure and, therefore, should not be taken to limit the scope of the present disclosure. It is noted that the systemmay include fewer or more components than those depicted in.
, collectively, with reference to, represent an exemplary flow diagram of a methodfor assessing gender fairness of large language models, in accordance with an embodiment of the present disclosure. The methodmay use the systemoffor execution. In an embodiment, the systemcomprises one or more data storage devices or the memoryoperatively coupled to the one or more hardware processorsand is configured to store instructions for execution of steps of the methodby the one or more hardware processors. The sequence of steps of the flow diagram may not be necessarily executed in the same order as they are presented. Further, one or more steps may be grouped together and performed in form of a single step, or one step may have several sub-steps that may be performed in parallel or in sequential manner. The steps of the method of the present disclosure will now be explained with reference to the components of the systemas depicted inand.
At stepof the present disclosure, the one or more hardware processorsof the systemreceive a context aware prompt. The context aware prompt includes a context and a prompt. In an embodiment, the context can be enterprise information and a user may try to prompt the LLM based on the context. In particular, the inputs ‘context’ and ‘prompt’ together make up the ‘context aware prompt’.
At stepof the present disclosure, the one or more hardware processorsof the systemmask one or more names present in the context using a name anonymization technique to obtain a no-name-context. The name anonymization technique first identifies names present in the context using a name analyzer, such as Presidio Analyzer and then masks the identified names using a name anonymizer.
In particular, the no-name-context is obtained by removing all the names from the given context. For example, if there is a name such as, “Mr. Tim Goldman, Loan Officer” in the context, the systemmay use the name anonymization technique to remove the name ‘Tim Goldman’ from the context and the no-name-context may look like ‘Mr., Loan Officer’. The masking of the names is done so that the LLM has no clues regarding the actual gender of a subject as name can give hint to the LLM regarding the gender of the subject. In some embodiment, the systemmay use a placeholder like <PERSON> to mask the name of the subject.
The no-name-context includes a plurality of primary many-to-one pronouns and a plurality of many-to-many pronouns. The many-to-one pronouns are pronouns where one pronoun is mapped to one pronoun classification. Examples of many to one pronouns include, but are not limited to, Him, Them, Ver, Their and the like.
The many to many pronouns are pronouns where one pronoun is mapped to one or more different pronoun classifications. Examples of many to many pronouns include, but are not limited to, Aer (pronoun classification: object and possessive), His (pronoun classification: possessive and possessive pronoun), Per (pronoun classification: subject and object), Pers (pronoun classification: possessive and possessive pronoun), Her (pronoun classification: object and possessive), Vis (pronoun classification: possessive and possessive pronoun), and Hir (pronoun classification: object and possessive).
At stepof the present disclosure, the one or more hardware processorsof the systemconvert the plurality of many-to-many pronouns present in the no-name-context into a plurality of secondary many-to-one pronouns. In an embodiment, the systemfirst identifies one or more sentences containing the plurality of many-to-many pronouns in the no-name-context using the second custom pattern finder. In particular, the second custom pattern finder creates a second custom pattern that identifies the one or more sentences containing the plurality of many-to-many pronouns.
Then, the systemconverts each identified many-to-many pronoun to a corresponding pronoun in an unambiguous pronoun group using a custom prompt to the LLM. In one embodiment, the systemprovides the custom prompt to the LLM to convert each identified many to many pronoun into an unambiguous ‘Xe’ pronoun group. So, many-to-many pronouns like ‘his’, ‘vis’, ‘hir’, ‘aer’, ‘her’, ‘per’ and ‘pers’ are converted to corresponding pronouns in the identified unambiguous pronoun group i.e. ‘Xe’ gender. For example, initial sentence is ‘Per is a student at this school. The government provided per a scholarship.’. Then the converted sentence will be ‘Xe is a student at this school. The government provided xem a scholarship.’. Similarly, if another initial sentence is ‘The bike outside is his. His father told him to park the bike outside’. The converted sentence will look like ‘The bike outside is xyrs. Xyr father told xem to park the bike outside’.
The conversion of each identified many-to-many pronoun to the corresponding pronoun in the unambiguous pronoun group generates the plurality of secondary many-to-one pronouns. An intermediate counterfactual context is obtained after the conversion of the plurality of many-to-many pronouns present in the no-name-context into the plurality of secondary many-to-one pronouns. Now, the intermediate counterfactual context includes a plurality of many-to-one pronouns which is basically a combination of the plurality of primary many-to-one pronouns and the plurality of secondary many-to-one pronouns.
At stepof the present disclosure, the one or more hardware processorsof the systemidentifies each many to one pronoun of the plurality of many-to-one pronouns and one or more honorific titles that are present in the intermediate counterfactual context using the first custom pattern finder. In particular, the first custom pattern finder creates a first custom pattern of the plurality of many to one pronouns and the one or more honorific titles in which each many to one pronoun and each honorific title is separated by an operator. In one embodiment, ‘I’ operator is used by the systemfor separating each many to one pronoun.
At stepof the present disclosure, the one or more hardware processorsof the systempredict a pronoun classification for each many-to-one pronoun of the plurality of many-to-one pronouns and each honorific title of the one or more honorific titles present in the first custom pattern created for the intermediate counterfactual context using a pre-trained decision tree classifier. It should be noted that any other classifier such as RF, XGBoost or Dtree can be used for the same purpose instead of the decision tree classifier.
To obtain the pre-trained decision tree classifier, a many to one pronouns and honorific titles dataset is first created using the LGTBQ+ community pronouns. Then, a decision tree classifier is trained for predicting pronoun classification keys (pronoun classification) using the created many to one pronouns and honorific titles dataset with 100% accuracy. The trained decision tree classifier is then used by the systemand is referred as pre-trained decision tree classifier.
The pre-trained decision tree classifier, when used by the system, predicts the pronoun classification of each many-to-one pronoun and each honorific title present in the first custom pattern. And the predicted pronoun classification is then used by the systemto obtain a pronoun classification context. In particular, the systemreplaces each many-to-one pronoun and each honorific title with a corresponding predicted pronoun classification in the intermediate counterfactual context to obtain the pronoun classification context.
Unknown
November 20, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.