Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method comprising: receiving, from a client, a request to determine whether a first vertex and a second vertex, of a particular graph data representation, are connected within a particular number of hops; wherein the particular graph data representation includes a plurality of high-degree vertices; in response to receiving the request: based, at least in part, on landmark connectivity data, determining whether the first vertex and the second vertex are connected via one or more high-degree vertices, of the plurality of high-degree vertices, within the particular number of hops; wherein the landmark connectivity data comprises connectivity information for each high-degree vertex of the plurality of high-degree vertices; responsive to determining that the first vertex and the second vertex are connected via the one or more high-degree vertices, indicating, in a response to the request, that the first vertex and second vertex are connected; wherein the method is performed by one or more computing devices.
This invention relates to graph data analysis, specifically determining connectivity between vertices in large-scale graphs with high-degree nodes. The problem addressed is efficiently checking whether two vertices are connected within a specified number of hops in graphs containing many high-degree vertices, which can slow down traditional traversal methods. The method involves receiving a request to check connectivity between a first and second vertex in a graph containing multiple high-degree vertices. Instead of traversing the entire graph, the system uses precomputed landmark connectivity data, which stores connectivity information for each high-degree vertex. This data helps quickly determine if the two vertices are connected via one or more high-degree vertices within the specified hop limit. If a connection is found through these landmarks, the system responds affirmatively. The approach leverages high-degree vertices as landmarks to reduce computational overhead in large graphs. The process is executed by one or more computing devices.
2. The method of claim 1 , further comprising: maintaining, in a data store, a plurality of data objects storing the landmark connectivity data; wherein each data object, of the plurality of data objects is associated with a respective number of hops; prior to determining whether the first vertex and the second vertex are connected within the particular number of hops, identifying a set of data objects, from the plurality of data objects, based on the particular number of hops; and wherein the landmark connectivity data, on which determining whether the first vertex and the second vertex are connected within the particular number of hops is based, is data stored in the identified set of data objects.
This invention relates to graph data processing, specifically methods for efficiently determining connectivity between vertices in a graph using landmark connectivity data. The problem addressed is the computational inefficiency of traditional graph traversal algorithms when checking connectivity within a limited number of hops (steps) between vertices, especially in large-scale graphs. The method involves storing landmark connectivity data in a data store as multiple data objects, where each object is associated with a specific number of hops. Each data object contains connectivity information for vertices within a certain hop distance from a landmark node. When checking if two vertices (a first and second vertex) are connected within a specified hop limit, the system first identifies a subset of data objects that correspond to the given hop distance. The connectivity determination is then performed using only the data from these selected objects, reducing the search space and improving efficiency. This approach optimizes connectivity queries by leveraging precomputed landmark-based connectivity data, avoiding full graph traversals and instead relying on preprocessed data objects indexed by hop distance. The method is particularly useful in applications requiring fast connectivity checks in large graphs, such as social networks, recommendation systems, or network routing.
3. The method of claim 2 , wherein: the identified set of data objects includes a first object containing forward-directional landmark connectivity data and a second object containing backward-directional landmark connectivity data; determining whether the first vertex and the second vertex are connected within the particular number of hops comprises comparing a portion of the forward-directional landmark connectivity data that is associated with the first vertex to a portion of the backward-directional landmark connectivity data that is associated with the second vertex.
The invention relates to a method for determining connectivity between vertices in a network graph using landmark-based connectivity data. The method addresses the challenge of efficiently verifying whether two vertices are connected within a specified number of hops in large-scale networks, where traditional pathfinding algorithms may be computationally expensive. The method involves analyzing a set of data objects that include forward-directional and backward-directional landmark connectivity data. The forward-directional data object contains connectivity information from a landmark vertex to other vertices, while the backward-directional data object contains connectivity information from those vertices back to the landmark. To determine if two vertices (a first and a second vertex) are connected within a specified number of hops, the method compares a portion of the forward-directional data associated with the first vertex to a portion of the backward-directional data associated with the second vertex. This comparison allows the system to infer connectivity without traversing the entire graph, improving efficiency. The method leverages precomputed landmark data to reduce the computational overhead of connectivity queries, making it suitable for large-scale networks where real-time pathfinding is required. The use of both forward and backward connectivity data ensures accurate results while minimizing the need for exhaustive graph traversal.
4. The method of claim 3 , wherein: the portion of the forward-directional landmark connectivity data is represented in a first bitmap, and the portion of the backward-directional landmark connectivity data is represented in a second bitmap; each bit position in both the first bitmap and the second bitmap represents a particular vertex in the particular graph data representation; and comparing the portion of the forward-directional landmark connectivity data and the portion of the backward-directional landmark connectivity data comprises performing a bitwise AND on the first and second bitmaps.
This invention relates to graph data processing, specifically methods for analyzing connectivity between vertices in a graph using landmark-based techniques. The problem addressed is efficiently determining connectivity relationships in large-scale graph structures, which is computationally expensive using traditional methods. The solution involves using bitmaps to represent connectivity data derived from landmark vertices, enabling fast comparisons through bitwise operations. The method processes a graph data representation by selecting a set of landmark vertices and generating connectivity data for each vertex in the graph relative to these landmarks. Forward-directional connectivity data indicates reachability from a landmark to other vertices, while backward-directional connectivity data indicates reachability from other vertices to a landmark. These connectivity portions are stored in separate bitmaps, where each bit position corresponds to a specific vertex in the graph. To compare connectivity data, a bitwise AND operation is performed between the forward and backward bitmaps, producing a result that identifies vertices with matching connectivity properties. This approach leverages bitmap representations and bitwise operations to significantly reduce computational overhead compared to traditional graph traversal methods, making it suitable for large-scale graph analysis tasks. The use of bitmaps allows for efficient storage and rapid comparison of connectivity information, enabling faster determination of vertex relationships in complex graph structures.
5. The method of claim 1 , further comprising: receiving, from a client, a second request to determine whether a third vertex and a fourth vertex, of the particular graph data representation, are connected within the particular number of hops; in response to receiving the second request: determining that the third vertex and the fourth vertex are not connected via one or more high-degree vertices, of the plurality of high-degree vertices in the particular graph data representation, based, at least in part, on the landmark connectivity data; responsive to determining that the third vertex and the fourth vertex are not connected via one or more high-degree vertices, automatically running a brute-force exploration of non-landmark vertices in the particular graph data representation; wherein the brute-force exploration of the non-landmark vertices omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
This invention relates to graph data analysis, specifically optimizing connectivity queries in large-scale graph structures. The problem addressed is efficiently determining whether two vertices (nodes) in a graph are connected within a specified number of hops (steps), particularly in graphs with high-degree vertices (nodes with many connections) that can slow down traditional traversal methods. The method involves using precomputed landmark connectivity data to quickly assess whether two vertices are connected via high-degree vertices. If the landmark data indicates they are not, the system performs a brute-force exploration of non-landmark vertices, excluding paths that involve high-degree vertices to improve efficiency. This approach reduces computational overhead by avoiding unnecessary traversals through densely connected nodes. The system first receives a request to check connectivity between two vertices within a defined hop limit. If the landmark data confirms they are not connected via high-degree vertices, it then explores only non-landmark vertices, skipping paths that include high-degree nodes. This selective traversal minimizes the search space, improving performance in large graphs where high-degree vertices could otherwise dominate processing time. The method is particularly useful in applications like social network analysis, recommendation systems, or fraud detection, where graph traversal efficiency is critical.
6. The method of claim 5 , further comprising: based on the brute-force exploration of the non-landmark vertices, determining an answer as to whether the third vertex and the fourth vertex are connected in the particular graph data representation within the particular number of hops; in response to determining the answer, indicating the answer in a response to the second request.
This invention relates to graph data processing, specifically methods for determining connectivity between vertices in a graph representation. The problem addressed is efficiently answering queries about whether two vertices are connected within a specified number of hops in a graph, particularly when the graph contains non-landmark vertices that are not part of a predefined landmark-based navigation system. The method involves exploring non-landmark vertices in the graph using brute-force techniques to determine connectivity. When a query is received to check if two vertices (a third and fourth vertex) are connected within a certain number of hops, the system performs an exhaustive search of the non-landmark vertices to establish the connection. The result of this exploration—whether the vertices are connected or not—is then returned as a response to the query. This approach ensures accurate connectivity verification even in graphs where landmark-based navigation is insufficient or unavailable. The technique is particularly useful in large-scale graph databases where partial or incomplete landmark systems exist, requiring fallback mechanisms for reliable connectivity analysis.
7. The method of claim 5 , wherein: the brute-force exploration of the non-landmark vertices explores a set of levels of connectivity, within the particular graph data representation, from one to the particular number of hops; the method further comprises avoiding cyclical exploration of each level of connectivity, of the set of levels of connectivity, by the brute-force exploration by tracking, in a stack, vertices on a current path of exploration.
This invention relates to graph data analysis, specifically improving the efficiency of brute-force exploration in graph traversal algorithms. The problem addressed is the computational inefficiency and potential for redundant or cyclical exploration when traversing graph structures, particularly in scenarios requiring exhaustive search of connectivity levels. The method involves exploring non-landmark vertices in a graph data representation by examining a set of connectivity levels, ranging from one to a specified number of hops. The brute-force exploration systematically investigates each level of connectivity while avoiding redundant or cyclical paths. This is achieved by tracking vertices on the current exploration path in a stack, ensuring that previously visited vertices are not revisited within the same path. The stack-based tracking mechanism prevents infinite loops and redundant computations, optimizing the traversal process. The method is particularly useful in applications requiring exhaustive graph analysis, such as network routing, social network analysis, or biological pathway mapping, where understanding connectivity at multiple levels is critical. By limiting cyclical exploration and efficiently tracking path vertices, the method reduces computational overhead and improves the accuracy of graph-based analyses. The approach is applicable to various graph types, including directed and undirected graphs, and can be integrated into existing graph traversal algorithms to enhance performance.
8. The method of claim 5 , further comprising: during the brute-force exploration of the non-landmark vertices, determining that a number of visited non-landmark vertices exceeds a pre-determined threshold; and responsive to determining that the number of visited non-landmark vertices exceeds the pre-determined threshold, performing a modified bidirectional search on the non-landmark vertices; wherein the modified bidirectional search omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
This invention relates to graph traversal algorithms, specifically optimizing brute-force and bidirectional search methods for large-scale graph exploration. The problem addressed is the computational inefficiency of traditional graph traversal techniques when dealing with high-degree vertices, which can significantly slow down exploration due to their numerous connections. The method involves a hybrid approach combining brute-force exploration with a modified bidirectional search. Initially, non-landmark vertices in a graph are explored using brute-force techniques. If the number of visited non-landmark vertices exceeds a predefined threshold, the algorithm switches to a modified bidirectional search. This modified search intentionally avoids paths that pass through high-degree vertices, which are nodes with an unusually large number of connections. By excluding these high-degree vertices from the search path, the algorithm reduces the computational overhead associated with exploring their extensive connections, thereby improving efficiency. The bidirectional search operates by simultaneously exploring the graph from both the start and target vertices, meeting in the middle to find the shortest path. The modification ensures that during this process, paths involving high-degree vertices are skipped, preventing unnecessary exploration of densely connected regions. This approach balances thoroughness with efficiency, particularly in graphs where high-degree vertices would otherwise dominate computational resources. The method is applicable to various graph-based applications, including network routing, pathfinding, and data analysis.
9. The method of claim 5 , further comprising: during the brute-force exploration of the non-landmark vertices, determining that a number of visited non-landmark vertices exceeds a pre-determined threshold; and responsive to determining that the number of visited non-landmark vertices exceeds the pre-determined threshold, performing a modified breadth-first search on the non-landmark vertices; wherein the modified breadth-first search omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
This invention relates to graph traversal techniques for optimizing exploration in large-scale graph structures. The problem addressed is the computational inefficiency of brute-force exploration methods when dealing with high-degree vertices, which can lead to excessive processing time and resource consumption. The solution involves a hybrid approach combining brute-force exploration with a modified breadth-first search (BFS) to improve traversal efficiency. The method begins by exploring non-landmark vertices in a graph using brute-force techniques. During this process, the system monitors the number of visited non-landmark vertices. If this number exceeds a predefined threshold, the traversal switches to a modified BFS. The modified BFS avoids paths that pass through high-degree vertices, which are typically computationally expensive to process. By omitting these paths, the method reduces unnecessary exploration and improves overall traversal efficiency. This approach is particularly useful in applications requiring rapid graph analysis, such as network routing, social network analysis, or large-scale data mining, where minimizing traversal time is critical. The adaptive switching between brute-force and modified BFS ensures balanced performance, avoiding the pitfalls of either method used in isolation.
10. The method of claim 1 , further comprising, prior to receiving the request, automatically compiling the landmark connectivity data by: automatically identifying a range of W values, wherein a W value represents a number of high-degree vertices to identify within the particular graph data representation; wherein automatically identifying the range of W values is based, at least in part, on a target data size for the landmark connectivity data; compiling a first test set of landmark connectivity data based on a maximum W from the range of W values; running a plurality of connectivity searches, based on the first test set of landmark connectivity data, by: identifying a plurality of random pairs of vertices within the particular graph data representation, running a connectivity search for each random pair of vertices, of the plurality of random pairs of vertices, using the first test set of landmark connectivity data, and recording statistics of the plurality of connectivity searches; identifying a baseline performance value for the maximum W based on the recorded statistics of the plurality of connectivity searches; compiling second one or more sets of landmark connectivity data based on respective one or more W values from the range of W values; running second one or more pluralities of connectivity searches, wherein each plurality of connectivity searches, of the second one or more pluralities of connectivity searches is based on a respective second set of landmark connectivity data of the second one or more sets of landmark connectivity data; identifying second one or more performance values, for the respective one or more W values, based on respective recorded statistics for each plurality of connectivity searches of the second one or more pluralities of connectivity searches; identifying an optimal W value that is a lowest W value, in the range of W values, associated with a second performance value that (a) satisfies a minimum performance threshold, and (b) is within a pre-determined threshold of the baseline performance value; wherein the landmark connectivity data, on which determining whether the first vertex and the second vertex are connected is based, identifies a number of high-degree vertices equal to the optimal W value.
This invention relates to optimizing graph data processing, specifically improving the efficiency of connectivity searches in large-scale graph representations. The problem addressed is the computational cost of determining connectivity between vertices in a graph, particularly when dealing with high-degree vertices that can slow down search operations. The solution involves automatically compiling landmark connectivity data to enhance search performance. The method begins by identifying a range of W values, where each W value represents the number of high-degree vertices to include in the landmark connectivity data. This range is determined based on a target data size for the landmark connectivity data. A first test set of landmark connectivity data is compiled using the maximum W value from the range. Multiple connectivity searches are then performed using this test set, with random vertex pairs, and performance statistics are recorded. A baseline performance value is derived from these statistics. Next, additional sets of landmark connectivity data are compiled using different W values from the range. For each set, connectivity searches are performed, and performance values are recorded. The optimal W value is selected as the lowest W that meets a minimum performance threshold while remaining within a predetermined threshold of the baseline performance. The final landmark connectivity data includes only the high-degree vertices corresponding to this optimal W value, ensuring efficient connectivity searches while minimizing computational overhead. This approach balances performance and resource usage in graph-based systems.
11. One or more non-transitory computer-readable media storing one or more sequences of instructions that, when executed by one or more processors, cause: receiving, from a client, a request to determine whether a first vertex and a second vertex, of a particular graph data representation, are connected within a particular number of hops; wherein the particular graph data representation includes a plurality of high-degree vertices; in response to receiving the request: based, at least in part, on landmark connectivity data, determining whether the first vertex and the second vertex are connected via one or more high-degree vertices, of the plurality of high-degree vertices, within the particular number of hops; wherein the landmark connectivity data comprises connectivity information for each high-degree vertex of the plurality of high-degree vertices; responsive to determining that the first vertex and the second vertex are connected via the one or more high-degree vertices, indicating, in a response to the request, that the first vertex and second vertex are connected.
This invention relates to graph data processing, specifically optimizing connectivity queries in large-scale graph structures containing high-degree vertices. The problem addressed is efficiently determining whether two vertices in a graph are connected within a specified number of hops, particularly in graphs with numerous high-degree vertices that can create computational bottlenecks. The system uses landmark connectivity data to accelerate these queries. This data precomputes and stores connectivity information for all high-degree vertices in the graph, allowing rapid verification of potential paths between any two vertices. When a query is received to check connectivity between a first and second vertex within a given hop limit, the system first checks if they are connected through one or more high-degree vertices using the precomputed landmark data. If such a connection exists, the system confirms the vertices are connected without requiring full graph traversal. This approach reduces computational overhead by leveraging precomputed connectivity patterns of high-degree vertices, which are often critical nodes in large graphs. The solution is particularly valuable for applications requiring frequent connectivity queries in complex network structures.
12. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause: maintaining, in a data store, a plurality of data objects storing the landmark connectivity data; wherein each data object, of the plurality of data objects, is associated with a respective number of hops; prior to determining whether the first vertex and the second vertex are connected within the particular number of hops, identifying a set of data objects, from the plurality of data objects, based on the particular number of hops; and wherein the landmark connectivity data, on which determining whether the first vertex and the second vertex are connected within the particular number of hops is based, is data stored in the identified set of data objects.
This invention relates to a system for efficiently determining connectivity between vertices in a graph using landmark connectivity data. The problem addressed is the computational inefficiency of traditional graph traversal methods, particularly in large-scale networks where direct pathfinding between arbitrary vertices is resource-intensive. The solution involves precomputing and storing connectivity information relative to a set of landmark nodes, allowing rapid verification of whether two vertices are connected within a specified number of hops without exhaustive traversal. The system maintains a data store containing multiple data objects, each storing landmark connectivity data and associated with a specific hop count. When checking connectivity between two vertices within a given hop limit, the system first identifies a subset of data objects corresponding to that hop count. The connectivity determination is then performed using only the relevant subset, reducing computational overhead. This approach leverages precomputed connectivity information to optimize queries, making it particularly suitable for large-scale graph applications such as social networks, recommendation systems, or network routing. The method ensures efficient verification of vertex connectivity while minimizing memory and processing requirements.
13. The one or more non-transitory computer-readable media of claim 12 , wherein: the identified set of data objects includes a first object containing forward-directional landmark connectivity data and a second object containing backward-directional landmark connectivity data; determining whether the first vertex and the second vertex are connected within the particular number of hops comprises comparing a portion of the forward-directional landmark connectivity data that is associated with the first vertex to a portion of the backward-directional landmark connectivity data that is associated with the second vertex.
This invention relates to computer-implemented methods for analyzing connectivity between vertices in a graph structure using landmark-based connectivity data. The problem addressed is efficiently determining whether two vertices are connected within a specified number of hops in large-scale graphs, which is computationally expensive using traditional methods. The system stores connectivity data in a distributed database, where data objects contain landmark connectivity information. A first object holds forward-directional landmark connectivity data, indicating which vertices are reachable from a landmark within a certain number of hops. A second object holds backward-directional landmark connectivity data, indicating which vertices can reach a landmark within the same hop limit. To check connectivity between two vertices, the system compares portions of these data objects associated with each vertex. Specifically, it examines the forward-directional data of the first vertex and the backward-directional data of the second vertex to determine if they share common landmarks within the specified hop limit. This approach reduces the need for exhaustive graph traversal, improving efficiency in large networks. The method is particularly useful for applications like social network analysis, network routing, or recommendation systems where quick connectivity queries are essential.
14. The one or more non-transitory computer-readable media of claim 13 , wherein: the portion of the forward-directional landmark connectivity data is represented in a first bitmap, and the portion of the backward-directional landmark connectivity data is represented in a second bitmap; each bit position in both the first bitmap and the second bitmap represents a particular vertex in the particular graph data representation; and comparing the portion of the forward-directional landmark connectivity data and the portion of the backward-directional landmark connectivity data comprises performing a bitwise AND on the first and second bitmaps.
This invention relates to graph data processing, specifically optimizing the comparison of landmark connectivity data in directed graphs. The problem addressed is efficiently determining connectivity relationships between vertices in large-scale graph structures, which is computationally expensive using traditional methods. The solution involves representing portions of forward-directional and backward-directional landmark connectivity data as bitmaps. Each bit position in these bitmaps corresponds to a specific vertex in the graph. The forward-directional connectivity data, indicating reachability from a landmark vertex, is stored in a first bitmap, while the backward-directional connectivity data, indicating reachability to a landmark vertex, is stored in a second bitmap. To compare these portions, a bitwise AND operation is performed between the two bitmaps. This operation identifies vertices that are reachable in both directions, efficiently determining mutual connectivity without exhaustive traversal. The bitmap representation reduces memory usage and speeds up comparisons by leveraging bitwise operations, which are hardware-optimized. This approach is particularly useful in applications like network analysis, social network modeling, and recommendation systems where graph connectivity patterns need to be analyzed efficiently.
15. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause: receiving, from a client, a second request to determine whether a third vertex and a fourth vertex, of the particular graph data representation, are connected within the particular number of hops; in response to receiving the second request: determining that the third vertex and the fourth vertex are not connected via one or more high-degree vertices, of the plurality of high-degree vertices in the particular graph data representation, based, at least in part, on the landmark connectivity data; responsive to determining that the third vertex and the fourth vertex are not connected via one or more high-degree vertices, automatically running a brute-force exploration of non-landmark vertices in the particular graph data representation; wherein the brute-force exploration of the non-landmark vertices omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
This invention relates to graph data processing, specifically optimizing connectivity queries in large-scale graph structures. The problem addressed is efficiently determining whether two vertices (nodes) in a graph are connected within a specified number of hops (steps) while minimizing computational overhead, particularly in graphs with high-degree vertices (nodes with many connections). The system stores a graph data representation and precomputes landmark connectivity data, which identifies high-degree vertices and their connectivity patterns. When a query is received to check if two vertices (a third and fourth vertex) are connected within a certain number of hops, the system first checks if they are connected via any high-degree vertices using the precomputed data. If no such connection exists, the system performs a brute-force exploration of non-landmark vertices, excluding paths that involve high-degree vertices. This approach reduces the search space by leveraging precomputed connectivity information and avoiding unnecessary exploration of high-degree vertices, improving query efficiency. The method is implemented via executable instructions stored on non-transitory computer-readable media.
16. The one or more non-transitory computer-readable media of claim 15 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause: based on the brute-force exploration of the non-landmark vertices, determining an answer as to whether the third vertex and the fourth vertex are connected in the particular graph data representation within the particular number of hops; in response to determining the answer, indicating the answer in a response to the second request.
This invention relates to graph data processing, specifically improving the efficiency of determining connectivity between vertices in a graph. The problem addressed is the computational inefficiency of traditional graph traversal methods, particularly when dealing with large-scale graphs where brute-force exploration of non-landmark vertices is necessary to verify connectivity between two vertices within a specified number of hops. The solution involves a system that processes graph data representations, where vertices and edges are stored in a database. The system receives a request to determine if two vertices (a third and fourth vertex) are connected within a certain number of hops. The system performs a brute-force exploration of non-landmark vertices to determine connectivity, then generates a response indicating whether the vertices are connected within the specified hop limit. Non-landmark vertices are those that are not pre-selected reference points used in landmark-based graph traversal methods. The brute-force exploration involves systematically checking paths between the vertices without relying on precomputed landmarks, ensuring accuracy in graphs where landmark-based methods may fail. The response is then transmitted back to the requester, providing a definitive answer on connectivity. This approach ensures reliable connectivity verification in scenarios where landmark-based methods are insufficient or impractical.
17. The one or more non-transitory computer-readable media of claim 15 , wherein: the brute-force exploration of the non-landmark vertices explores a set of levels of connectivity, within the particular graph data representation, from one to the particular number of hops; the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause avoiding cyclical exploration of each level of connectivity, of the set of levels of connectivity, by the brute-force exploration by tracking, in a stack, vertices on a current path of exploration.
This invention relates to graph data analysis, specifically improving the efficiency of exploring graph structures to identify relevant connections. The problem addressed is the computational inefficiency and potential for redundant or cyclical exploration when traversing graph data, particularly in large or complex networks. The solution involves a method for exploring non-landmark vertices in a graph data representation, where the exploration is performed in a brute-force manner but optimized to avoid redundant or cyclical traversal. The method explores a set of connectivity levels within the graph, ranging from one to a specified number of hops, to systematically examine connections between vertices. To prevent cyclical exploration, the method tracks vertices on the current path of exploration using a stack data structure. This ensures that once a vertex is visited, it is not revisited in the same path, reducing redundant computations and improving efficiency. The approach is particularly useful in applications requiring exhaustive or near-exhaustive graph traversal, such as network analysis, social network mapping, or recommendation systems, where identifying all possible connections within a certain range is critical. The use of a stack for tracking visited vertices ensures that the exploration remains systematic and avoids unnecessary reprocessing of the same nodes.
18. The one or more non-transitory computer-readable media of claim 15 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause: during the brute-force exploration of the non-landmark vertices, determining that a number of visited non-landmark vertices exceeds a pre-determined threshold; and responsive to determining that the number of visited non-landmark vertices exceeds the pre-determined threshold, performing a modified bidirectional search on the non-landmark vertices; wherein the modified bidirectional search omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
The invention relates to optimizing graph traversal algorithms, specifically for scenarios where brute-force exploration of non-landmark vertices becomes computationally expensive. The problem addressed is the inefficiency of exhaustive search methods in large graphs, particularly when encountering high-degree vertices that significantly increase computational overhead. The solution involves a hybrid approach combining brute-force exploration with a modified bidirectional search. During brute-force exploration of non-landmark vertices, the system monitors the number of visited vertices. If this number exceeds a predefined threshold, the algorithm switches to a modified bidirectional search. This modified search intentionally avoids paths that pass through high-degree vertices, thereby reducing the computational burden. The method ensures efficient traversal by dynamically adjusting the search strategy based on the graph's structural characteristics, particularly the presence of high-degree vertices that would otherwise slow down the exploration process. This approach is particularly useful in applications requiring rapid graph traversal, such as network routing, pathfinding, or data analysis in large-scale graph structures.
19. The one or more non-transitory computer-readable media of claim 15 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause: during the brute-force exploration of the non-landmark vertices, determining that a number of visited non-landmark vertices exceeds a pre-determined threshold; and responsive to determining that the number of visited non-landmark vertices exceeds the pre-determined threshold, performing a modified breadth-first search on the non-landmark vertices; wherein the modified breadth-first search omits, from exploration, paths through the particular graph data representation that involve any of the plurality of high-degree vertices.
This invention relates to graph traversal algorithms, specifically optimizing brute-force exploration in graph-based systems. The problem addressed is the computational inefficiency of exhaustive graph traversal, particularly when encountering high-degree vertices that can lead to excessive exploration and resource consumption. The system involves a graph data representation with vertices categorized as landmark or non-landmark, where non-landmark vertices are explored during brute-force traversal. To mitigate performance degradation, the system monitors the number of visited non-landmark vertices. If this count exceeds a predefined threshold, the traversal switches to a modified breadth-first search (BFS). This modified BFS avoids paths involving high-degree vertices, which are vertices with an unusually high number of connections, to prevent unnecessary exploration. The solution dynamically adjusts the traversal strategy based on exploration progress, balancing between brute-force and BFS methods to optimize computational efficiency. The threshold determines when to switch strategies, ensuring that the system avoids excessive resource usage while maintaining thorough exploration of the graph. This approach is particularly useful in large-scale graph processing, such as network analysis or pathfinding, where high-degree vertices can significantly slow down traversal.
20. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more sequences of instructions further comprise instructions that, when executed by one or more processors, cause, prior to receiving the request, automatically compiling the landmark connectivity data by: automatically identifying a range of W values, wherein a W value represents a number of high-degree vertices to identify within the particular graph data representation; wherein automatically identifying the range of W values is based, at least in part, on a target data size for the landmark connectivity data; compiling a first test set of landmark connectivity data based on a maximum W from the range of W values; running a plurality of connectivity searches, based on the first test set of landmark connectivity data, by: identifying a plurality of random pairs of vertices within the particular graph data representation, running a connectivity search for each random pair of vertices, of the plurality of random pairs of vertices, using the first test set of landmark connectivity data, and recording statistics of the plurality of connectivity searches; identifying a baseline performance value for the maximum W based on the recorded statistics of the plurality of connectivity searches; compiling second one or more sets of landmark connectivity data based on respective one or more W values from the range of W values; running second one or more pluralities of connectivity searches, wherein each plurality of connectivity searches, of the second one or more pluralities of connectivity searches, is based on a respective second set of landmark connectivity data of the second one or more sets of landmark connectivity data; identifying second one or more performance values, for the respective one or more W values, based on respective recorded statistics for each pluralit of connectivity searches of the second one or more pluralities of connectivity searches; identifying an optimal W value that is a lowest W value, in the range of W values, associated with a second performance value that (a) satisfies a minimum performance threshold, and (b) is within a pre-determined threshold of the baseline performance value; wherein the landmark connectivity data, on which determining whether the first vertex and the second vertex are connected is based, identifies a number of high-degree vertices equal to the optimal W value.
This invention relates to optimizing graph data processing, specifically improving the efficiency of connectivity searches in large-scale graph representations. The problem addressed is the computational cost of determining connectivity between vertices in a graph, which can be resource-intensive for large datasets. The solution involves automatically compiling landmark connectivity data by identifying an optimal number of high-degree vertices (W value) to include in the data, balancing performance and storage efficiency. The process begins by determining a range of W values based on a target data size for the landmark connectivity data. A first test set is compiled using the maximum W value in the range, and connectivity searches are performed on random vertex pairs using this test set. Performance statistics are recorded to establish a baseline. Additional test sets are then compiled for other W values in the range, and connectivity searches are repeated to measure performance. The optimal W value is selected as the lowest value that meets a minimum performance threshold while remaining within a predefined performance margin of the baseline. The final landmark connectivity data includes only the high-degree vertices corresponding to this optimal W value, ensuring efficient connectivity searches. This approach reduces computational overhead while maintaining accuracy in connectivity determinations.
Unknown
November 10, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.