GFQL Quick Reference#
This quick reference page provides short examples of various parameters and usage patterns.
Basic Usage#
Chaining Operations
g.gfql(ops=[...], engine=EngineAbstract.AUTO)
gfql sequences multiple matchers for more complex patterns of paths and subgraphs
ops: Sequence of graph node and edge matchers (
ASTObjectinstances).engine: Optional execution engine. Engine is typically not set, defaulting to ‘auto’. Use ‘cudf’ for GPU acceleration and ‘pandas’ for CPU.
Node Matchers#
n(filter_dict=None, name=None, query=None)
n matches nodes based on their attributes.
Filter nodes based on attributes.
Parameters:
filter_dict: {attribute: value} or {attribute: condition_function}
name: Optional label; adds a boolean column in the result.
query: Custom query string (e.g., “age > 30 and country == ‘USA’”).
Examples:
Match nodes where type is ‘person’:
n({"type": "person"})
Match nodes with age greater than 30:
n({"age": lambda x: x > 30})
Use a custom query string:
n(query="age > 30 and country == 'USA'")
Edge Matchers#
e_forward(edge_match=None, hops=1, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, name=None)
e_reverse(edge_match=None, hops=1, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, name=None)
e_undirected(edge_match=None, hops=1, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, name=None)
# alias for e_undirected
e(edge_match=None, hops=1, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, name=None)
e matches edges based on their attributes (undirected). May also include matching on edge’s source and destination nodes.
Traverse edges in the forward direction.
Parameters:
edge_match: {attribute: value} or {attribute: condition_function}
edge_query: Custom query string for edge attributes.
hops: int, number of hops to traverse.
min_hops/max_hops: Inclusive traversal bounds (min defaults to 1 unless max_hops is 0; max defaults to hops).
output_min_hops/output_max_hops: Optional post-filter slice; defaults keep all traversed hops up to max_hops.
label_node_hops/label_edge_hops: Optional column names for hop numbers; label_seeds=True adds hop 0 for seeds.
to_fixed_point: bool, continue traversal until no more matches.
source_node_match: Filter for source nodes.
destination_node_match: Filter for destination nodes.
source_node_query: Custom query string for source nodes.
destination_node_query: Custom query string for destination nodes.
name: Optional label.
Examples:
Traverse up to 2 hops forward on edges where status is ‘active’:
e_forward({"status": "active"}, hops=2)
Traverse 2..4 hops but show only hops 3..4 with labels:
e_forward( {"status": "active"}, min_hops=2, max_hops=4, output_min_hops=3, label_edge_hops="edge_hop" )
Use custom edge query strings:
e_forward(edge_query="weight > 5 and type == 'connects'")
Filter source and destination nodes with match dictionaries:
e_forward( source_node_match={"status": "active"}, destination_node_match={"age": lambda x: x < 30} )
Filter source and destination nodes with queries:
e_forward( source_node_query="status == 'active'", destination_node_query="age < 30" )
Label matched edges:
e_forward(name="active_edges")
Predicates#
graphistry.compute.predicates.ASTPredicate.ASTPredicate
Matches using a predicate on entity attributes.
See GFQL Operator Reference for more information.
Example:
Match nodes where category is ‘A’, ‘B’, or ‘C’:
from graphistry import n, is_in n({"category": is_in(["A", "B", "C"])})
Combined Examples#
Find people connected to transactions via active relationships:
g.gfql([ n({"type": "person"}), e_forward({"status": "active"}), n({"type": "transaction"}) ])
Label nodes and edges during traversal:
g.gfql([ n({"id": "start_node"}, name="start"), e_forward(name="edge1"), n({"level": 2}, name="middle"), e_forward(name="edge2"), n({"type": "end_type"}, name="end") ])
Traverse until no more matches (fixed point):
g.gfql([ n({"status": "infected"}), e_forward(to_fixed_point=True), n(name="reachable") ])
Filter by multiple conditions:
g.gfql([ n({"type": is_in(["server", "database"])}), e_undirected({"protocol": "TCP"}, hops=3), n(query="risk_level >= 8") ])
Use custom queries in matchers:
g.gfql([ n(query="age > 30 and country == 'USA'"), e_forward(edge_query="weight > 5"), n(query="status == 'active'") ])
GPU Acceleration#
Enable GPU mode:
g.gfql([...], engine='cudf')
Example with cuDF DataFrames:
import cudf e_gdf = cudf.from_pandas(edge_df) n_gdf = cudf.from_pandas(node_df) g = graphistry.nodes(n_gdf, 'node_id').edges(e_gdf, 'src', 'dst') g.gfql([...], engine='cudf')
Remote Mode#
Query existing remote data
g = graphistry.bind(dataset_id='ds-abc-123') nodes_df = g.gfql_remote([n()])._nodes
Upload graph and run GFQL
g2 = g1.upload() g3 = g2.gfql_remote([n(), e(), n()])
Enforce CPU and GPU mode on remote GFQL
g3a = g2.gfql_remote([n(), e(), n()], engine='pandas') g3b = g2.gfql_remote([n(), e(), n()], engine='cudf')
Return only nodes and certain columns
cols = ['id', 'name'] g2b = g1.gfql_remote([n(), e(), n()], output_type="edges", edge_col_subset=cols)
Return only edges and certain columns
cols = ['src', 'dst'] g2b = g1.gfql_remote([n(), e(), n()], output_type="edges", edge_col_subset=cols)
Return only shape metadata
shape_df = g1.chain_remote_shape([n(), e(), n()])
Run remote Python and get back a graph
def my_remote_trim_graph_task(g): return (g .nodes(g._nodes[:10]) .edges(g._edges[:10]) ) g2 = g1.upload() g3 = g2.python_remote_g(my_remote_trim_graph_task)
Run remote Python and get back a table
def first_n_edges(g): return g._edges[:10] some_edges_df = g.python_remote_table(first_n_edges)
Run remote Python and get back JSON
def first_n_edges(g): return g._edges[:10].to_json() some_edges_json = g.python_remote_json(first_n_edges)
Run remote Python and ensure runs on CPU or GPU
g3a = g2.python_remote_g(my_remote_trim_graph_task, engine='pandas') g3b = g2.python_remote_g(my_remote_trim_graph_task, engine='cudf')
Run remote Python, passing as a string
g2 = g1.upload() # ensure method is called "task" and takes a single argument "g" g3 = g2.python_remote_g(""" def task(g): return (g .nodes(g._nodes[:10]) .edges(g._edges[:10]) ) """)
Let Bindings and DAG Patterns#
Use Let bindings to create directed acyclic graph (DAG) patterns with named operations:
Basic Let with named bindings:
from graphistry import let, ref, Chain result = g.gfql(let({ 'suspects': [n({'risk_score': gt(80)})], 'connections': ref('suspects', [ e_forward({'type': 'transaction'}), n() ]) })) # Access results by name suspects = result._nodes[result._nodes['suspects']] connections = result._edges[result._edges['connections']]
Complex DAG with multiple references:
from graphistry import Chain result = g.gfql(let({ 'high_value': [n({'balance': gt(100000)})], 'large_transfers': ref('high_value', [ e_forward({'type': 'transfer', 'amount': gt(10000)}), n() ]), 'suspicious': ref('large_transfers', [ n({'created_recent': True, 'verified': False}) ]) }))
Call Operations#
Run graph algorithms like PageRank, community detection, and layouts directly within your GFQL queries:
Compute PageRank:
from graphistry import call, let, ref, n, e # Use let() to compose filter + enrichment result = g.gfql(let({ 'persons': [n({'type': 'person'}), e(), n()], 'ranked': ref('persons', [call('compute_cugraph', {'alg': 'pagerank', 'damping': 0.85})]) })) # Results have pagerank column top_nodes = result._nodes.sort_values('pagerank', ascending=False).head(10)
Community detection with Louvain:
from graphistry import call, let, ref, n, e_forward # Use let() to compose traversal + community detection result = g.gfql(let({ 'reachable': [n({'active': True}), e_forward(to_fixed_point=True), n()], 'communities': ref('reachable', [call('compute_cugraph', {'alg': 'louvain'})]) })) # Results have community column communities = result._nodes.groupby('community').size()
Filter and compute within Let:
from graphistry import call, let, ref, n, e, gt # Split mixed chain into separate bindings result = g.gfql(let({ 'suspects': [n({'flagged': True}), e(), n()], 'ranked': ref('suspects', [ call('compute_cugraph', {'alg': 'pagerank'}) ]), 'influencers': ref('ranked', [ n({'pagerank': gt(0.01)}) ]) }))
Apply layout algorithms:
from graphistry import call, let, ref, n, e_forward, is_in # Use let() to compose traversal + layout result = g.gfql(let({ 'entities': [n({'type': is_in(['person', 'company'])}), e_forward(), n()], 'positioned': ref('entities', [call('fa2_layout', {'iterations': 100})]) })) # Results have x, y coordinates for visualization result.plot()
Tip: For subset-based coloring after GFQL, use result.collections(...) and see
Layout Settings & Visualization Embedding.
Remote Graph References#
Reference graphs on remote servers for distributed computing:
Basic remote reference:
from graphistry import remote result = g.gfql([ remote(dataset_id='fraud-network-2024'), n({'risk_score': gt(90)}), e_forward() ])
Combine remote and local data in Let:
result = g.gfql(let({ 'remote_data': remote(dataset_id='historical-2023'), 'high_risk': ref('remote_data', [ n({'risk_score': gt(95)}) ]), 'connections': ref('high_risk', [ e_forward({'type': 'transaction'}), n() ]) }))
Advanced Usage#
Traversal with source and destination node filters and queries:
e_forward( edge_query="type == 'follows' and weight > 2", source_node_match={"status": "active"}, destination_node_query="age < 30", hops=2, name="social_edges" )
Node matcher with all parameters:
n( filter_dict={"department": "sales"}, query="age > 25 and tenure > 2", name="experienced_sales" )
Edge matcher with all parameters:
e_reverse( edge_match={"transaction_type": "refund"}, edge_query="amount > 100", source_node_match={"status": "inactive"}, destination_node_match={"region": "EMEA"}, name="large_refunds" )
Parameter Summary#
Common Parameters:
filter_dict: Attribute filters (e.g., {“status”: “active”})
query: Custom query string (e.g., “age > 30”)
hops: Max hops to traverse (shorthand for max_hops, default 1)
to_fixed_point: Continue traversal until no more matches (bool, default False)
name: Label for matchers (str)
source_node_match, destination_node_match: Filters for connected nodes
source_node_query, destination_node_query: Queries for connected nodes
edge_match: Filters for edges
edge_query: Query for edges
engine: Execution engine (EngineAbstract.AUTO, ‘cudf’, etc.)
Traversal Directions#
Forward Traversal: e_forward(…)
Reverse Traversal: e_reverse(…)
Undirected Traversal: e_undirected(…)
Tips and Best Practices#
Limit hops for performance: Specify hops to control traversal depth.
Use naming for analysis: Apply name to label and filter results.
Combine filters: Use filter_dict and query for precise matching.
Leverage GPU acceleration: Use engine=’cudf’ for large datasets.
Avoid infinite loops: Be cautious with to_fixed_point=True in cyclic graphs.
Examples at a Glance#
Find all paths between two nodes:
g.gfql([ n({g._node: "Alice"}), e_undirected(hops=3), n({g._node: "Bob"}) ])
Match nodes with IDs in a range:
n(query="100 <= id <= 200")
Traverse edges with specific labels:
e_forward({"label": is_in(["knows", "likes"])})
Identify subgraphs based on attributes:
g.gfql([ n({"community": "A"}), e_undirected(hops=2), n({"community": "B"}, name="bridge_nodes") ])
Custom edge and node queries:
g.gfql([ n(query="age >= 18"), e_forward(edge_query="interaction == 'message'"), n(query="location == 'NYC'") ])