GFQL Hop Matcher#

Hop is the core primitive behind a single matcher step in chain.

Calling hop directly has performance benefits over calling chain so may be helpful for larger graphs.

Graph hop/traversal operations for PyGraphistry

NOTE: Excluded from pyre (.pyre_configuration) - hop() complexity causes hang. Use mypy.

graphistry.compute.hop.hop(self, nodes=None, hops=1, *, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, direction='forward', edge_match=None, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, return_as_wave_front=False, target_wave_front=None, engine=EngineAbstract.AUTO)#

Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources

This can be faster than the equivalent chain([…]) call that wraps it with additional steps

See chain() examples for examples of many of the parameters

g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: consider paths of length 1 to ‘hops’ steps, if any (default 1). Shorthand for max_hops. min_hops/max_hops: inclusive traversal bounds; defaults preserve legacy behavior (min=1 unless max=0; max defaults to hops). output_min_hops/output_max_hops: optional output slice applied after traversal; defaults keep all traversed hops up to max_hops. Useful for showing a subrange (e.g., min/max = 2..4 but display only hops 3..4). label_node_hops/label_edge_hops: optional column names for hop numbers (omit or None to skip). Nodes record the first hop step they are reached (1 = first expansion); edges record the hop step that traversed them. label_seeds: when True and labeling, also write hop 0 for seed nodes in the node label column. to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping (including intermediate) destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) source_node_query: dataframe query to match nodes before hopping (including intermediate) destination_node_query: dataframe query to match nodes after hopping (including intermediate) edge_query: dataframe query to match edges before hopping (including intermediate) return_as_wave_front: Exclude starting node(s) in return, returning only encountered nodes target_wave_front: Only consider these nodes + self._nodes for reachability engine: ‘auto’, ‘pandas’, ‘cudf’ (GPU)

Parameters:
  • self (Plottable)

  • nodes (Any | None)

  • hops (int | None)

  • min_hops (int | None)

  • max_hops (int | None)

  • output_min_hops (int | None)

  • output_max_hops (int | None)

  • label_node_hops (str | None)

  • label_edge_hops (str | None)

  • label_seeds (bool)

  • to_fixed_point (bool)

  • direction (str)

  • edge_match (dict | None)

  • source_node_match (dict | None)

  • destination_node_match (dict | None)

  • source_node_query (str | None)

  • destination_node_query (str | None)

  • edge_query (str | None)

  • target_wave_front (Any | None)

  • engine (EngineAbstract | str)

Return type:

Plottable

graphistry.compute.hop.prepare_merge_dataframe(edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, is_reverse=False)#

Prepare a merge DataFrame handling column name conflicts for hop operations. Centralizes the conflict resolution logic for both forward and reverse directions.

Parameters:#

edges_indexedDataFrame

The indexed edges DataFrame

column_conflictbool

Whether there’s a column name conflict

source_colstr

The source column name

dest_colstr

The destination column name

edge_id_colstr

The edge ID column name

node_colstr

The node column name

temp_colstr

The temporary column name to use in case of conflict

is_reversebool, default=False

Whether to prepare for reverse direction hop

Returns:#

DataFrame

A merge DataFrame prepared for hop operation

Parameters:
  • edges_indexed (Any)

  • column_conflict (bool)

  • source_col (str)

  • dest_col (str)

  • edge_id_col (str)

  • node_col (str)

  • temp_col (str)

  • is_reverse (bool)

Return type:

Any

graphistry.compute.hop.process_hop_direction(direction_name, wave_front_iter, edges_indexed, column_conflict, source_col, dest_col, edge_id_col, node_col, temp_col, intermediate_target_wave_front, base_target_nodes, target_col, node_match_query, node_match_dict, is_reverse, debugging)#

Process a single hop direction (forward or reverse)

Parameters:#

direction_namestr

Name of the direction for debug logging (‘forward’ or ‘reverse’)

wave_front_iterDataFrame

Current wave front of nodes to expand from

edges_indexedDataFrame

The indexed edges DataFrame

column_conflictbool

Whether there’s a name conflict between node and edge columns

source_colstr

The source column name

dest_colstr

The destination column name

edge_id_colstr

The edge ID column name

node_colstr

The node column name

temp_colstr

The temporary column name for conflict resolution

intermediate_target_wave_frontDataFrame or None

Pre-calculated target wave front for filtering

base_target_nodesDataFrame

The base target nodes for destination filtering

target_colstr

The target column for merging (destination or source depending on direction)

node_match_querystr or None

Optional query for node filtering

node_match_dictdict or None

Optional dictionary for node filtering

is_reversebool

Whether this is the reverse direction

debuggingbool

Whether debug logging is enabled

Returns:#

Tuple[DataFrame, DataFrame]

The processed hop edges and node IDs

Parameters:
  • direction_name (str)

  • wave_front_iter (Any)

  • edges_indexed (Any)

  • column_conflict (bool)

  • source_col (str)

  • dest_col (str)

  • edge_id_col (str)

  • node_col (str)

  • temp_col (str)

  • intermediate_target_wave_front (Any | None)

  • base_target_nodes (Any)

  • target_col (str)

  • node_match_query (str | None)

  • node_match_dict (dict | None)

  • is_reverse (bool)

  • debugging (bool)

Return type:

Tuple[Any, Any]

graphistry.compute.hop.query_if_not_none(query, df)#
Parameters:
  • query (str | None)

  • df (Any)

Return type:

Any