Collections in PyGraphistry#
Collections define labeled subsets of a graph (nodes, edges, or subgraphs) using full GFQL. They enable advanced, layered styling that overrides base encodings when you need precise highlights.
Use collections when you want: - baseline encodings (for example, by entity type) plus overlays for alerts or critical paths - multiple overlapping highlights with a priority order - a UI panel for toggling focused subsets on and off
Collections are evaluated in priority order, with higher priority collections overriding lower ones for styling.
In this notebook, we build sets using GFQL AST helpers, combine them with intersections, and apply node and edge colors. Collections can be based on nodes, edges, or multi-step graph traversals (Chain).
[ ]:
from pathlib import Path
import pandas as pd
import graphistry
from graphistry import collection_set, collection_intersection, n, e_forward, Chain
edges = pd.read_csv(Path('demos/data/honeypot.csv'))
g = graphistry.edges(edges, "attackerIP", "victimIP")
[ ]:
# Use Chain to select subgraphs (nodes + edges) by edge attributes
collections = [
collection_set(
expr=Chain([n(), e_forward({"vulnName": "MS08067 (NetAPI)"}), n()]),
id='netapi',
name='MS08067 (NetAPI)',
node_color='#00BFFF',
edge_color='#00BFFF',
),
collection_set(
expr=Chain([n(), e_forward({"victimPort": 445.0}), n()]),
id='port445',
name='Port 445',
node_color='#32CD32',
edge_color='#32CD32',
),
collection_intersection(
sets=['netapi', 'port445'],
name='NetAPI + 445',
node_color='#AABBCC',
edge_color='#AABBCC',
),
]
g2 = g.collections(
collections=collections,
show_collections=True,
collections_global_node_color='CCCCCC',
collections_global_edge_color='CCCCCC',
)
g2._url_params
[ ]:
# Render (requires graphistry.register(...))
g2.plot()
Notes and validation#
Order matters: earlier collections override later ones.
Use collections for priority-based subsets and overlaps; use encode_* for simple column-driven colors.
Helper constructors:
graphistry.collection_set(...)andgraphistry.collection_intersection(...)return JSON-friendly dicts (AST inputs wrap togfql_chain).Provide
idfor sets used by intersections.Global colors apply to nodes/edges not in any collection;
#is optional.Use
validate='strict'to raise, orwarn=Falseto silence warnings.
Wire protocol and pre-encoded strings:
collections_wire = [
{
"type": "set",
"name": "Wire Protocol Example",
"node_color": "#AA00AA",
"expr": {
"type": "gfql_chain",
"gfql": [
{"type": "Node", "filter_dict": {"status": "purchased"}}
]
}
}
]
g.collections(collections=collections_wire)
g.collections(collections=encoded_collections, encode=False)
Run g2.plot() in a notebook session with valid credentials to render inline.
Overlap priority example#
Earlier collections override later ones when they overlap.
collections_priority = [
collection_set(
expr=Chain([n(), e_forward({"vulnName": "MS08067 (NetAPI)"}), n()]),
id="netapi",
name="MS08067 (NetAPI)",
node_color="#FFAA00",
edge_color="#FFAA00",
),
collection_set(
expr=Chain([n(), e_forward({"victimPort": 445.0}), n()]),
id="port445",
name="Port 445",
node_color="#00BFFF",
edge_color="#00BFFF",
),
]
g.collections(collections=collections_priority)
For more on color encodings, see the Color encodings notebook.