Compute API Reference

Contents

Compute API Reference#

ComputeMixin module#

class graphistry.compute.ComputeMixin.ComputeMixin(*a, **kw)#

Bases: Plottable

chain(*args, **kwargs)#

Deprecated since version 2.XX.X: Use gfql() instead for a unified API that supports both chains and DAGs.

Chain a list of ASTObject (node/edge) traversal operations

Return subgraph of matches according to the list of node & edge matchers If any matchers are named, add a correspondingly named boolean-valued column to the output

For direct calls, exposes convenience List[ASTObject]. Internal operational should prefer Chain.

Use engine=’cudf’ to force automatic GPU acceleration mode

Parameters:
  • ops – List[ASTObject] Various node and edge matchers

  • validate_schema – Whether to validate the chain against the graph schema before executing

  • policy – Optional policy dict for hooks

  • context – Optional ExecutionContext for tracking execution state

Returns:

Plotter

Return type:

Plotter

chain_remote(*args, **kwargs)#

Deprecated since version 2.XX.X: Use gfql_remote() instead for a unified API that supports both chains and DAGs.

Remotely run GFQL chain query on a remote dataset.

Uses the latest bound _dataset_id, and uploads current dataset if not already bound. Note that rebinding calls of edges() and nodes() reset the _dataset_id binding.

param chain:

GFQL chain query as a Python object or in serialized JSON format

type chain:

Union[Chain, List[ASTObject], Dict[str, JSONVal]]

param api_token:

Optional JWT token. If not provided, refreshes JWT and uses that.

type api_token:

Optional[str]

param dataset_id:

Optional dataset_id. If not provided, will fallback to self._dataset_id. If not provided, will upload current data, store that dataset_id, and run GFQL against that.

type dataset_id:

Optional[str]

param output_type:

Whether to return nodes and edges (“all”, default), Plottable with just nodes (“nodes”), or Plottable with just edges (“edges”). For just a dataframe of the resultant graph shape (output_type=”shape”), use instead chain_remote_shape().

type output_type:

OutputType

param format:

What format to fetch results. We recommend a columnar format such as parquet, which it defaults to when output_type is not shape.

type format:

Optional[FormatType]

param df_export_args:

When server parses data, any additional parameters to pass in.

type df_export_args:

Optional[Dict, str, Any]]

param node_col_subset:

When server returns nodes, what property subset to return. Defaults to all.

type node_col_subset:

Optional[List[str]]

param edge_col_subset:

When server returns edges, what property subset to return. Defaults to all.

type edge_col_subset:

Optional[List[str]]

param engine:

Override which run mode GFQL uses. Defaults to ‘auto’ which auto-detects based on DataFrame type. Also accepts ‘pandas’ or ‘cudf’.

type engine:

EngineAbstractType

param validate:

Whether to locally test code, and if uploading data, the data. Default true.

type validate:

bool

param persist:

Whether to persist dataset on server and return dataset_id for immediate URL generation. Default false.

type persist:

bool

Example: Explicitly upload graph and return subgraph where nodes have at least one edge
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry.edges(es, 'src', 'dst').upload()
assert g1._dataset_id, "Graph should have uploaded"

g2 = g1.chain_remote([n(), e(), n()])
print(f'dataset id: {g2._dataset_id}, # nodes: {len(g2._nodes)}')
Example: Return subgraph where nodes have at least one edge, with implicit upload
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry.edges(es, 'src', 'dst')
g2 = g1.chain_remote([n(), e(), n()])
print(f'dataset id: {g2._dataset_id}, # nodes: {len(g2._nodes)}')
Example: Return subgraph where nodes have at least one edge, with implicit upload, and force GPU mode
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry.edges(es, 'src', 'dst')
g2 = g1.chain_remote([n(), e(), n()], engine='cudf')
print(f'dataset id: {g2._dataset_id}, # nodes: {len(g2._nodes)}')
Return type:

Plottable

chain_remote_shape(*args, **kwargs)#

Deprecated since version 2.XX.X: Use gfql_remote_shape() instead for a unified API that supports both chains and DAGs.

Like chain_remote(), except instead of returning a Plottable, returns a pd.DataFrame of the shape of the resulting graph.

Useful as a fast success indicator that avoids the need to return a full graph when a match finds hits, return just the metadata.

Example: Upload graph and compute number of nodes with at least one edge
import graphistry
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry.edges(es, 'src', 'dst').upload()
assert g1._dataset_id, "Graph should have uploaded"

shape_df = g1.chain_remote_shape([n(), e(), n()])
print(shape_df)
Example: Compute number of nodes with at least one edge, with implicit upload, and force GPU mode
import graphistry
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry.edges(es, 'src', 'dst')

shape_df = g1.chain_remote_shape([n(), e(), n()], engine='cudf')
print(shape_df)
Return type:

DataFrame

collapse(node, attribute, column, self_edges=False, unwrap=False, verbose=False)#

Topology-aware collapse by given column attribute starting at node

Traverses directed graph from start node node and collapses clusters of nodes that share the same property so that topology is preserved.

Parameters:
  • node (str | int) – start node to begin traversal

  • attribute (str | int) – the given attribute to collapse over within column

  • column (str | int) – the column of nodes DataFrame that contains attribute to collapse over

  • self_edges (bool) – whether to include self edges in the collapsed graph

  • unwrap (bool) – whether to unwrap the collapsed graph into a single node

  • verbose (bool) – whether to print out collapse summary information

:returns:A new Graphistry instance with nodes and edges DataFrame containing collapsed nodes and edges given by column attribute – nodes and edges DataFrames contain six new columns collapse_{node | edges} and final_{node | edges}, while original (node, src, dst) columns are left untouched :rtype: Plottable

drop_nodes(nodes)#

return g with any nodes/edges involving the node id series removed

filter_edges_by_dict(*args, **kwargs)#

filter edges to those that match all values in filter_dict

filter_nodes_by_dict(*args, **kwargs)#

filter nodes to those that match all values in filter_dict

get_degrees(col='degree', degree_in='degree_in', degree_out='degree_out')#

Decorate nodes table with degree info

Edges must be dataframe-like: pandas, cudf, …

Parameters determine generated column names

Warning: Self-cycles are currently double-counted. This may change.

Example: Generate degree columns

edges = pd.DataFrame({'s': ['a','b','c','d'], 'd': ['c','c','e','e']})
g = graphistry.edges(edges, 's', 'd')
print(g._nodes)  # None
g2 = g.get_degrees()
print(g2._nodes)  # pd.DataFrame with 'id', 'degree', 'degree_in', 'degree_out'
Parameters:
  • col (str)

  • degree_in (str)

  • degree_out (str)

get_indegrees(col='degree_in')#

See get_degrees

Parameters:

col (str)

get_outdegrees(col='degree_out')#

See get_degrees

Parameters:

col (str)

get_topological_levels(level_col='level', allow_cycles=True, warn_cycles=True, remove_self_loops=True)#

Label nodes on column level_col based on topological sort depth Supports pandas + cudf, using parallelism within each level computation Options: * allow_cycles: if False and detects a cycle, throw ValueException, else break cycle by picking a lowest-in-degree node * warn_cycles: if True and detects a cycle, proceed with a warning * remove_self_loops: preprocess by removing self-cycles. Avoids allow_cycles=False, warn_cycles=True messages.

Example:

edges_df = gpd.DataFrame({‘s’: [‘a’, ‘b’, ‘c’, ‘d’],’d’: [‘b’, ‘c’, ‘e’, ‘e’]}) g = graphistry.edges(edges_df, ‘s’, ‘d’) g2 = g.get_topological_levels() g2._nodes.info() # pd.DataFrame with | ‘id’ , ‘level’ |

Parameters:
  • level_col (str)

  • allow_cycles (bool)

  • warn_cycles (bool)

  • remove_self_loops (bool)

Return type:

Plottable

gfql(*args, **kwargs)#

Execute a GFQL query - either a chain or a DAG

Unified entrypoint that automatically detects query type and dispatches to the appropriate execution engine.

Parameters:
  • query – GFQL query - ASTObject, List[ASTObject], Chain, ASTLet, or dict

  • engine – Execution engine (auto, pandas, cudf)

  • output – For DAGs, name of binding to return (default: last executed)

  • policy – Optional policy hooks for external control (preload, postload, precall, postcall phases)

Returns:

Resulting Plottable

Return type:

Plottable

Policy Hooks

The policy parameter enables external control over GFQL query execution through hooks at three phases:

  • preload: Before data is loaded (can modify query/engine)

  • postload: After data is loaded (can inspect data size)

  • precall: Before each method call (can deny based on parameters)

  • postcall: After each method call (can validate results, timing)

Policies can accept/deny/modify operations. Modifications are validated against a schema and applied immediately. Recursion is prevented at depth 1.

Policy Example

from graphistry.compute.gfql.policy import PolicyContext, PolicyException
from typing import Optional

def create_tier_policy(max_nodes: int = 10000):
    # State via closure
    state = {"nodes_processed": 0}

    def policy(context: PolicyContext) -> None:
        phase = context['phase']

        if phase == 'preload':
            # Force CPU for free tier
            return {'engine': 'cpu'}

        elif phase == 'postload':
            # Check data size limits
            stats = context.get('graph_stats', {})
            nodes = stats.get('nodes', 0)
            state['nodes_processed'] += nodes

            if state['nodes_processed'] > max_nodes:
                raise PolicyException(
                    phase='postload',
                    reason=f'Node limit {max_nodes} exceeded',
                    code=403,
                    data_size={'nodes': state['nodes_processed']}
                )

        elif phase == 'precall':
            # Restrict operations
            op = context.get('call_op', '')
            if op == 'hypergraph':
                raise PolicyException(
                    phase='precall',
                    reason='Hypergraph not available in free tier',
                    code=403
                )

        return None

    return policy

# Use policy
policy_func = create_tier_policy(max_nodes=1000)
result = g.gfql([n()], policy={
    'preload': policy_func,
    'postload': policy_func,
    'precall': policy_func
})

Example: Chain query

from graphistry.compute.ast import n, e

# As list
result = g.gfql([n({'type': 'person'}), e(), n()])

# As Chain object
from graphistry.compute.chain import Chain
result = g.gfql(Chain([n({'type': 'person'}), e(), n()]))

Example: DAG query

from graphistry.compute.ast import let, ref, n, e

result = g.gfql(let({
    'people': n({'type': 'person'}),
    'friends': ref('people', [e({'rel': 'knows'}), n()])
}))

# Select specific output
friends = g.gfql(result, output='friends')

Example: Transformations (e.g., hypergraph)

from graphistry.compute import hypergraph

# Simple transformation
hg = g.gfql(hypergraph(entity_types=['user', 'product']))

# Or using call()
from graphistry.compute.ast import call
hg = g.gfql(call('hypergraph', {'entity_types': ['user', 'product']}))

# In a DAG with other operations
result = g.gfql(let({
    'hg': hypergraph(entity_types=['user', 'product']),
    'filtered': ref('hg', [n({'type': 'user'})])
}))

Example: Auto-detection

# List → chain execution
g.gfql([n(), e(), n()])

# Single ASTObject → chain execution
g.gfql(n({'type': 'person'}))

# Dict → DAG execution (convenience)
g.gfql({'people': n({'type': 'person'})})
gfql_remote(chain, api_token=None, dataset_id=None, output_type='all', format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine='auto', validate=True, persist=False)#

Run GFQL query remotely.

This is the remote execution version of gfql(). It supports both simple chains and complex DAG patterns with Let bindings, including transformations like hypergraph.

Example:

# Remote hypergraph transformation hg = g.gfql_remote(call(‘hypergraph’, {‘entity_types’: [‘user’, ‘product’]}))

# Or using typed builder from graphistry.compute import hypergraph hg = g.gfql_remote(hypergraph(entity_types=[‘user’, ‘product’]))

See chain_remote() for detailed documentation (chain_remote is deprecated).

Parameters:
  • chain (Chain | List[ASTObject] | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • output_type (Literal['all', 'nodes', 'edges', 'shape'])

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • validate (bool)

  • persist (bool)

Return type:

Plottable

gfql_remote_shape(chain, api_token=None, dataset_id=None, format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine='auto', validate=True, persist=False)#

Get shape metadata for remote GFQL query execution.

This is the remote shape version of gfql(). Returns metadata about the resulting graph without downloading the full data.

See chain_remote_shape() for detailed documentation (chain_remote_shape is deprecated).

Parameters:
  • chain (Chain | List[ASTObject] | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • validate (bool)

  • persist (bool)

Return type:

DataFrame

hop(*args, **kwargs)#

Given a graph and some source nodes, return subgraph of all paths within k-hops from the sources

This can be faster than the equivalent chain([…]) call that wraps it with additional steps

See chain() examples for examples of many of the parameters

g: Plotter nodes: dataframe with id column matching g._node. None signifies all nodes (default). hops: consider paths of length 1 to ‘hops’ steps, if any (default 1). Shorthand for max_hops. min_hops/max_hops: inclusive traversal bounds; defaults preserve legacy behavior (min=1 unless max=0; max defaults to hops). output_min_hops/output_max_hops: optional output slice applied after traversal; defaults keep all traversed hops up to max_hops. Useful for showing a subrange (e.g., min/max = 2..4 but display only hops 3..4). label_node_hops/label_edge_hops: optional column names for hop numbers (omit or None to skip). Nodes record the first hop step they are reached (1 = first expansion); edges record the hop step that traversed them. label_seeds: when True and labeling, also write hop 0 for seed nodes in the node label column. to_fixed_point: keep hopping until no new nodes are found (ignores hops) direction: ‘forward’, ‘reverse’, ‘undirected’ edge_match: dict of kv-pairs to exact match (see also: filter_edges_by_dict) source_node_match: dict of kv-pairs to match nodes before hopping (including intermediate) destination_node_match: dict of kv-pairs to match nodes after hopping (including intermediate) source_node_query: dataframe query to match nodes before hopping (including intermediate) destination_node_query: dataframe query to match nodes after hopping (including intermediate) edge_query: dataframe query to match edges before hopping (including intermediate) return_as_wave_front: Exclude starting node(s) in return, returning only encountered nodes target_wave_front: Only consider these nodes + self._nodes for reachability engine: ‘auto’, ‘pandas’, ‘cudf’ (GPU)

keep_nodes(nodes)#

Limit nodes and edges to those selected by parameter nodes For edges, both source and destination must be in nodes Nodes can be a list or series of node IDs, or a dictionary When a dictionary, each key corresponds to a node column, and nodes will be included when all match

materialize_nodes(reuse=True, engine=EngineAbstract.AUTO)#

Generate g._nodes based on g._edges

Uses g._node for node id if exists, else ‘id’

Edges must be dataframe-like: cudf, pandas, …

When reuse=True and g._nodes is not None, use it

Example: Generate nodes

edges = pd.DataFrame({'s': ['a','b','c','d'], 'd': ['c','c','e','e']})
g = graphistry.edges(edges, 's', 'd')
print(g._nodes)  # None
g2 = g.materialize_nodes()
print(g2._nodes)  # pd.DataFrame
Parameters:
  • reuse (bool)

  • engine (EngineAbstract | str)

Return type:

Plottable

prune_self_edges()#
python_remote_g(*args, **kwargs)#

Remotely run Python code on a remote dataset that returns a Plottable

Uses the latest bound _dataset_id, and uploads current dataset if not already bound. Note that rebinding calls of edges() and nodes() reset the _dataset_id binding.

Parameters:
  • code (Union[str, Callable[..., object]]) – Python code that includes a top-level function def task(g: Plottable) -> Union[str, Dict].

  • api_token (Optional[str]) – Optional JWT token. If not provided, refreshes JWT and uses that.

  • dataset_id (Optional[str]) – Optional dataset_id. If not provided, will fallback to self._dataset_id. If not defined, will upload current data, store that dataset_id, and run code against that.

  • format (Optional[FormatType]) – What format to fetch results. Defaults to ‘parquet’.

  • output_type (Optional[OutputTypeGraph]) – What shape of output to fetch. Defaults to ‘all’. Options include ‘nodes’, ‘edges’, ‘all’ (both). For other variants, see python_remote_shape and python_remote_json.

  • engine (EngineAbstractType) – Override which run mode GFQL uses. Defaults to ‘auto’ which auto-detects based on DataFrame type. Also accepts ‘pandas’ or ‘cudf’.

  • run_label (Optional[str]) – Optional label for the run for serverside job tracking.

  • validate (bool) – Whether to locally test code, and if uploading data, the data. Default true.

Return type:

Any

Example: Upload data and count the results
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry
    .edges(es, source='src', destination='dst')
    .upload()
assert g1._dataset_id is not None, "Successfully uploaded"
g2 = g1.python_remote_g(
    code='''
        from typing import Any, Dict
        from graphistry import Plottable

        def task(g: Plottable) -> Dict[str, Any]:
            return g
    ''',
    engine='cudf')
num_edges = len(g2._edges)
print(f'num_edges: {num_edges}')
python_remote_json(*args, **kwargs)#

Remotely run Python code on a remote dataset that returns json

Uses the latest bound _dataset_id, and uploads current dataset if not already bound. Note that rebinding calls of edges() and nodes() reset the _dataset_id binding.

Parameters:
  • code (Union[str, Callable[..., object]]) – Python code that includes a top-level function def task(g: Plottable) -> Union[str, Dict].

  • api_token (Optional[str]) – Optional JWT token. If not provided, refreshes JWT and uses that.

  • dataset_id (Optional[str]) – Optional dataset_id. If not provided, will fallback to self._dataset_id. If not defined, will upload current data, store that dataset_id, and run code against that.

  • engine (EngineAbstractType) – Override which run mode GFQL uses. Defaults to ‘auto’ which auto-detects based on DataFrame type. Also accepts ‘pandas’ or ‘cudf’.

  • run_label (Optional[str]) – Optional label for the run for serverside job tracking.

  • validate (bool) – Whether to locally test code, and if uploading data, the data. Default true.

Return type:

Any

Example: Upload data and count the results
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry
    .edges(es, source='src', destination='dst')
    .upload()
assert g1._dataset_id is not None, "Successfully uploaded"
obj = g1.python_remote_json(
    code='''
        from typing import Any, Dict
        from graphistry import Plottable

        def task(g: Plottable) -> Dict[str, Any]:
            return {'num_edges': len(g._edges)}
    ''',
    engine='cudf')
num_edges = obj['num_edges']
print(f'num_edges: {num_edges}')
python_remote_table(*args, **kwargs)#

Remotely run Python code on a remote dataset that returns a table

Uses the latest bound _dataset_id, and uploads current dataset if not already bound. Note that rebinding calls of edges() and nodes() reset the _dataset_id binding.

Parameters:
  • code (Union[str, Callable[..., object]]) – Python code that includes a top-level function def task(g: Plottable) -> Union[str, Dict].

  • api_token (Optional[str]) – Optional JWT token. If not provided, refreshes JWT and uses that.

  • dataset_id (Optional[str]) – Optional dataset_id. If not provided, will fallback to self._dataset_id. If not defined, will upload current data, store that dataset_id, and run code against that.

  • format (Optional[FormatType]) – What format to fetch results. Defaults to ‘parquet’.

  • output_type (Optional[OutputTypeGraph]) – What shape of output to fetch. Defaults to ‘table’. Options include ‘table’, ‘nodes’, and ‘edges’.

  • engine (EngineAbstractType) – Override which run mode GFQL uses. Defaults to ‘auto’ which auto-detects based on DataFrame type. Also accepts ‘pandas’ or ‘cudf’.

  • run_label (Optional[str]) – Optional label for the run for serverside job tracking.

  • validate (bool) – Whether to locally test code, and if uploading data, the data. Default true.

Return type:

Any

Example: Upload data and count the results
import graphistry
from graphistry import n, e
es = pandas.DataFrame({'src': [0,1,2], 'dst': [1,2,0]})
g1 = graphistry
    .edges(es, source='src', destination='dst')
    .upload()
assert g1._dataset_id is not None, "Successfully uploaded"
edges_df = g1.python_remote_table(
    code='''
        from typing import Any, Dict
        from graphistry import Plottable

        def task(g: Plottable) -> Dict[str, Any]:
            return g._edges
    ''',
    engine='cudf')
num_edges = len(edges_df)
print(f'num_edges: {num_edges}')
to_cudf()#

Convert to GPU mode by converting any defined nodes and edges to cudf dataframes

When nodes or edges are already cudf dataframes, they are left as is

Parameters:

g (Plottable) – Graphistry object

Returns:

Graphistry object

Return type:

Plottable

to_pandas()#

Convert to CPU mode by converting any defined nodes and edges to pandas dataframes

When nodes or edges are already pandas dataframes, they are left as is

Return type:

Plottable

Collapse#

graphistry.compute.collapse.check_default_columns_present_and_coerce_to_string(g)#

Helper to set COLLAPSE columns to nodes and edges dataframe, while converting src, dst, node to dtype(str)

Generates unique internal column names to avoid conflicts with user data. Stores the generated names as attributes on the graph object: - g._collapse_node_col - g._collapse_src_col - g._collapse_dst_col

Parameters:

g (Plottable) – graphistry instance

Returns:

graphistry instance

graphistry.compute.collapse.check_has_set(ndf, parent, child, collapse_node_col)#
Parameters:

collapse_node_col (str)

graphistry.compute.collapse.collapse_algo(g, child, parent, attribute, column, seen)#

Basically candy crush over graph properties in a topology aware manner

Checks to see if child node has desired property from parent, we will need to check if (start_node=parent: has_attribute , children nodes: has_attribute) by case (T, T), (F, T), (T, F) and (F, F),we start recursive collapse (or not) on the children, reassigning nodes and edges.

if (T, T), append children nodes to start_node, re-assign the name of the node, and update the edge table with new name,

if (F, T) start k-(potentially new) super nodes, with k the number of children of start_node. Start node keeps k outgoing edges.

if (T, F) it is the end of the cluster, and we keep new node as is; keep going

if (F, F); keep going

Parameters:
  • seen (dict)

  • g (Plottable) – graphistry instance

  • child (str | int) – child node to start traversal, for first traversal, set child=parent or vice versa.

  • parent (str | int) – parent node to start traversal, in main call, this is set to child.

  • attribute (str | int) – attribute to collapse by

  • column (str | int) – column in nodes dataframe to collapse over.

Returns:

graphistry instance with collapsed nodes.

graphistry.compute.collapse.collapse_by(self, parent, start_node, attribute, column, seen, self_edges=False, unwrap=False, verbose=True)#

Main call in collapse.py, collapses nodes and edges by attribute, and returns normalized graphistry object.

Parameters:
  • self (Plottable) – graphistry instance

  • parent (str | int) – parent node to start traversal, in main call, this is set to child.

  • start_node (str | int)

  • attribute (str | int) – attribute to collapse by

  • column (str | int) – column in nodes dataframe to collapse over.

  • seen (dict) – dict of previously collapsed pairs – {n1, n2) is seen as different from (n2, n1)

  • verbose (bool) – bool, default True

  • self_edges (bool)

  • unwrap (bool)

Return type:

Plottable

:returns graphistry instance with collapsed and normalized nodes.

graphistry.compute.collapse.collapse_nodes_and_edges(g, parent, child)#

Asserts that parent and child node in ndf should be collapsed into super node. Sets new ndf with COLLAPSE nodes in graphistry instance g

# this asserts that we SHOULD merge parent and child as super node # outside logic controls when that is the case # for example, it assumes parent is already in cluster keys of COLLAPSE node

Parameters:
  • g (Plottable) – graphistry instance

  • parent (str | int) – node with attribute in column

  • child (str | int) – node with attribute in column

Returns:

graphistry instance

graphistry.compute.collapse.get_children(g, node_id, hops=1)#

Helper that gets children at k-hops from node node_id

:returns graphistry instance of hops

Parameters:
  • g (Plottable)

  • node_id (str | int)

  • hops (int)

graphistry.compute.collapse.get_cluster_store_keys(ndf, node, collapse_node_col)#

Main innovation in finding and adding to super node. Checks if node is a segment in any collapse_node in COLLAPSE column of nodes DataFrame

Parameters:
  • ndf (DataFrame) – node DataFrame

  • node (str | int) – node to find

  • collapse_node_col (str) – the collapse node column name

Returns:

DataFrame of bools of where wrap_key(node) exists in COLLAPSE column

graphistry.compute.collapse.get_edges_in_out_cluster(g, node_id, attribute, column, directed=True)#

Traverses children of node_id and separates them into incluster and outcluster sets depending if they have attribute in node DataFrame column

Parameters:
  • g (Plottable) – graphistry instance

  • node_id (str | int) – node with attribute in column

  • attribute (str | int) – attribute to collapse in column over

  • column (str | int) – column to collapse over

  • directed (bool)

graphistry.compute.collapse.get_edges_of_node(g, node_id, outgoing_edges=True, hops=1)#

Gets edges of node at k-hops from node

Parameters:
  • g (Plottable) – graphistry instance

  • node_id (str | int) – node to find edges from

  • outgoing_edges (bool) – bool, if true, finds all outgoing edges of node, default True

  • hops (int) – the number of hops from node to take, default = 1

Returns:

DataFrame of edges

graphistry.compute.collapse.get_new_node_name(ndf, parent, child, collapse_node_col)#

If child in cluster group, melts name, else makes new parent_name from parent, child

Parameters:
  • ndf (DataFrame) – node DataFrame

  • parent (str | int) – node with attribute in column

  • child (str | int) – node with attribute in column

  • collapse_node_col (str) – the collapse node column name

Return type:

str

:returns new_parent_name

graphistry.compute.collapse.has_edge(g, n1, n2, directed=True)#

Checks if n1 and n2 share an (directed or not) edge

Parameters:
  • g (Plottable) – graphistry instance

  • n1 (str | int) – node to check if has edge to n2

  • n2 (str | int) – node to check if has edge to n1

  • directed (bool) – bool, if True, checks only outgoing edges from n1->`n2`, else finds undirected edges

Returns:

bool, if edge exists between n1 and n2

Return type:

bool

graphistry.compute.collapse.has_property(g, ref_node, attribute, column)#

Checks if ref_node is in node dataframe in column with attribute :param attribute: :param column: :param g: graphistry instance :param ref_node: node to check if it as attribute in column

Returns:

bool

Parameters:
  • g (Plottable)

  • ref_node (str | int)

  • attribute (str | int)

  • column (str | int)

Return type:

bool

graphistry.compute.collapse.in_cluster_store_keys(ndf, node, collapse_node_col)#

checks if node is in collapse_node in COLLAPSE column of nodes DataFrame

Parameters:
  • ndf (DataFrame) – nodes DataFrame

  • node (str | int) – node to find

  • collapse_node_col (str) – the collapse node column name

Returns:

bool

Return type:

bool

graphistry.compute.collapse.melt(ndf, node, collapse_node_col)#

Reduces node if in cluster store, otherwise passes it through. ex:

node = “4” will take any sequence from get_cluster_store_keys, “1 2 3”, “4 3 6” and returns “1 2 3 4 6” when they have a common entry (3).

:param ndf, node DataFrame :param node: node to melt :param collapse_node_col: the collapse node column name :returns new_parent_name of super node

Parameters:
  • ndf (DataFrame)

  • node (str | int)

  • collapse_node_col (str)

Return type:

str

graphistry.compute.collapse.normalize_graph(g, self_edges=False, unwrap=False)#

Final step after collapse traversals are done, removes duplicates and moves COLLAPSE columns into respective(node, src, dst) columns of node, edges dataframe from Graphistry instance g.

Parameters:
  • g (Plottable) – graphistry instance

  • self_edges (bool) – bool, whether to keep duplicates from ndf, edf, default False

  • unwrap (bool) – bool, whether to unwrap node text with ~, default True

Returns:

final graphistry instance

Return type:

Plottable

graphistry.compute.collapse.reduce_key(key)#

Takes “1 1 2 1 2 3” -> “1 2 3

Parameters:

key (str | int) – node name

Returns:

new node name with duplicates removed

Return type:

str

graphistry.compute.collapse.unpack(g)#

Helper method that unpacks graphistry instance

ex:

ndf, edf, src, dst, node = unpack(g)

Parameters:

g (Plottable) – graphistry instance

Returns:

node DataFrame, edge DataFrame, source column, destination column, node column

graphistry.compute.collapse.unwrap_key(name)#

Unwraps node name: ~name~ -> name

Parameters:

name (str | int) – node to unwrap

Returns:

unwrapped node name

Return type:

str

graphistry.compute.collapse.wrap_key(name)#

Wraps node name -> ~name~

Parameters:

name (str | int) – node name

Returns:

wrapped node name

Return type:

str

Conditional#

class graphistry.compute.conditional.ConditionalMixin(*a, **kw)#

Bases: Plottable

DGL_graph: Any | None#
addStyle(fg=None, bg=None, page=None, logo=None)#
Parameters:
  • fg (Dict[str, Any] | None)

  • bg (Dict[str, Any] | None)

  • page (Dict[str, Any] | None)

  • logo (Dict[str, Any] | None)

Return type:

Plottable

base_url_client(v=None)#
Parameters:

v (str | None)

Return type:

str

base_url_server(v=None)#
Parameters:

v (str | None)

Return type:

str

bind(source=None, destination=None, node=None, edge=None, edge_title=None, edge_label=None, edge_color=None, edge_weight=None, edge_size=None, edge_opacity=None, edge_icon=None, edge_source_color=None, edge_destination_color=None, point_title=None, point_label=None, point_color=None, point_weight=None, point_size=None, point_opacity=None, point_icon=None, point_x=None, point_y=None, point_longitude=None, point_latitude=None, dataset_id=None, url=None, nodes_file_id=None, edges_file_id=None)#
Parameters:
  • source (str | None)

  • destination (str | None)

  • node (str | None)

  • edge (str | None)

  • edge_title (str | None)

  • edge_label (str | None)

  • edge_color (str | None)

  • edge_weight (str | None)

  • edge_size (str | None)

  • edge_opacity (str | None)

  • edge_icon (str | None)

  • edge_source_color (str | None)

  • edge_destination_color (str | None)

  • point_title (str | None)

  • point_label (str | None)

  • point_color (str | None)

  • point_weight (str | None)

  • point_size (str | None)

  • point_opacity (str | None)

  • point_icon (str | None)

  • point_x (str | None)

  • point_y (str | None)

  • point_longitude (str | None)

  • point_latitude (str | None)

  • dataset_id (str | None)

  • url (str | None)

  • nodes_file_id (str | None)

  • edges_file_id (str | None)

Return type:

Plottable

chain(ops)#

ops is Union[List[ASTObject], Chain]

Parameters:

ops (Any | List[Any])

Return type:

Plottable

chain_remote(chain, api_token=None, dataset_id=None, output_type='all', format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine=None, validate=True, persist=False)#

chain is Union[List[ASTObject], Chain]

Parameters:
  • self (Plottable)

  • chain (Any | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • output_type (Literal['all', 'nodes', 'edges', 'shape'])

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (Literal['pandas', 'cudf'] | None)

  • validate (bool)

  • persist (bool)

Return type:

Plottable

chain_remote_shape(chain, api_token=None, dataset_id=None, format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine=None, validate=True, persist=False)#

chain is Union[List[ASTObject], Chain]

Parameters:
  • self (Plottable)

  • chain (Any | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (Literal['pandas', 'cudf'] | None)

  • validate (bool)

  • persist (bool)

Return type:

DataFrame

client_protocol_hostname(v=None)#
Parameters:

v (str | None)

Return type:

str

collapse(node, attribute, column, self_edges=False, unwrap=False, verbose=False)#
Parameters:
  • node (str | int)

  • attribute (str | int)

  • column (str | int)

  • self_edges (bool)

  • unwrap (bool)

  • verbose (bool)

Return type:

Plottable

collections(collections=None, show_collections=None, collections_global_node_color=None, collections_global_edge_color=None, encode=True, validate='autofix', warn=True)#
Parameters:
  • collections (str | CollectionSet | CollectionIntersection | List[CollectionSet | CollectionIntersection] | None)

  • show_collections (bool | None)

  • collections_global_node_color (str | None)

  • collections_global_edge_color (str | None)

  • encode (bool)

  • validate (Literal['strict', 'strict-fast', 'autofix'] | bool)

  • warn (bool)

Return type:

Plottable

compute_cugraph(alg, out_col=None, params={}, kind='Graph', directed=True, G=None)#
Parameters:
  • alg (str)

  • out_col (str | None)

  • params (dict)

  • kind (Literal['Graph', 'MultiGraph', 'BiPartiteGraph'])

  • G (Any | None)

Return type:

Plottable

compute_igraph(alg, out_col=None, directed=None, use_vids=False, params={}, stringify_rich_types=True)#
Parameters:
  • alg (str)

  • out_col (str | None)

  • directed (bool | None)

  • use_vids (bool)

  • params (dict)

  • stringify_rich_types (bool)

Return type:

Plottable

conditional_graph(x, given, kind='nodes', *args, **kwargs)#

conditional_graph – p(x|given) = p(x, given) / p(given)

Useful for finding the conditional probability of a node or edge attribute

returned dataframe sums to 1 on each column

Parameters:
  • x – target column

  • given – the dependent column

  • kind – ‘nodes’ or ‘edges’

  • args/kwargs – additional arguments for g.bind(…)

Returns:

a graphistry instance with the conditional graph edges weighted by the conditional probability. edges are between x and given, keep in mind that g._edges.columns = [given, x, _probs]

conditional_probs(x, given, kind='nodes', how='index')#

Produces a Dense Matrix of the conditional probability of x given y

Args:

x: the column variable of interest given the column y=given given : the variabe to fix constant df pd.DataFrame: dataframe how (str, optional): One of ‘column’ or ‘index’. Defaults to ‘index’. kind (str, optional): ‘nodes’ or ‘edges’. Defaults to ‘nodes’.

Returns:

pd.DataFrame: the conditional probability of x given the column y as dense array like dataframe

copy()#
Return type:

Plottable

description(description)#
Parameters:

description (str)

Return type:

Plottable

drop_nodes(nodes)#
Parameters:

nodes (Any)

Return type:

Plottable

edges(edges, source=None, destination=None, edge=None, *args, **kwargs)#
Parameters:
  • edges (Callable | Any)

  • source (str | None)

  • destination (str | None)

  • edge (str | None)

  • args (Any)

  • kwargs (Any)

Return type:

Plottable

embed(relation, proto='DistMult', embedding_dim=32, use_feat=False, X=None, epochs=2, batch_size=32, train_split=0.8, sample_size=1000, num_steps=50, lr=0.01, inplace=False, device='cpu', evaluate=True, *args, **kwargs)#
Parameters:
  • relation (str)

  • proto (str | Callable[[Any, Any, Any], Any] | None)

  • embedding_dim (int)

  • use_feat (bool)

  • X (DataFrame | np.ndarray | List[str] | None)

  • epochs (int)

  • batch_size (int)

  • train_split (float | int)

  • sample_size (int)

  • num_steps (int)

  • lr (float)

  • inplace (bool | None)

  • device (str | None)

  • evaluate (bool)

Return type:

Plottable

encode_axis(rows=[])#
Parameters:

rows (List[Dict])

Return type:

Plottable

encode_edge_badge(column, position='TopRight', categorical_mapping=Ellipsis, continuous_binning=Ellipsis, default_mapping=Ellipsis, comparator=Ellipsis, color=Ellipsis, bg=Ellipsis, fg=Ellipsis, for_current=False, for_default=True, as_text=Ellipsis, blend_mode=Ellipsis, style=Ellipsis, border=Ellipsis, shape=Ellipsis)#
Parameters:
  • column (str)

  • position (str)

  • categorical_mapping (Dict[Any, Any] | None)

  • continuous_binning (List[Any] | None)

  • default_mapping (Any | None)

  • comparator (Callable[[Any, Any], int] | None)

  • color (str | None)

  • bg (str | None)

  • fg (str | None)

  • for_current (bool)

  • for_default (bool)

  • as_text (bool | None)

  • blend_mode (str | None)

  • style (Dict[str, Any] | None)

  • border (Dict[str, Any] | None)

  • shape (str | None)

Return type:

Plottable

encode_edge_color(column, palette=Ellipsis, as_categorical=Ellipsis, as_continuous=Ellipsis, categorical_mapping=Ellipsis, default_mapping=Ellipsis, for_default=True, for_current=False)#
Parameters:
  • column (str)

  • palette (List[str] | None)

  • as_categorical (bool | None)

  • as_continuous (bool | None)

  • categorical_mapping (Dict[Any, Any] | None)

  • default_mapping (str | None)

  • for_default (bool)

  • for_current (bool)

Return type:

Plottable

encode_edge_icon(column, categorical_mapping=Ellipsis, continuous_binning=Ellipsis, default_mapping=Ellipsis, comparator=Ellipsis, for_default=True, for_current=False, as_text=False, blend_mode=Ellipsis, style=Ellipsis, border=Ellipsis, shape=Ellipsis)#
Parameters:
  • column (str)

  • categorical_mapping (Dict[Any, str] | None)

  • continuous_binning (List[Any] | None)

  • default_mapping (str | None)

  • comparator (Callable[[Any, Any], int] | None)

  • for_default (bool)

  • for_current (bool)

  • as_text (bool)

  • blend_mode (str | None)

  • style (Dict[str, Any] | None)

  • border (Dict[str, Any] | None)

  • shape (str | None)

Return type:

Plottable

encode_point_badge(column, position='TopRight', categorical_mapping=Ellipsis, continuous_binning=Ellipsis, default_mapping=Ellipsis, comparator=Ellipsis, color=Ellipsis, bg=Ellipsis, fg=Ellipsis, for_current=False, for_default=True, as_text=Ellipsis, blend_mode=Ellipsis, style=Ellipsis, border=Ellipsis, shape=Ellipsis)#
Parameters:
  • column (str)

  • position (str)

  • categorical_mapping (Dict[Any, Any] | None)

  • continuous_binning (List[Any] | None)

  • default_mapping (Any | None)

  • comparator (Callable[[Any, Any], int] | None)

  • color (str | None)

  • bg (str | None)

  • fg (str | None)

  • for_current (bool)

  • for_default (bool)

  • as_text (bool | None)

  • blend_mode (str | None)

  • style (Dict[str, Any] | None)

  • border (Dict[str, Any] | None)

  • shape (str | None)

Return type:

Plottable

encode_point_color(column, palette=Ellipsis, as_categorical=Ellipsis, as_continuous=Ellipsis, categorical_mapping=Ellipsis, default_mapping=Ellipsis, for_default=True, for_current=False)#
Parameters:
  • column (str)

  • palette (List[str] | None)

  • as_categorical (bool | None)

  • as_continuous (bool | None)

  • categorical_mapping (Dict[Any, Any] | None)

  • default_mapping (str | None)

  • for_default (bool)

  • for_current (bool)

Return type:

Plottable

encode_point_icon(column, categorical_mapping=Ellipsis, continuous_binning=Ellipsis, default_mapping=Ellipsis, comparator=Ellipsis, for_default=True, for_current=False, as_text=False, blend_mode=Ellipsis, style=Ellipsis, border=Ellipsis, shape=Ellipsis)#
Parameters:
  • column (str)

  • categorical_mapping (Dict[Any, str] | None)

  • continuous_binning (List[Any] | None)

  • default_mapping (str | None)

  • comparator (Callable[[Any, Any], int] | None)

  • for_default (bool)

  • for_current (bool)

  • as_text (bool)

  • blend_mode (str | None)

  • style (Dict[str, Any] | None)

  • border (Dict[str, Any] | None)

  • shape (str | None)

Return type:

Plottable

encode_point_size(column, categorical_mapping=Ellipsis, default_mapping=Ellipsis, for_default=True, for_current=False)#
Parameters:
  • column (str)

  • categorical_mapping (Dict[Any, int | float] | None)

  • default_mapping (int | float | None)

  • for_default (bool)

  • for_current (bool)

Return type:

Plottable

fa2_layout(fa2_params=None, circle_layout_params=None, singleton_layout=None, partition_key=None, engine='auto', allow_cpu_fallback=False)#
Parameters:
  • fa2_params (Dict[str, Any] | None)

  • circle_layout_params (Dict[str, Any] | None)

  • singleton_layout (Callable[[Plottable, Tuple[float, float, float, float] | Any], Plottable] | None)

  • partition_key (str | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • allow_cpu_fallback (bool)

Return type:

Plottable

filter_edges_by_dict(filter_dict=None)#
Parameters:

filter_dict (dict | None)

Return type:

Plottable

filter_nodes_by_dict(filter_dict=None)#
Parameters:

filter_dict (dict | None)

Return type:

Plottable

filter_weighted_edges(scale=1.0, index_to_nodes_dict=None, inplace=False, kind='nodes')#
Parameters:
  • scale (float)

  • index_to_nodes_dict (Dict | None)

  • inplace (bool)

  • kind (Literal['nodes', 'edges'])

Return type:

Plottable | None

from_cugraph(G, node_attributes=None, edge_attributes=None, load_nodes=True, load_edges=True, merge_if_existing=True)#
Parameters:
  • node_attributes (List[str] | None)

  • edge_attributes (List[str] | None)

  • load_nodes (bool)

  • load_edges (bool)

  • merge_if_existing (bool)

Return type:

Plottable

from_igraph(ig, node_attributes=None, edge_attributes=None, load_nodes=True, load_edges=True, merge_if_existing=True)#
Parameters:
  • ig (Any)

  • node_attributes (List[str] | None)

  • edge_attributes (List[str] | None)

  • load_nodes (bool)

  • load_edges (bool)

  • merge_if_existing (bool)

Return type:

Plottable

from_networkx(G)#
Parameters:

G (Any)

Return type:

Plottable

get_degrees(col='degree', degree_in='degree_in', degree_out='degree_out')#
Parameters:
  • col (str)

  • degree_in (str)

  • degree_out (str)

Return type:

Plottable

get_indegrees(col='degree_in')#
Parameters:

col (str)

Return type:

Plottable

get_outdegrees(col='degree_out')#
Parameters:

col (str)

Return type:

Plottable

get_topological_levels(level_col='level', allow_cycles=True, warn_cycles=True, remove_self_loops=True)#
Parameters:
  • level_col (str)

  • allow_cycles (bool)

  • warn_cycles (bool)

  • remove_self_loops (bool)

Return type:

Plottable

gfql_remote(chain, api_token=None, dataset_id=None, output_type='all', format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine='auto', validate=True, persist=False)#

chain is Union[List[ASTObject], Chain]

Parameters:
  • self (Plottable)

  • chain (Any | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • output_type (Literal['all', 'nodes', 'edges', 'shape'])

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • validate (bool)

  • persist (bool)

Return type:

Plottable

gfql_remote_shape(chain, api_token=None, dataset_id=None, format=None, df_export_args=None, node_col_subset=None, edge_col_subset=None, engine='auto', validate=True, persist=False)#

chain is Union[List[ASTObject], Chain]

Parameters:
  • self (Plottable)

  • chain (Any | Dict[str, None | bool | str | float | int | List[Any] | Dict[str, Any]])

  • api_token (str | None)

  • dataset_id (str | None)

  • format (Literal['json', 'csv', 'parquet'] | None)

  • df_export_args (Dict[str, Any] | None)

  • node_col_subset (List[str] | None)

  • edge_col_subset (List[str] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • validate (bool)

  • persist (bool)

Return type:

DataFrame

graph(ig)#
Parameters:

ig (Any)

Return type:

Plottable

hop(nodes, hops=1, *, min_hops=None, max_hops=None, output_min_hops=None, output_max_hops=None, label_node_hops=None, label_edge_hops=None, label_seeds=False, to_fixed_point=False, direction='forward', edge_match=None, source_node_match=None, destination_node_match=None, source_node_query=None, destination_node_query=None, edge_query=None, return_as_wave_front=False, target_wave_front=None)#
Parameters:
  • nodes (DataFrame | None)

  • hops (int | None)

  • min_hops (int | None)

  • max_hops (int | None)

  • output_min_hops (int | None)

  • output_max_hops (int | None)

  • label_node_hops (str | None)

  • label_edge_hops (str | None)

  • label_seeds (bool)

  • to_fixed_point (bool)

  • direction (str)

  • edge_match (dict | None)

  • source_node_match (dict | None)

  • destination_node_match (dict | None)

  • source_node_query (str | None)

  • destination_node_query (str | None)

  • edge_query (str | None)

  • return_as_wave_front (bool)

  • target_wave_front (DataFrame | None)

Return type:

Plottable

hypergraph(raw_events=None, *, entity_types=None, opts={}, drop_na=True, drop_edge_attrs=False, verbose=True, direct=False, engine='auto', npartitions=None, chunksize=None, from_edges=False, return_as='graph')#
Parameters:
  • raw_events (Any | None)

  • entity_types (List[str] | None)

  • opts (dict)

  • drop_na (bool)

  • drop_edge_attrs (bool)

  • verbose (bool)

  • direct (bool)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • npartitions (int | None)

  • chunksize (int | None)

  • from_edges (bool)

  • return_as (Literal['graph', 'all', 'entities', 'events', 'edges', 'nodes'])

Return type:

Plottable | HypergraphResult | Any

igraph2pandas(ig)#
Parameters:

ig (Any)

Return type:

Tuple[DataFrame, DataFrame]

infer_labels()#
Return type:

Plottable

keep_nodes(nodes)#
Parameters:

nodes (List | Any)

Return type:

Plottable

layout_cugraph(layout='force_atlas2', params={}, kind='Graph', directed=True, G=None, bind_position=True, x_out_col='x', y_out_col='y', play=0)#
Parameters:
  • layout (str)

  • params (dict)

  • kind (Literal['Graph', 'MultiGraph', 'BiPartiteGraph'])

  • G (Any | None)

  • bind_position (bool)

  • x_out_col (str)

  • y_out_col (str)

  • play (int | None)

Return type:

Plottable

layout_graphviz(prog='dot', args=None, directed=True, strict=False, graph_attr=None, node_attr=None, edge_attr=None, skip_styling=False, render_to_disk=False, path=None, format=None)#
Parameters:
  • prog (Literal['acyclic', 'ccomps', 'circo', 'dot', 'fdp', 'gc', 'gvcolor', 'gvpr', 'neato', 'nop', 'osage', 'patchwork', 'sccmap', 'sfdp', 'tred', 'twopi', 'unflatten'])

  • args (str | None)

  • directed (bool)

  • strict (bool)

  • graph_attr (Dict[Literal['_background', 'bb', 'beautify', 'bgcolor', 'center', 'charset', 'class', 'clusterrank', 'colorscheme', 'comment', 'compound', 'concentrate', 'Damping', 'defaultdist', 'dim', 'dimen', 'diredgeconstraints', 'dpi', 'epsilon', 'esep', 'fontcolor', 'fontname', 'fontnames', 'fontpath', 'fontsize', 'forcelabels', 'gradientangle', 'href', 'id', 'imagepath', 'inputscale', 'K', 'label', 'label_scheme', 'labeljust', 'labelloc', 'landscape', 'layerlistsep', 'layers', 'layerselect', 'layersep', 'layout', 'levels', 'levelsgap', 'lheight', 'linelength', 'lp', 'lwidth', 'margin', 'maxiter', 'mclimit', 'mindist', 'mode', 'model', 'newrank', 'nodesep', 'nojustify', 'normalize', 'notranslate', 'nslimit', 'nslimit1', 'oneblock', 'ordering', 'orientation', 'outputorder', 'overlap', 'overlap_scaling', 'overlap_shrink', 'pack', 'packmode', 'pad', 'page', 'pagedir', 'quadtree', 'quantum', 'rankdir', 'ranksep', 'ratio', 'remincross', 'repulsiveforce', 'resolution', 'root', 'rotate', 'rotation', 'scale', 'searchsize', 'sep', 'showboxes', 'size', 'smoothing', 'sortv', 'splines', 'start', 'style', 'stylesheet', 'target', 'TBbalance', 'tooltip', 'truecolor', 'URL', 'viewport', 'voro_margin', 'xdotversion'], ~typing.Any] | None)

  • node_attr (Dict[Literal['area', 'class', 'color', 'colorscheme', 'comment', 'distortion', 'fillcolor', 'fixedsize', 'fontcolor', 'fontname', 'fontsize', 'gradientangle', 'group', 'height', 'href', 'id', 'image', 'imagepos', 'imagescale', 'label', 'labelloc', 'layer', 'margin', 'nojustify', 'ordering', 'orientation', 'penwidth', 'peripheries', 'pin', 'pos', 'rects', 'regular', 'root', 'samplepoints', 'shape', 'shapefile', 'showboxes', 'sides', 'skew', 'sortv', 'style', 'target', 'tooltip', 'URL', 'vertices', 'width', 'xlabel', 'xlp', 'z'], ~typing.Any] | None)

  • edge_attr (Dict[Literal['arrowhead', 'arrowsize', 'arrowtail', 'class', 'color', 'colorscheme', 'comment', 'constraint', 'decorate', 'dir', 'edgehref', 'edgetarget', 'edgetooltip', 'edgeURL', 'fillcolor', 'fontcolor', 'fontname', 'fontsize', 'head_lp', 'headclip', 'headhref', 'headlabel', 'headport', 'headtarget', 'headtooltip', 'headURL', 'href', 'id', 'label', 'labelangle', 'labeldistance', 'labelfloat', 'labelfontcolor', 'labelfontname', 'labelfontsize', 'labelhref', 'labeltarget', 'labeltooltip', 'labelURL', 'layer', 'len', 'lhead', 'lp', 'ltail', 'minlen', 'nojustify', 'penwidth', 'pos', 'samehead', 'sametail', 'showboxes', 'style', 'tail_lp', 'tailclip', 'tailhref', 'taillabel', 'tailport', 'tailtarget', 'tailtooltip', 'tailURL', 'target', 'tooltip', 'URL', 'weight', 'xlabel', 'xlp'], ~typing.Any] | None)

  • skip_styling (bool)

  • render_to_disk (bool)

  • path (str | None)

  • format (Literal['canon', 'cmap', 'cmapx', 'cmapx_np', 'dia', 'dot', 'fig', 'gd', 'gd2', 'gif', 'hpgl', 'imap', 'imap_np', 'ismap', 'jpe', 'jpeg', 'jpg', 'mif', 'mp', 'pcl', 'pdf', 'pic', 'plain', 'plain-ext', 'png', 'ps', 'ps2', 'svg', 'svgz', 'vml', 'vmlz', 'vrml', 'vtx', 'wbmp', 'xdot', 'xlib'] | None)

Return type:

Plottable

layout_igraph(layout, directed=None, use_vids=False, bind_position=True, x_out_col='x', y_out_col='y', play=0, params={})#
Parameters:
  • layout (str)

  • directed (bool | None)

  • use_vids (bool)

  • bind_position (bool)

  • x_out_col (str)

  • y_out_col (str)

  • play (int | None)

  • params (dict)

Return type:

Plottable

layout_settings(play=None, locked_x=None, locked_y=None, locked_r=None, left=None, top=None, right=None, bottom=None, lin_log=None, strong_gravity=None, dissuade_hubs=None, edge_influence=None, precision_vs_speed=None, gravity=None, scaling_ratio=None)#
Parameters:
  • play (int | None)

  • locked_x (bool | None)

  • locked_y (bool | None)

  • locked_r (bool | None)

  • left (float | None)

  • top (float | None)

  • right (float | None)

  • bottom (float | None)

  • lin_log (bool | None)

  • strong_gravity (bool | None)

  • dissuade_hubs (bool | None)

  • edge_influence (float | None)

  • precision_vs_speed (float | None)

  • gravity (float | None)

  • scaling_ratio (float | None)

Return type:

Plottable

materialize_nodes(reuse=True, engine='auto')#
Parameters:
  • reuse (bool)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

Return type:

Plottable

name(name)#
Parameters:

name (str)

Return type:

Plottable

networkx2pandas(G)#
Parameters:

G (Any)

Return type:

Tuple[DataFrame, DataFrame]

networkx_checkoverlap(g)#
Parameters:

g (Any)

Return type:

None

nodes(nodes, node=None, *args, **kwargs)#
Parameters:
  • nodes (Callable | Any)

  • node (str | None)

  • args (Any)

  • kwargs (Any)

Return type:

Plottable

pandas2igraph(edges, directed=True)#
Parameters:
  • edges (DataFrame)

  • directed (bool)

Return type:

Any

pipe(graph_transform, *args, **kwargs)#
Parameters:
  • graph_transform (Callable)

  • args (Any)

  • kwargs (Any)

Return type:

Plottable

plot(graph=None, nodes=None, name=None, description=None, render='auto', skip_upload=False, as_files=False, memoize=True, erase_files_on_fail=True, extra_html='', override_html_style=None, validate='autofix', warn=True)#
Parameters:
  • graph (Any | None)

  • nodes (Any | None)

  • name (str | None)

  • description (str | None)

  • render (bool | Literal['auto'] | ~typing.Literal['g', 'url', 'ipython', 'databricks', 'browser'] | None)

  • skip_upload (bool)

  • as_files (bool)

  • memoize (bool)

  • erase_files_on_fail (bool)

  • extra_html (str)

  • override_html_style (str | None)

  • validate (Literal['strict', 'strict-fast', 'autofix'] | bool)

  • warn (bool)

Return type:

Any

privacy(mode=None, notify=None, invited_users=None, message=None, mode_action=None)#
Parameters:
  • mode (Literal['private', 'organization', 'public'] | None)

  • notify (bool | None)

  • invited_users (List[str] | None)

  • message (str | None)

  • mode_action (Literal['10', '20'] | None)

Return type:

Plottable

protocol(v=None)#
Parameters:

v (str | None)

Return type:

str

prune_self_edges()#
Return type:

Plottable

python_remote_g(code, api_token=None, dataset_id=None, format='parquet', output_type='all', engine='auto', run_label=None, validate=True)#
Parameters:
  • self (Plottable)

  • code (str)

  • api_token (str | None)

  • dataset_id (str | None)

  • format (Literal['json', 'csv', 'parquet'] | None)

  • output_type (Literal['all', 'nodes', 'edges', 'shape'] | ~typing.Literal['table', 'shape'] | ~typing.Literal['json'] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • run_label (str | None)

  • validate (bool)

Return type:

Plottable

python_remote_json(code, api_token=None, dataset_id=None, engine='auto', run_label=None, validate=True)#
Parameters:
  • self (Plottable)

  • code (str)

  • api_token (str | None)

  • dataset_id (str | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • run_label (str | None)

  • validate (bool)

Return type:

Any

python_remote_table(code, api_token=None, dataset_id=None, format='parquet', output_type='table', engine='auto', run_label=None, validate=True)#
Parameters:
  • self (Plottable)

  • code (str)

  • api_token (str | None)

  • dataset_id (str | None)

  • format (Literal['json', 'csv', 'parquet'] | None)

  • output_type (Literal['table', 'shape'] | None)

  • engine (EngineAbstract | Literal['pandas', 'cudf', 'dask', 'dask_cudf', 'auto'])

  • run_label (str | None)

  • validate (bool)

Return type:

DataFrame

reset_caches()#
Return type:

None

scene_settings(menu=None, info=None, show_arrows=None, point_size=None, edge_curvature=None, edge_opacity=None, point_opacity=None)#
Parameters:
  • menu (bool | None)

  • info (bool | None)

  • show_arrows (bool | None)

  • point_size (float | None)

  • edge_curvature (float | None)

  • edge_opacity (float | None)

  • point_opacity (float | None)

Return type:

Plottable

search(query, cols=None, thresh=5000, fuzzy=True, top_n=10)#
Parameters:
  • query (str)

  • thresh (float)

  • fuzzy (bool)

  • top_n (int)

search_graph(query, scale=0.5, top_n=100, thresh=5000, broader=False, inplace=False)#
Parameters:
  • query (str)

  • scale (float)

  • top_n (int)

  • thresh (float)

  • broader (bool)

  • inplace (bool)

Return type:

Plottable

server(v=None)#
Parameters:

v (str | None)

Return type:

str

session: ClientSession#
settings(height=None, url_params={}, render=None)#
Parameters:
  • height (int | None)

  • url_params (Dict[str, Any])

  • render (bool | Literal['auto'] | ~typing.Literal['g', 'url', 'ipython', 'databricks', 'browser'] | None)

Return type:

Plottable

style(fg=None, bg=None, page=None, logo=None)#
Parameters:
  • fg (Dict[str, Any] | None)

  • bg (Dict[str, Any] | None)

  • page (Dict[str, Any] | None)

  • logo (Dict[str, Any] | None)

Return type:

Plottable

to_arrow(table=None, validate='autofix', warn=True)#
Parameters:
  • table (Any | None)

  • validate (Literal['strict', 'strict-fast', 'autofix'] | bool)

  • warn (bool)

Return type:

Any | None

to_cudf()#
Return type:

Plottable

to_cugraph(directed=True, include_nodes=True, node_attributes=None, edge_attributes=None, kind='Graph')#
Parameters:
  • directed (bool)

  • include_nodes (bool)

  • node_attributes (List[str] | None)

  • edge_attributes (List[str] | None)

  • kind (Literal['Graph', 'MultiGraph', 'BiPartiteGraph'])

Return type:

Any

to_igraph(directed=True, include_nodes=True, node_attributes=None, edge_attributes=None, use_vids=False)#
Parameters:
  • directed (bool)

  • include_nodes (bool)

  • node_attributes (List[str] | None)

  • edge_attributes (List[str] | None)

  • use_vids (bool)

Return type:

Any

to_pandas()#
Return type:

Plottable

transform(df, y=None, kind='nodes', min_dist='auto', n_neighbors=7, merge_policy=False, sample=None, *, return_graph=True, scaled=True, verbose=False)#
Parameters:
  • df (DataFrame)

  • y (DataFrame | None)

  • kind (str)

  • min_dist (str | float | int)

  • n_neighbors (int)

  • merge_policy (bool)

  • sample (int | None)

  • return_graph (bool)

  • scaled (bool)

  • verbose (bool)

Return type:

Tuple[DataFrame, DataFrame] | Plottable

transform_umap(df, y=None, kind='nodes', min_dist='auto', n_neighbors=7, merge_policy=False, sample=None, *, return_graph=True, fit_umap_embedding=True, umap_transform_kwargs={})#
Parameters:
  • df (DataFrame)

  • y (DataFrame | None)

  • kind (Literal['nodes', 'edges'])

  • min_dist (str | float | int)

  • n_neighbors (int)

  • merge_policy (bool)

  • sample (int | None)

  • return_graph (bool)

  • fit_umap_embedding (bool)

  • umap_transform_kwargs (Dict[str, Any])

Return type:

Tuple[DataFrame, DataFrame, DataFrame] | Plottable

umap(X=None, y=None, kind='nodes', scale=1.0, n_neighbors=12, min_dist=0.1, spread=0.5, local_connectivity=1, repulsion_strength=1, negative_sample_rate=5, n_components=2, metric='euclidean', suffix='', play=0, encode_position=True, encode_weight=True, dbscan=False, engine='auto', feature_engine='auto', inplace=False, memoize=True, umap_kwargs={}, umap_fit_kwargs={}, umap_transform_kwargs={}, **featurize_kwargs)#
Parameters:
  • X (DataFrame | np.ndarray | List[str] | None)

  • y (DataFrame | np.ndarray | List[str] | None)

  • kind (Literal['nodes', 'edges'])

  • scale (float)

  • n_neighbors (int)

  • min_dist (float)

  • spread (float)

  • local_connectivity (int)

  • repulsion_strength (float)

  • negative_sample_rate (int)

  • n_components (int)

  • metric (str)

  • suffix (str)

  • play (int | None)

  • encode_position (bool)

  • encode_weight (bool)

  • dbscan (bool)

  • engine (Literal['auto', 'cuml', 'umap_learn'])

  • feature_engine (str)

  • inplace (bool)

  • memoize (bool)

  • umap_kwargs (Dict[str, Any])

  • umap_fit_kwargs (Dict[str, Any])

  • umap_transform_kwargs (Dict[str, Any])

  • featurize_kwargs (Any)

Return type:

Plottable | None

umap_fit(X, y=None, umap_fit_kwargs={})#
Parameters:
  • X (DataFrame)

  • y (DataFrame | None)

  • umap_fit_kwargs (Dict[str, Any])

Return type:

Plottable

umap_lazy_init(res, n_neighbors=12, min_dist=0.1, spread=0.5, local_connectivity=1, repulsion_strength=1, negative_sample_rate=5, n_components=2, metric='euclidean', engine='auto', suffix='', umap_kwargs={}, umap_fit_kwargs={}, umap_transform_kwargs={})#
Parameters:
  • res (Plottable)

  • n_neighbors (int)

  • min_dist (float)

  • spread (float)

  • local_connectivity (int)

  • repulsion_strength (float)

  • negative_sample_rate (int)

  • n_components (int)

  • metric (str)

  • engine (Literal['auto', 'cuml', 'umap_learn'])

  • suffix (str)

  • umap_kwargs (Dict[str, Any])

  • umap_fit_kwargs (Dict[str, Any])

  • umap_transform_kwargs (Dict[str, Any])

Return type:

Plottable

upload(memoize=True, erase_files_on_fail=True, validate='autofix', warn=True)#
Parameters:
  • memoize (bool)

  • erase_files_on_fail (bool)

  • validate (Literal['strict', 'strict-fast', 'autofix'] | bool)

  • warn (bool)

Return type:

Plottable

property url: str | None#
graphistry.compute.conditional.conditional_probability(x, given, df)#
conditional probability function over categorical variables

p(x | given) = p(x, given)/p(given)

Args:

x: the column variable of interest given the column ‘given’ given: the variabe to fix constant df: dataframe with columns [given, x]

Returns:

pd.DataFrame: the conditional probability of x given the column ‘given’

Parameters:

df (DataFrame)

graphistry.compute.conditional.probs(x, given, df, how='index')#

Produces a Dense Matrix of the conditional probability of x given y=given

Args:

x: the column variable of interest given the column ‘y’ given : the variabe to fix constant df pd.DataFrame: dataframe how (str, optional): One of ‘column’ or ‘index’. Defaults to ‘index’.

Returns:

pd.DataFrame: the conditional probability of x given the column ‘y’ as dense array like dataframe

Parameters:

df (DataFrame)

Filter by Dictionary#

graphistry.compute.filter_by_dict.filter_by_dict(df, filter_dict=None, engine=EngineAbstract.AUTO)#

return df where rows match all values in filter_dict

Parameters:
  • df (Any)

  • filter_dict (dict | None)

  • engine (EngineAbstract | str)

Return type:

Any

graphistry.compute.filter_by_dict.filter_edges_by_dict(self, filter_dict=None, engine=EngineAbstract.AUTO)#

filter edges to those that match all values in filter_dict

Parameters:
  • self (Plottable)

  • filter_dict (dict | None)

  • engine (EngineAbstract | str)

Return type:

Plottable

graphistry.compute.filter_by_dict.filter_nodes_by_dict(self, filter_dict=None, engine=EngineAbstract.AUTO)#

filter nodes to those that match all values in filter_dict

Parameters:
  • self (Plottable)

  • filter_dict (dict | None)

  • engine (EngineAbstract | str)

Return type:

Plottable