Continuous layout tutorial#

Graphs where nodes have a continuous attribute may be layed out radially with the new continuous ring layout.

Examples of continuous values are ranges like weight and price.

The tutorial overviews:

  • Continuous coloring

  • Automated use with smart defaults

  • ring_col: str: Specifying the value dimension

  • reverse: bool: Reversing the axis

  • v_start, v_end, r_min, r_max: Control the value and ring radius ranges

  • normalize_ring_col: Whether to normalize values or pass them through

  • num_rings: int: Picking the number of rings

  • ring_step: float: Changing the ring step size

  • axis: List[str] | Dict[float,str]: Pass in axis labels

  • format_axis: Callable, format_label: Callable: Changing the labels

For larger graphs, we also describe automatic GPU acceleration support

Setup#

[1]:
import os
os.environ['LOG_LEVEL'] = 'INFO'
[2]:
from typing import Dict, List
import numpy as np
import pandas as pd
import graphistry

graphistry.register(
    api=3,
    username=FILL_ME_IN,
    password=FILL_ME_IN,
    protocol='https',
    server='hub.graphistry.com',
    client_protocol_hostname='https://hub.graphistry.com'
)

Data#

  • Edges: Load a table of IDS network events for our edges

  • Nodes: IP addresses, computing for each IP the time of the first and last events it was seen in

[3]:
df = pd.read_csv('https://raw.githubusercontent.com/graphistry/pygraphistry/master/demos/data/honeypot.csv')
df = df.assign(t= pd.Series(pd.to_datetime(df['time(max)'] * 1000000000)))
print(df.dtypes)
print(len(df))
df.sample(3)
attackerIP            object
victimIP              object
victimPort           float64
vulnName              object
count                  int64
time(max)            float64
time(min)            float64
t             datetime64[ns]
dtype: object
220
[3]:
attackerIP victimIP victimPort vulnName count time(max) time(min) t
144 41.227.194.80 172.31.14.66 445.0 MS08067 (NetAPI) 1 1.417556e+09 1.417556e+09 2014-12-02 21:28:47
116 201.221.119.171 172.31.14.66 445.0 MS08067 (NetAPI) 7 1.414022e+09 1.414021e+09 2014-10-22 23:49:50
31 124.123.70.99 172.31.14.66 445.0 MS08067 (NetAPI) 3 1.413567e+09 1.413566e+09 2014-10-17 17:26:09
[4]:
ip_times = pd.concat([
    df[['attackerIP', 't', 'count', 'time(min)']].rename(columns={'attackerIP': 'ip'}),
    df[['victimIP', 't', 'count', 'time(min)']].rename(columns={'victimIP': 'ip'})
])
ip_times = ip_times.groupby('ip').agg({
    't': ['min', 'max'],
    'count': ['sum'],
    'time(min)': ['min']
}).reset_index()
ip_times.columns = ['ip', 't_min', 't_max', 'count', 'time_min']

print(ip_times.dtypes)
print(ip_times.shape)
ip_times.sample(3)
ip                  object
t_min       datetime64[ns]
t_max       datetime64[ns]
count                int64
time_min           float64
dtype: object
(203, 5)
[4]:
ip t_min t_max count time_min
106 212.0.209.7 2014-12-31 03:54:42 2014-12-31 03:54:42 1 1.419998e+09
40 172.31.14.66 2014-09-30 08:43:16 2015-03-03 00:54:49 1321 1.412066e+09
30 124.123.70.99 2014-10-17 17:26:09 2014-10-17 17:26:09 3 1.413566e+09
[5]:
g = graphistry.edges(df, 'attackerIP', 'victimIP').nodes(ip_times, 'ip')

Visualization#

Continuous coloring#

Coloring nodes and edges by the value domain can help visual interpretation, so we encode smallest as cold (blue) and biggest as hot (red):

[6]:
g = g.encode_point_color('count', ['blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow', 'red'], as_continuous=True)

Default#

The default layout will scan for a numeric column and try to infer reasonable layout settings

[8]:
g.ring_continuous_layout().plot(render=False)
[8]:
'https://hub.graphistry.com/graph/graph.html?dataset=363ec1cb22ef4e9dafb474d1add25ed6&type=arrow&viztoken=58585ee0-4a23-4765-81b2-457e7c29380f&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015632&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Pick the value column and reverse direction#

[10]:
g.ring_continuous_layout(
    ring_col='count',
    reverse=True
).plot(render=False)
[10]:
'https://hub.graphistry.com/graph/graph.html?dataset=45d4daf6ce4347818e005e05c218e9f1&type=arrow&viztoken=2b5b95bf-290d-4fa5-a221-50fa53de00d4&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015645&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Control the number of rings#

  • Control the number of rings via num_rings

  • Control which values map to the first, last rings via v_start and v_end in terms of ring value column g._nodes['count']

    • Let the layout algorithm determine the distance between rings based on num_rings, v_start, and v_end

  • Control the radius of the first, last rings via min_r, max_r

[13]:
g.ring_continuous_layout(
    ring_col='count',
    min_r=500,
    max_r=1000,
    v_start=100,
    v_end=1400,
    num_rings=13
).plot(render=False)
[13]:
'https://hub.graphistry.com/graph/graph.html?dataset=c864818290f64e22a5333c5d4567824f&type=arrow&viztoken=feabda1a-8411-4a68-9c93-206182f9a9ca&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015794&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Control sizes#

  • Control which values map to the first, last rings via v_start andn v_end

  • Control the ring step size via v_step (in terms of input column g._nodes['count'])

    • Let the layout algorithm determine the num_rings based on v_start, v_end, and v_step

  • Control the radius of the first, last rings via min_r, max_r

[14]:
import math

def round_up_to_nearest_100(x: float) -> int:
    return math.ceil(x / 100) * 100

g.ring_continuous_layout(
    ring_col='count',
    min_r=500,
    max_r=1000,
    v_start=100,
    v_end=round_up_to_nearest_100(g._nodes['count'].max()),
    v_step=100
).plot(render=False)
[14]:
'https://hub.graphistry.com/graph/graph.html?dataset=78cd72af75b74e5280d3c644be9aa768&type=arrow&viztoken=9ac50462-1d05-487c-bec7-303d24a60847&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015950&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Control axis labels#

Pass in a value for each radial axis

[18]:
axis: List[str] = [
    f'ring: {v}'
    for v in
    ['a', 'b', 'c', 'd', 'e', 'f']  # one more than rings
]
print('axis', axis)

g.ring_continuous_layout(
    ring_col='count',
    num_rings=5,
    axis=axis
).plot(render=False)
axis ['ring: a', 'ring: b', 'ring: c', 'ring: d', 'ring: e', 'ring: f']
[18]:
'https://hub.graphistry.com/graph/graph.html?dataset=fd1009384627494a8d983bc7df79d4c3&type=arrow&viztoken=9e1032cd-b2c7-47c6-b912-32b6d7b18a78&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016118&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Compute a custom label based on the value

[22]:
def axis_to_title(value: float, step: int, ring_width: float) -> str:
    lbl = int(value)
    return f'Count: {lbl}'

g.ring_continuous_layout(
    ring_col='count',
    num_rings=5,
    format_labels=axis_to_title
).plot(render=False)
[22]:
'https://hub.graphistry.com/graph/graph.html?dataset=b76a773409c9487d913995b9586460d0&type=arrow&viztoken=fbfedc8c-3d15-43ea-b2e4-5e19743e5856&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016584&info=true&play=0&lockedR=True&bg=%23E2E2E2'

Control more aspects of the axis, like border style

[30]:
def fancy_axis_transform(axis: List[Dict]) -> List[Dict]:
    """
      - same radii
      - add "Ring ..." to labels
      - color radial axis based on ring number
          * ring 3: internal (blue axis style)
          * ring 6: external (orange axis style)
          * other rings: space (default gray axis style)
    """
    out = []
    print('sample input axis[0]:', axis[0])
    for i, ring in enumerate(axis):
        out.append({
            'r': ring['r'],
            'label': f'Ring {ring["label"]}',
            'internal': i == 3,  # blue
            'external': i == 6,  # orange
            'space': i != 3 and i != 6  # gray
        })
    print('sample output axis[0]:', out[0])
    return out

g.ring_continuous_layout(
    ring_col='count',
    num_rings=10,
    min_r=500,
    max_r=1000,
    format_axis=fancy_axis_transform
).plot(render=False)
sample input axis[0]: {'label': '1.0', 'r': 500.0, 'internal': True}
sample output axis[0]: {'r': 500.0, 'label': 'Ring 1.0', 'internal': False, 'external': False, 'space': True}
[30]:
'https://hub.graphistry.com/graph/graph.html?dataset=63980e32d60d4cebb6b5007ae914e586&type=arrow&viztoken=d60a6b8b-958a-4f9a-a6f9-f3be428d390c&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016963&info=true&play=0&lockedR=True&bg=%23E2E2E2'

GPU Acceleration#

For larger graphs, automatic GPU acceleration triggers when g._nodes is a cudf.DataFrame.

To ensure GPU acceleration is used, set engine='cudf'

[32]:
import cudf

(g
 .nodes(cudf.from_pandas(g._nodes))
 .ring_continuous_layout(engine='cudf')
).plot(render=False)

[32]:
'https://hub.graphistry.com/graph/graph.html?dataset=2731e5857dd541048c3754c902707dcb&type=arrow&viztoken=fee07e68-cf25-4814-9e17-d95bf96b59d4&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721017010&info=true&play=0&lockedR=True&bg=%23E2E2E2'