Continuous layout tutorial#
Graphs where nodes have a continuous attribute may be layed out radially with the new continuous ring layout.
Examples of continuous values are ranges like weight and price.
The tutorial overviews:
Continuous coloring
Automated use with smart defaults
ring_col: str
: Specifying the value dimensionreverse: bool
: Reversing the axisv_start
,v_end
,r_min
,r_max
: Control the value and ring radius rangesnormalize_ring_col
: Whether to normalize values or pass them throughnum_rings: int
: Picking the number of ringsring_step: float
: Changing the ring step sizeaxis: List[str] | Dict[float,str]
: Pass in axis labelsformat_axis: Callable, format_label: Callable
: Changing the labels
For larger graphs, we also describe automatic GPU acceleration support
Setup#
[1]:
import os
os.environ['LOG_LEVEL'] = 'INFO'
[2]:
from typing import Dict, List
import numpy as np
import pandas as pd
import graphistry
graphistry.register(
api=3,
username=FILL_ME_IN,
password=FILL_ME_IN,
protocol='https',
server='hub.graphistry.com',
client_protocol_hostname='https://hub.graphistry.com'
)
Data#
Edges: Load a table of IDS network events for our edges
Nodes: IP addresses, computing for each IP the time of the first and last events it was seen in
[3]:
df = pd.read_csv('https://raw.githubusercontent.com/graphistry/pygraphistry/master/demos/data/honeypot.csv')
df = df.assign(t= pd.Series(pd.to_datetime(df['time(max)'] * 1000000000)))
print(df.dtypes)
print(len(df))
df.sample(3)
attackerIP object
victimIP object
victimPort float64
vulnName object
count int64
time(max) float64
time(min) float64
t datetime64[ns]
dtype: object
220
[3]:
attackerIP | victimIP | victimPort | vulnName | count | time(max) | time(min) | t | |
---|---|---|---|---|---|---|---|---|
144 | 41.227.194.80 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 1 | 1.417556e+09 | 1.417556e+09 | 2014-12-02 21:28:47 |
116 | 201.221.119.171 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 7 | 1.414022e+09 | 1.414021e+09 | 2014-10-22 23:49:50 |
31 | 124.123.70.99 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 3 | 1.413567e+09 | 1.413566e+09 | 2014-10-17 17:26:09 |
[4]:
ip_times = pd.concat([
df[['attackerIP', 't', 'count', 'time(min)']].rename(columns={'attackerIP': 'ip'}),
df[['victimIP', 't', 'count', 'time(min)']].rename(columns={'victimIP': 'ip'})
])
ip_times = ip_times.groupby('ip').agg({
't': ['min', 'max'],
'count': ['sum'],
'time(min)': ['min']
}).reset_index()
ip_times.columns = ['ip', 't_min', 't_max', 'count', 'time_min']
print(ip_times.dtypes)
print(ip_times.shape)
ip_times.sample(3)
ip object
t_min datetime64[ns]
t_max datetime64[ns]
count int64
time_min float64
dtype: object
(203, 5)
[4]:
ip | t_min | t_max | count | time_min | |
---|---|---|---|---|---|
106 | 212.0.209.7 | 2014-12-31 03:54:42 | 2014-12-31 03:54:42 | 1 | 1.419998e+09 |
40 | 172.31.14.66 | 2014-09-30 08:43:16 | 2015-03-03 00:54:49 | 1321 | 1.412066e+09 |
30 | 124.123.70.99 | 2014-10-17 17:26:09 | 2014-10-17 17:26:09 | 3 | 1.413566e+09 |
[5]:
g = graphistry.edges(df, 'attackerIP', 'victimIP').nodes(ip_times, 'ip')
Visualization#
Continuous coloring#
Coloring nodes and edges by the value domain can help visual interpretation, so we encode smallest as cold (blue) and biggest as hot (red):
[6]:
g = g.encode_point_color('count', ['blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow', 'red'], as_continuous=True)
Default#
The default layout will scan for a numeric column and try to infer reasonable layout settings
[8]:
g.ring_continuous_layout().plot(render=False)
[8]:
'https://hub.graphistry.com/graph/graph.html?dataset=363ec1cb22ef4e9dafb474d1add25ed6&type=arrow&viztoken=58585ee0-4a23-4765-81b2-457e7c29380f&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015632&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Pick the value column and reverse direction#
[10]:
g.ring_continuous_layout(
ring_col='count',
reverse=True
).plot(render=False)
[10]:
'https://hub.graphistry.com/graph/graph.html?dataset=45d4daf6ce4347818e005e05c218e9f1&type=arrow&viztoken=2b5b95bf-290d-4fa5-a221-50fa53de00d4&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015645&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Control the number of rings#
Control the number of rings via
num_rings
Control which values map to the first, last rings via
v_start
andv_end
in terms of ring value columng._nodes['count']
Let the layout algorithm determine the distance between rings based on
num_rings
,v_start
, andv_end
Control the radius of the first, last rings via
min_r
,max_r
[13]:
g.ring_continuous_layout(
ring_col='count',
min_r=500,
max_r=1000,
v_start=100,
v_end=1400,
num_rings=13
).plot(render=False)
[13]:
'https://hub.graphistry.com/graph/graph.html?dataset=c864818290f64e22a5333c5d4567824f&type=arrow&viztoken=feabda1a-8411-4a68-9c93-206182f9a9ca&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015794&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Control sizes#
Control which values map to the first, last rings via
v_start
andnv_end
Control the ring step size via
v_step
(in terms of input columng._nodes['count']
)Let the layout algorithm determine the
num_rings
based onv_start
,v_end
, andv_step
Control the radius of the first, last rings via
min_r
,max_r
[14]:
import math
def round_up_to_nearest_100(x: float) -> int:
return math.ceil(x / 100) * 100
g.ring_continuous_layout(
ring_col='count',
min_r=500,
max_r=1000,
v_start=100,
v_end=round_up_to_nearest_100(g._nodes['count'].max()),
v_step=100
).plot(render=False)
[14]:
'https://hub.graphistry.com/graph/graph.html?dataset=78cd72af75b74e5280d3c644be9aa768&type=arrow&viztoken=9ac50462-1d05-487c-bec7-303d24a60847&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721015950&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Control axis labels#
Pass in a value for each radial axis
[18]:
axis: List[str] = [
f'ring: {v}'
for v in
['a', 'b', 'c', 'd', 'e', 'f'] # one more than rings
]
print('axis', axis)
g.ring_continuous_layout(
ring_col='count',
num_rings=5,
axis=axis
).plot(render=False)
axis ['ring: a', 'ring: b', 'ring: c', 'ring: d', 'ring: e', 'ring: f']
[18]:
'https://hub.graphistry.com/graph/graph.html?dataset=fd1009384627494a8d983bc7df79d4c3&type=arrow&viztoken=9e1032cd-b2c7-47c6-b912-32b6d7b18a78&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016118&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Compute a custom label based on the value
[22]:
def axis_to_title(value: float, step: int, ring_width: float) -> str:
lbl = int(value)
return f'Count: {lbl}'
g.ring_continuous_layout(
ring_col='count',
num_rings=5,
format_labels=axis_to_title
).plot(render=False)
[22]:
'https://hub.graphistry.com/graph/graph.html?dataset=b76a773409c9487d913995b9586460d0&type=arrow&viztoken=fbfedc8c-3d15-43ea-b2e4-5e19743e5856&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016584&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Control more aspects of the axis, like border style
[30]:
def fancy_axis_transform(axis: List[Dict]) -> List[Dict]:
"""
- same radii
- add "Ring ..." to labels
- color radial axis based on ring number
* ring 3: internal (blue axis style)
* ring 6: external (orange axis style)
* other rings: space (default gray axis style)
"""
out = []
print('sample input axis[0]:', axis[0])
for i, ring in enumerate(axis):
out.append({
'r': ring['r'],
'label': f'Ring {ring["label"]}',
'internal': i == 3, # blue
'external': i == 6, # orange
'space': i != 3 and i != 6 # gray
})
print('sample output axis[0]:', out[0])
return out
g.ring_continuous_layout(
ring_col='count',
num_rings=10,
min_r=500,
max_r=1000,
format_axis=fancy_axis_transform
).plot(render=False)
sample input axis[0]: {'label': '1.0', 'r': 500.0, 'internal': True}
sample output axis[0]: {'r': 500.0, 'label': 'Ring 1.0', 'internal': False, 'external': False, 'space': True}
[30]:
'https://hub.graphistry.com/graph/graph.html?dataset=63980e32d60d4cebb6b5007ae914e586&type=arrow&viztoken=d60a6b8b-958a-4f9a-a6f9-f3be428d390c&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721016963&info=true&play=0&lockedR=True&bg=%23E2E2E2'
GPU Acceleration#
For larger graphs, automatic GPU acceleration triggers when g._nodes
is a cudf.DataFrame
.
To ensure GPU acceleration is used, set engine='cudf'
[32]:
import cudf
(g
.nodes(cudf.from_pandas(g._nodes))
.ring_continuous_layout(engine='cudf')
).plot(render=False)
[32]:
'https://hub.graphistry.com/graph/graph.html?dataset=2731e5857dd541048c3754c902707dcb&type=arrow&viztoken=fee07e68-cf25-4814-9e17-d95bf96b59d4&usertag=6c2f6dc1-pygraphistry-0+unknown&splashAfter=1721017010&info=true&play=0&lockedR=True&bg=%23E2E2E2'