Time ring layout tutorial#
Graphs where nodes have a time attribute may be layed out radially with the new time ring layout.
The tutorial overviews:
Temporal coloring
Automated use with smart defaults
time_col: str
: Specifying the time dimensionreverse: bool
: Reversing the axistime_unit: TimeUnit
: Changing the ring step time intervalnum_rings: int
: Picking the number of ringstime_start: np.datetime64, time_end: np.datetime64
: Clipping the time intervalmin_r: float, max_r: float
: Changing chart sizesformat_axis: Callable, format_label: Callable
: Changing the labels
For larger graphs, we also describe automatic GPU acceleration support
Setup#
[2]:
import os
os.environ['LOG_LEVEL'] = 'INFO'
[ ]:
from typing import Dict, List
import numpy as np
import pandas as pd
import graphistry
graphistry.register(
api=3,
username=FILL_ME_IN,
password=FILL_ME_IN,
protocol='https',
server='hub.graphistry.com',
client_protocol_hostname='https://hub.graphistry.com'
)
Data#
Edges: Load a table of IDS network events for our edges
Nodes: IP addresses, computing for each IP the time of the first and last events it was seen in
[4]:
df = pd.read_csv('https://raw.githubusercontent.com/graphistry/pygraphistry/master/demos/data/honeypot.csv')
df = df.assign(t= pd.Series(pd.to_datetime(df['time(max)'] * 1000000000)))
print(df.dtypes)
print(len(df))
df.sample(3)
attackerIP object
victimIP object
victimPort float64
vulnName object
count int64
time(max) float64
time(min) float64
t datetime64[ns]
dtype: object
220
[4]:
attackerIP | victimIP | victimPort | vulnName | count | time(max) | time(min) | t | |
---|---|---|---|---|---|---|---|---|
133 | 27.51.48.2 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 10 | 1.423648e+09 | 1.423647e+09 | 2015-02-11 09:54:42 |
120 | 217.172.247.126 | 172.31.14.66 | 139.0 | MS08067 (NetAPI) | 13 | 1.424391e+09 | 1.424389e+09 | 2015-02-20 00:16:47 |
158 | 46.175.85.19 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 8 | 1.419202e+09 | 1.419201e+09 | 2014-12-21 22:48:14 |
[5]:
ip_times = pd.concat([
df[['attackerIP', 't']].rename(columns={'attackerIP': 'ip'}),
df[['victimIP', 't']].rename(columns={'victimIP': 'ip'})
])
ip_times = ip_times.groupby('ip').agg({'t': ['min', 'max']}).reset_index()
ip_times.columns = ['ip', 't_min', 't_max']
print(ip_times.dtypes)
print(len(ip_times))
ip_times.sample(3)
ip object
t_min datetime64[ns]
t_max datetime64[ns]
dtype: object
203
[5]:
ip | t_min | t_max | |
---|---|---|---|
5 | 106.201.227.134 | 2014-11-21 14:38:07 | 2014-11-21 14:38:07 |
25 | 122.121.202.157 | 2015-02-10 23:53:52 | 2015-02-10 23:53:52 |
59 | 179.25.208.154 | 2015-01-05 23:22:45 | 2015-01-05 23:22:45 |
[6]:
g = graphistry.edges(df, 'attackerIP', 'victimIP').nodes(ip_times, 'ip')
Visualization#
Temporal coloring#
Coloring nodes and edges by time can help visual interpretation, so we encode old as cold (blue) and new as hot (red):
[7]:
g = g.encode_point_color('t_min', ['blue', 'yellow', 'red'], as_continuous=True)
Default#
The default layout will scan for a time column and try to infer reasonable layout settings
[8]:
g.time_ring_layout().plot(render=False)
[8]:
'https://hub.graphistry.com/graph/graph.html?dataset=df1e3c96e94b4770adb7bd3195f3c5e4&type=arrow&viztoken=2512035f-15a0-4284-aed4-8418e1152826&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920413&info=true&play=2000&lockedR=True&bg=%23E2E2E2'
Pick the time column and reverse direction#
[9]:
g.time_ring_layout(
time_col='t_min',
reverse=True
).plot(render=False)
[9]:
'https://hub.graphistry.com/graph/graph.html?dataset=7455256f9d5446e1bf379e9c751be4f2&type=arrow&viztoken=091ebc73-a500-4e71-a162-d1f4c5d57c6e&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920419&info=true&play=2000&lockedR=True&bg=%23E2E2E2'
Use alternate units#
Available units:
‘s’: seconds
‘m’: minutes
‘h’: hours
‘D’: days
‘W’: weeks
‘M’: months
‘Y’: years
‘C’: centuries
[10]:
g.time_ring_layout(
time_col='t_min',
time_unit='W',
num_rings=30
).plot(render=False)
[10]:
'https://hub.graphistry.com/graph/graph.html?dataset=97ecd3d87acf406f9eba87bfec53e634&type=arrow&viztoken=e1553a4c-9e4f-4771-bb21-90a262044c14&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920423&info=true&play=2000&lockedR=True&bg=%23E2E2E2'
Control the ring size, radius, and time interval#
[11]:
g.time_ring_layout(
time_unit='Y',
num_rings=2,
play_ms=0,
min_r=700,
max_r=1000
).plot(render=False)
[11]:
'https://hub.graphistry.com/graph/graph.html?dataset=e96440fb19a64bddb8c29e9a7cc80ec4&type=arrow&viztoken=e9c0203e-1d7d-42ff-bb9e-b5b86075d49d&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920426&info=true&play=0&lockedR=True&bg=%23E2E2E2'
[12]:
g.time_ring_layout(
time_unit='Y',
time_start=np.datetime64('2013'),
play_ms=0,
min_r=700,
max_r=1000
).plot(render=False)
[12]:
'https://hub.graphistry.com/graph/graph.html?dataset=30daf0734b61454f80c89606af5ae24a&type=arrow&viztoken=b8adf44c-2942-49de-a622-bc858031a169&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920428&info=true&play=0&lockedR=True&bg=%23E2E2E2'
Control labels#
[13]:
def custom_label(time: np.datetime64, ring: int, step: np.timedelta64) -> str:
date_str = pd.Timestamp(time).strftime('%Y-%m-%d')
return f'Ring {ring}: {date_str}'
g.time_ring_layout(
format_label=custom_label
).plot(render=False)
[13]:
'https://hub.graphistry.com/graph/graph.html?dataset=88adff411bb04c95843c5eee61ec1b97&type=arrow&viztoken=18f922c2-bbd3-4972-8f5b-1d6d93ea31f3&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920432&info=true&play=2000&lockedR=True&bg=%23E2E2E2'
[14]:
def custom_axis(axis: List[Dict]) -> List[Dict]:
"""
Axis with reversed label text
"""
print('axis item keys', {k: type(axis[0][k]) for k in axis[0].keys()})
return [
{**o, 'label': o['label'][::-1]}
for o in axis
]
g.time_ring_layout(
format_axis=custom_axis
).plot(render=False)
axis item keys {'label': <class 'str'>, 'r': <class 'numpy.float64'>, 'internal': <class 'bool'>}
[14]:
'https://hub.graphistry.com/graph/graph.html?dataset=1403b7ba4c4b4678904546555d656060&type=arrow&viztoken=60e14303-a0bb-4f97-a76a-a68f5ee1bcab&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920438&info=true&play=2000&lockedR=True&bg=%23E2E2E2'
GPU Acceleration#
For larger graphs, automatic GPU acceleration triggers when g._nodes
is a cudf.DataFrame
.
To ensure GPU acceleration is used, set `engine=
[15]:
import cudf
(g
.nodes(cudf.from_pandas(g._nodes))
.time_ring_layout()
).plot(render=False)
[15]:
'https://hub.graphistry.com/graph/graph.html?dataset=d9e58ab3967f4003b7ef292406ba1a47&type=arrow&viztoken=80d707a5-cb03-48f8-897f-5e56e127630b&usertag=1c11b3a4-pygraphistry-0+unknown&splashAfter=1720920444&info=true&play=2000&lockedR=True&bg=%23E2E2E2'