Visualize CSV Mini-App#
Jupyter:
File
->Make a copy
Colab:File
->Save a copy in Drive
Run notebook cells by pressing
shift-enter
Either edit annd run top cells one-by-one, or edit and run the self-contained version at the bottom
[1]:
#!pip install graphistry -q
[3]:
import pandas as pd
import graphistry
# To specify Graphistry account & server, use:
# graphistry.register(api=3, username='...', password='...', protocol='https', server='hub.graphistry.com')
# For more options, see https://github.com/graphistry/pygraphistry#configure
1. Upload csv#
Use a file by uploading it or via URL.
Run help(pd.read_csv)
for more options.
File Upload: Jupyter Notebooks#
If circle on top right not green, click
kernel
->reconnect
Go to file directory (
/tree
) by clicking the Jupyter logoNavigate to the directory page containing your notebook
Press the
upload
button on the top right
File Upload: Google Colab#
Open the left sidebar by pressing the right arrow on the left
Go to the
Files
tabPress
UPLOAD
Make sure goes into
/content
File Upload: URL#
Uncomment below line and put in the actual data url
Run
help(pd.read_csv)
for more options
[4]:
file_path = './data/honeypot.csv'
df = pd.read_csv(file_path)
print('# rows', len(df))
df.sample(min(len(df), 3))
('# rows', 220)
[4]:
attackerIP | victimIP | victimPort | vulnName | count | time(max) | time(min) | |
---|---|---|---|---|---|---|---|
145 | 41.230.211.128 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 2 | 1.421730e+09 | 1.421729e+09 |
25 | 122.121.202.157 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 8 | 1.423612e+09 | 1.423611e+09 |
75 | 182.68.160.230 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 9 | 1.417438e+09 | 1.417436e+09 |
2. Optional: Clean up CSV#
[5]:
df = df.rename(columns={
# 'attackerIP': 'src_ip',
# 'victimIP': 'dest_ip'
})
df.sample(3)
[5]:
attackerIP | victimIP | victimPort | vulnName | count | time(max) | time(min) | |
---|---|---|---|---|---|---|---|
70 | 182.161.224.84 | 172.31.14.66 | 139.0 | MS08067 (NetAPI) | 4 | 1.419954e+09 | 1.419952e+09 |
10 | 115.115.227.82 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 2 | 1.413569e+09 | 1.413569e+09 |
152 | 46.130.76.13 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 7 | 1.421093e+09 | 1.421092e+09 |
3. Configure: Visualize with 3 kinds of graphs#
Set mode
and the corresponding values:
Mode “A”. See graph from table of (src,dst) edges#
Mode “B”. See hypergraph: Draw row as node and connect it to entities in same row#
Pick which cols to make nodes
If multiple cols share same type (e.g., “src_ip”, “dest_ip” are both “ip”), unify them
Mode “C”. See by creating multiple nodes, edges per row#
Pick how different column values point to other column values
If multiple cols share same type (e.g., “src_ip”, “dest_ip” are both “ip”), unify them
[6]:
#Pick 'A', 'B', or 'C'
mode = 'B'
max_rows = 1000
### 'A' == mode
my_src_col = 'attackerIP'
my_dest_col = 'victimIP'
### 'B' == mode
node_cols = ['attackerIP', 'victimIP', 'vulnName']
categories = { #optional
'ip': ['attacker_IP', 'victimIP']
#, 'user': ['owner', 'seller'],
}
### 'C' == mode
edges = {
'attackerIP': [ 'victimIP', 'victimPort', 'vulnName'],
'victimIP': [ 'victimPort'],
'vulnName': [ 'victimIP' ]
}
categories = { #optional
'ip': ['attackerIP', 'victimIP']
#, user': ['owner', 'seller'], ...
}
4. Plot: Upload & render!#
See UI guide
[75]:
g = None
hg = None
num_rows = min(max_rows, len(df))
if mode == 'A':
g = graphistry.edges(df.sample(num_rows)).bind(source=my_src_col, destination=my_dest_col)
elif mode == 'B':
hg = graphistry.hypergraph(df.sample(num_rows), node_cols, opts={'CATEGORIES': categories})
g = hg['graph']
elif mode == 'C':
nodes = list(edges.keys())
for dests in edges.values():
for dest in dests:
nodes.append(dest)
node_cols = list(set(nodes))
hg = graphistry.hypergraph(df.sample(num_rows), node_cols, direct=True, opts={'CATEGORIES': categories, 'EDGES': edges})
g = hg['graph']
#hg
print(len(g._edges))
g.plot()
('# links', 1100)
('# events', 220)
('# attrib entities', 221)
1100
[75]:
Alternative: Combined#
Split into data loading and cleaning/configuring/plotting.
[59]:
#!pip install graphistry -q
import pandas as pd
import graphistry
#graphistry.register(key='MY_KEY', server='hub.graphistry.com')
##########
#1. Load
file_path = './data/honeypot.csv'
df = pd.read_csv(file_path)
print(df.columns)
print('rows:', len(df))
print(df.sample(min(len(df),3)))
Index([u'attackerIP', u'victimIP', u'victimPort', u'vulnName', u'count',
u'time(max)', u'time(min)'],
dtype='object')
('rows:', 220)
attackerIP victimIP victimPort vulnName count \
81 187.143.247.231 172.31.14.66 445.0 MS04011 (LSASS) 1
47 151.252.204.92 172.31.14.66 139.0 MS08067 (NetAPI) 1
41 125.64.35.68 172.31.14.66 9999.0 MaxDB Vulnerability 6
time(max) time(min)
81 1.420657e+09 1.420657e+09
47 1.422929e+09 1.422929e+09
41 1.420915e+09 1.417479e+09
[79]:
##########
#2. Clean
#df = df.rename(columns={'attackerIP': 'src_ip', 'victimIP: 'dest_ip', 'victimPort': 'protocol'})
##########
#3. Config - Pick 'A', 'B', or 'C'
mode = 'C'
max_rows = 1000
### 'A' == mode
my_src_col = 'attackerIP'
my_dest_col = 'victimIP'
### 'B' == mode
node_cols = ['attackerIP', 'victimIP', 'victimPort', 'vulnName']
categories = { #optional
'ip': ['src_ip', 'dest_ip']
#, 'user': ['owner', 'seller'],
}
### 'C' == mode
edges = {
'attackerIP': [ 'victimIP', 'victimPort', 'vulnName'],
'victimIP': [ 'victimPort' ],
'vulnName': ['victimIP' ]
}
categories = { #optional
'ip': ['attackerIP', 'victimIP']
#, 'user': ['owner', 'seller'], ...
}
##########
#4. Plot
g = None
hg = None
num_rows = min(max_rows, len(df))
if mode == 'A':
g = graphistry.edges(df.sample(num_rows)).bind(source=my_src_col, destination=my_dest_col)
elif mode == 'B':
hg = graphistry.hypergraph(df.sample(num_rows), node_cols, opts={'CATEGORIES': categories})
g = hg['graph']
elif mode == 'C':
nodes = list(edges.keys())
for dests in edges.values():
for dest in dests:
nodes.append(dest)
node_cols = list(set(nodes))
hg = graphistry.hypergraph(df.sample(num_rows), node_cols, direct=True, opts={'CATEGORIES': categories, 'EDGES': edges})
g = hg['graph']
g.plot()
('# links', 1100)
('# events', 220)
('# attrib entities', 221)
[79]:
[ ]: