Visually analyze any table as a graph: Our three favorite Graphistry shapings#
Our 3 favorite ways to shape tables into a graph and then visualize it, each with just one line of code!
Simple property graphs: Edge tables
Advanced property graphs: Hypergraphs for more control
AI - UMAP: Automatically link entities with similar properties
Install#
Local pip install to run the shaping and analytics locally, CPU or GPU
For the GPU cloud visualization sessons, and GPU analytics offloading of bigger graphs, get a free username/password or api key at hub.graphistry.com
[ ]:
! pip install -q graphistry[umap_learn]
[15]:
import graphistry
print(graphistry.__version__)
# Make API key at https://hub.graphistry.com/users/personal/key/ (create free account first)
graphistry.register(api=3, personal_key_id=FILL_ME_IN, personal_key_secret=FILL_ME_IN)
0.35.4+66.g9a3a886
Data#
Sample logs
CPU mode is great for < 10K rows, and consider GPU and AI modes for 10K-1B rows
[16]:
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/graphistry/pygraphistry/refs/heads/master/demos/data/honeypot.csv')
df
[16]:
attackerIP | victimIP | victimPort | vulnName | count | time(max) | time(min) | |
---|---|---|---|---|---|---|---|
0 | 1.235.32.141 | 172.31.14.66 | 139.0 | MS08067 (NetAPI) | 6 | 1.421434e+09 | 1.421423e+09 |
1 | 105.157.235.22 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 4 | 1.422498e+09 | 1.422495e+09 |
2 | 105.186.127.152 | 172.31.14.66 | 445.0 | MS04011 (LSASS) | 1 | 1.419966e+09 | 1.419966e+09 |
3 | 105.227.98.90 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 7 | 1.421742e+09 | 1.421740e+09 |
4 | 105.235.44.218 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 4 | 1.416686e+09 | 1.416684e+09 |
... | ... | ... | ... | ... | ... | ... | ... |
215 | 94.153.13.180 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 1 | 1.423904e+09 | 1.423904e+09 |
216 | 94.243.32.41 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 10 | 1.412510e+09 | 1.412508e+09 |
217 | 95.234.253.23 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 2 | 1.421355e+09 | 1.421354e+09 |
218 | 95.68.116.216 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 20 | 1.420813e+09 | 1.414762e+09 |
219 | 95.74.232.188 | 172.31.14.66 | 445.0 | MS08067 (NetAPI) | 6 | 1.418149e+09 | 1.418148e+09 |
220 rows × 7 columns
1. Simple property graph: Edge tables with attributes#
Each table row represents an edge with properties: * One column to use as the edge source * One column as the edge destination * Remaining as edge attributes
Optionally add a nodes table by chaining .nodes(nodes_df, 'my_id_column')
[21]:
g1 = graphistry.edges(df, source='attackerIP', destination='victimIP')
g1.plot()
[21]:
2. Advanced property graphs: Hypergraphs for more control#
We commonly want a table row to yield multiple edges between multiple columns, not just a src/dst column pair
The first version simply links entities 3 columns to one another, so each table row forms a triangle:
[18]:
g2 = graphistry.hypergraph(df, ['attackerIP', 'victimIP', 'vulnName'], direct=True)['graph']
g2.plot()
# links 660
# events 220
# attrib entities 212
[18]:
We can control many aspects. In this case:
Causally directed edges: attackerIP->victimIP, attackerIP->vulnName, vulnName->attackerIP
Combine name spaces: When an IP appears both as a victimIP and attackerIP, collapse into one node, vs treating those columns as distinct node ID namespaces
[6]:
g2b = graphistry.hypergraph(df, ['attackerIP', 'victimIP', 'vulnName'], direct=True, opts={
'EDGES': {
'attackerIP': ['victimIP', 'vulnName'],
'vulnName': ['victimIP']
},
'CATEGORIES': {
'ip': ['attackerIP', 'victimIP']
}
})['graph']
g2b = g2b.encode_point_color('category', categorical_mapping={'ip': 'grey', 'vulnName': 'orange'}, as_categorical=True)
g2b.plot()
# links 660
# events 220
# attrib entities 212
[6]:
3. AI - UMAP: Automatically link entities with similar properties#
[19]:
#g3.reset_caches()
g3 = graphistry.nodes(df).umap(X=['attackerIP', 'victimIP', 'victimPort', 'vulnName', 'count', 'time(max)', 'time(min)'])
g3.plot()
[19]:
Next steps#
Learn:
Try:
Install the pygraphistry client
Create a free Graphistry Hub GPU account and even host your own GPU server
… Then login and try the file uploader!
Explore more of the Graphistry ecosystem:
Louie.AI: GenAI-first notebooks, dashboards, and pipelines, including for working with Graphistry
Dashboards: Use in Snowflake’s Streamlit, Databricks, PowerBI, and more
GFQL: The first dataframe-name graph query language, our new open source system, including optional GPU acceleration and ability to switch between local & remote execution