Utilities#
Arrow uploader Module#
- class graphistry.arrow_uploader.ArrowUploader(server_base_path='http://nginx', view_base_path='http://localhost', name=None, description=None, edges=None, nodes=None, node_encodings=None, edge_encodings=None, token=None, dataset_id=None, nodes_file_id=None, edges_file_id=None, metadata=None, certificate_validation=True, org_name=None)#
Bases:
object
- Parameters:
edges (Table | None)
nodes (Table | None)
org_name (str | None)
- arrow_to_buffer(table)#
- Parameters:
table (Table)
- cascade_privacy_settings(mode=None, notify=None, invited_users=None, mode_action=None, message=None)#
- Cascade:
local (passed in)
global
hard-coded
- Parameters:
mode (Literal['private', 'organization', 'public'] | None)
notify (bool | None)
invited_users (List[str] | None)
mode_action (str | None)
message (str | None)
- property certificate_validation#
- create_dataset(json, validate=True)#
- Parameters:
validate (bool)
- property dataset_id: str#
- property description: str#
- property edge_encodings#
- property edges: Table | None#
- property edges_file_id: str#
- g_to_edge_bindings(g)#
- g_to_edge_encodings(g)#
- g_to_node_bindings(g)#
- g_to_node_encodings(g)#
- login(username, password, org_name=None)#
- maybe_bindings(g, bindings, base={})#
Skip if never called .privacy() Return True/False based on whether called
- Return type:
bool
- property metadata#
- property name: str#
- property node_encodings#
- property nodes: Table | None#
- property nodes_file_id: str#
- property org_name: str | None#
- pkey_login(personal_key_id, personal_key_secret, org_name=None)#
- post(as_files=True, memoize=True, validate=True)#
Note: likely want to pair with self.maybe_post_share_link(g)
- Parameters:
as_files (bool)
memoize (bool)
validate (bool)
- Return type:
- post_arrow(arr, graph_type, opts='')#
- Parameters:
arr (Table)
graph_type (str)
opts (str)
- post_arrow_generic(sub_path, tok, arr, opts='')#
- Parameters:
sub_path (str)
tok (str)
arr (Table)
- Return type:
Response
- post_edges_arrow(arr=None, opts='')#
- Parameters:
arr (Table | None)
- post_edges_file(file_path, file_type='csv')#
- post_file(file_path, graph_type='edges', file_type='csv')#
- post_g(g, name=None, description=None)#
Warning: main post() does not call this
- post_nodes_arrow(arr=None, opts='')#
- Parameters:
arr (Table | None)
- post_nodes_file(file_path, file_type='csv')#
Set sharing settings. Any settings not passed here will cascade from PyGraphistry or defaults
- Parameters:
obj_pk (str)
obj_type (str)
privacy (Privacy | None)
- refresh(token=None)#
- property server_base_path: str#
- sso_get_token(state)#
Koa, 04 May 2022 Use state to get token
- sso_login(org_name=None, idp_name=None)#
Koa, 04 May 2022 Get SSO login auth_url or token
- property token: str#
- verify(token=None)#
- Return type:
bool
- property view_base_path: str#
Arrow File Uploader Module#
- class graphistry.ArrowFileUploader.ArrowFileUploader(uploader)#
Bases:
object
Implement file API with focus on Arrow support
Memoization in this class is based on reference equality, while plotter is based on hash. That means the plotter resolves different-identity value matches, so by the time ArrowFileUploader compares, identities are unified for faster reference-based checks.
- Example: Upload files with per-session memoization
uploader : ArrowUploader arr : pa.Table afu = ArrowFileUploader(uploader)
file1_id = afu.create_and_post_file(arr)[0] file2_id = afu.create_and_post_file(arr)[0]
assert file1_id == file2_id # memoizes by default (memory-safe: weak refs)
- Example: Explicitly create a file and upload data for it
uploader : ArrowUploader arr : pa.Table afu = ArrowFileUploader(uploader)
file1_id = afu.create_file() afu.post_arrow(arr, file_id)
file2_id = afu.create_file() afu.post_arrow(arr, file_id)
assert file1_id != file2_id
- Parameters:
uploader (Any)
- create_and_post_file(arr, file_id=None, file_opts={}, upload_url_opts='erase=true', memoize=True)#
Create file and upload data for it.
Default upload_url_opts=’erase=true’ throws exceptions on parse errors and deletes upload.
Default memoize=True skips uploading ‘arr’ when previously uploaded in current session
See File REST API for file_opts (file create) and upload_url_opts (file upload)
- Parameters:
arr (Table)
file_id (str | None)
file_opts (dict)
upload_url_opts (str)
memoize (bool)
- Return type:
Tuple[str, dict]
- create_file(file_opts={})#
Creates File and returns file_id str.
- Defauls:
file_type: ‘arrow’
See File REST API for file_opts
- Parameters:
file_opts (dict)
- Return type:
str
- post_arrow(arr, file_id, url_opts='erase=true')#
Upload new data to existing file id
Default url_opts=’erase=true’ throws exceptions on parse errors and deletes upload.
See File REST API for url_opts (file upload)
- Parameters:
arr (Table)
file_id (str)
url_opts (str)
- Return type:
dict
- uploader: Any = None#
- graphistry.ArrowFileUploader.DF_TO_FILE_ID_CACHE: WeakKeyDictionary = <WeakKeyDictionary>#
- NOTE: Will switch to pa.Table -> … when RAPIDS upgrades from pyarrow,
which adds weakref support
- class graphistry.ArrowFileUploader.MemoizedFileUpload(file_id, output)#
Bases:
object
- Parameters:
file_id (str)
output (dict)
- file_id: str#
- output: dict#
- class graphistry.ArrowFileUploader.WrappedTable(arr)#
Bases:
object
- Parameters:
arr (Table)
- arr: Table#
- graphistry.ArrowFileUploader.cache_arr(arr)#
Hold reference to most recent memoization entries Hack until RAPIDS supports Arrow 2.0, when pa.Table becomes weakly referenceable
Validation#
- graphistry.validate.validate_encodings.cascade_encoding(base_encoding, encoding)
- graphistry.validate.validate_encodings.validate_complex(encodings, kind, attributes=None)
- Parameters:
attributes (List | None)
- graphistry.validate.validate_encodings.validate_complex_encoding(kind, mode, name, enc, attributes=None)
- Parameters:
attributes (List | None)
- graphistry.validate.validate_encodings.validate_complex_encoding_badge(kind, mode, name, badge)
- graphistry.validate.validate_encodings.validate_complex_encoding_color(base_path, kind, mode, name, enc)
- graphistry.validate.validate_encodings.validate_complex_encoding_icon(kind, mode, name, enc)
- graphistry.validate.validate_encodings.validate_edge_encodings(encodings, edge_attributes=None)
- Parameters:
edge_attributes (List | None)
- graphistry.validate.validate_encodings.validate_encodings(node_encodings, edge_encodings, node_attributes=None, edge_attributes=None)
Validate node and edge encodings for compatibility with the given attributes.
This function processes and validates the node_encodings and edge_encodings against the provided node and edge attributes, ensuring they follow the expected format. If any encoding is invalid, a ValueError is raised with details. It is a subset of what the server checks, and run by the uploader.
- Parameters:
node_encodings (dict) – Encodings for the nodes in the graph.
edge_encodings (dict) – Encodings for the edges in the graph.
node_attributes (Optional[List]) – List of node attributes to validate encodings against.
edge_attributes (Optional[List]) – List of edge attributes to validate encodings against.
- Returns:
A dictionary containing the validated encodings for nodes and edges, in the form:
- Return type:
dict
- Example:
node_encodings = {‘color’: ‘blue’, ‘size’: 5} edge_encodings = {‘weight’: 0.2} result = validate_encodings(node_encodings, edge_encodings) # {‘node_encodings’: {‘color’: ‘blue’, ‘size’: 5}, ‘edge_encodings’: {‘weight’: 0.2}}
- graphistry.validate.validate_encodings.validate_encodings_generic(encodings, kind, required_bindings)
- graphistry.validate.validate_encodings.validate_mapping(mapping, base_path)
- graphistry.validate.validate_encodings.validate_node_encodings(encodings, node_attributes=None)
- Parameters:
node_attributes (List | None)
- graphistry.validate.validate_encodings.validate_style(base_path, enc)
Versioneer#
Git implementation of _version.py.
- exception graphistry._version.NotThisMethod#
Bases:
Exception
Exception raised if a method is not valid for the current scenario.
- class graphistry._version.VersioneerConfig#
Bases:
object
Container for Versioneer configuration parameters.
- graphistry._version.get_config()#
Create, populate and return the VersioneerConfig() object.
- graphistry._version.get_keywords()#
Get the keywords needed to look up the version information.
- graphistry._version.get_versions()#
Get version information or return default if unable to do so.
- graphistry._version.git_get_keywords(versionfile_abs)#
Extract version information from the given file.
- graphistry._version.git_pieces_from_vcs(tag_prefix, root, verbose, run_command=<function run_command>)#
Get version from ‘git describe’ in the root of the source tree.
This only gets called if the git-archive ‘subst’ keywords were not expanded, and _version.py hasn’t already been rewritten with a short version string, meaning we’re inside a checked out source tree.
- graphistry._version.git_versions_from_keywords(keywords, tag_prefix, verbose)#
Get version information from git keywords.
- graphistry._version.plus_or_dot(pieces)#
Return a + if we don’t already have one, else return a .
- graphistry._version.register_vcs_handler(vcs, method)#
Create decorator to mark a method as the handler of a VCS.
- graphistry._version.render(pieces, style)#
Render the given version pieces into the requested style.
- graphistry._version.render_git_describe(pieces)#
TAG[-DISTANCE-gHEX][-dirty].
Like ‘git describe –tags –dirty –always’.
Exceptions: 1: no tags. HEX[-dirty] (note: no ‘g’ prefix)
- graphistry._version.render_git_describe_long(pieces)#
TAG-DISTANCE-gHEX[-dirty].
Like ‘git describe –tags –dirty –always -long’. The distance/hash is unconditional.
Exceptions: 1: no tags. HEX[-dirty] (note: no ‘g’ prefix)
- graphistry._version.render_pep440(pieces)#
Build up version string, with post-release “local version identifier”.
Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you get a tagged build and then dirty it, you’ll get TAG+0.gHEX.dirty
Exceptions: 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty]
- graphistry._version.render_pep440_old(pieces)#
TAG[.postDISTANCE[.dev0]] .
The “.dev0” means dirty.
Exceptions: 1: no tags. 0.postDISTANCE[.dev0]
- graphistry._version.render_pep440_post(pieces)#
TAG[.postDISTANCE[.dev0]+gHEX] .
The “.dev0” means dirty. Note that .dev0 sorts backwards (a dirty tree will appear “older” than the corresponding clean one), but you shouldn’t be releasing software with -dirty anyways.
Exceptions: 1: no tags. 0.postDISTANCE[.dev0]
- graphistry._version.render_pep440_pre(pieces)#
TAG[.post0.devDISTANCE] – No -dirty.
Exceptions: 1: no tags. 0.post0.devDISTANCE
- graphistry._version.run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None)#
Call the given command(s).
- graphistry._version.versions_from_parentdir(parentdir_prefix, root, verbose)#
Try to determine the version from the parent directory name.
Source tarballs conventionally unpack into a directory that includes both the project name and a version string. We will also support searching up two directory levels for an appropriately named parent directory