7.4. Python client#

This page contains the API reference of the functions in the vantage-client package.

7.4.1. User Client#

vantage6.client#

vantage6 clients

This module is contains a base client. From this base client the container client (client used by master algorithms) and the user client are derived.

Client#: alias of UserClient

class ClientBase(host, port, path='/api')#

Bases: object

Common interface to the central server.

Contains the basis for all other clients, e.g. UserClient, NodeClient and AlgorithmClient. This includes a basic interface to authenticate, send generic requests, create tasks and retrieve results.

class SubClient(parent)#

Bases: object

Create sub groups of commands using this SubClient

Parameters:: parent (UserClient) – The parent client

authenticate(credentials, path='token/user')#

Authenticate to the vantage6-server

It allows users, nodes and containers to sign in. Credentials can either be a username/password combination or a JWT authorization token.

Parameters:

credentials (dict) – Credentials used to authenticate
path (str, optional) – Endpoint used for authentication. This differs for users, nodes and containers, by default “token/user”

Raises:

Exception – Failed to authenticate

Returns:

Whether or not user is authenticated. Alternative is that user is redirected to set up two-factor authentication

Return type:

Bool

property base_path: str#

Full path to the server URL. Combination of host, port and api-path

Returns:: Server URL
Return type:: str

generate_path_to(endpoint)#

Generate URL to endpoint using host, port and endpoint

Parameters:: endpoint (str) – endpoint to which a fullpath needs to be generated
Returns:: URL to the endpoint
Return type:: str

get_results(id_=None, state=None, include_task=False, task_id=None, node_id=None, params={})#

Get task result(s) from the central server

Depending if a id is specified or not, either a single or a list of results is returned. The input and result field of the result are attempted te be decrypted. This fails if the public key at the server is not derived from the currently private key or when the result is not from your organization.

Parameters:

id (int, optional) – Id of the result, by default None
state (str, optional) – The state of the task (e.g. open), by default None
include_task (bool, optional) – Whenever to include the originating task, by default False
task_id (int, optional) – The id of the originating task, this will return all results belonging to this task, by default None
node_id (int, optional) – The id of the node at which this result has been produced, this will return all results from this node, by default None
params (dict, optional) – Additional query parameters, by default {}

Returns:

Containing the result(s)

Return type:

dict

property headers: dict#

Defines headers that are sent with each request. This includes the authorization token.

Returns:: Headers
Return type:: dict

property host: str#

Host including protocol (HTTP/HTTPS)

Returns:: Host address of the vantage6 server
Return type:: str

property name: str#

Return the node’s/client’s name

Returns:: Name of the user or node
Return type:: str

property path: str#

Path/endpoint at the server where the api resides

Returns:: Path to the api
Return type:: str

property port: int#

Port on which vantage6 server listens

Returns:: Port number
Return type:: int

post_task(name, image, collaboration_id, input_='', description='', organization_ids=None, data_format='legacy', database='default')#

Post a new task at the server

It will also encrypt input_ for each receiving organization.

Parameters:

name (str) – Human readable name for the task
image (str) – Docker image name containing the algorithm
collaboration_id (int) – Collaboration id of the collaboration for which the task is intended
input (str, optional) – Task input, by default ‘’
description (str, optional) – Human readable description of the task, by default ‘’
organization_ids (list, optional) – Ids of organizations (within the collaboration) that need to execute this task, by default None
data_format (str, optional) – Type of data format to use to send and receive data. possible values: ‘json’, ‘pickle’, ‘legacy’. ‘legacy’ will use pickle serialization. Default is ‘legacy’., by default LEGACY
database (str, optional) – Database label to use for the task, by default ‘default’

Returns:

Containing the task meta-data

Return type:

dict

Raises:

AssertionError – Encryption has not yet been setup.

refresh_token()#

Refresh an expired token using the refresh token

Raises:

Exception – Authentication Error!
AssertionError – Refresh URL not found

Return type:

None

request(endpoint, json=None, method='get', params=None, first_try=True, retry=True)#

Create http(s) request to the vantage6 server

Parameters:

endpoint (str) – Endpoint of the server
json (dict, optional) – payload, by default None
method (str, optional) – Http verb, by default ‘get’
params (dict, optional) – URL parameters, by default None
first_try (bool, optional) – Whether this is the first attempt of this request. Default True.
retry (bool, optional) – Try request again after refreshing the token. Default True.

Returns:

Response of the server

Return type:

dict

setup_encryption(private_key_file)#

Enable the encryption module fot the communication

This will attach a Crypter object to the client. It will also verify that the public key at the server matches the local private key. In case they differ, the local public key is uploaded to the server.

Parameters:: private_key_file (str) – File path of the private key file
Raises:: AssertionError – If the client is not authenticated
Return type:: None

property token: str#

JWT Authorization token

Returns:: JWT token
Return type:: str

class ContainerClient(token, *args, **kwargs)#

Bases: ClientBase

Container interface to the local proxy server (central server).

An algorithm container should never communicate directly to the central server. Therefore the algorithm container has no internet connection. The algorithm can, however, talk to a local proxy server which has interface to the central server. This way we make sure that the algorithm container does not share stuff with others, and we also can encrypt the results for a specific receiver. Thus this not a interface to the central server but to the local proxy server. However the interface is identical thus we are happy that we can ignore this detail.

authenticate()#: Containers obtain their key via their host Node.

create_new_task(input_, organization_ids=[])#

Create a new (child) task at the central server.

Containers are allowed to create child tasks (having the same run_id) at the central server. The docker image must be the same as the docker image of this container self.

Parameters:

input – input to the task
organization_ids – organization ids which need to execute this task

get_algorithm_address_by_label(task_id, label)#

Return the IP address plus port number of a given port label

Return type:: str

get_algorithm_addresses(task_id)#: Return IP address and port number of other algorithm containers involved in a task so that VPN can be used for communication

get_organizations_in_my_collaboration()#

Obtain all organization in the collaboration.

The container runs in a Node which is part of a single collaboration. This method retrieves all organization data that are within that collaboration. This can be used to target specific organizations in a collaboration.

get_results(task_id)#

Obtain results from a specific task at the server

Containers are allowed to obtain the results of their children (having the same run_id at the server). The permissions are checked at te central server.

Parameters:: task_id (int) – id of the task from which you want to obtain the results

post_task(name, image, collaboration_id, input_='', description='', organization_ids=[], database='default')#

Post a new task at the central server.

! To create a new task from the algorithm container you should use the create_new_task function !

Creating a task from a container does need to be encrypted. This is done because the container should never have access to the private key of this organization. The encryption takes place in the local proxy server to which the algorithm communicates (indirectly to the central server). Therefore we needed to overload the post_task function.

Parameters:

name (str) – human-readable name
image (str) – docker image name of the task
collaboration_id (int) – id of the collaboration in which the task should run
input – input to the task
description – human-readable description
organization_ids (list) – ids of the organizations where this task should run

Return type:

dict

refresh_token()#

Containers cannot refresh their token.

TODO we might want to notify node/server about this… TODO make a more usefull exception

class ServerInfo(host: str, port: int, path: str)#

Bases: NamedTuple

Data-class to store the server info.

Variables:

host (str) – Adress (including protocol, e.g. https://) of the vantage6 server
port (int) – Port numer to which the server listens
path (str) – Path of the api, e.g. ‘/api’

host: str#: Alias for field number 0

path: str#: Alias for field number 2

port: int#: Alias for field number 1

class UserClient(*args, verbose=False, log_level='debug', **kwargs)#

Bases: ClientBase

User interface to the vantage6-server

class Collaboration(parent)#

Bases: SubClient

Collection of collaboration requests

create(name, organizations, encrypted=False)#

Create new collaboration

Parameters:

name (str) – Name of the collaboration
organizations (list) – List of organization ids which participate in the collaboration
encrypted (bool, optional) – Whenever the collaboration should be encrypted or not, by default False

Returns:

Containing the new collaboration meta-data

Return type:

dict

get(id_)#

View specific collaboration

Parameters:: id (int) – Id from the collaboration you want to view
Returns:: Containing the collaboration information
Return type:: dict

list(scope='organization', name=None, encrypted=None, organization=None, page=1, per_page=20, include_metadata=True)#

View your collaborations

Parameters:

scope (str, optional) – Scope of the list, accepted values are organization and global. In case of organization you get the collaborations in which your organization participates. If you specify global you get the collaborations which you are allowed to see.
name (str, optional (with LIKE operator)) – Filter collaborations by name
organization (int, optional) – Filter collaborations by organization id
encrypted (bool, optional) – Filter collaborations by whether or not they are encrypted
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing collabotation information

Return type:

list of dicts

Notes

Pagination does not work in combination with scope organization as pagination is missing at endpoint /organization/<id>/collaboration

class Node(parent)#

Bases: SubClient

Collection of node requests

create(collaboration, organization=None, name=None)#

Parameters:

collaboration (int) – Collaboration id to which this node belongs
organization (int, optional) – Organization id to which this node belongs. If no id provided the users organization is used. Default value is None
name (str, optional) – Name of the node. If no name is provided the server will generate one. Default value is None

Returns:

Containing the meta-data of the new node

Return type:

dict

delete(id_)#

Deletes a node

Parameters:: id (int) – Id of the node you want to delete
Returns:: Message from the server
Return type:: dict

get(id_)#

View specific node

Parameters:: id (int) – Id of the node you want to inspect
Returns:: Containing the node meta-data
Return type:: dict

kill_tasks(id_)#

Kill all tasks currently running on a node

Parameters:: id (int) – Id of the node of which you want to kill the tasks
Returns:: Message from the server
Return type:: dict

list(name=None, organization=None, collaboration=None, is_online=None, ip=None, last_seen_from=None, last_seen_till=None, page=1, per_page=20, include_metadata=True)#

List nodes

Parameters:

name (str, optional) – Filter by name (with LIKE operator)
organization (int, optional) – Filter by organization id
collaboration (int, optional) – Filter by collaboration id
is_online (bool, optional) – Filter on whether nodes are online or not
ip (str, optional) – Filter by node VPN IP address
last_seen_from (str, optional) – Filter if node has been online since date (format: yyyy-mm-dd)
last_seen_till (str, optional) – Filter if node has been online until date (format: yyyy-mm-dd)
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Return type:

list[dict]

Returns:

list of dicts – Containing meta-data of the nodes

update(id_, name=None, organization=None, collaboration=None)#

Update node information

Parameters:

id (int) – Id of the node you want to update
name (str, optional) – New node name, by default None
organization (int, optional) – Change the owning organization of the node, by default None
collaboration (int, optional) – Changes the collaboration to which the node belongs, by default None

Returns:

Containing the meta-data of the updated node

Return type:

dict

class Organization(parent)#

Bases: SubClient

Collection of organization requests

create(name, address1, address2, zipcode, country, domain, public_key=None)#

Create new organization

Parameters:

name (str) – Name of the organization
address1 (str) – Street and number
address2 (str) – City
zipcode (str) – Zip or postal code
country (str) – Country
domain (str) – Domain of the organization (e.g. vantage6.ai)
public_key (str, optional) – Public key of the organization. This can be set later, by default None

Returns:

Containing the information of the new organization

Return type:

dict

get(id_=None)#

View specific organization

Parameters:: id (int, optional) – Organization id of the organization you want to view. In case no id is provided it will display your own organization, default value is None.
Returns:: Containing the organization meta-data
Return type:: dict

list(name=None, country=None, collaboration=None, page=None, per_page=None, include_metadata=True)#

List organizations

Parameters:

name (str, optional) – Filter by name (with LIKE operator)
country (str, optional) – Filter by country
collaboration (int, optional) – Filter by collaboration id
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing meta-data information of the organizations

Return type:

list[dict]

update(id_=None, name=None, address1=None, address2=None, zipcode=None, country=None, domain=None, public_key=None)#

Update organization information

Parameters:

id (int, optional) – Organization id, by default None
name (str, optional) – New organization name, by default None
address1 (str, optional) – Address line 1, by default None
address2 (str, optional) – Address line 2, by default None
zipcode (str, optional) – Zipcode, by default None
country (str, optional) – Country, by default None
domain (str, optional) – Domain of the organization (e.g. iknl.nl), by default None
public_key (str, optional) – public key, by default None

Returns:

The meta-data of the updated organization

Return type:

dict

class Result(parent)#

Bases: SubClient

from_task(task_id, include_task=False)#

Get all results from a specific task

Parameters:

task_id (int) – Id of the task to get results from
include_task (bool, optional) – Whenever to include the task or not, by default False

Returns:

Containing the results

Return type:

list[dict]

get(id_, include_task=False)#

View a specific result

Parameters:

id (int) – id of the result you want to inspect
include_task (bool, optional) – Whenever to include the task or not, by default False

Returns:

Containing the result data

Return type:

dict

list(task=None, organization=None, state=None, node=None, include_task=False, started=None, assigned=None, finished=None, port=None, page=None, per_page=None, include_metadata=True)#

List results

Parameters:

task (int, optional) – Filter by task id
organization (int, optional) – Filter by organization id
state (int, optional) – Filter by state: (‘open’,)
node (int, optional) – Filter by node id
include_task (bool, optional) – Whenever to include the task or not, by default False
started (tuple[str, str], optional) – Filter on a range of start times (format: yyyy-mm-dd)
assigned (tuple[str, str], optional) – Filter on a range of assign times (format: yyyy-mm-dd)
finished (tuple[str, str], optional) – Filter on a range of finished times (format: yyyy-mm-dd)
port (int, optional) – Port on which result was computed
page (int, optional) – Pagination page number, defaults to 1
per_page (int, optional) – Number of items per page, defaults to 20
include_metedata (bool, optional) – Whenevet to include pagination metadata, defaults to True

Returns:

If include_metadata is True, a dictionary is returned containing the key ‘data’ which contains a list of results, and a key ‘links’ which contains the pagination metadata. When include_metadata is False, the metadata wrapper is stripped and only a list of results is returned

Return type:

dict | list[dict]

class Role(parent)#

Bases: SubClient

create(name, description, rules, organization=None)#

Parameters:

name (str) – Role name
description (str) – Human readable description of the role
rules (list) – Rules that this role contains
organization (int, optional) – Organization to which this role belongs. In case this is not provided the users organization is used. By default None

Returns:

Containing meta-data of the new role

Return type:

dict

delete(role)#

Delete role

Parameters:: role (int) – CAUTION! Id of the role to be deleted. If you remove roles that are attached to you, you might lose access!
Returns:: Message from the server
Return type:: dict

get(id_)#

View specific role

Parameters:: id (int) – Id of the role you want to insepct
Returns:: Containing meta-data of the role
Return type:: dict

list(name=None, description=None, organization=None, rule=None, user=None, include_root=None, page=1, per_page=20, include_metadata=True)#

List of roles

Parameters:

name (str, optional) – Filter by name (with LIKE operator)
description (str, optional) – Filter by description (with LIKE operator)
organization (int, optional) – Filter by organization id
rule (int, optional) – Only show roles that contain this rule id
user (int, optional) – Only show roles that belong to a particular user id
include_root (bool, optional) – Include roles that are not assigned to any particular organization
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing roles meta-data

Return type:

list[dict]

update(role, name=None, description=None, rules=None)#

Update role

Parameters:

role (int) – Id of the role that updated
name (str, optional) – New name of the role, by default None
description (str, optional) – New description of the role, by default None
rules (list, optional) – CAUTION! This will not add rules but replace them. If you remove rules from your own role you lose access. By default None

Returns:

Containing the updated role data

Return type:

dict

class Rule(parent)#

Bases: SubClient

get(id_)#

View specific rule

Parameters:: id (int) – Id of the rule you want to view
Returns:: Containing the information about this rule
Return type:: dict

list(name=None, operation=None, scope=None, role=None, page=1, per_page=20, include_metadata=True)#

List of all available rules

Parameters:

name (str, optional) – Filter by rule name
operation (str, optional) – Filter by operation
scope (str, optional) – Filter by scope
role (int, optional) – Only show rules that belong to this role id
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing all the rules from the vantage6 server

Return type:

list of dicts

class Task(parent)#

Bases: SubClient

create(collaboration, organizations, name, image, description, input, data_format='legacy', database='default')#

Create a new task

Parameters:

collaboration (int) – Id of the collaboration to which this task belongs
organizations (list) – Organization ids (within the collaboration) which need to execute this task
name (str) – Human readable name
image (str) – Docker image name which contains the algorithm
description (str) – Human readable description
input (dict) – Algorithm input
data_format (str, optional) – IO data format used, by default LEGACY
database (str, optional) – Database name to be used at the node

Returns:

[description]

Return type:

dict

delete(id_)#

Delete a task

Also removes the related results.

Parameters:: id (int) – Id of the task to be removed
Returns:: Message from the server
Return type:: dict

get(id_, include_results=False)#

View specific task

Parameters:

id (int) – Id of the task you want to view
include_results (bool, optional) – Whenever to include the results or not, by default False

Returns:

Containing the task data

Return type:

dict

kill(id_)#

Kill a task running on one or more nodes

Note that this does not remove the task from the database, but merely halts its execution (and prevents it from being restarted).

Parameters:: id (int) – Id of the task to be killed
Returns:: Message from the server
Return type:: dict

list(initiator=None, initiating_user=None, collaboration=None, image=None, parent=None, run=None, name=None, include_results=False, description=None, database=None, result=None, status=None, page=1, per_page=20, include_metadata=True)#

List tasks

Parameters:

name (str, optional) – Filter by the name of the task. It will match with a Like operator. I.e. E% will search for task names that start with an ‘E’.
initiator (int, optional) – Filter by initiating organization
initiating_user (int, optional) – Filter by initiating user
collaboration (int, optional) – Filter by collaboration
image (str, optional) – Filter by Docker image name (with LIKE operator)
parent (int, optional) – Filter by parent task
run (int, optional) – Filter by run
include_results (bool, optional) – Whenever to include the results in the tasks, by default False
description (str, optional) – Filter by description (with LIKE operator)
database (str, optional) – Filter by database (with LIKE operator)
result (int, optional) – Only show task that contains this result id
status (str, optional) – Filter by task status (e.g. ‘active’, ‘pending’, ‘completed’, ‘crashed’)
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Return type:

dict

Returns:

dict – dictonairy containing the key ‘data’ which contains the tasks and a key ‘links’ containing the pagination metadata
OR
list – when ‘include_metadata’ is set to false, it removes the metadata wrapper. I.e. directly returning the ‘data’ key.

class User(parent)#

Bases: SubClient

create(username, firstname, lastname, password, email, organization=None, roles=[], rules=[])#

Create new user

Parameters:

username (str) – Used to login to the service. This can not be changed later.
firstname (str) – Firstname of the new user
lastname (str) – Lastname of the new user
password (str) – Password of the new user
organization (int) – Organization id this user should belong to
roles (list of ints) – Role ids that are assigned to this user. Note that you can only assign roles if you own the rules within this role.
rules (list of ints) – Rule ids that are assigned to this user. Note that you can only assign rules that you own

Returns:

Containing data of the new user

Return type:

dict

get(id_=None)#

View user information

Parameters:: id (int, optional) – User id, by default None. When no id is provided your own user information is displayed
Returns:: Containing user information
Return type:: dict

list(username=None, organization=None, firstname=None, lastname=None, email=None, role=None, rule=None, last_seen_from=None, last_seen_till=None, page=1, per_page=20, include_metadata=True)#

List users

Parameters:

username (str, optional) – Filter by username (with LIKE operator)
organization (int, optional) – Filter by organization id
firstname (str, optional) – Filter by firstname (with LIKE operator)
lastname (str, optional) – Filter by lastname (with LIKE operator)
email (str, optional) – Filter by email (with LIKE operator)
role (int, optional) – Show only users that have this role id
rule (int, optional) – Show only users that have this rule id
last_seen_from (str, optional) – Filter users that have logged on since (format yyyy-mm-dd)
last_seen_till (str, optional) – Filter users that have logged on until (format yyyy-mm-dd)
page (int, optional) – Pagination page, by default 1
per_page (int, optional) – Number of items on a single page, by default 20
include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing the meta-data of the users

Return type:

list of dicts

update(id_=None, firstname=None, lastname=None, organization=None, rules=None, roles=None, email=None)#

Update user details

In case you do not supply a user_id, your user is being updated.

Parameters:

id (int) – User id from the user you want to update
firstname (str) – Your first name
lastname (str) – Your last name
organization (int) – Organization id of the organization you want to be part of. This can only done by super-users.
rules (list of ints) – USE WITH CAUTION! Rule ids that should be assigned to this user. All previous assigned rules will be removed!
roles (list of ints) – USE WITH CAUTION! Role ids that should be assigned to this user. All previous assigned roles will be removed!
email (str) – New email from the user

Returns:

A dict containing the updated user data

Return type:

dict

class Util(parent)#

Bases: SubClient

Collection of general utilities

change_my_password(current_password, new_password)#

Change your own password by providing your current password

Parameters:

current_password (str) – Your current password
new_password (str) – Your new password

Returns:

Message from the server

Return type:

dict

generate_private_key(file_=None)#

Generate new private key

Parameters:: file (str, optional) – Path where to store the private key, by default None
Return type:: None

get_server_health()#

View the health of the vantage6-server

Returns:: Containing the server health information
Return type:: dict

get_server_version()#

View the version number of the vantage6-server

Returns:: A dict containing the version number
Return type:: dict

reset_my_password(email=None, username=None)#

Start reset password procedure

Either a username of email needs to be provided.

Parameters:

email (str, optional) – Email address of your account, by default None
username (str, optional) – Username of your account, by default None

Returns:

Message from the server

Return type:

dict

reset_two_factor_auth(password, email=None, username=None)#

Start reset procedure for two-factor authentication

The password and either username of email must be provided.

Parameters:

password (str) – Password of your account
email (str, optional) – Email address of your account, by default None
username (str, optional) – Username of your account, by default None

Returns:

Message from the server

Return type:

dict

set_my_password(token, password)#

Set a new password using a recovery token

Token can be obtained through .reset_password(…)

Parameters:

token (str) – Token obtained from reset_password
password (str) – New password

Returns:

Message from the server

Return type:

dict

set_two_factor_auth(token)#

Setup two-factor authentication using a recovery token after you have lost access.

Token can be obtained through .reset_two_factor_auth(…)

Parameters:: token (str) – Token obtained from reset_two_factor_auth
Returns:: Message from the server
Return type:: dict

authenticate(username, password, mfa_code=None)#

Authenticate as a user

It also collects some additional info about your user.

Parameters:

username (str) – Username used to authenticate
password (str) – Password used to authenticate
mfa_token (str | int) – Six-digit two-factor authentication code

Return type:

None

wait_for_results(task_id, sleep=1)#

Polls the server to check when results are ready, and returns the results when the task is completed.

Parameters:

task_id (int) – ID of the task that you are waiting for
sleep (float) – Interval in seconds between checks if task is finished. Default 1.

Returns:

A dictionary with the results of the task, after it has completed.

Return type:

dict

vantage6.client.utils#

class LogLevel(value)#

Enum for the different log levels

Variables:

DEBUG (str) – The debug log level
INFO (str) – The info log level
WARN (str) – The warn log level
ERROR (str) – The error log level
CRITICAL (str) – The critical log level

print_qr_code(json_data)#

Print the QR code for 2fa with additional info of how to use it.

This function should work in any terminal or Python scripting environment. Therefore, all is printed regardless of log level

Parameters:: json_data (dict) – A dictionary containing the secret and URI to generate the QR code
Return type:: None

show_qr_code_image(qr_uri)#

Print a QR code image to the user’s python enviroment

Parameters:: qr_uri (str) – An OTP-auth URI used to generate the QR code
Return type:: None

7.4.2. Algorithm Client#

vantage6.client.algorithm_client#

class AlgorithmClient(token, *args, **kwargs)#

Bases: ClientBase

Interface to communicate between the algorithm container and the central server via a local proxy server.

An algorithm container cannot communicate directly to the central server as it has no internet connection. The algorithm can, however, talk to a local proxy server which has interface to the central server. This way we make sure that the algorithm container does not share details with others, and we also can encrypt the results for a specific receiver. Thus, this not a interface to the central server but to the local proxy server - however, the interface looks identical to make it easier to use.

Parameters:

token (str) – JWT (container) token, generated by the node the algorithm container runs on
*args – Arguments passed to the parent ClientBase class.
**kwargs – Arguments passed to the parent ClientBase class.

class Collaboration(parent)#

Bases: SubClient

Get information about the collaboration.

get()#

Get the collaboration data.

Returns:: Dictionary containing the collaboration data.
Return type:: dict

class Node(parent)#

Bases: SubClient

Get information about the node.

get()#

Get the node data.

Returns:: Dictionary containing data on the node this algorithm is running on.
Return type:: dict

class Organization(parent)#

Bases: SubClient

Get information about organizations in the collaboration.

get(id_)#

Get an organization by ID.

Parameters:: id (int) – ID of the organization to retrieve
Returns:: Dictionary containing the organization data.
Return type:: dict

list()#

Obtain all organization in the collaboration.

Returns:: List of organizations in the collaboration.
Return type:: list[dict]

class Result(parent)#

Bases: SubClient

Result client for the algorithm container.

This client is used to obtain results of tasks with the same run_id from the central server.

get(task_id)#

Obtain results from a specific task at the server.

Containers are allowed to obtain the results of their children (having the same run_id at the server). The permissions are checked at te central server.

Note that the returned results are not decrypted. The algorithm is responisble for decrypting the results.

Parameters:: task_id (int) – ID of the task from which you want to obtain the results
Returns:: List of results. The type of the results depends on the algorithm.
Return type:: list

class Task(parent)#

Bases: SubClient

A task client for the algorithm container.

It provides functions to get task information and create new tasks.

create(input_, organization_ids=None, name='subtask', description=None)#

Create a new (child) task at the central server.

Containers are allowed to create child tasks (having the same run_id) at the central server. The docker image must be the same as the docker image of this container self.

Parameters:

input (bytes) – Input to the task. Should be b64 encoded.
organization_ids (list[int]) – List of organization IDs that should execute the task.
name (str, optional) – Name of the subtask
description (str, optional) – Description of the subtask

Returns:

Dictionary containing information on the created task

Return type:

dict

get(task_id)#

Retrieve a task at the central server.

Parameters:: task_id (int) – ID of the task to retrieve
Returns:: Dictionary containing the task information
Return type:: dict

class VPN(parent)#

Bases: SubClient

A VPN client for the algorithm container.

It provides functions to obtain the IP addresses of other containers.

get_addresses(only_children=False, only_parent=False, include_children=False, include_parent=False, label=None)#

Get information about the VPN IP addresses and ports of other algorithm containers involved in the current task. These addresses can be used to send VPN communication to.

Parameters:

only_children (bool, optional) – Only return the IP addresses of the children of the current task, by default False. Incompatible with only_parent.
only_parent (bool, optional) – Only return the IP address of the parent of the current task, by default False. Incompatible with only_children.
include_children (bool, optional) – Include the IP addresses of the children of the current task, by default False. Incompatible with only_parent, superseded by only_children.
include_parent (bool, optional) – Include the IP address of the parent of the current task, by default False. Incompatible with only_children, superseded by only_parent.
label (str, optional) – The label of the port you are interested in, which is set in the algorithm Dockerfile. If this parameter is set, only the ports with this label will be returned.

Returns:

List of dictionaries containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.

Return type:

list[dict] | dict

get_child_addresses()#

Get the IP addresses and port numbers of the children of the current task.

Returns:: List of dictionaries containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.
Return type:: List[dict]

get_parent_address()#

Get the IP address and port number of the parent of the current task.

Returns:: Dictionary containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.
Return type:: dict

request(*args, **kwargs)#

Make a request to the central server. This overwrites the parent function so that containers will not try to refresh their token, which they would be unable to do.

Parameters:

*args – Arguments passed to the parent ClientBase.request function.
**kwargs – Arguments passed to the parent ClientBase.request function.

Returns:

Response from the central server.

Return type:

dict

7.4.3. Algorithm tooling#

vantage6.tools.wrapper#

This module contains algorithm wrappers. These wrappers are used to provide different data adapters to the algorithms. This way we ony need to write the algorithm once and can use it with different data adapters.

Currently the following wrappers are available:

DockerWrapper (= CSVWrapper)
SparqlDockerWrapper
ParquetWrapper
SQLWrapper
OMOPWrapper
ExcelWrapper

When writing the Docker file for the algorithm, you can call the auto_wrapper which will automatically select the correct wrapper based on the database type. The database type is set by the vantage6 node based on its configuration file.

For legacy reasons, the docker_wrapper, sparql_docker_wrapper and parquet_wrapper are still available. These wrappers are deprecated and will be removed in the future.

The multi_wrapper is used when multiple databases are connected to a single algorithm. This wrapper is separated from the other wrappers because it is not compatible with the smart_wrapper.

class CSVWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the csv file, supplied by te node
input_data (dict) – Unused, as csv files do not require a query

Returns:

The data from the csv file

Return type:

pandas.DataFrame

CsvWrapper#: alias of CSVWrapper

DockerWrapper#: alias of CSVWrapper

class ExcelWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the excel file, supplied by te node
input_data (dict) – May contain a ‘sheet_name’, which is passed to pandas.read_excel

Returns:

The data from the excel file

Return type:

pandas.DataFrame

class MultiDBWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Supply the all URI’s to the algorithm. Note that this does not load the data from the database, but only the URI’s. So the algorithm needs to load the data itself.

Parameters:

database_uri (str) – Unused, as all databases URI are passed on to the algorithm.
input_data (dict) – Unused

Returns:

A dictionary with the database label as key and the URI as value

Return type:

dict

class OMOPWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the OMOP database, supplied by te node
input_data (dict) – Contain a JSON cohort definition from the ATLAS tool, to retrieve the data from the database

Returns:

The data from the database

Return type:

pandas.DataFrame

class ParquetWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the parquet file, supplied by te node
input_data (dict) – Unused, as no additional settings are required

Returns:

The data from the parquet file

Return type:

pandas.DataFrame

class SQLWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the sql database, supplied by te node
input_data (dict) – Contain a ‘query’, to retrieve the data from the database

Returns:

The data from the database

Return type:

pandas.DataFrame

class SparqlDockerWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the triplestore, supplied by te node
input_data (dict) – Can contain a ‘query’, to retrieve the data from the triplestore

Returns:

The data from the triplestore

Return type:

pandas.DataFrame

class WrapperBase#

Bases: ABC

abstract static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:

database_uri (str) – URI of the database to read
input_data (dict) – User defined input, which may contain a query for the database

wrap_algorithm(module, load_data=True, use_new_client=False, log_traceback=False)#

Wrap an algorithm module to provide input and output handling for the vantage6 infrastructure.

Data is received in the form of files, whose location should be specified in the following environment variables:

INPUT_FILE: input arguments for the algorithm
OUTPUT_FILE: location where the results of the algorithm should be stored
TOKEN_FILE: access token for the vantage6 server REST api
DATABASE_URI: either a database endpoint or path to a csv file.

The wrapper is able to parse a number of input file formats. The available formats can be found in vantage6.tools.data_format.DataFormat. When the input is not pickle (legacy), the format should be specified in the first bytes of the input file, followed by a ‘.’.

It is also possible to specify the desired output format. This is done by including the parameter ‘output_format’ in the input parameters. Again, the list of possible output formats can be found in vantage6.tools.data_format.DataFormat.

It is still possible that output serialization will fail even if the specified format is listed in the DataFormat enum. Algorithms can in principle return any python object, but not every serialization format will support arbitrary python objects. When dealing with unsupported algorithm output, the user should use ‘pickle’ as output format, which is the default.

The other serialization formats support the following algorithm output: - built-in primitives (int, float, str, etc.) - built-in collections (list, dict, tuple, etc.) - pandas DataFrames

Parameters:

module (str) – Python module name of the algorithm to wrap.
load_data (bool, optional) – Whether to load the data into a pandas DataFrame or not, by default True
use_new_client (bool) – Whether to use the new AlgorithmClient or the old ContainerClient, by default False
log_traceback (bool) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information.

Return type:

None

auto_wrapper(module, load_data=True, use_new_client=False, log_traceback=False)#

Wrap an algorithm module to provide input and output handling for the vantage6 infrastructure. This function will automatically select the correct wrapper based on the database type.

Parameters:

module (str) – Python module name of the algorithm to wrap.
load_data (bool, optional) – Wether to load the data or not, by default True
use_new_client (bool, optional) – Wether to use the new client or not, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

docker_wrapper(module, load_data=True, use_new_client=False, log_traceback=False)#

Specific wrapper for CSV only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:

module (str) – Module name of the algorithm package.
load_data (bool, optional) – Whether to load the data into a pandas DataFrame or not, by default True
use_new_client (bool, optional) – Whether to use the new or old client, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

load_input(input_file)#

Try to read the specified data format and deserialize the rest of the stream accordingly. If this fails, assume the data format is pickle.

Parameters:: input_file (str) – Path to the input file
Returns:: Deserialized input data
Return type:: Any
Raises:: DeserializationException – Failed to deserialize input data

multidb_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for multiple data sources.

Parameters:

module (str) – Module name of the algorithm package.
use_new_client (bool, optional) – Whether to use the new or old client, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

parquet_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for Parquet only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:

module (str) – Module name of the algorithm package.
use_new_client (bool, optional) – Whether to use the new or old client, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

select_wrapper(database_type)#

Select the correct wrapper based on the database type.

Parameters:: database_type (str) – The database type to select the wrapper for.
Returns:: The wrapper for the specified database type.
Return type:: derivative of WrapperBase

sparql_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for SPARQL only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:

module (str) – Module name of the algorithm package.
use_new_client (bool, optional) – Whether to use the new or old client, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

write_output(output_format, output, output_file)#

Write output to output_file using the format from output_format.

If output_format == None, write output as pickle without indicating format (legacy method)

Parameters:

output_format (str) – Data type of the output e.g. ‘pickle’, ‘json’, ‘csv’, ‘parquet’
output (Any) – Output of the algorithm, could by any type
output_file (str) – Path to the output file

Return type:

None

vantage6.tools.mock_client#

class ClientMockProtocol(datasets, module)#

The ClientMockProtocol is used to test your algorithm locally. It mimics the behaviour of the client and its communication with the server.

Parameters:

datasets (list[str]) – A list of paths to the datasets that are used in the algorithm.
module (str) – The name of the module that contains the algorithm.

create_new_task(input_, organization_ids=None)#

Create a new task with the MockProtocol and return the task id.

Parameters:

input (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.
organization_ids (list[int], optional) – A list of organization ids that should run the algorithm.

Returns:

The id of the task.

Return type:

int

get_organizations_in_my_collaboration()#

Get mocked organizations.

Returns:: A list of mocked organizations.
Return type:: list[dict]

get_results(task_id)#

Return the results of the task with the given id.

Parameters:: task_id (int) – The id of the task.
Returns:: The results of the task.
Return type:: list[dict]

get_task(task_id)#

Return the task with the given id.

Parameters:: task_id (int) – The id of the task.
Returns:: The task details.
Return type:: dict

class MockAlgorithmClient(datasets, module, node_id=None, collaboration_id=None, organization_id=None)#

The MockAlgorithmClient mimics the behaviour of the AlgorithmClient. It can be used to mock the behaviour of the AlgorithmClient and its communication with the server.

Parameters:

datasets (list[dict]) –
A list of dictionaries that contain the datasets that are used in the mocked algorithm. The dictionaries should contain the following: {

”database”: str | pd.DataFrame, “type”: str, “input_data”: dict

} where database is the path/URI to the database, type is the database type (as listed in node configuration) and input_data contains the input data that is normally passed to the algorithm wrapper.

Note that if the database is a pandas DataFrame, the type and input_data keys are not required.
module (str) – The name of the module that contains the algorithm.
node_id (int, optional) – Sets the mocked node id that to this value. Defaults to 1.
collaboration_id (int, optional) – Sets the mocked collaboration id to this value. Defaults to 1.
organization_id (int, optional) – Sets the mocked organization id to this value. Defaults to 1.

class Collaboration(parent)#

Collaboration subclient for the MockAlgorithmClient

get(is_encrypted=True)#

Get mocked collaboration

Parameters:: is_encrypted (bool) – Whether the collaboration is encrypted or not. Default True.
Returns:: A mocked collaboration.
Return type:: dict

class Node(parent)#

Node subclient for the MockAlgorithmClient

get(is_online=True)#

Get mocked node

Parameters:: is_online (bool) – Whether the node is online or not. Default True.
Returns:: A mocked node.
Return type:: dict

class Organization(parent)#

Organization subclient for the MockAlgorithmClient

get(id_)#

Get mocked organization by ID

Parameters:: id (int) – The id of the organization.
Returns:: A mocked organization.
Return type:: dict

list()#

Get mocked organizations in the collaboration.

Returns:: A list of mocked organizations in the collaboration.
Return type:: list[dict]

class Result(parent)#

Result subclient for the MockAlgorithmClient

get(task_id)#

Return the results of the task with the given id.

Parameters:: task_id (int) – The id of the task.
Returns:: The results of the task.
Return type:: list[dict]

class SubClient(parent)#

Create sub groups of commands using this SubClient

Parameters:: parent (MockAlgorithmClient) – The parent client

class Task(parent)#

Task subclient for the MockAlgorithmClient

create(input_, organization_ids, name='mock', description='mock', *args, **kwargs)#

Create a new task with the MockProtocol and return the task id.

Parameters:

input (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.
organization_ids (list[int]) – A list of organization ids that should run the algorithm.
name (str, optional) – The name of the task, by default “mock”
description (str, optional) – The description of the task, by default “mock”

Returns:

The id of the task.

Return type:

int

get(task_id)#

Return the task with the given id.

Parameters:: task_id (int) – The id of the task.
Returns:: The task details.
Return type:: dict

vantage6.tools.dispatch_rpc#

dispatch_rpc(data, input_data, module, token, use_new_client=False, log_traceback=False)#

Load the algorithm module and call the correct method to run an algorithm.

Parameters:

data (Any) – The data that is passed to the algorithm.
input_data (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.
module (str) – The name of the module that contains the algorithm.
token (str) – The JWT token that is used to authenticate from the algorithm container to the server.
use_new_client (bool, optional) – Whether to use the new client or the old client, by default False
log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Returns:

The result of the algorithm.

Return type:

Any

vantage6.tools.util#

error(msg)#

Print an error message to stdout.

Parameters:: msg (str) – Error message to be printed
Return type:: None

info(msg)#

Print an info message to stdout.

Parameters:: msg (str) – Message to be printed
Return type:: None

warn(msg)#

Print a warning message to stdout.

Parameters:: msg (str) – Warning message to be printed
Return type:: None

7.4.4. Custom exceptions#

vantage6.client.exceptions#

exception DeserializationException#: Exception raised when deserialization of algorithm input or result fails.