7.4. Python client#

This page contains the API reference of the functions in the vantage-client package.

7.4.1. User Client#

vantage6.client#

vantage6 clients

This module is contains a base client. From this base client the container client (client used by master algorithms) and the user client are derived.

Client#

alias of UserClient

class ClientBase(host, port, path='/api')#

Bases: object

Common interface to the central server.

Contains the basis for all other clients, e.g. UserClient, NodeClient and AlgorithmClient. This includes a basic interface to authenticate, send generic requests, create tasks and retrieve results.

class SubClient(parent)#

Bases: object

Create sub groups of commands using this SubClient

Parameters:

parent (UserClient) – The parent client

authenticate(credentials, path='token/user')#

Authenticate to the vantage6-server

It allows users, nodes and containers to sign in. Credentials can either be a username/password combination or a JWT authorization token.

Parameters:
  • credentials (dict) – Credentials used to authenticate

  • path (str, optional) – Endpoint used for authentication. This differs for users, nodes and containers, by default “token/user”

Raises:

Exception – Failed to authenticate

Returns:

Whether or not user is authenticated. Alternative is that user is redirected to set up two-factor authentication

Return type:

Bool

property base_path: str#

Full path to the server URL. Combination of host, port and api-path

Returns:

Server URL

Return type:

str

generate_path_to(endpoint)#

Generate URL to endpoint using host, port and endpoint

Parameters:

endpoint (str) – endpoint to which a fullpath needs to be generated

Returns:

URL to the endpoint

Return type:

str

get_results(id_=None, state=None, include_task=False, task_id=None, node_id=None, params={})#

Get task result(s) from the central server

Depending if a id is specified or not, either a single or a list of results is returned. The input and result field of the result are attempted te be decrypted. This fails if the public key at the server is not derived from the currently private key or when the result is not from your organization.

Parameters:
  • id (int, optional) – Id of the result, by default None

  • state (str, optional) – The state of the task (e.g. open), by default None

  • include_task (bool, optional) – Whenever to include the originating task, by default False

  • task_id (int, optional) – The id of the originating task, this will return all results belonging to this task, by default None

  • node_id (int, optional) – The id of the node at which this result has been produced, this will return all results from this node, by default None

  • params (dict, optional) – Additional query parameters, by default {}

Returns:

Containing the result(s)

Return type:

dict

property headers: dict#

Defines headers that are sent with each request. This includes the authorization token.

Returns:

Headers

Return type:

dict

property host: str#

Host including protocol (HTTP/HTTPS)

Returns:

Host address of the vantage6 server

Return type:

str

property name: str#

Return the node’s/client’s name

Returns:

Name of the user or node

Return type:

str

property path: str#

Path/endpoint at the server where the api resides

Returns:

Path to the api

Return type:

str

property port: int#

Port on which vantage6 server listens

Returns:

Port number

Return type:

int

post_task(name, image, collaboration_id, input_='', description='', organization_ids=None, data_format='legacy', database='default')#

Post a new task at the server

It will also encrypt input_ for each receiving organization.

Parameters:
  • name (str) – Human readable name for the task

  • image (str) – Docker image name containing the algorithm

  • collaboration_id (int) – Collaboration id of the collaboration for which the task is intended

  • input (str, optional) – Task input, by default ‘’

  • description (str, optional) – Human readable description of the task, by default ‘’

  • organization_ids (list, optional) – Ids of organizations (within the collaboration) that need to execute this task, by default None

  • data_format (str, optional) – Type of data format to use to send and receive data. possible values: ‘json’, ‘pickle’, ‘legacy’. ‘legacy’ will use pickle serialization. Default is ‘legacy’., by default LEGACY

  • database (str, optional) – Database label to use for the task, by default ‘default’

Returns:

Containing the task meta-data

Return type:

dict

Raises:

AssertionError – Encryption has not yet been setup.

refresh_token()#

Refresh an expired token using the refresh token

Raises:
  • Exception – Authentication Error!

  • AssertionError – Refresh URL not found

Return type:

None

request(endpoint, json=None, method='get', params=None, first_try=True, retry=True)#

Create http(s) request to the vantage6 server

Parameters:
  • endpoint (str) – Endpoint of the server

  • json (dict, optional) – payload, by default None

  • method (str, optional) – Http verb, by default ‘get’

  • params (dict, optional) – URL parameters, by default None

  • first_try (bool, optional) – Whether this is the first attempt of this request. Default True.

  • retry (bool, optional) – Try request again after refreshing the token. Default True.

Returns:

Response of the server

Return type:

dict

setup_encryption(private_key_file)#

Enable the encryption module fot the communication

This will attach a Crypter object to the client. It will also verify that the public key at the server matches the local private key. In case they differ, the local public key is uploaded to the server.

Parameters:

private_key_file (str) – File path of the private key file

Raises:

AssertionError – If the client is not authenticated

Return type:

None

property token: str#

JWT Authorization token

Returns:

JWT token

Return type:

str

class ContainerClient(token, *args, **kwargs)#

Bases: ClientBase

Container interface to the local proxy server (central server).

An algorithm container should never communicate directly to the central server. Therefore the algorithm container has no internet connection. The algorithm can, however, talk to a local proxy server which has interface to the central server. This way we make sure that the algorithm container does not share stuff with others, and we also can encrypt the results for a specific receiver. Thus this not a interface to the central server but to the local proxy server. However the interface is identical thus we are happy that we can ignore this detail.

authenticate()#

Containers obtain their key via their host Node.

create_new_task(input_, organization_ids=[])#

Create a new (child) task at the central server.

Containers are allowed to create child tasks (having the same run_id) at the central server. The docker image must be the same as the docker image of this container self.

Parameters:
  • input – input to the task

  • organization_ids – organization ids which need to execute this task

get_algorithm_address_by_label(task_id, label)#

Return the IP address plus port number of a given port label

Return type:

str

get_algorithm_addresses(task_id)#

Return IP address and port number of other algorithm containers involved in a task so that VPN can be used for communication

get_organizations_in_my_collaboration()#

Obtain all organization in the collaboration.

The container runs in a Node which is part of a single collaboration. This method retrieves all organization data that are within that collaboration. This can be used to target specific organizations in a collaboration.

get_results(task_id)#

Obtain results from a specific task at the server

Containers are allowed to obtain the results of their children (having the same run_id at the server). The permissions are checked at te central server.

Parameters:

task_id (int) – id of the task from which you want to obtain the results

post_task(name, image, collaboration_id, input_='', description='', organization_ids=[], database='default')#

Post a new task at the central server.

! To create a new task from the algorithm container you should use the create_new_task function !

Creating a task from a container does need to be encrypted. This is done because the container should never have access to the private key of this organization. The encryption takes place in the local proxy server to which the algorithm communicates (indirectly to the central server). Therefore we needed to overload the post_task function.

Parameters:
  • name (str) – human-readable name

  • image (str) – docker image name of the task

  • collaboration_id (int) – id of the collaboration in which the task should run

  • input – input to the task

  • description – human-readable description

  • organization_ids (list) – ids of the organizations where this task should run

Return type:

dict

refresh_token()#

Containers cannot refresh their token.

TODO we might want to notify node/server about this… TODO make a more usefull exception

class ServerInfo(host: str, port: int, path: str)#

Bases: NamedTuple

Data-class to store the server info.

Variables:
  • host (str) – Adress (including protocol, e.g. https://) of the vantage6 server

  • port (int) – Port numer to which the server listens

  • path (str) – Path of the api, e.g. ‘/api’

host: str#

Alias for field number 0

path: str#

Alias for field number 2

port: int#

Alias for field number 1

class UserClient(*args, verbose=False, log_level='debug', **kwargs)#

Bases: ClientBase

User interface to the vantage6-server

class Collaboration(parent)#

Bases: SubClient

Collection of collaboration requests

create(name, organizations, encrypted=False)#

Create new collaboration

Parameters:
  • name (str) – Name of the collaboration

  • organizations (list) – List of organization ids which participate in the collaboration

  • encrypted (bool, optional) – Whenever the collaboration should be encrypted or not, by default False

Returns:

Containing the new collaboration meta-data

Return type:

dict

get(id_)#

View specific collaboration

Parameters:

id (int) – Id from the collaboration you want to view

Returns:

Containing the collaboration information

Return type:

dict

list(scope='organization', name=None, encrypted=None, organization=None, page=1, per_page=20, include_metadata=True)#

View your collaborations

Parameters:
  • scope (str, optional) – Scope of the list, accepted values are organization and global. In case of organization you get the collaborations in which your organization participates. If you specify global you get the collaborations which you are allowed to see.

  • name (str, optional (with LIKE operator)) – Filter collaborations by name

  • organization (int, optional) – Filter collaborations by organization id

  • encrypted (bool, optional) – Filter collaborations by whether or not they are encrypted

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing collabotation information

Return type:

list of dicts

Notes

  • Pagination does not work in combination with scope organization as pagination is missing at endpoint /organization/<id>/collaboration

class Node(parent)#

Bases: SubClient

Collection of node requests

create(collaboration, organization=None, name=None)#

Register new node

Parameters:
  • collaboration (int) – Collaboration id to which this node belongs

  • organization (int, optional) – Organization id to which this node belongs. If no id provided the users organization is used. Default value is None

  • name (str, optional) – Name of the node. If no name is provided the server will generate one. Default value is None

Returns:

Containing the meta-data of the new node

Return type:

dict

delete(id_)#

Deletes a node

Parameters:

id (int) – Id of the node you want to delete

Returns:

Message from the server

Return type:

dict

get(id_)#

View specific node

Parameters:

id (int) – Id of the node you want to inspect

Returns:

Containing the node meta-data

Return type:

dict

kill_tasks(id_)#

Kill all tasks currently running on a node

Parameters:

id (int) – Id of the node of which you want to kill the tasks

Returns:

Message from the server

Return type:

dict

list(name=None, organization=None, collaboration=None, is_online=None, ip=None, last_seen_from=None, last_seen_till=None, page=1, per_page=20, include_metadata=True)#

List nodes

Parameters:
  • name (str, optional) – Filter by name (with LIKE operator)

  • organization (int, optional) – Filter by organization id

  • collaboration (int, optional) – Filter by collaboration id

  • is_online (bool, optional) – Filter on whether nodes are online or not

  • ip (str, optional) – Filter by node VPN IP address

  • last_seen_from (str, optional) – Filter if node has been online since date (format: yyyy-mm-dd)

  • last_seen_till (str, optional) – Filter if node has been online until date (format: yyyy-mm-dd)

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Return type:

list[dict]

Returns:

  • list of dicts – Containing meta-data of the nodes

update(id_, name=None, organization=None, collaboration=None)#

Update node information

Parameters:
  • id (int) – Id of the node you want to update

  • name (str, optional) – New node name, by default None

  • organization (int, optional) – Change the owning organization of the node, by default None

  • collaboration (int, optional) – Changes the collaboration to which the node belongs, by default None

Returns:

Containing the meta-data of the updated node

Return type:

dict

class Organization(parent)#

Bases: SubClient

Collection of organization requests

create(name, address1, address2, zipcode, country, domain, public_key=None)#

Create new organization

Parameters:
  • name (str) – Name of the organization

  • address1 (str) – Street and number

  • address2 (str) – City

  • zipcode (str) – Zip or postal code

  • country (str) – Country

  • domain (str) – Domain of the organization (e.g. vantage6.ai)

  • public_key (str, optional) – Public key of the organization. This can be set later, by default None

Returns:

Containing the information of the new organization

Return type:

dict

get(id_=None)#

View specific organization

Parameters:

id (int, optional) – Organization id of the organization you want to view. In case no id is provided it will display your own organization, default value is None.

Returns:

Containing the organization meta-data

Return type:

dict

list(name=None, country=None, collaboration=None, page=None, per_page=None, include_metadata=True)#

List organizations

Parameters:
  • name (str, optional) – Filter by name (with LIKE operator)

  • country (str, optional) – Filter by country

  • collaboration (int, optional) – Filter by collaboration id

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing meta-data information of the organizations

Return type:

list[dict]

update(id_=None, name=None, address1=None, address2=None, zipcode=None, country=None, domain=None, public_key=None)#

Update organization information

Parameters:
  • id (int, optional) – Organization id, by default None

  • name (str, optional) – New organization name, by default None

  • address1 (str, optional) – Address line 1, by default None

  • address2 (str, optional) – Address line 2, by default None

  • zipcode (str, optional) – Zipcode, by default None

  • country (str, optional) – Country, by default None

  • domain (str, optional) – Domain of the organization (e.g. iknl.nl), by default None

  • public_key (str, optional) – public key, by default None

Returns:

The meta-data of the updated organization

Return type:

dict

class Result(parent)#

Bases: SubClient

from_task(task_id, include_task=False)#

Get all results from a specific task

Parameters:
  • task_id (int) – Id of the task to get results from

  • include_task (bool, optional) – Whenever to include the task or not, by default False

Returns:

Containing the results

Return type:

list[dict]

get(id_, include_task=False)#

View a specific result

Parameters:
  • id (int) – id of the result you want to inspect

  • include_task (bool, optional) – Whenever to include the task or not, by default False

Returns:

Containing the result data

Return type:

dict

list(task=None, organization=None, state=None, node=None, include_task=False, started=None, assigned=None, finished=None, port=None, page=None, per_page=None, include_metadata=True)#

List results

Parameters:
  • task (int, optional) – Filter by task id

  • organization (int, optional) – Filter by organization id

  • state (int, optional) – Filter by state: (‘open’,)

  • node (int, optional) – Filter by node id

  • include_task (bool, optional) – Whenever to include the task or not, by default False

  • started (tuple[str, str], optional) – Filter on a range of start times (format: yyyy-mm-dd)

  • assigned (tuple[str, str], optional) – Filter on a range of assign times (format: yyyy-mm-dd)

  • finished (tuple[str, str], optional) – Filter on a range of finished times (format: yyyy-mm-dd)

  • port (int, optional) – Port on which result was computed

  • page (int, optional) – Pagination page number, defaults to 1

  • per_page (int, optional) – Number of items per page, defaults to 20

  • include_metedata (bool, optional) – Whenevet to include pagination metadata, defaults to True

Returns:

If include_metadata is True, a dictionary is returned containing the key ‘data’ which contains a list of results, and a key ‘links’ which contains the pagination metadata. When include_metadata is False, the metadata wrapper is stripped and only a list of results is returned

Return type:

dict | list[dict]

class Role(parent)#

Bases: SubClient

create(name, description, rules, organization=None)#

Register new role

Parameters:
  • name (str) – Role name

  • description (str) – Human readable description of the role

  • rules (list) – Rules that this role contains

  • organization (int, optional) – Organization to which this role belongs. In case this is not provided the users organization is used. By default None

Returns:

Containing meta-data of the new role

Return type:

dict

delete(role)#

Delete role

Parameters:

role (int) – CAUTION! Id of the role to be deleted. If you remove roles that are attached to you, you might lose access!

Returns:

Message from the server

Return type:

dict

get(id_)#

View specific role

Parameters:

id (int) – Id of the role you want to insepct

Returns:

Containing meta-data of the role

Return type:

dict

list(name=None, description=None, organization=None, rule=None, user=None, include_root=None, page=1, per_page=20, include_metadata=True)#

List of roles

Parameters:
  • name (str, optional) – Filter by name (with LIKE operator)

  • description (str, optional) – Filter by description (with LIKE operator)

  • organization (int, optional) – Filter by organization id

  • rule (int, optional) – Only show roles that contain this rule id

  • user (int, optional) – Only show roles that belong to a particular user id

  • include_root (bool, optional) – Include roles that are not assigned to any particular organization

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing roles meta-data

Return type:

list[dict]

update(role, name=None, description=None, rules=None)#

Update role

Parameters:
  • role (int) – Id of the role that updated

  • name (str, optional) – New name of the role, by default None

  • description (str, optional) – New description of the role, by default None

  • rules (list, optional) – CAUTION! This will not add rules but replace them. If you remove rules from your own role you lose access. By default None

Returns:

Containing the updated role data

Return type:

dict

class Rule(parent)#

Bases: SubClient

get(id_)#

View specific rule

Parameters:

id (int) – Id of the rule you want to view

Returns:

Containing the information about this rule

Return type:

dict

list(name=None, operation=None, scope=None, role=None, page=1, per_page=20, include_metadata=True)#

List of all available rules

Parameters:
  • name (str, optional) – Filter by rule name

  • operation (str, optional) – Filter by operation

  • scope (str, optional) – Filter by scope

  • role (int, optional) – Only show rules that belong to this role id

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing all the rules from the vantage6 server

Return type:

list of dicts

class Task(parent)#

Bases: SubClient

create(collaboration, organizations, name, image, description, input, data_format='legacy', database='default')#

Create a new task

Parameters:
  • collaboration (int) – Id of the collaboration to which this task belongs

  • organizations (list) – Organization ids (within the collaboration) which need to execute this task

  • name (str) – Human readable name

  • image (str) – Docker image name which contains the algorithm

  • description (str) – Human readable description

  • input (dict) – Algorithm input

  • data_format (str, optional) – IO data format used, by default LEGACY

  • database (str, optional) – Database name to be used at the node

Returns:

[description]

Return type:

dict

delete(id_)#

Delete a task

Also removes the related results.

Parameters:

id (int) – Id of the task to be removed

Returns:

Message from the server

Return type:

dict

get(id_, include_results=False)#

View specific task

Parameters:
  • id (int) – Id of the task you want to view

  • include_results (bool, optional) – Whenever to include the results or not, by default False

Returns:

Containing the task data

Return type:

dict

kill(id_)#

Kill a task running on one or more nodes

Note that this does not remove the task from the database, but merely halts its execution (and prevents it from being restarted).

Parameters:

id (int) – Id of the task to be killed

Returns:

Message from the server

Return type:

dict

list(initiator=None, initiating_user=None, collaboration=None, image=None, parent=None, run=None, name=None, include_results=False, description=None, database=None, result=None, status=None, page=1, per_page=20, include_metadata=True)#

List tasks

Parameters:
  • name (str, optional) – Filter by the name of the task. It will match with a Like operator. I.e. E% will search for task names that start with an ‘E’.

  • initiator (int, optional) – Filter by initiating organization

  • initiating_user (int, optional) – Filter by initiating user

  • collaboration (int, optional) – Filter by collaboration

  • image (str, optional) – Filter by Docker image name (with LIKE operator)

  • parent (int, optional) – Filter by parent task

  • run (int, optional) – Filter by run

  • include_results (bool, optional) – Whenever to include the results in the tasks, by default False

  • description (str, optional) – Filter by description (with LIKE operator)

  • database (str, optional) – Filter by database (with LIKE operator)

  • result (int, optional) – Only show task that contains this result id

  • status (str, optional) – Filter by task status (e.g. ‘active’, ‘pending’, ‘completed’, ‘crashed’)

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Return type:

dict

Returns:

  • dict – dictonairy containing the key ‘data’ which contains the tasks and a key ‘links’ containing the pagination metadata

  • OR

  • list – when ‘include_metadata’ is set to false, it removes the metadata wrapper. I.e. directly returning the ‘data’ key.

class User(parent)#

Bases: SubClient

create(username, firstname, lastname, password, email, organization=None, roles=[], rules=[])#

Create new user

Parameters:
  • username (str) – Used to login to the service. This can not be changed later.

  • firstname (str) – Firstname of the new user

  • lastname (str) – Lastname of the new user

  • password (str) – Password of the new user

  • organization (int) – Organization id this user should belong to

  • roles (list of ints) – Role ids that are assigned to this user. Note that you can only assign roles if you own the rules within this role.

  • rules (list of ints) – Rule ids that are assigned to this user. Note that you can only assign rules that you own

Returns:

Containing data of the new user

Return type:

dict

get(id_=None)#

View user information

Parameters:

id (int, optional) – User id, by default None. When no id is provided your own user information is displayed

Returns:

Containing user information

Return type:

dict

list(username=None, organization=None, firstname=None, lastname=None, email=None, role=None, rule=None, last_seen_from=None, last_seen_till=None, page=1, per_page=20, include_metadata=True)#

List users

Parameters:
  • username (str, optional) – Filter by username (with LIKE operator)

  • organization (int, optional) – Filter by organization id

  • firstname (str, optional) – Filter by firstname (with LIKE operator)

  • lastname (str, optional) – Filter by lastname (with LIKE operator)

  • email (str, optional) – Filter by email (with LIKE operator)

  • role (int, optional) – Show only users that have this role id

  • rule (int, optional) – Show only users that have this rule id

  • last_seen_from (str, optional) – Filter users that have logged on since (format yyyy-mm-dd)

  • last_seen_till (str, optional) – Filter users that have logged on until (format yyyy-mm-dd)

  • page (int, optional) – Pagination page, by default 1

  • per_page (int, optional) – Number of items on a single page, by default 20

  • include_metadata (bool, optional) – Whenever to include the pagination metadata. If this is set to False the output is no longer wrapped in a dictonairy, by default True

Returns:

Containing the meta-data of the users

Return type:

list of dicts

update(id_=None, firstname=None, lastname=None, organization=None, rules=None, roles=None, email=None)#

Update user details

In case you do not supply a user_id, your user is being updated.

Parameters:
  • id (int) – User id from the user you want to update

  • firstname (str) – Your first name

  • lastname (str) – Your last name

  • organization (int) – Organization id of the organization you want to be part of. This can only done by super-users.

  • rules (list of ints) – USE WITH CAUTION! Rule ids that should be assigned to this user. All previous assigned rules will be removed!

  • roles (list of ints) – USE WITH CAUTION! Role ids that should be assigned to this user. All previous assigned roles will be removed!

  • email (str) – New email from the user

Returns:

A dict containing the updated user data

Return type:

dict

class Util(parent)#

Bases: SubClient

Collection of general utilities

change_my_password(current_password, new_password)#

Change your own password by providing your current password

Parameters:
  • current_password (str) – Your current password

  • new_password (str) – Your new password

Returns:

Message from the server

Return type:

dict

generate_private_key(file_=None)#

Generate new private key

Parameters:

file (str, optional) – Path where to store the private key, by default None

Return type:

None

get_server_health()#

View the health of the vantage6-server

Returns:

Containing the server health information

Return type:

dict

get_server_version()#

View the version number of the vantage6-server

Returns:

A dict containing the version number

Return type:

dict

reset_my_password(email=None, username=None)#

Start reset password procedure

Either a username of email needs to be provided.

Parameters:
  • email (str, optional) – Email address of your account, by default None

  • username (str, optional) – Username of your account, by default None

Returns:

Message from the server

Return type:

dict

reset_two_factor_auth(password, email=None, username=None)#

Start reset procedure for two-factor authentication

The password and either username of email must be provided.

Parameters:
  • password (str) – Password of your account

  • email (str, optional) – Email address of your account, by default None

  • username (str, optional) – Username of your account, by default None

Returns:

Message from the server

Return type:

dict

set_my_password(token, password)#

Set a new password using a recovery token

Token can be obtained through .reset_password(…)

Parameters:
  • token (str) – Token obtained from reset_password

  • password (str) – New password

Returns:

Message from the server

Return type:

dict

set_two_factor_auth(token)#

Setup two-factor authentication using a recovery token after you have lost access.

Token can be obtained through .reset_two_factor_auth(…)

Parameters:

token (str) – Token obtained from reset_two_factor_auth

Returns:

Message from the server

Return type:

dict

authenticate(username, password, mfa_code=None)#

Authenticate as a user

It also collects some additional info about your user.

Parameters:
  • username (str) – Username used to authenticate

  • password (str) – Password used to authenticate

  • mfa_token (str | int) – Six-digit two-factor authentication code

Return type:

None

wait_for_results(task_id, sleep=1)#

Polls the server to check when results are ready, and returns the results when the task is completed.

Parameters:
  • task_id (int) – ID of the task that you are waiting for

  • sleep (float) – Interval in seconds between checks if task is finished. Default 1.

Returns:

A dictionary with the results of the task, after it has completed.

Return type:

dict

vantage6.client.utils#

class LogLevel(value)#

Enum for the different log levels

Variables:
  • DEBUG (str) – The debug log level

  • INFO (str) – The info log level

  • WARN (str) – The warn log level

  • ERROR (str) – The error log level

  • CRITICAL (str) – The critical log level

print_qr_code(json_data)#

Print the QR code for 2fa with additional info of how to use it.

This function should work in any terminal or Python scripting environment. Therefore, all is printed regardless of log level

Parameters:

json_data (dict) – A dictionary containing the secret and URI to generate the QR code

Return type:

None

show_qr_code_image(qr_uri)#

Print a QR code image to the user’s python enviroment

Parameters:

qr_uri (str) – An OTP-auth URI used to generate the QR code

Return type:

None

7.4.2. Algorithm Client#

vantage6.client.algorithm_client#

class AlgorithmClient(token, *args, **kwargs)#

Bases: ClientBase

Interface to communicate between the algorithm container and the central server via a local proxy server.

An algorithm container cannot communicate directly to the central server as it has no internet connection. The algorithm can, however, talk to a local proxy server which has interface to the central server. This way we make sure that the algorithm container does not share details with others, and we also can encrypt the results for a specific receiver. Thus, this not a interface to the central server but to the local proxy server - however, the interface looks identical to make it easier to use.

Parameters:
  • token (str) – JWT (container) token, generated by the node the algorithm container runs on

  • *args – Arguments passed to the parent ClientBase class.

  • **kwargs – Arguments passed to the parent ClientBase class.

class Collaboration(parent)#

Bases: SubClient

Get information about the collaboration.

get()#

Get the collaboration data.

Returns:

Dictionary containing the collaboration data.

Return type:

dict

class Node(parent)#

Bases: SubClient

Get information about the node.

get()#

Get the node data.

Returns:

Dictionary containing data on the node this algorithm is running on.

Return type:

dict

class Organization(parent)#

Bases: SubClient

Get information about organizations in the collaboration.

get(id_)#

Get an organization by ID.

Parameters:

id (int) – ID of the organization to retrieve

Returns:

Dictionary containing the organization data.

Return type:

dict

list()#

Obtain all organization in the collaboration.

The container runs in a Node which is part of a single collaboration. This method retrieves all organization data that are within that collaboration. This can be used to target specific organizations in a collaboration.

Returns:

List of organizations in the collaboration.

Return type:

list[dict]

class Result(parent)#

Bases: SubClient

Result client for the algorithm container.

This client is used to obtain results of tasks with the same run_id from the central server.

get(task_id)#

Obtain results from a specific task at the server.

Containers are allowed to obtain the results of their children (having the same run_id at the server). The permissions are checked at te central server.

Note that the returned results are not decrypted. The algorithm is responisble for decrypting the results.

Parameters:

task_id (int) – ID of the task from which you want to obtain the results

Returns:

List of results. The type of the results depends on the algorithm.

Return type:

list

class Task(parent)#

Bases: SubClient

A task client for the algorithm container.

It provides functions to get task information and create new tasks.

create(input_, organization_ids=None, name='subtask', description=None)#

Create a new (child) task at the central server.

Containers are allowed to create child tasks (having the same run_id) at the central server. The docker image must be the same as the docker image of this container self.

Parameters:
  • input (bytes) – Input to the task. Should be b64 encoded.

  • organization_ids (list[int]) – List of organization IDs that should execute the task.

  • name (str, optional) – Name of the subtask

  • description (str, optional) – Description of the subtask

Returns:

Dictionary containing information on the created task

Return type:

dict

get(task_id)#

Retrieve a task at the central server.

Parameters:

task_id (int) – ID of the task to retrieve

Returns:

Dictionary containing the task information

Return type:

dict

class VPN(parent)#

Bases: SubClient

A VPN client for the algorithm container.

It provides functions to obtain the IP addresses of other containers.

get_addresses(only_children=False, only_parent=False, include_children=False, include_parent=False, label=None)#

Get information about the VPN IP addresses and ports of other algorithm containers involved in the current task. These addresses can be used to send VPN communication to.

Parameters:
  • only_children (bool, optional) – Only return the IP addresses of the children of the current task, by default False. Incompatible with only_parent.

  • only_parent (bool, optional) – Only return the IP address of the parent of the current task, by default False. Incompatible with only_children.

  • include_children (bool, optional) – Include the IP addresses of the children of the current task, by default False. Incompatible with only_parent, superseded by only_children.

  • include_parent (bool, optional) – Include the IP address of the parent of the current task, by default False. Incompatible with only_children, superseded by only_parent.

  • label (str, optional) – The label of the port you are interested in, which is set in the algorithm Dockerfile. If this parameter is set, only the ports with this label will be returned.

Returns:

List of dictionaries containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.

Return type:

list[dict] | dict

get_child_addresses()#

Get the IP addresses and port numbers of the children of the current task.

Returns:

List of dictionaries containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.

Return type:

List[dict]

get_parent_address()#

Get the IP address and port number of the parent of the current task.

Returns:

Dictionary containing the IP address and port number, and other information to identify the containers. If obtaining the VPN addresses from the server fails, a dictionary with a ‘message’ key is returned instead.

Return type:

dict

request(*args, **kwargs)#

Make a request to the central server. This overwrites the parent function so that containers will not try to refresh their token, which they would be unable to do.

Parameters:
  • *args – Arguments passed to the parent ClientBase.request function.

  • **kwargs – Arguments passed to the parent ClientBase.request function.

Returns:

Response from the central server.

Return type:

dict

7.4.3. Algorithm tooling#

vantage6.tools.wrapper#

This module contains algorithm wrappers. These wrappers are used to provide different data adapters to the algorithms. This way we ony need to write the algorithm once and can use it with different data adapters.

Currently the following wrappers are available:
  • DockerWrapper (= CSVWrapper)

  • SparqlDockerWrapper

  • ParquetWrapper

  • SQLWrapper

  • OMOPWrapper

  • ExcelWrapper

When writing the Docker file for the algorithm, you can call the auto_wrapper which will automatically select the correct wrapper based on the database type. The database type is set by the vantage6 node based on its configuration file.

For legacy reasons, the docker_wrapper, sparql_docker_wrapper and parquet_wrapper are still available. These wrappers are deprecated and will be removed in the future.

The multi_wrapper is used when multiple databases are connected to a single algorithm. This wrapper is separated from the other wrappers because it is not compatible with the smart_wrapper.

class CSVWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the csv file, supplied by te node

  • input_data (dict) – Unused, as csv files do not require a query

Returns:

The data from the csv file

Return type:

pandas.DataFrame

CsvWrapper#

alias of CSVWrapper

DockerWrapper#

alias of CSVWrapper

class ExcelWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the excel file, supplied by te node

  • input_data (dict) – May contain a ‘sheet_name’, which is passed to pandas.read_excel

Returns:

The data from the excel file

Return type:

pandas.DataFrame

class MultiDBWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Supply the all URI’s to the algorithm. Note that this does not load the data from the database, but only the URI’s. So the algorithm needs to load the data itself.

Parameters:
  • database_uri (str) – Unused, as all databases URI are passed on to the algorithm.

  • input_data (dict) – Unused

Returns:

A dictionary with the database label as key and the URI as value

Return type:

dict

class OMOPWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the OMOP database, supplied by te node

  • input_data (dict) – Contain a JSON cohort definition from the ATLAS tool, to retrieve the data from the database

Returns:

The data from the database

Return type:

pandas.DataFrame

class ParquetWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the parquet file, supplied by te node

  • input_data (dict) – Unused, as no additional settings are required

Returns:

The data from the parquet file

Return type:

pandas.DataFrame

class SQLWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the sql database, supplied by te node

  • input_data (dict) – Contain a ‘query’, to retrieve the data from the database

Returns:

The data from the database

Return type:

pandas.DataFrame

class SparqlDockerWrapper#

Bases: WrapperBase

static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the triplestore, supplied by te node

  • input_data (dict) – Can contain a ‘query’, to retrieve the data from the triplestore

Returns:

The data from the triplestore

Return type:

pandas.DataFrame

class WrapperBase#

Bases: ABC

abstract static load_data(database_uri, input_data)#

Load the local privacy-sensitive data from the database.

Parameters:
  • database_uri (str) – URI of the database to read

  • input_data (dict) – User defined input, which may contain a query for the database

wrap_algorithm(module, load_data=True, use_new_client=False, log_traceback=False)#

Wrap an algorithm module to provide input and output handling for the vantage6 infrastructure.

Data is received in the form of files, whose location should be specified in the following environment variables:

  • INPUT_FILE: input arguments for the algorithm

  • OUTPUT_FILE: location where the results of the algorithm should be stored

  • TOKEN_FILE: access token for the vantage6 server REST api

  • DATABASE_URI: either a database endpoint or path to a csv file.

The wrapper is able to parse a number of input file formats. The available formats can be found in vantage6.tools.data_format.DataFormat. When the input is not pickle (legacy), the format should be specified in the first bytes of the input file, followed by a ‘.’.

It is also possible to specify the desired output format. This is done by including the parameter ‘output_format’ in the input parameters. Again, the list of possible output formats can be found in vantage6.tools.data_format.DataFormat.

It is still possible that output serialization will fail even if the specified format is listed in the DataFormat enum. Algorithms can in principle return any python object, but not every serialization format will support arbitrary python objects. When dealing with unsupported algorithm output, the user should use ‘pickle’ as output format, which is the default.

The other serialization formats support the following algorithm output: - built-in primitives (int, float, str, etc.) - built-in collections (list, dict, tuple, etc.) - pandas DataFrames

Parameters:
  • module (str) – Python module name of the algorithm to wrap.

  • load_data (bool, optional) – Whether to load the data into a pandas DataFrame or not, by default True

  • use_new_client (bool) – Whether to use the new AlgorithmClient or the old ContainerClient, by default False

  • log_traceback (bool) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information.

Return type:

None

auto_wrapper(module, load_data=True, use_new_client=False, log_traceback=False)#

Wrap an algorithm module to provide input and output handling for the vantage6 infrastructure. This function will automatically select the correct wrapper based on the database type.

Parameters:
  • module (str) – Python module name of the algorithm to wrap.

  • load_data (bool, optional) – Wether to load the data or not, by default True

  • use_new_client (bool, optional) – Wether to use the new client or not, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

docker_wrapper(module, load_data=True, use_new_client=False, log_traceback=False)#

Specific wrapper for CSV only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:
  • module (str) – Module name of the algorithm package.

  • load_data (bool, optional) – Whether to load the data into a pandas DataFrame or not, by default True

  • use_new_client (bool, optional) – Whether to use the new or old client, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

load_input(input_file)#

Try to read the specified data format and deserialize the rest of the stream accordingly. If this fails, assume the data format is pickle.

Parameters:

input_file (str) – Path to the input file

Returns:

Deserialized input data

Return type:

Any

Raises:

DeserializationException – Failed to deserialize input data

multidb_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for multiple data sources.

Parameters:
  • module (str) – Module name of the algorithm package.

  • use_new_client (bool, optional) – Whether to use the new or old client, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

parquet_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for Parquet only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:
  • module (str) – Module name of the algorithm package.

  • use_new_client (bool, optional) – Whether to use the new or old client, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

select_wrapper(database_type)#

Select the correct wrapper based on the database type.

Parameters:

database_type (str) – The database type to select the wrapper for.

Returns:

The wrapper for the specified database type.

Return type:

derivative of WrapperBase

sparql_wrapper(module, use_new_client=False, log_traceback=False)#

Specific wrapper for SPARQL only data sources. Use the auto_wrapper to automatically select the correct wrapper based on the database type.

Parameters:
  • module (str) – Module name of the algorithm package.

  • use_new_client (bool, optional) – Whether to use the new or old client, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Return type:

None

write_output(output_format, output, output_file)#

Write output to output_file using the format from output_format.

If output_format == None, write output as pickle without indicating format (legacy method)

Parameters:
  • output_format (str) – Data type of the output e.g. ‘pickle’, ‘json’, ‘csv’, ‘parquet’

  • output (Any) – Output of the algorithm, could by any type

  • output_file (str) – Path to the output file

Return type:

None

vantage6.tools.mock_client#

class ClientMockProtocol(datasets, module)#

The ClientMockProtocol is used to test your algorithm locally. It mimics the behaviour of the client and its communication with the server.

Parameters:
  • datasets (list[str]) – A list of paths to the datasets that are used in the algorithm.

  • module (str) – The name of the module that contains the algorithm.

create_new_task(input_, organization_ids=None)#

Create a new task with the MockProtocol and return the task id.

Parameters:
  • input (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.

  • organization_ids (list[int], optional) – A list of organization ids that should run the algorithm.

Returns:

The id of the task.

Return type:

int

get_organizations_in_my_collaboration()#

Get mocked organizations.

Returns:

A list of mocked organizations.

Return type:

list[dict]

get_results(task_id)#

Return the results of the task with the given id.

Parameters:

task_id (int) – The id of the task.

Returns:

The results of the task.

Return type:

list[dict]

get_task(task_id)#

Return the task with the given id.

Parameters:

task_id (int) – The id of the task.

Returns:

The task details.

Return type:

dict

class MockAlgorithmClient(datasets, module, node_id=None, collaboration_id=None, organization_id=None)#

The MockAlgorithmClient mimics the behaviour of the AlgorithmClient. It can be used to mock the behaviour of the AlgorithmClient and its communication with the server.

Parameters:
  • datasets (list[dict]) –

    A list of dictionaries that contain the datasets that are used in the mocked algorithm. The dictionaries should contain the following: {

    ”database”: str | pd.DataFrame, “type”: str, “input_data”: dict

    } where database is the path/URI to the database, type is the database type (as listed in node configuration) and input_data contains the input data that is normally passed to the algorithm wrapper.

    Note that if the database is a pandas DataFrame, the type and input_data keys are not required.

  • module (str) – The name of the module that contains the algorithm.

  • node_id (int, optional) – Sets the mocked node id that to this value. Defaults to 1.

  • collaboration_id (int, optional) – Sets the mocked collaboration id to this value. Defaults to 1.

  • organization_id (int, optional) – Sets the mocked organization id to this value. Defaults to 1.

class Collaboration(parent)#

Collaboration subclient for the MockAlgorithmClient

get(is_encrypted=True)#

Get mocked collaboration

Parameters:

is_encrypted (bool) – Whether the collaboration is encrypted or not. Default True.

Returns:

A mocked collaboration.

Return type:

dict

class Node(parent)#

Node subclient for the MockAlgorithmClient

get(is_online=True)#

Get mocked node

Parameters:

is_online (bool) – Whether the node is online or not. Default True.

Returns:

A mocked node.

Return type:

dict

class Organization(parent)#

Organization subclient for the MockAlgorithmClient

get(id_)#

Get mocked organization by ID

Parameters:

id (int) – The id of the organization.

Returns:

A mocked organization.

Return type:

dict

list()#

Get mocked organizations in the collaboration.

Returns:

A list of mocked organizations in the collaboration.

Return type:

list[dict]

class Result(parent)#

Result subclient for the MockAlgorithmClient

get(task_id)#

Return the results of the task with the given id.

Parameters:

task_id (int) – The id of the task.

Returns:

The results of the task.

Return type:

list[dict]

class SubClient(parent)#

Create sub groups of commands using this SubClient

Parameters:

parent (MockAlgorithmClient) – The parent client

class Task(parent)#

Task subclient for the MockAlgorithmClient

create(input_, organization_ids, name='mock', description='mock', *args, **kwargs)#

Create a new task with the MockProtocol and return the task id.

Parameters:
  • input (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.

  • organization_ids (list[int]) – A list of organization ids that should run the algorithm.

  • name (str, optional) – The name of the task, by default “mock”

  • description (str, optional) – The description of the task, by default “mock”

Returns:

The id of the task.

Return type:

int

get(task_id)#

Return the task with the given id.

Parameters:

task_id (int) – The id of the task.

Returns:

The task details.

Return type:

dict

vantage6.tools.dispatch_rpc#

dispatch_rpc(data, input_data, module, token, use_new_client=False, log_traceback=False)#

Load the algorithm module and call the correct method to run an algorithm.

Parameters:
  • data (Any) – The data that is passed to the algorithm.

  • input_data (dict) – The input data that is passed to the algorithm. This should at least contain the key ‘method’ which is the name of the method that should be called. Another often used key is ‘master’ which indicates that this container is a master container. Other keys depend on the algorithm.

  • module (str) – The name of the module that contains the algorithm.

  • token (str) – The JWT token that is used to authenticate from the algorithm container to the server.

  • use_new_client (bool, optional) – Whether to use the new client or the old client, by default False

  • log_traceback (bool, optional) – Whether to print the full error message from algorithms or not, by default False. Algorithm developers should only use this option if they are sure that the error message does not contain any sensitive information. By default False.

Returns:

The result of the algorithm.

Return type:

Any

vantage6.tools.util#

error(msg)#

Print an error message to stdout.

Parameters:

msg (str) – Error message to be printed

Return type:

None

info(msg)#

Print an info message to stdout.

Parameters:

msg (str) – Message to be printed

Return type:

None

warn(msg)#

Print a warning message to stdout.

Parameters:

msg (str) – Warning message to be printed

Return type:

None

7.4.4. Custom exceptions#

vantage6.client.exceptions#

exception DeserializationException#

Exception raised when deserialization of algorithm input or result fails.