Skip to content

Dataset Version Representation

DatasetVersionRepresentation

Represents the metadata of a Vectice dataset version.

A Dataset Version Representation shows information about a specific version of a dataset from the Vectice app. It makes it easier to get and read this information through the API.

Hint

A dataset version ID starts with 'DTV-XXX'. Retrieve the ID in the Vectice App, then use the ID with the following methods to get the dataset version: connect.dataset_version('DTV-XXX') or connect.browse('DTV-XXX') (see Connection page).

Attributes:

Name Type Description
id str

The unique identifier of the dataset version.

project_id str

The identifier of the project to which the dataset version belongs.

name str

The name of the dataset version. For dataset versions it corresponds to the version number.

description str

The description of the dataset version.

properties List[Dict[str, Any]]

The properties associated with the dataset version.

resources List[Dict[str, Any]]

The resources summary with the type, number of files and aggregated total number of columns for each resource inside the dataset version.

iteration_origin str | None

The iteration in which this dataset version was created.

phase_origin str | None

The phase in which this dataset version was created.

dataset_representation DatasetRepresentation

Holds informations about the source dataset linked to the dataset version, where all versions are grouped together.

creator Dict[str, str]

Creator of the dataset version.

asdict

asdict()

Transform the DatasetVersionRepresentation into a organised dictionary.

Returns:

Type Description
Dict[str, Any]

The object represented as a dictionary

download_attachments

download_attachments(attachments=None, output_dir=None)

Downloads a list of attachments associated with the current dataset version.

Parameters:

Name Type Description Default
attachments list[str] | str | None

A list of attachment file names or a single attachment file name to be downloaded. If None, all attachments will be downloaded.

None
output_dir str | None

The directory path where the attachments will be saved. If None, the current working directory is used.

None

Returns:

Type Description
None

None

get_table

get_table(table)

Retrieves a table associated with the current dataset version.

Parameters:

Name Type Description Default
table str

The name of the table.

required

Returns:

Type Description
DataFrame

The data from the specified table as a DataFrame.

has_column

has_column(column)

Identifies if this dataset version has a column with name matching the column parameter.

Parameters:

Name Type Description Default
column str

The name of column to search for.

required

Returns:

Type Description
bool

Whether the dataset version has a column matching that name or not.

list_attachments

list_attachments()

Retrieves a list of attachments and prints the attachments in a table format associated with the current dataset version.

Returns:

Type Description
list[Attachment]

list[Attachment]: A list of Attachment instances corresponding

list[Attachment]

to the dataset version.

list_lineage_children

list_lineage_children()

Retrieves all the lineage children of the current dataset version.

Returns:

Type Description
list[DatasetVersionRepresentation | ModelVersionRepresentation]

The list of dataset version or model version where the current dataset version is used as input.

list_lineage_inputs

list_lineage_inputs()

Retrieves all the lineage inputs of the current dataset version.

Returns:

Type Description
list[DatasetVersionRepresentation]

The list of dataset version used as input of the current dataset version.

list_resources_items

list_resources_items()

Lists all the resources items of the DatasetVersionRepresentation.

Returns:

Type Description
list[ResourceItemRepresentation]

The list of the resources items.

properties_as_dataframe

properties_as_dataframe()

Transforms the properties of the DatasetVersionRepresentation into a DataFrame for better readability.

Returns:

Type Description
DataFrame

A pandas DataFrame containing the properties of the dataset version.

resources_as_dataframe

resources_as_dataframe()

Transforms the resources of the DatasetVersionRepresentation into a DataFrame for better readability.

Returns:

Type Description
DataFrame

A pandas DataFrame containing the resources of the dataset version.

update

update(
    properties=None,
    attachments=None,
    columns_description=None,
)

Update the Dataset Version from the API.

Parameters:

Name Type Description Default
properties dict[str, str | int] | list[Property] | Property | None

The new properties of the dataset.

None
attachments TAttachment | None

The new attachments of the dataset.

None
columns_description dict[str, str] | str | None

A dictionary or path to a csv file to map the column's name to a specific description. Should follow the format { "column_name": "Description", ... }

None

Returns:

Type Description
None

None