Dataset Version Representation
DatasetVersionRepresentation ¶
Represents the metadata of a Vectice dataset version.
A Dataset Version Representation shows information about a specific version of a dataset from the Vectice app. It makes it easier to get and read this information through the API.
Hint
A dataset version ID starts with 'DTV-XXX'. Retrieve the ID in the Vectice App, then use the ID with the following methods to get the dataset version:
connect.dataset_version('DTV-XXX')
or connect.browse('DTV-XXX')
(see Connection page).
Attributes:
Name | Type | Description |
---|---|---|
id |
str
|
The unique identifier of the dataset version. |
project_id |
str
|
The identifier of the project to which the dataset version belongs. |
name |
str
|
The name of the dataset version. For dataset versions it corresponds to the version number. |
description |
str
|
The description of the dataset version. |
properties |
List[Dict[str, Any]]
|
The properties associated with the dataset version. |
resources |
List[Dict[str, Any]]
|
The resources summary with the type, number of files and aggregated total number of columns for each resource inside the dataset version. |
iteration_origin |
str | None
|
The iteration in which this dataset version was created. |
phase_origin |
str | None
|
The phase in which this dataset version was created. |
dataset_representation |
DatasetRepresentation
|
Holds informations about the source dataset linked to the dataset version, where all versions are grouped together. |
creator |
Dict[str, str]
|
Creator of the dataset version. |
asdict ¶
asdict()
download_attachments ¶
download_attachments(attachments=None, output_dir=None)
Downloads a list of attachments associated with the current dataset version.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
attachments
|
list[str] | str | None
|
A list of attachment file names or a single attachment file name to be downloaded. If None, all attachments will be downloaded. |
None
|
output_dir
|
str | None
|
The directory path where the attachments will be saved. If None, the current working directory is used. |
None
|
Returns:
Type | Description |
---|---|
None
|
None |
get_table ¶
get_table(table)
Retrieves a table associated with the current dataset version.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table
|
str
|
The name of the table. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
The data from the specified table as a DataFrame. |
has_column ¶
has_column(column)
list_attachments ¶
list_attachments()
list_lineage_children ¶
list_lineage_children()
Retrieves all the lineage children of the current dataset version.
Returns:
Type | Description |
---|---|
list[DatasetVersionRepresentation | ModelVersionRepresentation]
|
The list of dataset version or model version where the current dataset version is used as input. |
list_lineage_inputs ¶
list_lineage_inputs()
Retrieves all the lineage inputs of the current dataset version.
Returns:
Type | Description |
---|---|
list[DatasetVersionRepresentation]
|
The list of dataset version used as input of the current dataset version. |
list_resources_items ¶
list_resources_items()
Lists all the resources items of the DatasetVersionRepresentation.
Returns:
Type | Description |
---|---|
list[ResourceItemRepresentation]
|
The list of the resources items. |
properties_as_dataframe ¶
properties_as_dataframe()
Transforms the properties of the DatasetVersionRepresentation into a DataFrame for better readability.
Returns:
Type | Description |
---|---|
DataFrame
|
A pandas DataFrame containing the properties of the dataset version. |
resources_as_dataframe ¶
resources_as_dataframe()
Transforms the resources of the DatasetVersionRepresentation into a DataFrame for better readability.
Returns:
Type | Description |
---|---|
DataFrame
|
A pandas DataFrame containing the resources of the dataset version. |
update ¶
update(
properties=None,
attachments=None,
columns_description=None,
)
Update the Dataset Version from the API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
properties
|
dict[str, str | int] | list[Property] | Property | None
|
The new properties of the dataset. |
None
|
attachments
|
TAttachment | None
|
The new attachments of the dataset. |
None
|
columns_description
|
dict[str, str] | str | None
|
A dictionary or path to a csv file to map the column's name to a specific description. Should follow the format { "column_name": "Description", ... } |
None
|
Returns:
Type | Description |
---|---|
None
|
None |