Skip to content

Base data wrapper

DataWrapper

DataWrapper(
    name,
    usage=None,
    derived_from=None,
    inputs=None,
    type=None,
)

Deprecated. Base class for DataWrapper.

Wrappers are deprecated.

Instead, use Dataset and Resource.

Use DataWrapper subclasses to assign datasets to steps. The Vectice library supports a handful of common cases. Additional cases are generally easy to supply by deriving from this base class. In particular, subclasses must override this class' abstact methods (_build_metadata(), _fetch_data()).

Examples:

To create a custom data wrapper, inherit from DataWrapper, and implement the _build_metadata() and _fetch_data() methods:

from vectice.models.datasource.datawrapper import DataWrapper
from vectice.models.datasource.datawrapper.metadata import FilesMetadata

class MyDataWrapper(DataWrapper):
    _origin = "Data source name"

    def _build_metadata(self) -> FilesMetadata:  # (1)
        files = ...  # fetch file list from your custom storage
        total_size = ...  # compute total file size
        return FilesMetadata(
            size=total_size,
            origin=self._origin,
            files=files,
            usage=self.usage,
        )

    def _fetch_data(self) -> dict[str, bytes]:
        files_data = {}
        for file in self.metadata.files:
            file_contents = ...  # fetch file contents from your custom storage
            files_data[file.name] = file_contents
        return files_data
  1. Return FilesMetadata for data stored in files, DBMetadata for data stored in a database.

Parameters:

Name Type Description Default
name str

The name of the data wrapper.

required
usage DatasetSourceUsage | None

The usage of the dataset.

None
derived_from list[int] | None

The list of dataset ids to create a new dataset from.

None
inputs list[int] | None

Deprecated. Use derived_from instead.

None
type DatasetType | None

The type of the dataset.

None

Raises:

Type Description
ValueError

When both type and value arguments are None.

Source code in src/vectice/models/datasource/datawrapper/data_wrapper.py
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
@abstractmethod
@deprecate(
    warn_at="23.1",
    fail_at="23.3",
    remove_at="23.4",
    reason="The data wrappers are deprecated in favor or "
    "the new Dataset and Resource classes. "
    "Using data wrappers will fail in v{fail_at}. "
    "They will be removed in v{remove_at}.",
)
@deprecate(
    parameter="inputs",
    warn_at="23.1",
    fail_at="23.2",
    remove_at="23.3",
    reason="The 'inputs' parameter is renamed 'derived_from'. "
    "Using 'inputs' will raise an error in v{fail_at}. "
    "The parameter will be removed in v{remove_at}.",
)
def __init__(
    self,
    name: str,
    usage: DatasetSourceUsage | None = None,
    derived_from: list[int] | None = None,
    inputs: list[int] | None = None,
    type: DatasetType | None = None,
):
    """Initialize a data wrapper.

    Parameters:
        name: The name of the data wrapper.
        usage: The usage of the dataset.
        derived_from: The list of dataset ids to create a new dataset from.
        inputs: Deprecated. Use `derived_from` instead.
        type: The type of the dataset.

    Raises:
        ValueError: When both `type` and `value` arguments are `None`.
    """
    if not derived_from and inputs:
        derived_from = inputs

    if type is None:
        if usage is None:
            raise ValueError("'type' and 'usage' cannot be both None")
        type = DatasetType.MODELING

    elif type is DatasetType.MODELING and usage is None:
        raise ValueError(f"'usage' must be set when 'type' is {DatasetType.MODELING!r}")

    self._old_name = name
    self._name = name
    self._type = type
    self._usage = usage
    self._derived_from = derived_from
    self._metadata = None
    self._data = None

data property

data: dict[str, bytes]

The wrapper's data.

Returns:

Type Description
dict[str, bytes]

The wrapper's data.

derived_from property

derived_from: list[int] | None

The datasets from which this wrapper is derived.

Returns:

Type Description
list[int] | None

The datasets from which this wrapper is derived.

inputs property

inputs: list[int] | None

Deprecated. Use derived_from instead.

Returns:

Type Description
list[int] | None

The datasets from which this wrapper is derived.

metadata writable property

metadata: FilesMetadata

The wrapper's metadata.

Returns:

Type Description
FilesMetadata

The wrapper's metadata.

name writable property

name: str

The wrapper's name.

Returns:

Type Description
str

The wrapper's name.

type property

type: DatasetType

The wrapper's dataset type.

Returns:

Type Description
DatasetType

The wrapper's dataset type.

usage property

usage: DatasetSourceUsage | None

The wrapper's dataset usage.

Returns:

Type Description
DatasetSourceUsage | None

The wrapper's dataset usage.