Skip to content

Base resource

Resource

Resource(
    paths,
    dataframes=None,
    capture_schema_only=False,
    columns_description=None,
)

Base class for resources.

Use Resource subclasses to assign datasets to steps. The Vectice library supports a handful of common cases. Additional cases are generally easy to supply by deriving from this base class. In particular, subclasses must override this class' abstact methods (_build_metadata(), _fetch_data()).

Examples:

To create a custom resource class, inherit from Resource, and implement the _build_metadata() and _fetch_data() methods:

from vectice import Resource, DatasetSourceOrigin, FilesMetadata

class MyResource(Resource):
    _origin = "Data source name"

    def __init__(
        self,
        paths: str | list[str],
    ):
        super().__init__(paths=paths)

    def _build_metadata(self) -> FilesMetadata:  # (1)
        files = ...  # fetch file list from your custom storage
        total_size = ...  # compute total file size, retrieve them from self._paths
        return FilesMetadata(
            size=total_size,
            origin=self._origin,
            files=files,
            usage=self.usage,
        )

    def _fetch_data(self) -> dict[str, bytes]:
        files_data = {}
        for file in self.metadata.files:
            file_contents = ...  # fetch file contents from your custom storage
            files_data[file.name] = file_contents
        return files_data
  1. Return FilesMetadata for data stored in files, DBMetadata for data stored in a database.

data

data()

The resource's data.

Returns:

Type Description
dict

The resource's data.

metadata

metadata(value)

Set the resource's metadata.

Parameters:

Name Type Description Default
value Metadata

The metadata to set.

required

usage

usage(value)

Set the resource's usage.

Parameters:

Name Type Description Default
value DatasetSourceUsage

The usage to set.

required