S3 resource
S3Resource ¶
S3Resource(s3_client, bucket_name, resource_path)
Bases: Resource
Wrap columnar data and its metadata in AWS S3.
Vectice stores metadata -- data about your dataset -- communicated with a resource. Your actual dataset is not stored by Vectice.
This resource wraps data that you have stored in AWS S3. You assign it to a step.
from vectice import S3Resource
from boto3.session import Session
s3_session = Session( # (1)
aws_access_key_id="...",
aws_secret_access_key="...",
region_name="us-east-1",
)
s3_client = s3_session.client(service_name="s3") # (2)
s3_resource = S3Resource(
s3_client,
bucket_name="my_bucket",
resource_path="my_resource_path",
)
- See boto3 sessions.
- See boto3 session client.
Note that these three concepts are distinct, even if easily conflated:
- Where the data is stored
- The format at rest (in storage)
- The format when loaded in a running Python program
Notably, the statistics collectors provided by Vectice operate only on this last and only in the event that the data is loaded as a pandas dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s3_client |
Client
|
The client used to interact with Amazon s3. |
required |
bucket_name |
str
|
The name of the bucket to get data from. |
required |
resource_path |
str
|
The paths of the resources to get. |
required |
Source code in src/vectice/models/resource/s3_resource.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|