Metadata
metadata ¶
Column ¶
Column(name, data_type, stats=None, category_type=None)
Model a column of a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the column. |
required |
data_type
|
str
|
The type of the data contained in the column. |
required |
stats
|
BooleanStat | TextStat | NumericalStat | DateStat | None
|
Additional statistics about the column. |
None
|
category_type
|
ColumnCategoryType | None
|
Column category type. |
None
|
DBColumn ¶
DBColumn(
name,
data_type,
is_unique=None,
nullable=None,
is_private_key=None,
is_foreign_key=None,
stats=None,
)
Bases: Column
Model a column of a dataset, like a database column.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the column. |
required |
data_type
|
str
|
The type of the data contained in the column. |
required |
is_unique
|
bool | None
|
If the column uniquely defines a record. |
None
|
nullable
|
bool | None
|
If the column can contain null value. |
None
|
is_private_key
|
bool | None
|
If the column uniquely defines a record, individually or with other columns (can be null). |
None
|
is_foreign_key
|
bool | None
|
If the column refers to another one, individually or with other columns (cannot be null). |
None
|
stats
|
BooleanStat | TextStat | NumericalStat | DateStat | None
|
Additional statistics about the column. |
None
|
DBMetadata ¶
DBMetadata(dbs, size, usage=None, origin=None)
Bases: Metadata
Class that describes metadata of dataset that comes from a database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dbs
|
list[MetadataDB]
|
The list of databases. |
required |
size
|
int | None
|
The size of the metadata. |
required |
usage
|
DatasetSourceUsage | None
|
The usage of the metadata. |
None
|
origin
|
str | None
|
The origin of the metadata. |
None
|
DatasetSourceType ¶
File ¶
File(
name,
size=None,
fingerprint=None,
created_date=None,
updated_date=None,
uri=None,
columns=None,
dataframe=None,
content_type=None,
extra_metadata=None,
display_name=None,
capture_schema_only=False,
)
Bases: Source
Describe a dataset file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the file. |
required |
size
|
int | None
|
The size of the file. |
None
|
fingerprint
|
str | None
|
The hash of the file. |
None
|
created_date
|
str | None
|
The date of creation of the file. |
None
|
updated_date
|
str | None
|
The date of last update of the file. |
None
|
uri
|
str | None
|
The uri of the file. |
None
|
columns
|
list[Column] | None
|
The columns coming from the dataframe with the statistics. |
None
|
dataframe
|
Optional
|
A dataframe allowing vectice to optionally compute more metadata about this resource such as columns stats, size, rows number and column numbers. (Support Pandas and Spark) |
None
|
content_type
|
Optional
|
HTTP 'Content-Type' header for this file. |
None
|
extra_metadata
|
Optional
|
Extra metadata to be captured. |
None
|
display_name
|
Optional
|
Name that will be shown in the Web App. |
None
|
capture_schema_only
|
Optional
|
A boolean parameter indicating whether to capture only the schema or both the schema and column statistics of the dataframes. |
False
|
FilesMetadata ¶
FilesMetadata(files, size=None, usage=None, origin=None)
Bases: Metadata
The metadata of a set of files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
files
|
list[File]
|
The list of files of the dataset. |
required |
size
|
int | None
|
The size of the set of files. |
None
|
usage
|
DatasetSourceUsage | None
|
The usage of the dataset. |
None
|
origin
|
str | None
|
Where the dataset files come from. |
None
|
Metadata ¶
Metadata(type, size=None, usage=None, origin=None)
This class describes the metadata of a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int | None
|
The size of the file. |
None
|
type
|
DatasetSourceType
|
The type of file. |
required |
usage
|
DatasetSourceUsage | None
|
The usage made of the data. |
None
|
origin
|
str | None
|
The origin of the data. |
None
|
MetadataDB ¶
MetadataDB(
name,
columns,
rows_number=None,
size=None,
updated_date=None,
created_date=None,
uri=None,
dataframe=None,
extra_metadata=None,
display_name=None,
capture_schema_only=False,
type=TableType.UNKNOWN,
)
Bases: Source
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the table. |
required |
columns
|
list[DBColumn] | None
|
The columns that compose the table. |
required |
rows_number
|
int | None
|
The number of row of the table. |
None
|
size
|
int | None
|
The size of the table. |
None
|
updated_date
|
str | None
|
The date of last update of the table. |
None
|
created_date
|
str | None
|
The creation date of the table. |
None
|
uri
|
str | None
|
The uri of the table. |
None
|
dataframe
|
Optional
|
A dataframe allowing vectice to optionally compute more metadata about this resource such as columns stats, size, rows number and column numbers. (Support Pandas and Spark) |
None
|
extra_metadata
|
Optional
|
Extra metadata to be captured. |
None
|
display_name
|
Optional
|
Name that will be shown in the Web App. |
None
|
capture_schema_only
|
Optional
|
A boolean parameter indicating whether to capture only the schema or both the schema and column statistics of the dataframes. |
False
|
type
|
Optional
|
The table type. |
UNKNOWN
|