Skip to content

Autolog

autolog

Auto-Logging assets with Vectice.

IMPORTANT INFORMATION

Autolog is continuously evolving to enhance supported libraries, environments, and functionalities to provide an improved user experience.

Your feedback is highly valued. Please send any feedback to support@vectice.com.


The autolog feature in Vectice allows for seamless documentation and management of your data science projects. Please note the following details about this feature.

This feature is designed to work specifically with notebooks.

1. Installation: Make sure you install Vectice package using the following command:

```bash
pip install vectice
```

2. Supported libraries and environment: Vectice automatically identifies and log assets encapsulated within a specified list of supported libraries and environement mentioned below

Supported libraries and environment
  • Dataframe: Pandas, Spark.
  • Model: Scikit, Xgboost, Lightgbm, Catboost, Keras, Pytorch, Statsmodels.
  • Graphs: Matplotlib, Seaborn, Plotly.
  • Environments: Colab, Jupyter, Vertex, SageMaker, Databricks, Pycharm and VScode notebook.

3. General behavior: Vectice autolog provides three methods: autolog.config, autolog.notebook, and autolog.cell. These methods are designed to log every asset to Vectice existing as a variable in the notebook's memory. It is important to review the specific behaviors outlined in the documentation for each of these three methods.

IMPORTANT INFORMATION
  • For GRAPHS, ensure they are saved as files or the plot is displayed to be automatically logged in the iteration i.e
    • In inline environments like Jupyter Lab and Jupyter Hub (not IDEs), using plt.show() can flush the canvas, potentially preventing displayed or saved graphs from being captured. To avoid this, use only plt.savefig() to save your figures instead of or before calling plt.show(). See here: https://matplotlib.org/stable/users/explain/figure/interactive.html
    • fig.write_image("my figure.png") (for plotly)
  • For METRICS, Vectice currently recognizes sklearn metrics for automatic association with models.
    • In cases there's ambiguity due to multiple models with different metrics, Vectice won't automatically link them. To establish the link, make sure each model and its respective metrics are placed within the same notebook cell.

config

config(api_token, phase, host=None)

Configures the autolog functionality within Vectice.

The config method allows you to establish a connection to Vectice and specify the phase in which you want to autolog your work.

# Configure autolog
from vectice import autolog
autolog.config(
    api_token = 'your-api-key', # Paste your api key
    phase = 'PHA-XXX'           # Paste your Phase Id
)

Parameters:

Name Type Description Default
api_token str

The api token provided inside your Vectice app (API key).

required
phase str

The ID of the phase in which you wish to autolog your work as an iteration.

required
host str | None

The backend host to which the client will connect. If not found, the default endpoint https://app.vectice.com is used.

None

phase_config

phase_config(phase=None)

Update the phase in which autolog is logging assets. (This method will update the configured phase defined in autolog.config).

If phase is None, the method print the current configured phase.

Ensure that you have configured the autolog with your Vectice API token before using this method.

# After autolog is configured
autolog.phase_config(
    phase = 'PHA-XXX'   # Paste your Phase Id
)

Parameters:

Name Type Description Default
phase str | None

The ID of the phase in which you wish to autolog your work as an iteration.

None

notebook

notebook(
    note=None,
    capture_schema_only=True,
    capture_comments=True,
)

Automatically log all supported models, dataframes, and graphs from your notebook within the Vectice App as assets.

IMPORTANT INFORMATION

Autolog must be configured at the beginning of your notebook to capture all relevant information. Cells executed prior to configuring autolog may not have their assets recorded by the autolog.notebook() method and may need to be run again.

                               ...
#Add this command at the end of notebook to log all the assets in memory
autolog.notebook()

Ensure that the required assets are in memory before calling this method.

Parameters:

Name Type Description Default
note str | None

the note or comment to log to your iteration associated with the autolog.notebook

None
capture_schema_only bool

A boolean parameter indicating whether to capture only the schema or both the schema and column statistics of the dataframes. If set to False, both the schema and column statistics of the dataframes will be captured. Please note that this option may require additional processing time due to the computation of statistics.

True
capture_comments bool

A boolean parameter indicating whether comments should be automatically logged as notes inside Vectice. If set to True, autolog will capture all comments that start with '##'.

True

cell

cell(
    create_new_iteration=False,
    note=None,
    capture_schema_only=True,
    capture_comments=True,
)

Automatically logs all supported models, dataframes, and graphs from a specific notebook cell within the Vectice platform.

This method facilitates the selective logging of assets within a particular notebook cell, allowing users to precisely choose the assets to log to Vectice with an optional control to log assets inside a new iteration.

Ensure that you have configured the autolog with your Vectice API token and the relevant phase ID before using this method.

                               ...
#Add this command at the end of the desired cell to log all the cells assets
autolog.cell()

Place the command at the end of the desired cell to log all assets within that cell.

Parameters:

Name Type Description Default
create_new_iteration bool

If set to False, logging of assets will happen in the last updated iteration. Otherwise, it will create a new iteration for logging the cell's assets.

False
note str | None

the note or comment to log to your iteration associated with the autolog.cell

None
capture_schema_only bool

A boolean parameter indicating whether to capture only the schema or both the schema and column statistics of the dataframes. If set to False, both the schema and column statistics of the dataframes will be captured. Please note that this option may require additional processing time due to the computation of statistics.

True
capture_comments bool

A boolean parameter indicating whether comments should be automatically logged as notes inside Vectice. If set to True, autolog will capture all comments that start with '##'.

True

get_connection

get_connection()

Get the Connection from autolog.config(...) to interact with the Vectice base Python API.

Ensure that you have configured the autolog with your Vectice API token and the relevant phase ID before using this method.

                               ...
#After autolog is configured
connect = autolog.get_connection()