Tracking

Introduction to tracking

Tracking is crucial to make sure that insights from model development and experimentation can be captured, utilized and reproduced. The aim of experiment tracking is to uncover relationships between, for instance, parameter choices and output metrics. Through tracking, it becomes possible to reproduce results from produced in the past or by others. Tracking is also useful in a production deployment context as it allows monitor performance and debug protential modeling errors or pipeline issues. Using Rebase Tracking it is possible to track several different object types, including:

Scalar numbers such as parameters or metrics
Energy assets represented as EnergyDataModel objects
Links such as pointers to specific code commits (e.g. Rebase Pipelines)
Artifacts such as datasets and model objects

In Rebase Platform all tracking originates from Tables that acts as containers to organise tracked objects.

Tracking scalar parameters or metrics

To track metrics, parameters and other scalar values you use the rb.track() by passing a dictionary with the data to be logged and the table it should be logged to as:

import rebase as rb
import numpy as np

for param in np.arange(10): 
    metric = 2 + param**2
    rb.track(data={"param": param, "metric": metric}, table="my_table")

The table argument to rb.track() is a unique table identifier.

The tracked parameters and metrics can then be viewed in the Rebase Platform UI as shown below.

TODO: Add a illustration here showing the table functionality in Rebase Platform

It is also possible to track nested dictionaries (up to second-level nesting). This is, for instance, helpful when seperating metrics across training and validation datasets. Here is an example of logging nested dictionaries

import rebase as rb

def train(dataset):
    # Some script producing RMSE (root mean squared error) and 
    # mae (mean absolute error) on both train and validation datasets
    # ...   

    return rmse_train, mae_train, rmse_valid, mae_valid

rmse_train, mae_train = train()

rb.track(
    data={
        "train": {"rmse": rmse_train, "mae": mae_train}, 
        "valid": {"rmse": rmse_valid, "mae": mae_valid}}
    table="my_table")

Nested dictionaries will be shown in the Rebase Platform UI as tables with expandable content as illustraded below.

TODO: Add a illustration here showing the nested dictionary functionality in Rebase Platform

Tracking energy assets

Tracking of energy asset objects is done by providing the asset to the rb.tack() method through the data argument:

import rebase as rb

pvsystem = rb.PVSystem(capacity=1000, orientation=180, tilt=20)

rb.track(data={"pvsystem": pvsystem}, table="pvsystems")

Tracking artifacts

Some objects are easier to track as artifacts (in an object store) rather than in a table. Tracking artifacts is, for instance, useful when tracking files (such as .csv, .pickle, .h5, .netcdf, etc). In order to track an object as an artifact simply use the rb.track_artifact() instead of the rb.track() method as:

import rebase as rb

rb.track_artifact(data="dataset.csv")

Illustration in the Rebase Platform of tracked artifacts.

Introduction

Features

Components

Introduction to tracking

Tracking scalar parameters or metrics

Tracking energy assets

Tracking artifacts

Introduction

Features

Components

​Introduction to tracking

​Tracking scalar parameters or metrics

​Tracking energy assets

​Tracking artifacts

Introduction to tracking

Tracking scalar parameters or metrics

Tracking energy assets

Tracking artifacts