Hugging Face Accelerate
DVCLive allows you to add experiment tracking capabilities to your Hugging Face Accelerate projects.
Usage
If you have dvclive installed, the DVCLive Tracker will be used for
tracking experiments and logging metrics, parameters, and plots for
accelerate>=0.25.0. Unlike with Hugging Face Transformers callback, you must
specify what to log. The following snippet shows a full example:
from accelerate import Accelerator
# optional, `log_with` defaults to "all"
accelerator = Accelerator(log_with="dvclive")
# log hyperparameters
hps = {"num_iterations": 5, "learning_rate": 1e-2}
accelerator.init_trackers("my_project", config=hps)
# log metrics
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8})
# log model
accelerator.save_state("checkpoint_dir")
if accelerator.is_main_process:
live = accelerator.get_tracker("dvclive", unwrap=True)
live.log_artifact("checkpoint_dir")
# end logging
accelerator.end_training()Initialize
from accelerate import Accelerator
# optional, `log_with` defaults to "all"
accelerator = Accelerator(log_with="dvclive")
accelerator.init_trackers(project_name="my_project")To customize tracking, include arguments to be passed to the Live instance
using init_kwargs like:
accelerator.init_trackers(
project_name="my_project",
init_kwargs={"dvclive": {"dir": "my_directory"}}
)Log parameters
To log hyperparameters, add them using config like:
hps = {"num_iterations": 5, "learning_rate": 1e-2}
accelerator.init_trackers("my_project", config=hps)Log metrics
To log metrics:
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8})Optionally pass the step (if not passed, it will be auto-incremented):
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8}, step=1)Additional logging
To log models, other artifacts, or images, retrieve the Live instance from the
tracker:
accelerator.save_state("checkpoint_dir")
if accelerator.is_main_process:
live = accelerator.get_tracker("dvclive", unwrap=True)
live.log_artifact("checkpoint_dir")accelerator.is_main_process ensures it is only called on the main process.
This is handled automatically by the Accelerate Tracker in the other methods
above, but in this example we are directly calling Live instead.
unwrap=True returns the Live instance instead of the Tracker, so that
you can call any Live methods.
End
Finally, end the experiment to trigger Live.end():
accelerator.end_training()