Integrating Ray Tune, Hugging Face Transformers and W&B

2 minute read

Published: February 02, 2022

Update 03/21/2021: I published my modified version of run_glue.py as a public gist on GitHub.

Update 03/22/2021: My pull request has been accepted. The CustomTrainer subclass is not needed anymore for present and future versions of the transformer library: you can now use the Trainer directly.

</div></div></section><section name="b41e" class="section section--body section--last"><div class="section-divider"><hr class="section-divider"></div><div class="section-content"><div class="section-inner sectionLayout--insetColumn"><p name="001c" id="001c" class="graf graf--p graf--leading">There are a few articles, notebooks and code samples that teach how to integrate Ray Tune and Hugging Face Transformers, but they either leave out Weights & Biases or do not work anymore due to changes made to the library.</p><ul class="postList"><li name="e530" id="e530" class="graf graf--li graf-after--p">Hyperparameter Optimization for 🤗Transformers: A guide</li><li name="a847" id="a847" class="graf graf--li graf-after--li">Hyperparameter Search with Transformers and Ray Tune</li></ul><p name="aded" id="aded" class="graf graf--p graf-after--li">After some hours of experimentation, I figured out the right way to integrate them. First of all, there is a bug in the Trainer that won’t
allow you to use W&B and Ray Tune at the same time. I have already submitted a PR on this, but regardless of whether they accept it or not, you can currently fix this bug by creating a subclass that inherits from the Trainer:</p>

<p name="1365" id="1365" class="graf graf--p graf-after--figure">Before you actually instantiate a CustomTrainer object, you’ll have to create two functions: model_init and hp_space_fn.
model_init has to simply return your model, and hp_space_fn has to return the config that will be used by Ray Trace and W&B.</p><p name="102f" id="102f" class="graf graf--p graf-after--p">A few points regarding the code below:</p><p name="94be" id="94be" class="graf graf--p graf-after--p">* You can get your wandb api key at wandb.ai/authorize. I like to set it as an environment variable and run my scripts as API_KEY=… WANDB_PROJECT=my_project_name python run_glue.py …’ .</p><ul class="postList"><li name="d6db" id="d6db" class="graf graf--li graf-after--p">The model_init function is meant to run on a modified version of run_glue.py.</li></ul>

<p name="efb8" id="efb8" class="graf graf--p graf-after--figure">Now you’re ready to run the hyperparameter_search Trainer method. Any additional parameters ( such as time_budget_s ) will be passed directly to tune.run, as stated in the docs.

The best hyperparameters will be returned as a dictionary that can be accessed at best_run.hyperparameters.</p>

<p name="df14" id="df14" class="graf graf--p graf-after--figure">There may be better ways to do this, but this approach simply works. All code was tested on the transformers version 4.4.0.dev0.</p><p name="c7e3" id="c7e3" class="graf graf--p graf-after--p">P.S.: If for some reason you want to completely disable wandb, it is enough to omit the loggers argument on your call to trainer.hyperparameter_search, comment the config.update(wandb_config) line on hp_space_fn and remove the WandbCallback from the trainer:</p>

Share on

Twitter Facebook LinkedIn