Blog

Evolutionary HPO explained

Why is hyperparameter optimization so hard for reinforcement learning, and how is AgileRLĀ changing this?

GRPO and evolutionary HPO

Combining GRPO with evolutionary hyperparameter optimization to squeeze the most out of small LLMs