Qwen2.5 7B Surpasses o1: Microsoft's rStar-Math Makes a Stunning Debut
Discover Microsoft's rStar-Math: A cost-effective, self-evolving solution that outperforms OpenAI's o1 in math reasoning with small language models (1.5B-7B)
OpenAI's o1 has caused a sharp upward curve in the trade-off between scaling large models and their performance.
It has replicated the success of AlphaGo’s reinforcement learning in the large model space—providing more computing power leads to more intelligent outputs, eventually surpassing human levels.
However, behind this breakthrough is a massive computational support and inference cost: for the API, o1-preview charges $15 per million inputs and $60 per million outputs, while the latest o3 model, when handling complex reasoning tasks, costs thousands of dollars per query.
The industry has been searching for a more cost-effective and efficient solution. And this answer may have arrived sooner than expected.
The paper that has topped the Hugging Face leaderboard today demonstrates the potential of smaller models. A research team from Microsoft Research Asia has introduced rStar-Math.
rStar-Math shows us that small language models (SLMs) with sizes between 1.5B and 7B, without the need for distillation from larger models, can match or even surpass OpenAI's o1 in mathematical reasoning abilities.