Qwen2: World's Strongest Open-Source Model Beats Llama 3
Unveiling Qwen2: 6 Key Advantages That Crown Alibaba Cloud as the Global Leader in Open-Source Models
The title of the strongest open-source model has changed overnight! Alibaba Cloud's Qwen2 has topped multiple global authoritative rankings.
The most powerful language model in the open-source world, Qwen2, is here!
On June 7th, Tongyi Qianwen released a new series of open-source models called Qwen2. Qwen2 has five models: 0.5B, 1.5B, 7B, 72B, and MoE. The 72B model in Qwen2 performs much better than the previous 110B model in Qwen1.5. Qwen2-72B is stronger than the best open-source US model, Llama3-70B. It is also better than many closed-source Chinese models like Wenxin 4.0. So Qwen2-72B is now the strongest open-source large model.
Qwen2's Unmatched Open-Source Capabilities: Outperforming Llama 3-70B and Closed-Source Models
The new Qwen2 series has five pre-trained models: 0.5B, 1.5B, 7B, 57B-A14B, and 72B.
Compared to the earlier Qwen1.5, Qwen2 performs much better overall.
On a respected leaderboard, the previous Qwen1.5-110B was already better than many closed Chinese models like Wenxin 4.0. The new Qwen2-72B increases the lead over these models.
Qwen2-72B also outperforms the current top open models like Llama3-70B and Mixtrl-8x22B.
Qwen2-72B ranked highest in many important tests. These tests checked language understanding, knowledge, coding, math, and multiple languages. So Qwen2-72B is now the strongest open large model in the world.
The smaller Qwen2 models also perform very well. They generally beat other open models of the same size or even larger models. The Qwen2-7B model does much better than recent top models, especially in code and understanding Chinese.
Three Generations in a Year: Qwen2 Dominates the Open-Source Model Scene
Alibaba Cloud released Qwen1.5 in February this year.
Just over three months later, they released Qwen2.
Compared to Qwen1.5, Qwen2 is much better.
It improved abilities like logical reasoning, multiple languages, long texts, code, and math.
1. Major Improvements in Code and Math, Crushing Llama 3
Qwen2's code used good parts of CodeQwen1.5. This made Qwen2 better at many programming languages. For math, Qwen2-72B-Instruct had big, high-quality data. In many tests, Qwen2-72B-Instruct did much better than Llama 3-7B-Instruct.
2. Supports 128k Long Texts, Open-Source Intelligent Solutions
The picture shows that Qwen2-72B-Instruct can perfectly do information extraction tasks. The context length is up to 128k words. This is for the Needle in a Haystack test set.
Other Qwen2 models also do very well:
Qwen2-7B-Instruct can almost perfectly handle contexts up to 128k words.
Qwen2-57B-A14B-Instruct can handle 64k word contexts.
The two smaller Qwen2 models support 32k word contexts.
In addition to the long-context models, Alibaba Cloud has open-sourced an intelligent solution. This solution can efficiently handle contexts with up to 1 million words.
3. Enhanced Security, Comparable to GPT-4
The table shows the percentage of bad responses from big models. The responses were for multilingual queries about illegal things, fraud, pornography, and privacy violence.
Statistical tests were done. The tests show Qwen2-72B-Instruct performs as well as GPT-4 for security. Qwen2-72B-Instruct performs much better than Mixtral-8x22B for security.
Llama 3 was not included in the tests. Llama 3 did not do well with multilingual prompts.
The table proves Alibaba Cloud is now a leader in open-source large models.
In August 2023, Alibaba Cloud was the first Chinese tech company to open-source their model, Qwen. In February 2024, they released Qwen1.5. Less than 4 months later, they open-sourced the full Qwen2 model.
In under 1 year, Qwen's 72B and 110B models have topped open model rankings multiple times.
Alibaba Cloud shared innovative methods used to develop Qwen2:
All Qwen2 model sizes now use GQA, speeding up inference and reducing memory usage.
All models are trained on 32k context data and can handle 128k contexts.
Improved multilingual capabilities for 27 languages besides Chinese/English.
Used supervised fine-tuning, feedback training, online DPO, and online merging.
Automated ways to get high-quality training data with minimal manual annotation.
Open Source Qwen and Alibaba Cloud
Open source is important for the internet. It starts with Unix and the Internet. Without open source, few companies control technology. Stop new uses and new ideas.
Big AI models also need open source. Open-source models bring talent. Small and medium companies can use them for new things.
Alibaba Cloud believes open source is powerful. CTO Zhou Jingren says open source key strategy for Alibaba Cloud.
Open source helps make new technology. Also, gives a platform for developers worldwide.
Alibaba Cloud put big models on ModelScope. Offer services like choose model, train, security, and use. Also, tools to build apps for customers.
Openness makes the community grow. ModelScope community has open-source AI models. Become a big AI community in China.
Open source or closed source big choice for business. For big AI models, how to make money from open source is not clear yet. Meta thinking of cloud service profit share.
Alibaba Cloud is a cloud provider. So logic is clear. Big AI models make the need for cloud computing grow. The Open Qwen model brings vendors and developers to use Alibaba computing and services. Make a business model.
Alibaba's path is different from Meta, Microsoft, and AWS. Use the open Qwen model and tool platform. Aim to make industry standards. Convince customers to choose Alibaba Cloud.
This cloud + AI demonstration effect stands out among global large model providers.
Qwen: A Key Piece of Alibaba's AGI Vision
Alibaba shows confidence in the new AGI battle. Have a clear strategy.
In a letter to shareholders, Alibaba says e-commerce and cloud computing are two core businesses. Will keep investing to lead in key technologies like AI.
AI has big jumps forward. Change many industries a lot. Cloud computing helps AI grow fast. Can process big data and train complex models.
Alibaba Cloud uses infrastructure and new technology to push AI forward. Qwen model gets global respect. Result of cloud and AI strategy.
New technology always drives competition. Alibaba Cloud combines AI and cloud computing. Make new business models for the AI era. Find a way to keep growing.
Recent talk with JPMorgan, Alibaba Vice Chair Joe Tsai stressed AI and cloud computing are very important. Say Alibaba believes strongly in AGI's vision.
Now looks like Alibaba Cloud built a key part of this AGI vision.
Hello everyone, I am writing a series of 10 articles about GPT. This is the sixth one.