Today’s Open Source (2024-07-22): RWKV-6-World 14B Supports 100+ Languages and Code

Explore cutting-edge AI open-source models like RWKV-6-World, xLAM, and Athene-70B. Discover innovative solutions for language processing and virtual dressing!

Meng Li

Jul 22, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: RWKV/RWKV-6-World 14B

The RWKV Foundation has released the RWKV-6-World 14B model, the strongest dense pure RNN large language model to date.

In recent performance tests, the RWKV-6-World 14B model's English performance is equivalent to Llama2 13b.

Additionally, among models with the same parameters, RWKV-6-World 14B has the strongest multilingual performance, supporting over 100 languages and code.

https://huggingface.co/BlinkDL/rwkv-6-world

Project: xLAM

xLAM, developed by Salesforce, is a series of large language models focused on function calls, available in 1b and 7b parameter sizes.

These models aim to enhance decision-making by translating user intent into executable actions, suitable for automation workflows in various fields.

Optimized for efficient deployment on personal devices, they support offline functionality and enhanced privacy.

https://huggingface.co/Salesforce/xLAM-7b-fc-r

https://huggingface.co/Salesforce/xLAM-1b-fc-r

Project: Athene-70B

Athene-70B is a large model based on Llama-3-70B-Instruct, trained with Reinforcement Learning from Human Feedback (RLHF).

Developed by the Nexusflow team, it scored highly in the Chatbot Arena benchmark Arena-Hard-Auto, suitable for dialogue generation tasks.

https://huggingface.co/Nexusflow/Athene-70B

Project: IMAGDressing

IMAGDressing is an interactive modular apparel generation project for virtual dressing.

It uses a clothing UNet, combining CLIP's semantic features and VAE's texture features for user-controllable editing.

The project also released the IGPair dataset with over 300,000 pairs of clothing and wearing images and established a standard data assembly pipeline.

IMAGDressing can be combined with extensions like ControlNet, IP-Adapter, T2I-Adapter, and AnimateDiff to enhance diversity and controllability.

https://arxiv.org/abs/2407.12705

https://github.com/muzishen/imagdressing

Project: Tricycle

Tricycle is a fast, from-scratch deep learning library using Python and Numpy.

It can train a GPT-2 (124M) model on a GPU, processing 2.3 billion tokens in less than three days.

Tricycle aims to help anyone with some Python experience understand deep learning, providing a complete implementation from an autograd engine to GPU support.

https://github.com/bclarkson-code/Tricycle

Project: Metron

Metron is a tool for benchmarking LLM inference systems. It offers various evaluation methods and resource versions, supporting integrations with multiple open-source and public API systems, helping developers and researchers assess and optimize LLM performance.

https://arxiv.org/abs/2407.07000

https://github.com/project-metron/metron

Project: LLM-on-Ray

LLM-on-Ray is a comprehensive solution designed to help users build, customize, and deploy LLMs on Intel platforms using the Ray framework. Whether starting from scratch with pretraining, fine-tuning an existing model, or deploying a production-ready LLM endpoint service, this project simplifies complex processes into manageable steps. LLM-on-Ray leverages Ray's distributed computing capabilities, ensuring efficient fault tolerance and cluster resource management, making LLM projects more reliable and scalable.

https://github.com/intel/llm-on-ray

Open Source Today (2024-07-19): Mistral Collaborates with NVIDIA to Open Source 12B Model

Meng Li

Jul 19

Open Source Today (2024-07-19): Mistral Collaborates with NVIDIA to Open Source 12B Model

I'll share some interesting AI open-source models and frameworks today. Project: Mistral Nemo Mistral-Nemo, co-trained by Mistral AI and NVIDIA, is a large model with 12B parameters and supports 128k context. It includes Base and Instruct versions. Among models of similar size, Mistral-Nemo excels in reasoning, world knowledge, and code accuracy.

Read full story

AI Disruption

Open Source Today (2024-07-19): Mistral Collaborates with NVIDIA to Open Source 12B Model

Discussion about this post