Today's Open Source (2024-07-12): InternVL 2.0 Multimodal Model Series

Discover the latest in AI: InternVL-2.0, EchoMimic, LAMDA-TALENT, ServerlessLLM, and FlashAttention-3. Boost your projects with these powerful tools!

Jul 12, 2024

I'd like to share some interesting AI open-source models and frameworks from today.

Project: InternVL-2.0

InternVL 2.0, developed by Shanghai AI Laboratory, is part of the "Shusheng·Wanxiang" multimodal large model series. It includes various instruction-tuned models with parameters ranging from 1B to 108B. The largest model (pro version) requires an API trial request. InternVL 2.0 is trained with an 8k context window and includes long text, multiple images, and video data. Compared to InternVL 1.5, it has significantly improved capabilities in handling these input types.

[Learn more]

Project: EchoMimic

EchoMimic, released by Ant Group, is an open-source project that generates realistic audio-driven portrait animations with editable keypoint control. It can create synchronized facial expressions and lip movements based on input audio and supports pose driving and audio-keypoint hybrid driving.

[Learn more]

Project: LAMDA-TALENT

LAMDA-TALENT is a comprehensive toolbox and benchmark platform for table data learning. It integrates over 20 advanced deep learning models, 10 classic algorithms, and 300 diverse datasets to enhance table data model performance. It includes various classic methods, tree-based methods, and popular deep learning methods, making it easy to add new datasets and methods.

[Learn more]

Project: ServerlessLLM

ServerlessLLM aims to make LLMs serverless, offering cost-effective and scalable deployment solutions. This project removes the need for users to manage servers and infrastructure, making LLM deployment more convenient. The serverless approach allows automatic scaling based on demand, ensuring smooth operation under varying traffic levels without manual intervention. ServerlessLLM provides an easy-to-use interface for deploying and managing large language models.

[Learn more]

Project: FlashAttention-3

FlashAttention-3 has released its beta version, which is 1.5 to 2.0 times faster than FlashAttention-2, reaching up to 740 TFLOPS. This achieves 75% of the theoretical maximum FLOPS utilization of the H100 GPU, up from 35% previously. FlashAttention-3 uses reordered algorithms and tiling and recomputation techniques to reduce memory usage from quadratic to linear, significantly lowering memory consumption.

FlashAttention3 is Here: H100 Utilization Soars to 75%

Meng Li

Jul 12

Read full story

Open Source Today (2024-07-11): NuminaMath 7B TIR, AI Math Olympiad Winner

Meng Li

Jul 11

Read full story

AI Disruption

FlashAttention3 is Here: H100 Utilization Soars to 75%

Open Source Today (2024-07-11): NuminaMath 7B TIR, AI Math Olympiad Winner

Discussion about this post