Open Source Today (2024-08-28): CogVideoX-5b on RTX 3060; Apple's Training-Free Multimodal Model

Discover top AI open-source projects like Zhipu's CogVideoX-5b and Apple's SlowFast-LLaVA, perfect for video generation and multimodal understanding.

Aug 28, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: CogVideoX/CogVideoX-5b

Zhipu's CogVideoX series introduces the new open-source CogVideoX-5b, offering higher video generation quality and better visual effects.

CogVideoX-5B requires only 18GB of VRAM for inference at FP-16 precision and 40GB for fine-tuning. This means the model can run on an RTX 3060 GPU.

CogVideoX is Zhipu's open-source video generation model, sharing the same origin as the previously released "Qingying." The series includes models of various sizes, with the earlier open-source version being CogVideoX-2B.

https://github.com/THUDM/CogVideo

https://huggingface.co/THUDM/CogVideoX-5b

Project: SlowFast-LLaVA

SlowFast-LLaVA is a multimodal large language model from Apple that requires no training, focusing on video understanding and inference.

It captures detailed spatial semantics and long-range temporal context without exceeding the token budget of typical LLMs.

The model doesn’t need fine-tuning and performs as well as, or even better than, the most advanced video LLMs in video QA tasks and benchmarks.

https://github.com/apple/ml-slowfast-llava

https://arxiv.org/abs/2407.15841

Project: Kotaemon

Kotaemon is an open-source RAG development tool. It provides a simple UI for end-users to perform RAG-based Q&A and supports multiple LLM API providers (e.g., OpenAI, Cohere) and local LLMs.

For developers, it offers a framework to build custom RAG document Q&A pipelines, with a UI built through Gradio for customization and visualization.

https://github.com/Cinnamon/kotaemon

Project: RAGLAB

RAGLAB is a modular, research-focused open-source framework dedicated to RAG algorithms.

It includes reproductions of 6 existing RAG algorithms and a comprehensive evaluation system with 10 benchmark datasets, facilitating fair comparisons and efficient development of new algorithms, datasets, and metrics.

https://github.com/fate-ubw/RAGLAB

https://arxiv.org/abs/2408.11381

Project: RAGChecker

RAGChecker is an advanced automated evaluation framework for assessing and diagnosing Retrieval-Augmented Generation (RAG) systems.

It offers a comprehensive set of metrics and tools for in-depth performance analysis, helping developers and researchers accurately evaluate, diagnose, and improve their RAG systems.

https://github.com/amazon-science/ragchecker

https://arxiv.org/abs/2408.08067

Project: Llama Stack

Llama Stack defines and standardizes the building blocks needed to bring generative AI applications to market.

These modules cover the entire development lifecycle: from model training and fine-tuning to product evaluation and deploying AI agents in production.

https://github.com/meta-llama/llama-stack

Today's Open Source (2024-08-27): Hyper-SD LoRAs Cut FLUX Time; Cursor Best Practices

Meng Li

Aug 27

Today's Open Source (2024-08-27): Hyper-SD LoRAs Cut FLUX Time; Cursor Best Practices

Here are some interesting AI open-source models and frameworks I wanted to share today:

Read full story

AI Disruption

Today's Open Source (2024-08-27): Hyper-SD LoRAs Cut FLUX Time; Cursor Best Practices

Discussion about this post