30 Latest AI Open Source Projects of the Week(2024.11.25-2024.11.29)
Explore 30 innovative AI open-source models and frameworks from this week (2024.11.25-2024.11.29), including Tülu3, LTX-Video, and more, featuring state-of-the-art performance across various tasks.
I’m sharing some interesting AI open-source models and frameworks for this week (2024.11.25-2024.11.29).
There are a total of 30 AI open-source projects.
Project: Tülu3
Tülu3 is a leading family of instruction-following models, offering fully open-source datasets, code, and recipes. It serves as a comprehensive guide to modern post-training techniques.
Designed to achieve state-of-the-art performance across diverse tasks, including MATH, GSM8K, and IFEval, in addition to chat-based applications, Tülu3 is available in open-source 8B and 70B versions.
Project: LTX-Video
LTX-Video is the first video generation model based on DiT, capable of generating high-quality videos in real-time.
The model produces videos at 24 FPS with a resolution of 768x512, faster than the time it takes to watch them.
Trained on a large-scale diverse video dataset, it generates high-resolution videos with realism and variety.
Project: NumPro
Number-Prompt (NumPro) is an innovative approach that transforms video temporal grounding (VTG) into an intuitive process, akin to flipping through a comic strip, by adding unique numeric identifiers to video frames.
This technology significantly enhances VTG performance without additional computational cost, achieving up to a 6.9% improvement in mIoU for moment retrieval and an 8.5% boost in mAP for highlight detection.
https://github.com/yongliang-wu/NumPro
Project: Multi-IF
Multi-IF, released by Meta AI, is a multilingual, multi-turn dialogue dataset designed to support research in instruction-following tasks.
The dataset includes text data in multiple languages, making it suitable for tasks involving multi-turn dialogue generation and understanding in natural language processing.
https://huggingface.co/datasets/facebook/Multi-IF
Project: Llama OCR
Llama OCR is an OCR library based on the Llama 3.2 vision model, capable of converting documents into Markdown format.
The project leverages Together AI's free endpoints for image parsing and offers paid endpoints for enhanced performance and rate limits. Users can convert images to Markdown via simple API calls.
https://github.com/Nutlope/llama-ocr
Project: RAGLite
RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) supporting PostgreSQL or SQLite databases.
It allows users to select from various language model providers and rankers, offering multiple acceleration options. RAGLite features lightweight and open-source dependencies, supporting the conversion of various document formats and enabling performance evaluation for retrieval and generation.