Today's Open Source (2024-10-22): IBM Open-Sources Enterprise-Level AI Granite 3.0
Discover open-source projects like Granite 3.0, Allegro, Whisper Turbo MLX, BaseAI, and more. Cutting-edge AI, text-to-video, and memory agents.
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: Granite 3.0
The key highlights of this model series include low latency, support for tool-based scenarios, retrieval-augmented generation technology, and support for MoE hybrid models on small devices.
The model is trained from scratch using a two-stage strategy. In the first stage, it is trained on 10 trillion tokens from various domains, and in the second stage, it is fine-tuned on carefully curated high-quality data to enhance performance for specific tasks.
It supports multiple languages and can be used for tasks such as text summarization, classification, extraction, and question-answering.
https://huggingface.co/ibm-granite/granite-3.0-8b-base
https://huggingface.co/ibm-granite/granite-3.0-2b-instruct
Project: Allegro
Allegro is a powerful text-to-video generation model that can produce high-quality videos from simple text inputs.
The model supports generating videos up to 6 seconds long, with a frame rate of 15 FPS and a resolution of 720p.
Allegro leverages advanced machine-learning techniques to offer users a convenient text-to-video conversion experience.
https://github.com/rhymes-ai/Allegro
Project: Whisper Turbo MLX
Whisper Turbo MLX is a fast and lightweight speech recognition project, based on the MLX implementation of the Whisper model.
This project is designed to provide efficient audio transcription functionality, with all the code contained in a file of fewer than 300 lines, making it suitable for scenarios that require rapid processing of large volumes of audio data.
https://github.com/JosefAlbers/whisper-turbo-mlx
Project: BaseAI
BaseAI is a framework for building declarative and composable AI-driven LLM products.
It allows users to develop AI agent pipelines on their local machines, integrating tools and memory functions (RAG).
BaseAI is designed to simplify the AI product development process, offering comprehensive documentation and learning guides.
https://github.com/LangbaseInc/BaseAI
Project: rag-chatbot
This project enables users to engage in interactive chats with multiple PDF files locally.
It supports models from Huggingface and Ollama, with plans to include multilingual chat support.
The project provides a simple user interface through Gradio, allowing users to run it easily either locally or on Kaggle.
https://github.com/datvodinh/rag-chatbot
Project: LangGraph ReAct Memory Agent
Memory Agent is an example of a ReAct-style agent designed to store memory through tools.
This approach allows the agent to persist important information within a conversation thread and configure the memory scope based on user ID, enabling the bot to learn user preferences.