Open Source Today (2024-07-10): Tencent's Multimodal AI Framework Open Source for Huawei Ascend
5 Cutting-Edge AI Projects: Transforming Tech Today - Open Source Breakthroughs
Let me share some interesting AI open-source models and frameworks from today.
Project: MLLM-NPU
MLLM-NPU is an open-source framework from Tencent Research Institute for training multimodal large language models on Huawei Ascend NPU. This project allows a flexible selection of different visual encoders, adapters, LLMs, and corresponding generation components to form MLLMs. It supports training, inference, and image generation. The project includes a high-performance MLLM implementation called SEED-X, but users can also build their own MLLM with different modules as needed.
Project: MobileCPM
MobileCPM is an open-source toolset for deploying large models on mobile devices, aimed at helping developers integrate large models seamlessly into their apps. It offers a demo app with various example agents like translators, poets, storytellers, and motivational coaches for different use cases. Developers can customize agents by adding or modifying prompts and replacing models on the device to meet business needs. The project currently supports iOS, with an Android version coming soon.
Project: Vanna
Vanna is an open-source framework that improves SQL generation accuracy and functionality using RAG technology. It helps users interact with SQL databases through natural language, simplifying data querying and analysis. The project supports various user interfaces and provides detailed documentation to help users get started quickly.
Project: CopilotKit
CopilotKit is an open-source framework for building custom AI assistants. It supports integrating AI chatbots and agents into applications. The built-in chatbot can interact with application front-ends, back-ends, and third-party services (like Salesforce, and Dropbox) through plugins, helping users quickly retrieve information and perform actions within the app.
Project: Phi-3-Vision-MLX
Phi-3-Vision-MLX is a versatile AI framework optimized for Apple Silicon, combining the Phi-3-Vision multimodal model with the latest Phi-3-Mini-128K language model. This project provides easy-to-use interfaces for various AI tasks, from advanced text generation to visual question answering and code execution.