Open Source Today (2024-08-22): NVIDIA Mistral, Llama3 QA, C.ai Prompt Framework
Discover the latest AI open-source projects: NVIDIA Mistral, Cerebras DocChat, x-flux-comfyui, Prompt Poet, SiriLLama, and MLE-Agent.
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: Minitron/Mistral-NeMo-Minitron-8B-Base
NVIDIA has open-sourced the Mistral-NeMo-Minitron-8B-Base, a large language model derived by pruning and refining the Mistral-NeMo 12B.
Previously, NVIDIA released Llama-3.1-Minitron-4B, pruned from Llama-3.1-8B, as well as Minitron 8B and 4B, which were pruned versions of the NVIDIA Nemotron-4 15B model.
Minitron is a series of small language models (SLMs) by NVIDIA. The models are first pruned in embedding size, attention heads, and MLP dimensions, then further trained through distillation.
https://huggingface.co/nvidia/Minitron-4B-Base
https://huggingface.co/nvidia/Minitron-8B-Base
https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base
https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base
Project: DocChat
Cerebras Llama3-DocChat-1.0-8B is a large language model based on Llama 3, designed for document question answering.
It leverages the latest research in document QA, particularly from Nvidia’s ChatQA models.
Using synthetic data generation and efficient training, Cerebras Llama3-DocChat was fully trained in just a few hours on a single Cerebras system.
https://github.com/Cerebras/DocChat
https://huggingface.co/cerebras/Dragon-DocChat-Context-Encoder
https://huggingface.co/cerebras/Dragon-DocChat-Query-Encoder
https://huggingface.co/cerebras/Llama3-DocChat-1.0-8B
Project: x-flux-comfyui
x-flux-comfyui is a deep learning project based on ComfyUI, mainly for training and applying ControlNet and LoRA models.
The project provides various pre-trained models and tools, supports low memory mode, and is suited for computer vision tasks.
https://github.com/XLabs-AI/x-flux-comfyui
Project: Prompt Poet
Character AI’s open-source Prompt Poet simplifies prompt design with a low-code approach, ideal for both developers and non-technical users.
It combines YAML and Jinja2, allowing flexible and dynamic prompt creation, and enhancing efficiency and quality in AI model interactions. This tool saves time on string operations, letting users focus on crafting the best prompts.
https://github.com/character-ai/prompt-poet
Project: SiriLLama
SiriLLama is an Apple Shortcut that allows access to locally running LLMs via Siri or the Shortcuts UI.
It uses Langchain and supports open-source models from Ollama and Fireworks AI.
Users can interact with local LLMs from any Apple device on the same network.
https://github.com/0ssamaak0/SiriLLama
Project: MLE-Agent
MLE-Agent is an intelligent assistant designed for machine learning engineers and researchers.
It can automatically create AI/ML baselines, integrate Arxiv and Papers with Code for best practices and latest methods, provide smart debugging, organize project structures, and include various AI/ML features and MLOps tools, all aimed at seamless workflows.