Today's Open Source (2024-10-17): NVIDIA Open-Sources Llama 3.1 Nemotron 70B

Explore NVIDIA's Llama 3.1 Nemotron-70B model, Mini-Omni2, Ichigo, Toy Box Flux, and more cutting-edge AI projects in this detailed breakdown.

Oct 17, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: Nemotron-70B

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA, designed to enhance the effectiveness of user query responses generated by LLMs.

This model has excelled in multiple automated alignment benchmarks, surpassing powerful models like GPT-4o and Claude 3.5 Sonnet.

It is trained through RLHF (particularly REINFORCE) and supports the HuggingFace Transformers codebase.

https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

https://huggingface.co/nvidia/Llama-3_1-Nemotron-51B-Instruct

Project: Mini-Omni2

Mini-Omni2 is an all-in-one interactive model capable of understanding image, audio, and text inputs, engaging users in end-to-end voice conversations.

This project features real-time voice output, multimodal understanding, and flexible interaction capabilities, supporting an interrupt mechanism during speech.

https://github.com/gpt-omni/mini-omni2

Project: Ichigo

Ichigo is an open, ongoing research experiment aimed at expanding text-based language models to include native "listening" capabilities.

Inspired by Meta's Chameleon paper, this project employs an early fusion technique with the goal of creating an open data, open-weight voice assistant similar to on-device Siri.

Ichigo is trained on public platforms to progressively enhance its multi-turn conversation abilities and is capable of rejecting inaudible queries.

https://github.com/homebrewltd/ichigo

Project: Toy Box Flux

Toy Box Flux is an AI image-generation-based project aimed at creating uniquely styled toy designs by combining existing 3D LoRA and Coloring Book Flux LoRA.

This project focuses on generating cute 3D toy render images, suitable for both objects and human themes. Future versions are planned to enhance stylistic consistency through further training on generated outputs.

https://huggingface.co/renderartist/toyboxflux

Project: Awesome o1

Awesome o1 is a collection of papers related to OpenAI’s o1 project, featuring multiple research papers in the fields of machine learning and natural language processing.

These papers explore topics ranging from solving mathematical problems to language model reasoning, showcasing the latest research advances and technical applications in these areas.

https://github.com/srush/awesome-o1

Project: Tabled

Tabled is a small library for detecting and extracting tables.

It uses Surya to locate all tables in PDFs, identify rows and columns, and format cells into Markdown, CSV, or HTML.

This project aims to simplify the process of extracting table data from documents, supporting various document formats including PDFs, images, Word documents, and PowerPoint.

https://github.com/VikParuchuri/tabled

Today's Open Source (2024-10-16): FunASR Speech Recognition Toolkit

Meng Li

October 16, 2024

Today's Open Source (2024-10-16): FunASR Speech Recognition Toolkit

Here are some interesting AI open-source models and frameworks I wanted to share today:

Read full story

AI Disruption

Today's Open Source (2024-10-16): FunASR Speech Recognition Toolkit

Discussion about this post