Today's Open Source (2024-08-02): Andrej Karpathy Releases nano-llama31, Simplifies Llama 3.1 Inference
Explore the latest AI open-source projects like nano-llama31, LlamaVoice, FLUX.1, Stable Fast 3D, GLOMAP, and LLM-Chinese for cutting-edge advancements.
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: nano-llama31
nano-llama31 is an inference project for Llama 3.1, developed by Andrej Karpathy. It aims to achieve model inference with minimal dependencies.
The current code can run an 8B base model on a single 40GB+ GPU. The codebase is about 900 lines, primarily using PyTorch and tiktoken.
The project plans to further reduce the model size and add fine-tuning capabilities.
https://github.com/karpathy/nano-llama31
Project: LlamaVoice
LlamaVoice is a large-scale voice generation model based on Llama, offering inference and training capabilities.
It simplifies the traditional discrete voice encoding prediction process by directly predicting continuous features.
The model uses a Variational Autoencoder (VAE) to predict latent features and is co-trained with a Large Language Model (LLM) to enhance performance and flexibility in voice generation.
https://github.com/OpenT2S/LlamaVoice
Project: FLUX.1
FLUX.1 is an image generation model developed by Black Forest Labs, a startup by core engineers of Stable Diffusion. It includes professional, developer, and fast versions to meet various needs, outperforming Midjourney v6.0, DALL·E 3 (HD), and Stable Diffusion 3-Ultra.
FLUX.1 excels in visual quality, prompt responsiveness, flexibility in size/aspect ratio, layout, and output diversity.
All FLUX.1 models support multiple aspect ratios and resolutions of 0.1 and 2.0 megapixels.
https://github.com/black-forest-labs/flux
https://huggingface.co/black-forest-labs/FLUX.1-schnell
https://huggingface.co/black-forest-labs/FLUX.1-dev
Project: Stable Fast 3D
Stable Fast 3D (SF3D) is an innovative 3D asset generation model by Stability AI. SF3D can generate high-quality 3D assets from a single image in just 0.5 seconds. The generated assets include UV-mapped meshes, material parameters, and albedo colors with reduced light baking.
https://huggingface.co/stabilityai/stable-fast-3d
https://github.com/Stability-AI/stable-fast-3d
Project: GLOMAP
GLOMAP is a general global structure reconstruction pipeline for image-based reconstruction.
GLOMAP takes a COLMAP database as input and outputs a COLMAP sparse reconstruction.
Compared to COLMAP, this project offers a more efficient and scalable reconstruction process, usually 1-2 orders of magnitude faster, with comparable or superior reconstruction quality.
https://github.com/colmap/glomap
Project: LLM-Chinese
This project aligns gemma2 with the Chinese version. It is tutorial-focused, guiding readers on fine-tuning LLM for Chinese adaptation.