Today's Open Source (2024-09-25): Simplified Visual Language Model Mini-LLaVA
Discover exciting open-source AI models, including Mini-LLaVA for visual language tasks and GraphReasoning for knowledge graphs. Explore creativity with Gemma-2-2B and more!
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: Mini-LLaVA
Mini-LLaVA is a simplified version of the LLaVA visual language model. It can process images, videos, and text inputs.
This project is based on Llama 3.1 and can run on just one GPU.
It minimizes code structure to handle multi-modal inputs, making it suitable for complex visual-text tasks.
https://github.com/fangyuan-ksgk/Mini-LLaVA
Project: Google Gemma-2-2B-ArliAI-RPMax-v1.1
Gemma-2-2B-ArliAI-RPMax-v1.1 is a variant of gemma-22b-it, part of the RPMax model series.
These models are trained on diverse and de-duplicated datasets. They focus on creative writing and role-playing, ensuring high creativity and uniqueness when understanding different roles and scenarios.
Early user tests show that this series has a distinct style, different from other RP models.
https://huggingface.co/ArliAI/Gemma-2-2B-ArliAI-RPMax-v1.1
Project: GraphReasoning
The GraphReasoning project uses generative AI to turn a dataset of 1,000 scientific papers into an ontology knowledge graph. It conducts deep structural analysis, calculating node degrees, identifying communities, assessing connectivity, and evaluating clustering coefficients and key nodes' centrality.
This graph has inherent scale-free properties and high connectivity. It can be used for graph reasoning, revealing unseen interdisciplinary relationships, answering queries, identifying knowledge gaps, and proposing new material designs.
https://github.com/lamm-mit/GraphReasoning
Project: cog-flux
Cog inference for flux models is an inference tool for FLUX models, developed by Black Forest Labs. It supports two versions: FLUX.1 schnell and FLUX.1 dev.
The project offers various features, including compilation with torch.compile, CuDNN-based fast attention, NSFW checks, and img2img support.
Users can run these models via the Replicate platform's API or browser, or customize them on local hardware.
https://github.com/replicate/cog-flux
Project: Flow-Judge-v0.1
Flow-Judge-v0.1 is an open-source, lightweight language model evaluation tool optimized for LLM systems. It is based on synthetic datasets and supports various model types, such as Hugging Face Transformers and vLLM. It provides a scalable architecture for users to create custom metrics and scoring standards.
The design aims to improve evaluation accuracy, speed, and customization.
https://github.com/flowaicom/flow-judge
Project: nanoGPT-mup
nanoGPT-mup is the simplest and fastest repository for training and fine-tuning medium-sized GPT models.
This project is a branch of nanoGPT, offering a minimal implementation of maximum update parameterization (muP) as supplementary material for the "Maximum Update Parameterization Practice Guide."
The project's code is clear and easy to read, making it suitable for training new models from scratch or fine-tuning pre-trained checkpoints.