Today's Open Source (2024-10-30): SD 3.5 Medium Open Source Release

Explore SD 3.5 Medium's text-to-image generation, Eliza's multi-agent framework, and cutting-edge AI orchestration tools like Dynamiq and HyperCloning.

Oct 30, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: SD3.5 Medium

Stable Diffusion 3.5 Medium is a multimodal diffusion Transformer model focused on text-to-image generation.

This model shows significant improvements in image quality, typography, complex prompt comprehension, and resource efficiency.

With 2.5 billion parameters, it utilizes the improved MMDiT-X architecture and training methods to generate images with resolutions ranging from 0.25 to 2 megapixels.

https://github.com/Stability-AI/sd3.5

https://huggingface.co/stabilityai/stable-diffusion-3.5-medium

Project: Eliza

Eliza is a multi-agent simulation framework supporting conversational agents on Twitter and Discord platforms.

The project allows users to add multiple unique characters and provides full Discord and Twitter connectors, including support for Discord voice channels.

Eliza features RAG memory for conversations and documents, enabling it to read links and PDFs, transcribe audio and video, and summarize conversations.

Highly scalable, users can create their own actions and clients to extend Eliza's functionality.

By default, it supports the Nous Hermes Llama 3.1B model and OpenAI for cloud inference.

https://github.com/ai16z/eliza

Project: Dynamiq

Dynamiq is an orchestration framework for agents and large language model (LLM) applications, aimed at simplifying the development of AI-powered applications.

It focuses on retrieval-augmented generation (RAG) and orchestration of LLM agents, providing an all-in-one generative AI solution.

https://github.com/dynamiq-ai/dynamiq

Project: VLMEvalKit

VLMEvalKit is an open-source evaluation toolkit designed for large-scale vision-language models (LVLMs).

It supports the evaluation of around 100 vision-language models and covers over 40 benchmarks.

The toolkit simplifies the heavy data preparation work across multiple repositories using generative evaluation methods and provides assessment results based on exact match and LLM answer extraction.

https://github.com/open-compass/VLMEvalKit

Project: MMIE

MMIE is a large-scale multimodal interleaved understanding evaluation benchmark designed for large vision-language models (LVLMs).

The project offers a robust framework for assessing the interleaved understanding and generation capabilities of LVLMs across different domains and supports reliable automated metrics.

https://github.com/Lillianwei-h/MMIE

Project: HyperCloning

Illustration of HyperCloning for Linear Layers

HyperCloning is a software project aimed at accelerating the pre-training of large language models through small model initialization.

This project transfers the knowledge of small pre-trained language models to larger ones and improves the accuracy of large models through fine-tuning.

https://github.com/apple/ml-hypercloning

Today's Open Source (2024-10-29): Meta Open-Sources LongVU Large Model

Meng Li

Oct 29

Today's Open Source (2024-10-29): Meta Open-Sources LongVU Large Model

Here are some interesting AI open-source models and frameworks I wanted to share today:

Read full story

AI Disruption

Today's Open Source (2024-10-29): Meta Open-Sources LongVU Large Model

Discussion about this post