Today's Open Source (2024-10-12): SJTU Releases libcom, All-in-One Image Composition Toolbox!

Explore top AI open-source projects like libcom for image composition, MLE-Bench for benchmarking AI agents, and Swarm for multi-agent coordination.

Oct 12, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: libcom

libcom is an image composition toolbox designed to insert a foreground into a background image for realistic composite images by addressing inconsistencies in appearance, geometry, and semantics between the foreground and background.

libcom covers various tasks related to image composition, including image blending, standard/artistic image harmonization, shadow generation, object placement, generative composition, quality evaluation, and more.

For each task, libcom integrates one or two methods that balance efficiency and effectiveness and is continuously updated as better approaches become available.

https://github.com/bcmi/libcom

Project: MLE-Bench

MLE-Bench is a benchmarking tool for evaluating the performance of AI agents in machine learning engineering.

The project provides a comprehensive set of tools and datasets to test and assess the AI systems' capabilities in handling machine learning tasks.

By utilizing 75 Kaggle competition datasets, MLE-Bench is able to provide a thorough evaluation of AI agents' engineering abilities.

https://github.com/openai/mle-bench/

Project: LLM Evaluation Guidebook

The LLM Evaluation Guidebook is a guide on how to evaluate large language models (LLMs).

It offers multiple methods for assessing models, guides users in designing their own evaluations, and shares tips and tricks from practical experience.

Whether you're a developer of production models, a researcher, or an enthusiast, you'll find the information you need.

https://github.com/huggingface/evaluation-guidebook

Project: lm.rs

lm.rs is a minimal language model inference project written in Rust, designed to run locally on CPUs.

Inspired by Karpathy’s llama2.c and llm.c, it aims to perform complete language model inference without relying on machine learning libraries.

Currently, it supports the multimodal PHI-3.5-vision model and the text-only PHI-3.5-mini model. The project also supports Llama 3.2 and Gemma 2 models, and provides quantized models for optimized performance.

https://github.com/samuel-vitorino/lm.rs

Project: CogVideoX Factory

CogVideoX Factory is a memory-optimized fine-tuning script library for the video generation model CogVideoX.

The project uses TorchAO and DeepSpeed technologies to support fine-tuning of video models on 24GB GPU memory.

Users can leverage this project to perform text-to-video and image-to-video generation tasks, with various fine-tuning and training scripts available to accommodate different generation needs.

https://github.com/a-r-r-o-w/cogvideox-factory

Project: Swarm

Swarm is an experimental framework managed by the OpenAI Solutions team, designed to build, coordinate, and deploy multi-agent systems.

The framework implements lightweight, highly controllable, and easily testable agent coordination and execution through primitives for agent orchestration and handoffs.

Swarm is primarily used to demonstrate handoff and routine patterns explored in "Orchestrating Agents: Handoffs & Routines," suitable for scenarios that require handling a large number of independent functions and instructions.

https://github.com/openai/swarm

Today's Open Source (2024-10-11): Peking University & Kuaishou's Pyramid Flow Matching for Quick 10-Second Videos.

Meng Li

Oct 11

Today's Open Source (2024-10-11): Peking University & Kuaishou's Pyramid Flow Matching for Quick 10-Second Videos.

Here are some interesting AI open-source models and frameworks I wanted to share today:

Read full story

AI Disruption

Today's Open Source (2024-10-11): Peking University & Kuaishou's Pyramid Flow Matching for Quick 10-Second Videos.

Discussion about this post