Open Source Today (2024-09-14): Tencent Unveils GameGen-O, First Open-World Game Model

Tencent GameGen-O is the first model for generating open-world video games using diffusion transformers. Supports interactive controls via text, actions, and video prompts.

Sep 14, 2024

Here are some interesting AI open-source models and frameworks I wanted to share today:

Project: Tencent GameGen-O

Tencent released GameGen-O, the first diffusion transformer model designed for generating open-world video games. It uses the proprietary OGameData dataset and GPT-4o for data annotation. The model follows the Latte and OpenSora V1.2 framework principles.

GameGen-O simulates a wide range of game engine features, including innovative characters, dynamic environments, complex actions, and diverse events, enabling high-quality open-world generation.

It supports interactive controls where users can guide the game content through text, actions, and video prompts.

https://github.com/GameGen-O/GameGen-O/

Project: GenAI_Agents

GenAI_Agents is a comprehensive resource offering tutorials and implementations on generative AI agents, ranging from basic to advanced.

It aims to help users build intelligent, interactive AI systems, from simple chatbots to complex multi-agent setups.

DiamantAI

https://github.com/NirDiamant/GenAI_Agents

Project: AI Youtube Shorts Generator

AI Youtube Shorts Generator is a Python-based tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, find the most interesting parts, and format them for short videos.

The tool is currently in version 0.1, so there may be some bugs.

https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator

Project: AddressCLIP

Developed by the Chinese Academy of Sciences and Alibaba Cloud, AddressCLIP is a large model for street-level geolocation from a single photo.

It is a visual-language model designed to locate an image within a city based on street view photos.

The project proposes an end-to-end framework that solves image address localization through image-text alignment and image-geo matching. Three datasets for image address localization were built, demonstrating strong performance on them.

https://github.com/xsx1001/AddressCLIP

Project: FLUX-Controlnet-Inpainting

Released by Alibaba’s creative team, FLUX-Controlnet-Inpainting is an image restoration project.

It offers the Inpainting ControlNet checkpoint for the FLUX.1-dev model, designed for image inpainting and content generation, optimized for 768x768 resolution inference.

The project is in its alpha version, with future updates planned.

https://github.com/alimama-creative/FLUX-Controlnet-Inpainting

Project: doc-comments-ai

doc-comments-ai is a tool that uses large language models (LLMs) to generate code documentation.

With just a few terminal commands, users can generate documentation using OpenAI or fully local LLMs.

It integrates langchain, treesitter, lama.cpp, and ollama, supporting multiple programming languages and local use without data leakage.

https://github.com/fynnfluegge/doc-comments-ai

Today's Open Source (2024-09-13): XVERSE-MoE-A36B, China's Largest Open-Source MoE Model

Meng Li

Sep 13

Today's Open Source (2024-09-13): XVERSE-MoE-A36B, China's Largest Open-Source MoE Model

Here are some interesting AI open-source models and frameworks I wanted to share today:

Read full story

AI Disruption

Today's Open Source (2024-09-13): XVERSE-MoE-A36B, China's Largest Open-Source MoE Model

Discussion about this post