Today's Open Source (2024-09-05): Yi-Coder Models 1.5B & 9B, 128K Context Support
Explore top AI open-source projects like Yi-Coder, TinyAgent, GuideLLM, and more, offering advanced code generation, reasoning, and layout recovery models.
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: Yi-Coder
Yi-Coder is an open-source code generation model series by 01-AI. It offers models with 1.5B and 9B parameters, both available in base and chat versions.
Yi-Coder supports a 128K context length. The 9B version adds 2.4 trillion high-quality tokens sourced from GitHub repositories and filtered code data from CommonCrawl.
Yi-Coder-9B outperforms other models under 10B parameters, like CodeQwen1.5 7B and CodeGeex4 9B. It even competes with DeepSeek-Coder 33B.
https://github.com/01-ai/Yi-Coder
https://huggingface.co/01-ai/Yi-Coder-1.5B
https://huggingface.co/01-ai/Yi-Coder-9B
Project: TinyAgent
TinyAgent focuses on bringing advanced reasoning and function-calling capabilities to small language models (SLMs) that can be safely deployed on edge devices.
It is trained on high-quality, curated data and uses the LLMCompiler to manage function calls effectively.
TinyAgent interacts with MacOS apps to assist users with tasks like writing emails, managing contacts, scheduling events, and organizing Zoom meetings.
https://github.com/SqueezeAILab/TinyAgent
https://arxiv.org/pdf/2409.00608
Project: RapidLayoutRecover
RapidLayoutRecover restores layout information from document images, combining layout analysis, OCR, table, and formula recognition.
It exports document images directly to Word or Txt files, making them easier to use and edit.
https://github.com/RapidAI/RapidLayoutRecover
Project: GuideLLM
GuideLLM is a tool for evaluating and optimizing the deployment of large language models (LLMs).
By simulating real-world workloads, GuideLLM helps users assess LLM performance, resource needs, and cost on various hardware setups.
This ensures efficient, scalable, and cost-effective LLM inference services with high-quality results.
https://github.com/neuralmagic/guidellm
Project: real_world_prompting
Anthropic offers this comprehensive prompting tutorial to help experienced developers learn more about prompt engineering. It includes five courses covering how to apply advanced prompting techniques to real-world tasks.
https://github.com/anthropics/courses/tree/master/real_world_prompting
Project: AlphaNLHoldem
AlphaNLHoldem is an unofficial implementation of AlphaHoldem, a 1v1 no-limit poker AI built with TensorFlow and Ray.
It provides a clean codebase for applying model-free reinforcement learning in Holdem-like games and aims to replicate AlphaHoldem's results.