Today's Open Source (2024-09-09): DeepSeek-V2.5 Combines General and Coding Capabilities in an Upgrade
Explore cutting-edge AI models like DeepSeek-V2.5, DocAI, Rerankers, and more, optimized for research automation, document extraction, and collaborative AI systems.
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: DeepSeek-V2.5
DeepSeek-V2.5 is an upgraded version of DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
This new model combines the general and coding skills of the earlier versions. It’s optimized for human preferences, writing, and instruction following.
https://huggingface.co/deepseek-ai/DeepSeek-V2.5
Project: DocAI
DocAI is a project that extracts structured data from unstructured documents.
It uses Answer.AI's Byaldi, OpenAI's GPT-4o, and Langchain’s structured output features.
This tool helps automate the extraction of useful information from various documents, useful in many fields.
https://github.com/madisonmay/docai
Project: Rerankers
Rerankers is a lightweight, low-dependency API that uses all popular reranking and cross-encoder models.
It provides users with a simple API for accessing different reranking models, regardless of their architecture. The project is developed by the AnswerDotAI team.
https://github.com/answerdotai/rerankers
Project: AI-Driven Research Assistant
AI-Driven Research Assistant is a multi-agent system designed to automate complex research processes.
It uses LangChain, OpenAI GPT, and LangGraph to make tasks like hypothesis generation, data analysis, visualization, and report writing easier. It’s ideal for researchers and data scientists looking to improve productivity.
https://github.com/starpig1129/ai-data-analysis-MulitAgent
Project: GenAgent
The GenAgent framework creates workflows to build collaborative AI systems. These workflows are turned into code to help LLM agents better understand them.
GenAgent learns from human-designed workflows and can create new ones. These workflows can be used as collaborative systems to complete complex tasks.
https://github.com/xxyQwQ/GenAgent
Project: legkilo-dataset
The Leg-KILO Dataset contains data on the leg movements of a quadruped robot (Unitree Go1), including joint encoder data, contact sensors, IMU, and LiDAR data.
This dataset is used to study and develop Kinematic-Inertial-Lidar Odometry (KILO) technology for dynamic quadruped robots in different environments.