Today's Open Source Releases (2024-08-06): InternLM2.5 20B, Wobble Multimodal MiniCPM-V 2.6
Explore the latest AI open-source models: InternLM2.5-20B, CogVideoX, MiniCPM-V 2.6, Palmyra-Med-Fin, DistillKit, and pdfdeal. Enhance your AI projects today!
Here are some interesting AI open-source models and frameworks I wanted to share today:
Project: InternLM2.5/InternLM2.5-20B
InternLM2.5 has released the 20B parameter base model InternLM2.5-20B and the tailored chat model InternLM2.5-20B-chat. These models outperformed Gemma 2 27B and are close to Llama 3.1 70B in various benchmarks.
InternLM 2.5, developed by Shanghai AI Laboratory, also includes InternLM2.5-7B, InternLM2.5-7B-Chat, and InternLM2.5-7B-Chat-1M. These models collect and analyze information from hundreds of web pages, offering superior instruction comprehension, tool filtering, and result reflection capabilities.
https://huggingface.co/internlm/internlm2_5-20b
https://huggingface.co/internlm/internlm2_5-20b-chat
Project: CogVideoX
Zhipu has open-sourced its video generation model CogVideoX, which is related to the previously released "Qingying."
CogVideoX includes several models of different sizes, with CogVideoX-2B currently available. It requires 18GB of VRAM for inference and 40GB for fine-tuning (bs=1) at FP-16 precision. The model supports up to 226 tokens for prompts, generates 6-second videos at 8 fps, and outputs at a resolution of 720x480.
https://huggingface.co/THUDM/CogVideoX-2b
https://github.com/THUDM/CogVideo
Project: MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most powerful model in the MiniCPM-V series by Wobble AI.
Built on SigLip-400M and Qwen2-7B, it has a total of 8B parameters. MiniCPM-V 2.6 shows significant performance improvements over MiniCPM-Llama3-V 2.5 and introduces new capabilities for multi-image and video understanding.
https://huggingface.co/openbmb/MiniCPM-V-2_6
https://huggingface.co/openbmb/MiniCPM-V-2_6-int4
Project: Palmyra-Med-Fin
The writer has introduced Palmyra-Med and Palmyra-Fin, two specialized models for the medical and financial industries.
These models excel in generative AI applications, surpassing GPT-4, Med-PaLM-2, and Claude 3.5 Sonnet. Palmyra-Med scores an average of 85.9% on medical benchmarks, while Palmyra-Fin passed the CFA Level 3 exam with a score of 73%. These models are available via Writer's API, no-code tools, and the Writer framework, supporting local or private cloud deployment.
https://huggingface.co/Writer/Palmyra-Fin-70B-32K
https://huggingface.co/Writer/Palmyra-Med-70B-32K
https://writer.com/blog/palmyra-med-fin-models/
Project: DistillKit
DistillKit, open-sourced by Arcee-AI, simplifies and accelerates the distillation of large language models (LLM).
It offers comprehensive modules for data preprocessing, model training, performance optimization, and evaluation. DistillKit’s efficient API and integrative architecture make model compression and deployment easier, particularly in resource-limited environments, enhancing application response speed and reducing computing costs.
https://github.com/arcee-ai/DistillKit
Project: pdfdeal
pdfdeal is a Python wrapper for the Doc2X API, featuring local PDF processing capabilities. It enhances PDF recall in retrieval-augmented generation (RAG).
The project supports converting PDF files to Markdown or LaTeX, includes OCR for image text recognition, and provides various useful file-processing tools.