APMIC CaiGunn Model
CaiGunn 34B | CaiGunn 34Bx2 (Coming soon) | |
---|---|---|
Training architecture | APMIC APMIC Brainformers + Llama | Mamba + Transformer + MoDE |
Computing architecture | NeMo Framework | ANF |
Hardware requirements |
|
|
Features |
|
|
Cost halved, with no compromise on performance
- checkAfter completing Imitation Learning for APMIC, the inference cost can be reduced by over 50%.
- checkOutperforms well-known models in Chinese and English according to specific evaluation criteria.
- checkSupports domain-specific fine-tuning via the CaiGunn platform.
MMLU Rankings (English)
Company Name | Model | Score |
---|---|---|
OpenAI | GPT-4Not deployable | 86.5 |
APMIC | CaiGunn | 75.7 |
Gemini ProNot deployable | 71.8 | |
Mistral AI | Mixtral-8x7B | 71.4 |
OpenAI | GPT-3.5Not deployable | 70.0 |
Meta | LLaMA 65B | 68.9 |
Gemma 7B | 64.6 |
TMMLU+ Rankings (Traditional Chinese)
Company Name | Model | Score |
---|---|---|
OpenAI | GPT-4Not deployable | 60.40 |
APMIC | CaiGunn-zh | 55.20 |
Media Tek | Breeze-7B | 40.35 |
Mistral AI | Mixtral-8x7B | 36.93 |
NTU | Taiwan-LLM-13B | 21.36 |
Innolux | Bailong-instruct-7B | 6.80 |
Capable of Handling 21 Times the Data of OpenAI
Input Data Volume Capable of Processing at the Same Cost
Company Name | Model | Tokens |
---|---|---|
APMIC | CaiGunn | 21.4x |
Gemini | 12.0x | |
OpenAI | GPT3.5-Turbo | 1x |
Output Data Volume Capable of Processing at the Same Cost
Company Name | Model | Tokens |
---|---|---|
APMIC | CaiGunn | 8.7x |
Gemini | 5.3x | |
OpenAI | GPT3.5-Turbo | 1x |
Flexible Free Deployment
Instant Cloud Usage
Instantly use well-known large language models through CaiGunn. The platform includes built-in automatic text preprocessing, RAG, image-text output, version control, model testing, preview, deployment, and more. It also supports custom model training, fine-tuning, and inference in the cloud, allowing you to have your own models in the cloud.
Enterprise On-Premises Usage
CaiGunn's enterprise version offers powerful features, supporting not only NVIDIA DGX and HGX hardware but also hybrid and private cloud architectures, including AWS, Azure, Google, Oracle Cloud, DGX Cloud, and other cloud providers, ensuring data privacy and protection.
Developer Zone
Low Inference Cost
After fine-tuning, models trained in specific domains can provide lower inference costs.
Support for Application Scenarios
Language models are just the core. We can support more application systems launched by APMIC, such as customer service, knowledge management, contract recognition, etc.
Data Confidentiality
With NVIDIA H100 or higher-level Confidential Computing technology, data can be kept confidential from training to deployment.