⚠️ DeepSeek-R1 Deprecation Notice
The deepseek-reasoner endpoint will be retired on July 24, 2026. Please migrate to deepseek-v4-flash with thinking mode enabled. New pricing at $0.18/$0.80 per 1M tokens — up to 75% cheaper than R1.
Models & Pricing
Panda World provides access to leading Chinese LLMs through a single OpenAI-compatible endpoint. You only pay for what you use — no monthly commitments. All prices shown are in USD per million tokens (text), per image, or per video.
Pricing Table
All prices in USD. Text models use asymmetric input/output pricing — input is cheaper (attracts developers), output carries a premium margin. Image/video models are priced per generation. Updated weekly.
Chat Models
| Model ID | Provider | Upstream Endpoint | Upstream Input ($/1M) | Upstream Output ($/1M) | PW Input ($/1M) | PW Output ($/1M) | Context |
|---|---|---|---|---|---|---|---|
deepseek-v4-flash | DeepSeek | api.deepseek.com | $0.14 | $0.28 | $0.18 | $0.60 | 1M |
deepseek-v4-pro | DeepSeek | api.deepseek.com | $1.74 | $3.48 | $2.20 | $6.80 | 1M |
deepseek-reasoner | DeepSeek | api.deepseek.com | $0.14 | $0.28 | $0.80 | $3.20 | 64K |
minimax-m2.5 | MiniMax | api.minimax.chat | $0.15 | $1.15 | $0.20 | $1.50 | 205K |
qwen-flash | Alibaba Cloud | dashscope.aliyuncs.com | $0.10 | $0.30 | $0.12 | $0.45 | 1M |
qwen3-32b | Alibaba Cloud | dashscope.aliyuncs.com | $0.70 | $1.40 | $0.90 | $2.80 | 128K |
qwen-3.5-plus | Alibaba Cloud | dashscope.aliyuncs.com | $0.80 | $3.00 | $1.00 | $3.20 | 1M |
qwen3-max | Alibaba Cloud | dashscope.aliyuncs.com | $2.00 | $7.50 | $2.80 | $8.00 | 128K |
All requests are routed through our gateway. The upstream endpoint is the direct API domain of the model provider listed above, shown for full transparency.
Image Generation Models
| Model ID | Provider | Upstream Price | Panda World Price | Max Resolution |
|---|---|---|---|---|
wan-2.7-image | Alibaba Cloud | $0.03 | $0.05 | 2048×2048 |
kolors | Kuaishou | $0.02 | $0.04 | 1024×1024 |
Video Generation Models
Prices are per second of generated video. Final cost depends on video duration.
| Model ID | Provider | Upstream Price | Panda World Price | Max Duration |
|---|---|---|---|---|
wan-2.7-video | Alibaba Cloud | $0.07/sec | $0.10/sec | up to 15s |
Batch Processing — 50% Off
All models are available at 50% off the standard price for batch (async) processing. Batch jobs are typically completed within 1–6 hours, with a maximum processing time of 24 hours. There is no minimum batch size — use it for any workload, though we recommend 50+ requests for maximum throughput benefit.
Cost Comparison vs. US Providers
Comparison against comparable US provider models (input prices). Competitor prices as of May 2026.
| Model | Panda World (Input) | US Competitor | Savings |
|---|---|---|---|
| DeepSeek-V4-Flash | $0.18 | GPT-4.1 ($2.00) | 91% |
| MiniMax M2.5 | $0.20 | Claude Sonnet 4.6 ($3.00) | 93% |
| DeepSeek-R1 | $0.80 | o1 ($15.00) | 95% |
| Qwen3.5-Plus | $1.00 | GPT-5 ($8.00) | 87.5% |
| Qwen3-Max | $2.80 | Claude 3.5 Sonnet ($10.00) | 72% |
| Qwen-Flash | $0.12 | Claude Haiku 4.5 ($1.00) | 88% |
Markup Disclosure
We believe in full transparency. The difference between the upstream provider price and Panda World's price covers:
- Global edge infrastructure — multi-region proxy servers in Singapore, US-West, and Tokyo for low-latency access worldwide
- OpenAI-compatible translation layer — we maintain and update the compatibility layer so you don't have to
- Billing & analytics — usage tracking, cost analysis, and multi-model billing in one place
- Support — technical support and monitoring around the clock
- Payment processing fees — Lemon Squeezy and foreign currency conversion costs
Model Selection Guide
DeepSeek Models
deepseek-v4-flash (V4-Flash): Latest generation, 284B MoE with 1M token context. Excellent balance of speed, capability, and cost. The default choice for most use cases.
deepseek-reasoner (DeepSeek-R1): DEPRECATED — will be retired July 24, 2026. Use deepseek-v4-flash with thinking mode enabled instead. Set "thinking": {"type": "enabled"} in your API request for equivalent or better reasoning quality at 75% lower cost.
deepseek-v4-pro (V4-Pro): Flagship 1.6T MoE model with 1M token context. Best quality among DeepSeek models, ideal for complex tasks and agentic workflows.
MiniMax Models
minimax-m2.5 (M2.5): Frontier-level 230B MoE model. SWE-Bench #1, exceptional at coding, agent tasks, and function calling. 20x cheaper than Claude Opus 4. Best for developers building AI-powered applications.
Qwen Models
qwen-3.5-plus: Latest-generation flagship with 1M token context. Strong English and multilingual performance. Excellent for RAG, document processing, and complex reasoning tasks.
qwen3-max: Largest and most capable Qwen model. Best for Chinese language tasks and long-context applications (up to 128K tokens).
qwen-flash: Ultra-lightweight model at the lowest price point. Perfect for high-volume production workloads, bulk processing, and simple tasks where cost matters most.
qwen3-32b: More efficient 32B variant. Good price-performance balance for most production workloads.
Image Generation Models
wan-2.7-image (Wan 2.7-Image): Alibaba's latest text-to-image model. Supports multilingual text rendering, style transfer, and up to 2048×2048 resolution. $0.06 per image with fast inference.
kolors (Kolors): Kuaishou's open-source text-to-image model. Good quality at a lower price point. Suitable for batch image generation and cost-sensitive projects.
Video Generation Models
wan-2.7-video (Wan 2.7): Alibaba's open-weight video model with 7 generation modes including text-to-video, image-to-video, and video editing. Good balance of quality and price.
Pricing Notes
- Billing is based on total tokens processed (input + output combined)
- Token counting follows each model provider's tokenizer
- Requests that return errors due to provider failures are not billed
- Prices are subject to change; we will notify you via email of any changes
- Volume discounts available — contact us for enterprise pricing
- Prompt Caching: standard cache (5min TTL) at write 1.25×, cached reads 0.10× base price. Extended caching (60min TTL) coming soon.