⚠️ DeepSeek-R1 Deprecation Notice

The deepseek-reasoner endpoint will be retired on July 24, 2026. Please migrate to deepseek-v4-flash with thinking mode enabled. New pricing at $0.18/$0.80 per 1M tokens — up to 75% cheaper than R1.

Migration: change model from "deepseek-reasoner" to "deepseek-v4-flash" and add "thinking": {"type": "enabled"} in your API request.

Models & Pricing

Panda World provides access to leading Chinese LLMs through a single OpenAI-compatible endpoint. You only pay for what you use — no monthly commitments. All prices shown are in USD per million tokens (text), per image, or per video.

Pricing Table

All prices in USD. Text models use asymmetric input/output pricing — input is cheaper (attracts developers), output carries a premium margin. Image/video models are priced per generation. Updated weekly.

Chat Models

Model IDProviderUpstream EndpointUpstream Input ($/1M)Upstream Output ($/1M)PW Input ($/1M)PW Output ($/1M)Context
deepseek-v4-flashDeepSeekapi.deepseek.com$0.14$0.28$0.18$0.601M
deepseek-v4-proDeepSeekapi.deepseek.com$1.74$3.48$2.20$6.801M
deepseek-reasonerDeepSeekapi.deepseek.com$0.14$0.28$0.80$3.2064K
minimax-m2.5MiniMaxapi.minimax.chat$0.15$1.15$0.20$1.50205K
qwen-flashAlibaba Clouddashscope.aliyuncs.com$0.10$0.30$0.12$0.451M
qwen3-32bAlibaba Clouddashscope.aliyuncs.com$0.70$1.40$0.90$2.80128K
qwen-3.5-plusAlibaba Clouddashscope.aliyuncs.com$0.80$3.00$1.00$3.201M
qwen3-maxAlibaba Clouddashscope.aliyuncs.com$2.00$7.50$2.80$8.00128K

All requests are routed through our gateway. The upstream endpoint is the direct API domain of the model provider listed above, shown for full transparency.

Image Generation Models

Model IDProviderUpstream PricePanda World PriceMax Resolution
wan-2.7-imageAlibaba Cloud$0.03$0.052048×2048
kolorsKuaishou$0.02$0.041024×1024

Video Generation Models

Prices are per second of generated video. Final cost depends on video duration.

Model IDProviderUpstream PricePanda World PriceMax Duration
wan-2.7-videoAlibaba Cloud$0.07/sec$0.10/secup to 15s

Batch Processing — 50% Off

All models are available at 50% off the standard price for batch (async) processing. Batch jobs are typically completed within 1–6 hours, with a maximum processing time of 24 hours. There is no minimum batch size — use it for any workload, though we recommend 50+ requests for maximum throughput benefit.

Cost Comparison vs. US Providers

Comparison against comparable US provider models (input prices). Competitor prices as of May 2026.

ModelPanda World (Input)US CompetitorSavings
DeepSeek-V4-Flash$0.18GPT-4.1 ($2.00)91%
MiniMax M2.5$0.20Claude Sonnet 4.6 ($3.00)93%
DeepSeek-R1$0.80o1 ($15.00)95%
Qwen3.5-Plus$1.00GPT-5 ($8.00)87.5%
Qwen3-Max$2.80Claude 3.5 Sonnet ($10.00)72%
Qwen-Flash$0.12Claude Haiku 4.5 ($1.00)88%

Markup Disclosure

We believe in full transparency. The difference between the upstream provider price and Panda World's price covers:

  • Global edge infrastructure — multi-region proxy servers in Singapore, US-West, and Tokyo for low-latency access worldwide
  • OpenAI-compatible translation layer — we maintain and update the compatibility layer so you don't have to
  • Billing & analytics — usage tracking, cost analysis, and multi-model billing in one place
  • Support — technical support and monitoring around the clock
  • Payment processing fees — Lemon Squeezy and foreign currency conversion costs

Model Selection Guide

DeepSeek Models

deepseek-v4-flash (V4-Flash): Latest generation, 284B MoE with 1M token context. Excellent balance of speed, capability, and cost. The default choice for most use cases.

deepseek-reasoner (DeepSeek-R1): DEPRECATED — will be retired July 24, 2026. Use deepseek-v4-flash with thinking mode enabled instead. Set "thinking": {"type": "enabled"} in your API request for equivalent or better reasoning quality at 75% lower cost.

deepseek-v4-pro (V4-Pro): Flagship 1.6T MoE model with 1M token context. Best quality among DeepSeek models, ideal for complex tasks and agentic workflows.

MiniMax Models

minimax-m2.5 (M2.5): Frontier-level 230B MoE model. SWE-Bench #1, exceptional at coding, agent tasks, and function calling. 20x cheaper than Claude Opus 4. Best for developers building AI-powered applications.

Qwen Models

qwen-3.5-plus: Latest-generation flagship with 1M token context. Strong English and multilingual performance. Excellent for RAG, document processing, and complex reasoning tasks.

qwen3-max: Largest and most capable Qwen model. Best for Chinese language tasks and long-context applications (up to 128K tokens).

qwen-flash: Ultra-lightweight model at the lowest price point. Perfect for high-volume production workloads, bulk processing, and simple tasks where cost matters most.

qwen3-32b: More efficient 32B variant. Good price-performance balance for most production workloads.

Image Generation Models

wan-2.7-image (Wan 2.7-Image): Alibaba's latest text-to-image model. Supports multilingual text rendering, style transfer, and up to 2048×2048 resolution. $0.06 per image with fast inference.

kolors (Kolors): Kuaishou's open-source text-to-image model. Good quality at a lower price point. Suitable for batch image generation and cost-sensitive projects.

Video Generation Models

wan-2.7-video (Wan 2.7): Alibaba's open-weight video model with 7 generation modes including text-to-video, image-to-video, and video editing. Good balance of quality and price.

Pricing Notes

  • Billing is based on total tokens processed (input + output combined)
  • Token counting follows each model provider's tokenizer
  • Requests that return errors due to provider failures are not billed
  • Prices are subject to change; we will notify you via email of any changes
  • Volume discounts available — contact us for enterprise pricing
  • Prompt Caching: standard cache (5min TTL) at write 1.25×, cached reads 0.10× base price. Extended caching (60min TTL) coming soon.