⚠️ DeepSeek-R1 Deprecation Notice

The deepseek-reasoner endpoint will be retired on July 24, 2026. Please migrate to deepseek-v4-flash with thinking mode enabled. New pricing at $0.18/$0.80 per 1M tokens — up to 75% cheaper than R1.

Migration: change model from "deepseek-reasoner" to "deepseek-v4-flash" and add "thinking": {"type": "enabled"} in your API request.

Models & Pricing

Panda World provides access to leading Chinese LLMs through a single OpenAI-compatible endpoint. You only pay for what you use — no monthly commitments. All prices shown are in USD per million tokens (text), per image, or per video.

Pricing Table

All prices in USD. Text models use asymmetric input/output pricing — input is cheaper (attracts developers), output carries a premium margin. Image/video models are priced per generation. Updated weekly.

Chat Models

Model ID	Provider	Upstream Endpoint	Upstream Input ($/1M)	Upstream Output ($/1M)	PW Input ($/1M)	PW Output ($/1M)	Context
`deepseek-v4-flash`	DeepSeek	`api.deepseek.com`	$0.14	$0.28	$0.18	$0.60	1M
`deepseek-v4-pro`	DeepSeek	`api.deepseek.com`	$1.74	$3.48	$2.20	$6.80	1M
`deepseek-reasoner`	DeepSeek	`api.deepseek.com`	$0.14	$0.28	$0.80	$3.20	64K
`minimax-m2.5`	MiniMax	`api.minimax.chat`	$0.15	$1.15	$0.20	$1.50	205K
`qwen-flash`	Alibaba Cloud	`dashscope.aliyuncs.com`	$0.10	$0.30	$0.12	$0.45	1M
`qwen3-32b`	Alibaba Cloud	`dashscope.aliyuncs.com`	$0.70	$1.40	$0.90	$2.80	128K
`qwen-3.5-plus`	Alibaba Cloud	`dashscope.aliyuncs.com`	$0.80	$3.00	$1.00	$3.20	1M
`qwen3-max`	Alibaba Cloud	`dashscope.aliyuncs.com`	$2.00	$7.50	$2.80	$8.00	128K

All requests are routed through our gateway. The upstream endpoint is the direct API domain of the model provider listed above, shown for full transparency.

Image Generation Models

Model ID	Provider	Upstream Price	Panda World Price	Max Resolution
`wan-2.7-image`	Alibaba Cloud	$0.03	$0.05	2048×2048
`kolors`	Kuaishou	$0.02	$0.04	1024×1024

Video Generation Models

Prices are per second of generated video. Final cost depends on video duration.

Model ID	Provider	Upstream Price	Panda World Price	Max Duration
`wan-2.7-video`	Alibaba Cloud	$0.07/sec	$0.10/sec	up to 15s

Batch Processing — 50% Off

All models are available at 50% off the standard price for batch (async) processing. Batch jobs are typically completed within 1–6 hours, with a maximum processing time of 24 hours. There is no minimum batch size — use it for any workload, though we recommend 50+ requests for maximum throughput benefit.

Cost Comparison vs. US Providers

Comparison against comparable US provider models (input prices). Competitor prices as of May 2026.

Model	Panda World (Input)	US Competitor	Savings
DeepSeek-V4-Flash	$0.18	GPT-4.1 ($2.00)	91%
MiniMax M2.5	$0.20	Claude Sonnet 4.6 ($3.00)	93%
DeepSeek-R1	$0.80	o1 ($15.00)	95%
Qwen3.5-Plus	$1.00	GPT-5 ($8.00)	87.5%
Qwen3-Max	$2.80	Claude 3.5 Sonnet ($10.00)	72%
Qwen-Flash	$0.12	Claude Haiku 4.5 ($1.00)	88%

Markup Disclosure

We believe in full transparency. The difference between the upstream provider price and Panda World's price covers:

Global edge infrastructure — multi-region proxy servers in Singapore, US-West, and Tokyo for low-latency access worldwide
OpenAI-compatible translation layer — we maintain and update the compatibility layer so you don't have to
Billing & analytics — usage tracking, cost analysis, and multi-model billing in one place
Support — technical support and monitoring around the clock
Payment processing fees — Lemon Squeezy and foreign currency conversion costs

Model Selection Guide

DeepSeek Models

deepseek-v4-flash (V4-Flash): Latest generation, 284B MoE with 1M token context. Excellent balance of speed, capability, and cost. The default choice for most use cases.

deepseek-reasoner (DeepSeek-R1): DEPRECATED — will be retired July 24, 2026. Use deepseek-v4-flash with thinking mode enabled instead. Set "thinking": {"type": "enabled"} in your API request for equivalent or better reasoning quality at 75% lower cost.

deepseek-v4-pro (V4-Pro): Flagship 1.6T MoE model with 1M token context. Best quality among DeepSeek models, ideal for complex tasks and agentic workflows.

MiniMax Models

minimax-m2.5 (M2.5): Frontier-level 230B MoE model. SWE-Bench #1, exceptional at coding, agent tasks, and function calling. 20x cheaper than Claude Opus 4. Best for developers building AI-powered applications.

Qwen Models

qwen-3.5-plus: Latest-generation flagship with 1M token context. Strong English and multilingual performance. Excellent for RAG, document processing, and complex reasoning tasks.

qwen3-max: Largest and most capable Qwen model. Best for Chinese language tasks and long-context applications (up to 128K tokens).

qwen-flash: Ultra-lightweight model at the lowest price point. Perfect for high-volume production workloads, bulk processing, and simple tasks where cost matters most.

qwen3-32b: More efficient 32B variant. Good price-performance balance for most production workloads.

Image Generation Models

wan-2.7-image (Wan 2.7-Image): Alibaba's latest text-to-image model. Supports multilingual text rendering, style transfer, and up to 2048×2048 resolution. $0.06 per image with fast inference.

kolors (Kolors): Kuaishou's open-source text-to-image model. Good quality at a lower price point. Suitable for batch image generation and cost-sensitive projects.

Video Generation Models

wan-2.7-video (Wan 2.7): Alibaba's open-weight video model with 7 generation modes including text-to-video, image-to-video, and video editing. Good balance of quality and price.

Pricing Notes

Billing is based on total tokens processed (input + output combined)
Token counting follows each model provider's tokenizer
Requests that return errors due to provider failures are not billed
Prices are subject to change; we will notify you via email of any changes
Volume discounts available — contact us for enterprise pricing
Prompt Caching: standard cache (5min TTL) at write 1.25×, cached reads 0.10× base price. Extended caching (60min TTL) coming soon.