Overview
- How MiniCPM-V 4.6 (1.6 GB) compares to Qwen3.5-0.8B (text) and Gemma4-E2B (~7 GB) on the same machine
- Measuring TTFT and throughput with Ollama's streaming API
- When to pick vision vs text-only vs larger edge models for agentic stacks
- Reproducible benchmark script you can re-run after Ollama updates




| Guide | Use case |
|---|---|
| MiniCPM-V MCP Server | Vision tools in Cursor |
| OpenClaw + MiniCPM-V | Photo assistant on messaging |
| Qwen Agentic RAG | Text-only agentic RAG |
Full tutorial¶
See TUTORIAL.md.