Skip to content

Overview

  • How MiniCPM-V 4.6 (1.6 GB) compares to Qwen3.5-0.8B (text) and Gemma4-E2B (~7 GB) on the same machine
  • Measuring TTFT and throughput with Ollama's streaming API
  • When to pick vision vs text-only vs larger edge models for agentic stacks
  • Reproducible benchmark script you can re-run after Ollama updates

Benchmark terminal demo

Comparison table — 16 GB Mac shootout

Benchmark terminal demo

Comparison table

Guide Use case
MiniCPM-V MCP Server Vision tools in Cursor
OpenClaw + MiniCPM-V Photo assistant on messaging
Qwen Agentic RAG Text-only agentic RAG

Full tutorial

See TUTORIAL.md.

Read the full tutorial →