GPT Image 2 MCP

Generate and edit images in Claude with OpenAI's gpt-image-2 — text rendering that finally works.

中文版說明請見 README.zh.md。

What is this?

A Model Context Protocol (MCP) server that exposes OpenAI's gpt-image-2 (released 2026-04-21) and gpt-image-1 to Claude as two tools: text-to-image and multi-reference edit. Built for the IG / poster / product-shot workflow — where dense in-image text rendering and reference fusion actually matter.

Inputs over 4 MB or 1024 px are auto-resized before upload, so you don't have to babysit aspect ratios or compress reference photos by hand.

Features

Two tools, no leaks: gpt_image_generate (text → image) and gpt_image_edit (text + 1–5 references + optional mask → image)
Auto-resize on input: oversize images get a temp 1024 px PNG cache so the API call doesn't time out or 413
gpt-image-2 by default, easy fallback to gpt-image-1 for accounts without org verification
Slash skills /gpt-image and /gpt-image-edit with a curated parameter cheatsheet (size, quality, model fallback, cost-aware defaults)
Saves to disk and returns a path — survives the 60 s MCP timeout (image still lands on disk even if MCP layer disconnects)
.env-based key handling so macOS GUI Claude can find your OPENAI_API_KEY without launchd hacks

Quick Start (Cowork users)

Download claude-gpt-image2-mcp-v0.1.0.plugin from Releases.
In Cowork: Plugins → Install → Select file.

Set up your API key (paste into the plugin folder's .env):

cd <plugin-dir>
cp .env.example .env
# Edit .env: OPENAI_API_KEY=sk-proj-YOUR_KEY
chmod 600 .env

Restart Cowork (Cmd+Q, reopen).
Try it: /gpt-image 一張極簡海報，黑底白字，文字內容：「保持好奇」（Stay Curious）

Quick Start (Claude Desktop / Claude Code users)

For users not on Cowork:

Clone:

git clone https://github.com/Impossible-Studio-TW/claude-gpt-image2-mcp.git
cd claude-gpt-image2-mcp
cp .env.example .env   # then edit .env with your OpenAI key
chmod 600 .env

Register in ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "gpt-image2": {
      "command": "bash",
      "args": ["/absolute/path/to/claude-gpt-image2-mcp/run.sh"]
    }
  }
}

Restart Claude. First call auto-creates a venv (~30 s); subsequent calls are instant.

Configuration

OPENAI_API_KEY is read from a .env file inside the plugin directory — more reliable than shell env on macOS, since GUI apps don't inherit shell config. The .env file is in .gitignore; never commit it.

Default save folder is the user's ~/Pictures/gpt-image2/. Override per call with save_to=<absolute path>, or edit DEFAULT_SAVE_DIR in server.py.

Prerequisites:

Python 3.10+ (brew install python@3.12 on macOS)
An OpenAI API key with billing enabled
For gpt-image-2: organization verification at https://platform.openai.com/settings/organization/general (Persona ID + selfie, ~5 min to submit, up to 1–2 days to propagate). Without verification you can still use gpt-image-1 by passing model="gpt-image-1".

Tools (MCP API)

gpt_image_generate(
    prompt: str,
    size: str = "1024x1024",
    quality: str = "auto",
    save_to: str | None = None,
    model: str = "gpt-image-2",
) -> str   # absolute path to saved PNG, or "[Error: ...]"

gpt_image_edit(
    prompt: str,
    image_paths: list[str],         # 1–5 references; multi-image = char-lock attempt
    mask_path: str | None = None,   # transparent areas get edited
    size: str = "1024x1024",
    quality: str = "auto",
    save_to: str | None = None,
    model: str = "gpt-image-2",
) -> str

Param	Common values	Notes
`size`	`1024x1024` square / `1024x1536` portrait / `1536x1024` landscape	gpt-image-2 supports custom multiples of 16 up to 3840 px, max aspect 3:1
`quality`	`low` (~~$0.006) / `medium` (~~$0.04) / `high` (~$0.165) / `auto`	`auto` lets the model pick from the prompt complexity
`model`	`gpt-image-2` / `gpt-image-1`	Pass `gpt-image-1` if your org isn't verified yet

Slash skills /gpt-image and /gpt-image-edit ship with a Chinese / English keyword router so phrases like "高品質" or "/gpt-image-edit" map cleanly to the right tool and parameters.

Demo

Demo coming soon — see RECORD_DEMO.md for the recording script.

Why this exists

I'm a non-developer building IG content and brand decks daily. Every image-generation tool I tried was great until it had to render Chinese text on a poster — at which point the output became unusable. gpt-image-2 is the first model I've used that gets text right on the first try, and I wanted it inside Claude with the rest of my creative ops, not in another tab. The .env-loading and auto-resize bits are the kind of thing you only notice after a week of failed launches; this plugin folds that pain in once.

Pricing reference

Quality	1024×1024	Notes
low	~$0.006	Drafts, fast iteration
medium	~$0.04	Default-quality production
high	~$0.165	Best detail / text fidelity
auto	varies	Model picks based on prompt complexity

Edit operations add input image tokens (~$0.02–0.05 extra at low quality).

Known caveats

First-call delay: ~30 s for venv bootstrap; instant afterwards.
MCP 60 s timeout: high-quality generations can exceed 60 s. The image still saves to disk — check the default folder or your save_to path.
Multi-image character lock: gpt-image-1 multi-image is weak fusion only (good for style cues, not strict identity preservation). gpt-image-2 (post-verification) is meaningfully better. For true character lock, dedicated tools (Dreamina, Midjourney character ref, Flux PuLID, custom LoRAs) still outperform.
Content policy: OpenAI rejects prompts featuring real celebrities, copyrighted characters, explicit content, or violence.
Org verification gate: gpt-image-2 requires verification; until then, fallback to gpt-image-1 via model="gpt-image-1".

Contributing

See CONTRIBUTING.md. Issues, PRs, and feature suggestions are welcome — this is a personal project shared as-is, maintenance is best-effort.

License

MIT — see LICENSE.

Author

Built by Icy — Impossible Studio TW / Taipei, Taiwan
GitHub: @Impossible-Studio-TW
X / Threads / IG: TBD

PRs welcome — especially for additional model fallbacks, new quality heuristics, or better keyword routing in the slash skills.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT Image 2 MCP

What is this?

Features

Quick Start (Cowork users)

Quick Start (Claude Desktop / Claude Code users)

Configuration

Tools (MCP API)

Demo

Why this exists

Pricing reference

Known caveats

Contributing

License

Author

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude-plugin		.claude-plugin
skills		skills
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
RECORD_DEMO.md		RECORD_DEMO.md
requirements.txt		requirements.txt
run.sh		run.sh
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

GPT Image 2 MCP

What is this?

Features

Quick Start (Cowork users)

Quick Start (Claude Desktop / Claude Code users)

Configuration

Tools (MCP API)

Demo

Why this exists

Pricing reference

Known caveats

Contributing

License

Author

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages