nanobanana
by ReScienceLabnanobanana is a Python CLI skill for Google Gemini 3 Pro Image that supports text-to-image, image editing, aspect ratios, 2K/4K output, and batch generation with simple local scripts.
This skill scores 78/100, which means it is a solid directory listing candidate: agents get a clear trigger, concrete commands, and runnable scripts for Gemini-based image generation/editing, though users still need to handle setup and some model-specific uncertainty themselves.
- Strong triggerability: the frontmatter explicitly says to use it for generating or editing images with Gemini image generation.
- Operationally concrete: SKILL.md includes prerequisites, pip install commands, quick-start examples, CLI usage, and output/editing options.
- Real workflow leverage beyond prompting: included `generate.py` and `batch_generate.py` scripts support text-to-image, image editing, aspect ratios, 2K/4K output, and batch generation.
- Adoption requires external setup: users must provide `GEMINI_API_KEY` and install Python 3.10+, `google-genai`, and Pillow.
- Some trust/clarity limits remain because the skill depends on a preview model (`gemini-3-pro-image-preview`) and the provided evidence does not show troubleshooting, error-mode guidance, or install automation inside SKILL.md.
Overview of nanobanana skill
What nanobanana is for
The nanobanana skill is a lightweight wrapper around Google's gemini-3-pro-image-preview model for practical image generation and image editing from the command line. It is best for people who want a repeatable, scriptable way to create images, test prompt variations, or batch-generate outputs without building a full app first.
Who should install nanobanana
The best fit for the nanobanana skill is:
- developers who already use Python and environment variables
- AI operators who want reproducible image generation commands
- prompt testers comparing styles, aspect ratios, and output sizes
- users who need both text-to-image and edit-an-existing-image workflows
If you only want occasional one-off image generation in a chat UI, this may be more setup than you need.
Real job-to-be-done
Most users are not looking for "an image model" in the abstract. They want to turn a rough creative goal into a usable asset: a product shot, a landscape, a mascot, a concept illustration, or an edited version of an existing image. nanobanana for Image Generation is useful because it gives you a direct CLI path for that job, including prompt input, optional source image input, aspect ratio selection, and 2K/4K output options.
What makes nanobanana different from a generic prompt
The main differentiator is not secret prompting. It is workflow reduction:
- a dedicated script for generation and editing
- explicit flags for
--ratioand--size - environment-based API setup
- batch generation support in
scripts/batch_generate.py - a prompt reference file with concrete style patterns in
references/prompts.md
That makes nanobanana usage more consistent than repeatedly hand-typing ad hoc prompts in a general chat tool.
What matters before you adopt it
The key adoption questions are simple:
- You need a
GEMINI_API_KEY. - You need Python
3.10+. - You need
google-genaiandpillow. - You should be comfortable running local scripts.
- You should expect image quality to depend heavily on prompt specificity.
This is a practical skill, not a no-config web app.
How to Use nanobanana skill
nanobanana install requirements
Before trying nanobanana install, make sure you have:
- Python
3.10+ - a valid
GEMINI_API_KEY - network access to Google's API
- Python packages
google-genaiandpillow
Install dependencies:
pip install google-genai pillow
Set your API key:
export GEMINI_API_KEY="your_api_key_here"
Get a key from https://aistudio.google.com/apikey.
Install the skill in your skills environment
If you use the skills system, add the skill with:
npx skills add ReScienceLab/opc-skills --skill nanobanana
After installation, read these files first:
skills/nanobanana/SKILL.mdskills/nanobanana/scripts/generate.pyskills/nanobanana/references/prompts.mdskills/nanobanana/scripts/batch_generate.py
This reading order gives you the fastest path from "Can I use this?" to "What exact commands should I run?"
Basic nanobanana usage for text-to-image
The core command is the generate script with a prompt:
python3 <skill_dir>/scripts/generate.py "a cute robot mascot, pixel art style" -o robot.png
Use this when you are starting from text only. The output path is optional, but setting it avoids hunting for auto-named files later.
Edit an existing image with nanobanana
For image editing, provide both a prompt and an input image:
python3 <skill_dir>/scripts/generate.py "make the background blue" -i input.jpg -o output.png
This is the right workflow when you want to preserve a base image and request a targeted change. The prompt should describe the change, not restate the whole scene unless you want bigger variation.
Choose aspect ratio and output size
The skill supports common ratios including:
1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Example:
python3 <skill_dir>/scripts/generate.py "cinematic landscape at sunrise" --ratio 21:9 -o landscape.png
For higher resolution:
python3 <skill_dir>/scripts/generate.py "professional product photo of headphones" --size 4K -o product.png
Use ratio early in your workflow. It changes composition, not just cropping.
Use batch generation when prompt exploration matters
scripts/batch_generate.py is the most decision-relevant file after the main script because it supports multiple generations from one prompt.
Example:
python3 <skill_dir>/scripts/batch_generate.py "pixel art logo" -n 20 -d ./logos -p logo
Parallel generation is supported:
python3 <skill_dir>/scripts/batch_generate.py "landscape concept art" -n 20 --parallel 5
This is especially useful when you are exploring style, not chasing one deterministic output.
What input makes nanobanana work well
A rough goal like "make a cool image" is usually too weak. Stronger inputs include:
- clear subject
- intended style
- lighting or camera cues
- composition hints
- quality or output intent
Better prompt:
Professional product photo of wireless headphones on marble surface, soft studio lighting, 85mm lens, sharp focus, minimalist background
Weaker prompt:
headphones advertisement
The stronger version gives the model more control signals and reduces generic outputs.
Turn a rough idea into a complete prompt
A practical nanobanana guide for prompt building is:
- name the subject
- specify the visual mode
- add scene or composition details
- add lighting or mood
- add quality cues only if useful
Template from the repo's prompt reference:
Digital illustration of {subject}, {style} style, {colors} color palette, {mood} atmosphere
Example:
Digital illustration of an underwater research base, retro-futurist style, cyan and amber palette, mysterious atmosphere, detailed windows, glowing marine life
Repository files worth reading before serious use
If you want better than surface-level nanobanana usage, review:
SKILL.mdfor prerequisites and command patternsreferences/prompts.mdfor prompt structures and category examplesscripts/generate.pyfor supported file types, valid ratios, and sizesscripts/batch_generate.pyfor concurrency, delays, and naming behavior.claude-plugin/plugin.jsonfor packaging context
This is more useful than skimming the repo root because the skill is concentrated in a few files.
Practical constraints and tradeoffs
Important boundaries surfaced by the scripts:
- input image editing depends on local file availability
- unsupported or missing image files will fail before generation
- ratios and sizes are restricted to known valid values
- the workflow depends on Google's preview image model, so behavior may change with model updates
- batch generation adds throughput, but also increases API usage and possible rate-limit pressure
If you need advanced image pipeline controls, node-based editing, or a full hosted UI, this skill is intentionally narrower.
nanobanana skill FAQ
Is nanobanana good for beginners
Yes, if you are comfortable with basic terminal commands and Python package installation. The nanobanana skill is simpler than building your own API client from scratch, but it is still a developer-oriented tool rather than a consumer app.
When should I use nanobanana instead of a normal chat prompt
Use nanobanana when you need:
- saved output files
- repeatable commands
- image editing from local files
- batch generation
- explicit ratio and size selection
A normal chat prompt is fine for casual experimentation, but this skill is better when output handling and repeatability matter.
Does nanobanana support both generation and editing
Yes. It supports:
- text-to-image generation from a prompt
- image editing with
-i/--input - aspect ratio control
2Kand4Koutput settings- batch generation via a separate script
That combination is the main reason to install it instead of writing a one-off prompt.
Is nanobanana for Image Generation enough for production work
It can be useful in production-adjacent workflows such as concept generation, asset ideation, prompt exploration, or batch creation experiments. But it is not a complete product pipeline by itself. You still need your own review, selection, storage, and possibly post-processing steps.
When is nanobanana a poor fit
Skip nanobanana install if you need:
- a browser-first no-code experience
- a fully managed GUI workflow
- complex multi-step editing orchestration
- strong guarantees around stable model behavior over time
- image generation without an external API dependency
It is strongest as a thin, practical scripting layer.
How to Improve nanobanana skill
Start with better prompt specificity
The fastest way to improve nanobanana results is to make prompts more concrete. Add subject, style, composition, and lighting instead of relying on adjectives like "cool" or "beautiful."
Weak:
a nice city
Stronger:
Aerial photograph of a dense coastal city at golden hour, dramatic shadows, high dynamic range, realistic urban detail, cinematic composition
Match prompt style to the output type
Use different prompt language for different goals:
- pixel art: mention limited palette, crisp pixels, retro game feel
- photorealistic: mention lens, lighting, focus, material realism
- illustration: mention art style, palette, atmosphere, brush or rendering feel
This alignment is one of the most practical ideas in references/prompts.md.
Improve image editing by describing only the intended change
For edit workflows, many users over-prompt. If you already supply an input image, start with the specific modification:
Replace the gray wall with a warm blue studio backdrop while keeping the product position and lighting consistent
This is usually better than rewriting the entire image from scratch unless you actually want a broader reinterpretation.
Use batch generation to explore, then narrow
A good iterative workflow for nanobanana usage is:
- generate 6-20 variations with one prompt theme
- identify what worked in the best outputs
- rewrite the prompt around those winning traits
- rerun with a tighter style description or different ratio
This beats endlessly polishing one abstract prompt before seeing any output.
Common failure modes to watch for
Typical quality problems include:
- prompts that are too vague
- mismatched ratio for the subject
- overstuffed prompts with conflicting styles
- editing prompts that unintentionally request a full scene rewrite
- assuming 4K alone will fix a weak concept
Most bad outputs come from instruction quality, not missing magic keywords.
Use aspect ratio as a creative control, not an afterthought
For better nanobanana for Image Generation results:
- use
1:1for icons, avatars, product crops - use
9:16for vertical mobile-first scenes - use
16:9or21:9for cinematic landscapes - use
4:5for poster-like compositions
Choosing the wrong ratio often causes cramped framing or wasted space.
Improve trust by testing the scripts directly
If a skill feels unclear, run the scripts yourself before judging it. scripts/generate.py and scripts/batch_generate.py are short enough to inspect, which helps you verify supported options, failure paths, and naming behavior. For this repo, direct script inspection gives more confidence than relying on the high-level description alone.
Best next improvement if your first output is close but not right
Do not fully restart. Change one variable at a time:
- subject detail
- style phrase
- lighting cue
- aspect ratio
- edit instruction scope
That makes it easier to learn what the model is responding to and improves your future nanobanana guide instincts quickly.
