nanobanana

by ReScienceLab

nanobanana is a Python CLI skill for Google Gemini 3 Pro Image that supports text-to-image, image editing, aspect ratios, 2K/4K output, and batch generation with simple local scripts.

Stars654

Favorites0

Comments0

AddedMar 31, 2026

CategoryImage Generation

Install Command

npx skills add ReScienceLab/opc-skills --skill nanobanana

Curation Score

This skill scores 78/100, which means it is a solid directory listing candidate: agents get a clear trigger, concrete commands, and runnable scripts for Gemini-based image generation/editing, though users still need to handle setup and some model-specific uncertainty themselves.

78/100

Strengths

Strong triggerability: the frontmatter explicitly says to use it for generating or editing images with Gemini image generation.
Operationally concrete: SKILL.md includes prerequisites, pip install commands, quick-start examples, CLI usage, and output/editing options.
Real workflow leverage beyond prompting: included `generate.py` and `batch_generate.py` scripts support text-to-image, image editing, aspect ratios, 2K/4K output, and batch generation.

Cautions

Adoption requires external setup: users must provide `GEMINI_API_KEY` and install Python 3.10+, `google-genai`, and Pillow.
Some trust/clarity limits remain because the skill depends on a preview model (`gemini-3-pro-image-preview`) and the provided evidence does not show troubleshooting, error-mode guidance, or install automation inside SKILL.md.

Gemini Google Python Cli Workflow Batch Jobs

Overview

Overview of nanobanana skill

What nanobanana is for

The nanobanana skill is a lightweight wrapper around Google's gemini-3-pro-image-preview model for practical image generation and image editing from the command line. It is best for people who want a repeatable, scriptable way to create images, test prompt variations, or batch-generate outputs without building a full app first.

Who should install nanobanana

The best fit for the nanobanana skill is:

developers who already use Python and environment variables
AI operators who want reproducible image generation commands
prompt testers comparing styles, aspect ratios, and output sizes
users who need both text-to-image and edit-an-existing-image workflows

If you only want occasional one-off image generation in a chat UI, this may be more setup than you need.

Real job-to-be-done

Most users are not looking for "an image model" in the abstract. They want to turn a rough creative goal into a usable asset: a product shot, a landscape, a mascot, a concept illustration, or an edited version of an existing image. nanobanana for Image Generation is useful because it gives you a direct CLI path for that job, including prompt input, optional source image input, aspect ratio selection, and 2K/4K output options.

What makes nanobanana different from a generic prompt

The main differentiator is not secret prompting. It is workflow reduction:

a dedicated script for generation and editing
explicit flags for --ratio and --size
environment-based API setup
batch generation support in scripts/batch_generate.py
a prompt reference file with concrete style patterns in references/prompts.md

That makes nanobanana usage more consistent than repeatedly hand-typing ad hoc prompts in a general chat tool.

What matters before you adopt it

The key adoption questions are simple:

You need a GEMINI_API_KEY.
You need Python 3.10+.
You need google-genai and pillow.
You should be comfortable running local scripts.
You should expect image quality to depend heavily on prompt specificity.

This is a practical skill, not a no-config web app.

How to Use nanobanana skill

nanobanana install requirements

Before trying nanobanana install, make sure you have:

Python 3.10+
a valid GEMINI_API_KEY
network access to Google's API
Python packages google-genai and pillow

Install dependencies:

pip install google-genai pillow

Set your API key:

export GEMINI_API_KEY="your_api_key_here"

Get a key from https://aistudio.google.com/apikey.

Install the skill in your skills environment

If you use the skills system, add the skill with:

npx skills add ReScienceLab/opc-skills --skill nanobanana

After installation, read these files first:

skills/nanobanana/SKILL.md
skills/nanobanana/scripts/generate.py
skills/nanobanana/references/prompts.md
skills/nanobanana/scripts/batch_generate.py

This reading order gives you the fastest path from "Can I use this?" to "What exact commands should I run?"

Basic nanobanana usage for text-to-image

The core command is the generate script with a prompt:

python3 <skill_dir>/scripts/generate.py "a cute robot mascot, pixel art style" -o robot.png

Use this when you are starting from text only. The output path is optional, but setting it avoids hunting for auto-named files later.

Edit an existing image with nanobanana

For image editing, provide both a prompt and an input image:

python3 <skill_dir>/scripts/generate.py "make the background blue" -i input.jpg -o output.png

This is the right workflow when you want to preserve a base image and request a targeted change. The prompt should describe the change, not restate the whole scene unless you want bigger variation.

Choose aspect ratio and output size

The skill supports common ratios including:
1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

Example:

python3 <skill_dir>/scripts/generate.py "cinematic landscape at sunrise" --ratio 21:9 -o landscape.png

For higher resolution:

python3 <skill_dir>/scripts/generate.py "professional product photo of headphones" --size 4K -o product.png

Use ratio early in your workflow. It changes composition, not just cropping.

Use batch generation when prompt exploration matters

scripts/batch_generate.py is the most decision-relevant file after the main script because it supports multiple generations from one prompt.

Example:

python3 <skill_dir>/scripts/batch_generate.py "pixel art logo" -n 20 -d ./logos -p logo

Parallel generation is supported:

python3 <skill_dir>/scripts/batch_generate.py "landscape concept art" -n 20 --parallel 5

This is especially useful when you are exploring style, not chasing one deterministic output.

What input makes nanobanana work well

A rough goal like "make a cool image" is usually too weak. Stronger inputs include:

clear subject
intended style
lighting or camera cues
composition hints
quality or output intent

Better prompt:

Professional product photo of wireless headphones on marble surface, soft studio lighting, 85mm lens, sharp focus, minimalist background

Weaker prompt:

headphones advertisement

The stronger version gives the model more control signals and reduces generic outputs.

Turn a rough idea into a complete prompt

A practical nanobanana guide for prompt building is:

name the subject
specify the visual mode
add scene or composition details
add lighting or mood
add quality cues only if useful

Template from the repo's prompt reference:

Digital illustration of {subject}, {style} style, {colors} color palette, {mood} atmosphere

Example:

Digital illustration of an underwater research base, retro-futurist style, cyan and amber palette, mysterious atmosphere, detailed windows, glowing marine life

Repository files worth reading before serious use

If you want better than surface-level nanobanana usage, review:

SKILL.md for prerequisites and command patterns
references/prompts.md for prompt structures and category examples
scripts/generate.py for supported file types, valid ratios, and sizes
scripts/batch_generate.py for concurrency, delays, and naming behavior
.claude-plugin/plugin.json for packaging context

This is more useful than skimming the repo root because the skill is concentrated in a few files.

Practical constraints and tradeoffs

Important boundaries surfaced by the scripts:

input image editing depends on local file availability
unsupported or missing image files will fail before generation
ratios and sizes are restricted to known valid values
the workflow depends on Google's preview image model, so behavior may change with model updates
batch generation adds throughput, but also increases API usage and possible rate-limit pressure

If you need advanced image pipeline controls, node-based editing, or a full hosted UI, this skill is intentionally narrower.

nanobanana skill FAQ

Is nanobanana good for beginners

Yes, if you are comfortable with basic terminal commands and Python package installation. The nanobanana skill is simpler than building your own API client from scratch, but it is still a developer-oriented tool rather than a consumer app.

When should I use nanobanana instead of a normal chat prompt

Use nanobanana when you need:

saved output files
repeatable commands
image editing from local files
batch generation
explicit ratio and size selection

A normal chat prompt is fine for casual experimentation, but this skill is better when output handling and repeatability matter.

Does nanobanana support both generation and editing

Yes. It supports:

text-to-image generation from a prompt
image editing with -i / --input
aspect ratio control
2K and 4K output settings
batch generation via a separate script

That combination is the main reason to install it instead of writing a one-off prompt.

Is nanobanana for Image Generation enough for production work

It can be useful in production-adjacent workflows such as concept generation, asset ideation, prompt exploration, or batch creation experiments. But it is not a complete product pipeline by itself. You still need your own review, selection, storage, and possibly post-processing steps.

When is nanobanana a poor fit

Skip nanobanana install if you need:

a browser-first no-code experience
a fully managed GUI workflow
complex multi-step editing orchestration
strong guarantees around stable model behavior over time
image generation without an external API dependency

It is strongest as a thin, practical scripting layer.

How to Improve nanobanana skill

Start with better prompt specificity

The fastest way to improve nanobanana results is to make prompts more concrete. Add subject, style, composition, and lighting instead of relying on adjectives like "cool" or "beautiful."

Weak:

a nice city

Stronger:

Aerial photograph of a dense coastal city at golden hour, dramatic shadows, high dynamic range, realistic urban detail, cinematic composition

Match prompt style to the output type

Use different prompt language for different goals:

pixel art: mention limited palette, crisp pixels, retro game feel
photorealistic: mention lens, lighting, focus, material realism
illustration: mention art style, palette, atmosphere, brush or rendering feel

This alignment is one of the most practical ideas in references/prompts.md.

Improve image editing by describing only the intended change

For edit workflows, many users over-prompt. If you already supply an input image, start with the specific modification:

Replace the gray wall with a warm blue studio backdrop while keeping the product position and lighting consistent

This is usually better than rewriting the entire image from scratch unless you actually want a broader reinterpretation.

Use batch generation to explore, then narrow

A good iterative workflow for nanobanana usage is:

generate 6-20 variations with one prompt theme
identify what worked in the best outputs
rewrite the prompt around those winning traits
rerun with a tighter style description or different ratio

This beats endlessly polishing one abstract prompt before seeing any output.

Common failure modes to watch for

Typical quality problems include:

prompts that are too vague
mismatched ratio for the subject
overstuffed prompts with conflicting styles
editing prompts that unintentionally request a full scene rewrite
assuming 4K alone will fix a weak concept

Most bad outputs come from instruction quality, not missing magic keywords.

Use aspect ratio as a creative control, not an afterthought

For better nanobanana for Image Generation results:

use 1:1 for icons, avatars, product crops
use 9:16 for vertical mobile-first scenes
use 16:9 or 21:9 for cinematic landscapes
use 4:5 for poster-like compositions

Choosing the wrong ratio often causes cramped framing or wasted space.

Improve trust by testing the scripts directly

If a skill feels unclear, run the scripts yourself before judging it. scripts/generate.py and scripts/batch_generate.py are short enough to inspect, which helps you verify supported options, failure paths, and naming behavior. For this repo, direct script inspection gives more confidence than relying on the high-level description alone.

Best next improvement if your first output is close but not right

Do not fully restart. Change one variable at a time:

subject detail
style phrase
lighting cue
aspect ratio
edit instruction scope

That makes it easier to learn what the model is responding to and improves your future nanobanana guide instincts quickly.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

openclaw-persona-forge

by affaan-m

openclaw-persona-forge is a workflow-driven skill for building complete OpenClaw persona packages from scratch. It creates identity tension, SOUL.md-style framing, boundary rules, name options, and optional avatar prompt guidance. Best for OpenClaw character design, roleplay agents, and UI Design-adjacent persona work, not for minor edits to an existing persona.

UI Design

Favorites 0GitHub 156.2k

baoyu-imagine

by JimLiu

baoyu-imagine is a multi-provider image generation skill with a typed CLI, mandatory EXTEND.md setup, reference image support, aspect ratio controls, and batch runs across OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, MiniMax, Jimeng, Seedream, and Replicate.

Image Generation

Favorites 0GitHub 13.2k

baoyu-comic

by JimLiu

baoyu-comic is a skill for turning source text into educational or biography-style comics with storyboard planning, character consistency, panel layouts, and staged image generation. It supports installable CLI usage, style and layout options, and partial workflows like --storyboard-only, --prompts-only, and --regenerate for controlled comic production.

Image Generation

Favorites 0GitHub 13.2k

shader-dev

by MiniMax-AI

shader-dev is a practical GLSL shader skill for ShaderToy-style real-time visuals. Use the shader-dev skill to build or debug ray marching, SDF scenes, lighting, particles, fluid motion, post-processing, and shader-dev for UI Design with less guesswork than a generic prompt.

UI Design

Favorites 0GitHub 11.7k

videoagent-video-studio

by pexoai

videoagent-video-studio is a skill for generating short AI videos from text, images, and references. Use it to test text-to-video and image-to-video workflows, compare supported models, and run the hosted proxy or self-hosted setup with Node 18+.

Video Editing

Favorites 0GitHub 456

seo-image-gen

by AgriciDaniel

seo-image-gen is a GitHub skill for turning SEO image requests into production-ready prompts and settings for OG images, social previews, hero banners, product visuals, infographics, and thumbnails. It uses Gemini via nanobanana-mcp and assumes the banana extension is installed for a practical seo-image-gen guide and workflow.

Image Generation

Favorites 0GitHub 0

baoyu-xhs-images

by JimLiu

baoyu-xhs-images turns articles or notes into Xiaohongshu infographic carousels with presets, styles, layouts, and setup guidance. It helps users install the skill, choose inputs, and run `/baoyu-xhs-images` for structured multi-slide social posts.

UI Design

Favorites 0GitHub 13.2k

baoyu-cover-image

by JimLiu

baoyu-cover-image helps agents generate structured article cover-image prompts using type, palette, rendering, text, and mood. It supports 2.35:1, 16:9, and 1:1 formats, includes auto-selection rules and compatibility guidance, and fits repeatable editorial and UI Design cover workflows.

UI Design

Favorites 0GitHub 13.2k

meme-factory

by softaworks

meme-factory helps agents create template-based memes with the free memegen.link API, plus Markdown-friendly text memes. Use it to generate shareable meme URLs, pick fitting templates, format text correctly, and automate output with the included Python helper.

Image Generation

Favorites 0GitHub 1.3k

logo-creator

by ReScienceLab

logo-creator is an AI logo workflow for generating concepts, comparing variations, cropping, removing backgrounds, and exporting SVG assets. It uses style references, example prompts, and helper scripts for logo, icon, favicon, and brand mark creation in ReScienceLab/opc-skills.

Branding

Favorites 0GitHub 0

scientific-schematics

by K-Dense-AI

scientific-schematics turns natural-language prompts into publication-quality scientific diagrams with smart iterative refinement. It uses Nano Banana 2 for generation and Gemini 3.1 Pro Preview for review, regenerating only when output falls below the threshold for your document type. Built for neural network architectures, system diagrams, flowcharts, biological pathways, and other complex scientific visuals.

Image Generation

Favorites 0GitHub 0

banner-creator

by ReScienceLab

banner-creator helps create banners, headers, and hero images with a structured workflow: gather requirements, generate variations, refine with feedback, and crop to platform ratios using the included script.

UI Design

Favorites 0GitHub 0

baoyu-article-illustrator

by JimLiu

baoyu-article-illustrator helps agents turn article drafts into structured illustration prompts, placements, and consistent type-plus-style decisions for explainers, tutorials, diagrams, and multi-image posts.

Image Generation

Favorites 0GitHub 13.2k

sound-fx

by NoizAI

Use the sound-fx skill to turn text prompts into sound effects, foley, ambient beds, creature sounds, and UI noises. It fits sound-fx for Audio Editing, quick prototyping, and downloadable audio assets. Install with NoizAI/skills, then use the script-based workflow with a valid Noiz API key. Not for speech, lyrics, melody, or voice cloning.

Audio Editing

Favorites 0GitHub 498

chat-with-anyone

by NoizAI

chat-with-anyone helps you clone a real person's voice from public audio or design a matching voice from an image, then generate synthetic replies with TTS. It supports practical workflows for roleplay, narration, and voice generation, with guidance on install, source selection, and safe usage.

Voice Generation

Favorites 0GitHub 498

videoagent-image-studio

by pexoai

videoagent-image-studio is a unified image generation skill for Node-based agents. It offers one CLI flow for Midjourney, Flux, Ideogram, Recraft, SDXL, and more, with proxy-backed setup, model selection guidance, and normalized outputs for automation.

Image Generation

Favorites 0GitHub 456