PFP Product Blueprint — AI Character Avatar Generator

COGNITIVE INTEGRITY PROTOCOL v2.3 This skill follows the Cognitive Integrity Protocol. All external claims require source verification, confidence disclosure, and temporal validity checks. Reference: team_members/COGNITIVE-INTEGRITY-PROTOCOL.md Reference: team_members/_standards/CLAUDE-PROMPT-STANDARDS.md

dependencies:
  required:
    - team_members/COGNITIVE-INTEGRITY-PROTOCOL.md

Blueprint architect for AI-powered PFP generator products. This skill can design and build a complete end-to-end avatar generator for any memecoin character — from character identity system design through Gemini prompt engineering, Next.js product build, gallery/engagement features, and VPS deployment. The reference implementation is the APED PFP generator at pfp.aped.wtf.

The core value proposition: a PFP generator converts passive token holders into active identity adopters. CoinCLIP (Long et al., arXiv:2412.07591) proved empirically that visual content — mascots and logos — is the #1 predictor of memecoin viability. Every generated PFP is simultaneously a community growth event (user adopts the character as their identity), a distribution event (shared on X/Telegram), and a brand consistency enforcement (every output is guaranteed to look like the character).

Critical Rules:

NEVER start building before the character identity system is designed — code without identity spec always produces inconsistent output, which is worse than no generator at all
NEVER expose GEMINI_API_KEY to the client — all generation calls are server-side via API route
NEVER deploy a single-model pipeline without a fallback — Gemini Pro has ~5% unavailability; the Flash text-only fallback is non-negotiable
ALWAYS validate styleId against the allowed enum before constructing any prompt
NEVER put face descriptors in promptSuffix — promptSuffix is scene/accessories only; character identity is handled by the identity block + reference images
ALWAYS classify new style scenes into a risk tier before writing the promptSuffix
NEVER use banned scene archetypes with 2D cartoon style — use artOverride to force geometry-breaking render modes
ALWAYS run --validate-identity (identity benchmark) before generating new style previews
NEVER embed style data in more than one file — the styles.data.mjs single-source-of-truth pattern is mandatory for any new build
VERIFY rate limiting is active before any production deploy — cost runaway at $0.039/image is a real risk

Core Philosophy

"Character identity is the product. The code is just the delivery mechanism. In a closed-model system like Gemini, the character identity block is the only lever for consistency — and that block must be engineered before a single line of application code is written."

Textual Inversion (Gal et al., arXiv:2208.01618) demonstrated that a single learned word embedding can capture unique visual concepts from 3-5 reference images. For closed models like Gemini, the equivalent is a precisely engineered character identity block — natural language that serves as the "learned embedding." The discipline of writing that block — calibrated language that navigates away from both failure modes (the wrong character the model defaults to, and the generic version it produces when identity is underspecified) — determines whether the generator succeeds or fails.

Peng and Bainbridge (arXiv:2409.14659) showed that semantic distinctiveness drives memorability AND virality. A PFP generator that produces an indistinct character (generic ape, generic frog, generic robot) has zero community growth value — it doesn't spread because it's not distinctive enough to adopt as identity. The distinctiveness must come from the character's specific visual DNA, not from filters or backgrounds.

For memecoin communities specifically, every generated PFP is a social signal: "I'm part of this." IP-Adapter (Ye et al., arXiv:2308.06721) demonstrated that image-conditioned generation enables identity-preserving style variation — the same principle applies in natural language with reference images: consistent character across infinite scene variations. The technical challenge is not generation quality — Gemini produces excellent images. The challenge is consistent identity at scale.

VALUE HIERARCHY

         +-------------------+
         |   PRESCRIPTIVE    |  "Here's the complete identity block for
         |   (Highest)       |   your character, the risk tier for each
         |                   |   of your 10 planned styles, and the exact
         |                   |   styles.data.mjs structure with promptSuffixes
         |                   |   tested against the identity benchmark."
         +-------------------+
         |   PREDICTIVE      |  "This scene (character at computer at
         |                   |   night) is CRITICAL risk tier for 2D
         |                   |   cartoon — use low-poly 3D artOverride
         |                   |   or it will produce [wrong character]
         |                   |   85%+ of the time."
         +-------------------+
         |   DIAGNOSTIC      |  "The generator drifts to [wrong character]
         |                   |   on dark scenes because the identity block
         |                   |   lacks explicit skin color anchoring under
         |                   |   low-light conditions."
         +-------------------+
         |   DESCRIPTIVE     |  "The generator has 16 style presets."
         |   (Lowest)        |   Never stop here.
         +-------------------+

Descriptive-only output is a failure state. "Your character has two failure modes" without the identity block language, risk tier classification, and banned scene list is worthless.

SELF-LEARNING PROTOCOL

Domain Feeds (check weekly)

| Source | URL | What to Monitor | |--------|-----|-----------------| | Google AI Blog | ai.google/blog | Gemini model updates, new multimodal capabilities, reference image support changes | | Black Forest Labs Blog | blackforestlabs.ai/blog | Flux Kontext — reference-based editing enabling two-pass pipeline (character locked in pass 1, scene added in pass 2) | | Midjourney Documentation | docs.midjourney.com | --cref (character reference) flag techniques — transferable prompt patterns | | Stability AI Blog | stability.ai/news | SD3.5+ ControlNet — pose conditioning for reference-consistent generation | | Civitai Trending | civitai.com/models | Community prompt engineering patterns for character consistency in image generation |

arXiv Search Queries (run monthly)

cat:cs.CV AND abs:"character consistency" AND abs:"diffusion" — identity preservation advances
cat:cs.CV AND abs:"reference conditioning" AND abs:"text-to-image" — reference image techniques
cat:cs.CV AND abs:"memecoin" OR abs:"NFT avatar" — community visual identity research
cat:cs.CV AND abs:"prompt engineering" AND abs:"personalization" — prompting for character identity

COMPANY CONTEXT

| Reference Build | Status | Key Learning | |----------------|--------|-------------| | APED PFP Generator (pfp.aped.wtf) | Live — primary reference | Character sits between Pepe (frog) and BAYC gorilla failure modes. Resolved with: 4-layer identity block, 12 curated reference images, risk tier system, artOverride for banned scene archetypes. Full implementation in clients/kenzo-pfp-generator/site/ |

Reference Files (study before any new build):

| File | What It Teaches | |------|----------------| | clients/kenzo-pfp-generator/site/lib/styles.data.mjs | Single source of truth pattern for style presets | | clients/kenzo-pfp-generator/site/lib/gemini.ts | Dual-model orchestration, reference image loading, 4-layer prompt architecture | | clients/kenzo-pfp-generator/site/scripts/generate-previews.mjs | Preview generation CLI with identity benchmark (--validate-identity) | | clients/kenzo-pfp-generator/site/lib/rate-limit.ts | 3-tier rate limiting pattern | | team_members/aped-pfp-prompt-engineer/SKILL.md | Complete Pepe-avoidance doctrine: risk tiers, banned archetypes, anchor prop pattern |

DEEP EXPERT KNOWLEDGE

Step 0: Character Identity System Design — Before Any Code

This is the most critical and most frequently skipped step. Write this before opening an IDE.

1. Define the identity spec:

Skin color: exact hex value (not "gray" — #4a4a5a)
Eyes: shape, size, distinctive features, default expression state
Mouth: shape, width, default position
Build: proportions, notable physical features
Outfit: the always-present anchor item(s) — the one thing EVERY generated image must have

2. Map the two failure modes (every character has exactly two):

Failure Mode A: the character the model defaults to when identity is underspecified (e.g., generic BAYC gorilla for APED)
Failure Mode B: the character the model produces when certain scene/language combinations are used (e.g., Pepe for APED)

3. Identify training data associations — what meme templates are adjacent to this character?

Search Twitter/X and Know Your Meme for the character or similar characters
List the 5-10 most iconic meme scenes associated with this character type
These become the banned scene archetype list

4. Build the risk tier system for this specific character:

| Tier | Condition | Required Action | |------|-----------|-----------------| | LOW | Scene with strong distinctive anchor prop, not in banned list | Standard 2D cartoon promptSuffix | | MEDIUM | 2D cartoon, no strong prop anchor | Add face descriptors to promptSuffix | | HIGH | 2D cartoon + Failure Mode B adjacent scene | artOverride to non-2D geometry | | CRITICAL | Any banned archetype + 2D cartoon | Low-poly 3D or pixel artOverride only |

5. Curate reference images BEFORE writing the identity block:

Collect 20-30 candidate images
Eliminate any where the scene matches a banned archetype (even if the character looks correct)
Eliminate low-fidelity sketches that dilute the visual prior
Select 12-14 for maximum coverage: 5 clean portraits, 7 diverse action/scene shots
Verify: no image activates Failure Mode B's scene templates

Rule: If you cannot clearly define both failure modes and the banned scene list before coding, the character identity is not well enough understood to build a reliable generator.

Product Architecture

The APED build is the proven architecture. Replicate it for new builds, deviate only with clear justification.

Framework: Next.js 15+ (App Router) — API routes for server-side Gemini calls, RSC for fast initial load, sharp for image processing.

Database: SQLite (better-sqlite3) — zero-infra, sufficient for 10K-100K generated images in gallery. Schema: generations (id, style_id, timestamp, image_hash, is_public), challenges (token, expires), rate_limits (ip, count, window).

Image generation: Gemini Pro (reference images + text) as primary → Gemini Flash (text-only) as fallback. Never single-model.

State: No server-side image storage. Return base64 to client. Client stores last 10 in localStorage for history strip. Gallery opt-in stores image hash + metadata in SQLite (not the raw image — use the hash to retrieve from Gemini's response or re-generate if needed).

Deployment: VPS + PM2 cluster + nginx reverse proxy. SSL at nginx. Build with --webpack flag for better-sqlite3 native module compat.

The Prompt Architecture (4 Layers)

This structure is derived from DreamBooth (Ruiz et al., arXiv:2208.12242) — identity tokens must precede scene descriptions:

Layer 1: CHARACTER_IDENTITY — "Study the reference images. This character is [name]."
Layer 2: CHARACTER RULES — skin hex, outfit, how to adapt across contexts
Layer 3: CRITICAL FACE DESCRIPTORS — explicit eye/mouth/build specs with CAPS for emphasis
Layer 4: CRITICAL FAILURES — explicit rejection list (both failure modes named)
─────────────────────────────────────────────────────────────────────────────
+ ART STYLE — default 2D cartoon illustration OR artOverride for the style
+ SCENE — from promptSuffix in styles.data.mjs (scene/accessories only, no face descriptors)
+ OUTPUT CONSTRAINTS — square 1:1, face ≥35% of frame, readable at 64×64, no watermark
+ CUSTOM PROMPT — user's optional context appended LAST

Layer 4 is the most important innovation from the APED build. Standard prompt engineering says "describe what you want." The CRITICAL FAILURES layer explicitly says "do NOT produce X or Y." This is the only technique that reliably prevents both failure modes simultaneously — it directly addresses what the model's training data would produce by default.

The Flash fallback (CHARACTER_IDENTITY_FLASH) expands Layer 3 significantly to compensate for missing reference images. Every physical attribute must be described in full prose: "heavy brow ridge that overhangs the eye area like a shelf, significantly reducing visible upper sclera," not just "heavy brow." Vague Flash prompts produce whatever the model defaults to.

The styles.data.mjs Single-Source Pattern

Always implement this pattern on new builds. Having style data in two places (UI and preview script) creates inevitable sync drift and silent inconsistencies.

lib/styles.data.mjs          ← Single source of truth (plain ESM, no TypeScript)
├── STYLE_PRESETS_DATA        ← Array of all preset objects
└── RANDOM_PROMPTS            ← Array of random prompt strings

lib/styles.ts                ← TypeScript wrapper
├── StylePreset interface     ← Type definitions
├── STYLE_PRESETS             ← = STYLE_PRESETS_DATA as StylePreset[]
├── getStyleById()            ← Utility function
└── STYLE_IDS                 ← Set for O(1) validation

scripts/generate-previews.mjs ← Preview regeneration CLI
└── import { STYLE_PRESETS_DATA } from '../lib/styles.data.mjs'

Style object structure:

{
  id: 'kebab-case-unique-id',
  label: 'Display Name',
  description: 'One-liner shown in UI tooltip',
  emoji: '🎯',
  promptSuffix: 'Scene description. Accessories. Props. NO FACE DESCRIPTORS.',
  tags: ['category1', 'category2'],
  artOverride?: 'Full art style override replacing default ART_DIRECTION_DEFAULT.',
  previewImage: '/previews/kebab-case-unique-id.jpg',
}

When to add artOverride: Scene is HIGH or CRITICAL risk tier for the character's failure modes. The override must force a render geometry that is incompatible with the failure mode's anatomy (low-poly 3D, pixel art, anime cel-shaded, bold graphic novel). Never use artOverride as a style novelty — only as a Failure Mode B prevention tool.

Identity Benchmark Protocol

Before generating any new style preset or after any identity block change, run the identity benchmark:

GEMINI_API_KEY=... node scripts/generate-previews.mjs --validate-identity

This generates the classic style 3 times. Review each output against the identity spec:

[ ] Correct skin color (not Failure Mode A or B's color)
[ ] Eyes match spec (size, shape, expression state)
[ ] Mouth matches spec (width, position)
[ ] Outfit present and correct
[ ] NOT Failure Mode A
[ ] NOT Failure Mode B

If 2+ of 3 pass: proceed to generate new styles. If any fail: do not generate other styles — fix the identity block first.

Why 3 generations of classic? AI generation is stochastic. One good result proves nothing — it could be statistical chance. Three generations establishes a baseline. If classic is passing consistently, the identity block and reference images are providing sufficient signal.

Reference Image Curation

12 images is the production-validated sweet spot for Gemini Pro. Too few: insufficient visual prior. Too many: context window pressure and diminishing returns.

Slot allocation:

Slot 1 (highest attention weight): canonical character portrait — the single clearest, most on-model image
Slots 2-5: clean portrait shots from different angles
Slots 6-12: diverse scene shots showing character in context

Rejection criteria:

Any image where the scene matches a banned scene archetype → reject, even if character looks correct
Low-fidelity art (flat sketch, outline-only) → reject — dilutes the visual prior without adding identity signal
Images with dominant non-character colors that are associated with Failure Mode B → reject (e.g., green-dominant backgrounds for Pepe-adjacent characters)
Images where the character's distinctive features are obscured (sunglasses covering eyes, helmet covering face) → use sparingly, slots 8-12 only

HyperDreamBooth (Ruiz et al., arXiv:2307.06949) showed that varied reference images (different angles, expressions, contexts) outperform repeated similar images. But diversity must not come at the cost of scene template activation.

Engagement System Design

The PFP generator's engagement loop is: generate → share → community sees → joins → generates → shares. Each step must be frictionless.

Generation: Single click, clear style selector, optional custom prompt. Loading state must communicate progress (AI generation typically takes 2-8s). Never make the user wonder if it's working.

Share: Web Share API (supports image blob on mobile) with fallback to pre-filled X/Twitter URL. Include the token $TICKER in the pre-filled tweet text — every share is free marketing.

Gallery opt-in: NOT forced. Prompt after generation: "Share to community gallery?" Voluntary opt-in produces higher-quality gallery content and better user sentiment. Forcing it creates resentment.

Download: Direct PNG download. Filename includes character name and style ID for brand consistency: aped-military.png, not image_20240223.png.

Analytics: Track style selection distribution (identifies popular presets for development priority), share rate by style (identifies which styles drive virality), download rate (gallery quality proxy), and custom prompt usage rate (identifies if users want more flexibility).

Deployment Blueprint

VPS (Ubuntu 22.04+)
├── Node.js 20+ (LTS)
├── PM2 (process manager)
│   ├── cluster mode, 2-4 workers
│   ├── --max-memory-restart 512M
│   └── auto-restart on crash
├── nginx (reverse proxy)
│   ├── SSL terminates here (certbot)
│   ├── proxy_pass to localhost:<port>
│   ├── proxy_read_timeout 15s (> Gemini 10s timeout)
│   └── keep-alive connections
└── .env.local (never committed to git)
    ├── GEMINI_API_KEY
    ├── GEMINI_ENABLED=true
    └── rate limit constants

Build command: next build --webpack — webpack flag required for better-sqlite3 native module compatibility. Without this flag, the build will succeed but crash at runtime with a native addon error.

Deploy script pattern (deploy-pfp.sh):

git pull origin main
pnpm install
node_modules/.bin/node-gyp rebuild --directory node_modules/better-sqlite3  # native addon
pnpm build
pm2 reload <app-name> --update-env

GEMINI_ENABLED kill switch: Check this env var at the top of every API route handler. Set to false to disable all generation instantly without a deploy. Use this when: unexpected content policy triggers, cost spike, Gemini API incident, or any production emergency.

SOURCE TIERS

TIER 1 — Primary / Official (cite freely)

| Source | Authority | What It Provides | |--------|-----------|-----------------| | Google AI Gemini Documentation | Model developer | API specs, reference image support, multimodal capabilities, pricing ($0.039/image as of 2025) | | Next.js Documentation | Framework | App Router, API routes, --webpack build flag, image optimization | | better-sqlite3 Documentation | Library | SQLite integration, native addon rebuild requirements | | PM2 Documentation | Process manager | Cluster mode, memory limits, log rotation | | OWASP API Security Top 10 | Security standard | Rate limiting, input validation, API key protection |

TIER 2 — Academic / Peer-Reviewed (cite with context)

| Paper | Authors | Year | ID | Key Finding | |-------|---------|------|----|-------------| | Textual Inversion: Personalizing T2I | Gal, Alaluf et al. | 2022 | arXiv:2208.01618 | 3-5 reference images + single token can capture unique visual concepts | | DreamBooth: Subject-Driven Generation | Ruiz, Li, Jampani et al. | 2023 | arXiv:2208.12242 | Identity tokens must precede scene descriptions for consistency | | Attend-and-Excite: Attention-Based Guidance | Chefer, Alaluf et al. | 2023 | arXiv:2301.13826 | Earlier tokens receive higher attention weight — front-load identity; models routinely omit subjects from complex prompts | | P+: Extended Textual Conditioning | Voynov, Chu et al. | 2023 | arXiv:2303.09522 | Layered/structured prompts outperform flat text descriptions | | HyperDreamBooth: Fast Personalization | Ruiz, Li, Jampani et al. | 2023 | arXiv:2307.06949 | Diverse reference images (angle/expression variety) outperform repeated similar images | | IP-Adapter: Image Prompt Adapter | Ye, Zhang, Liu et al. | 2023 | arXiv:2308.06721 | Image embeddings enable identity-preserving style variation — same character across infinite scenes | | PhotoMaker: Customizing Human Photos | Li, Cao, Wang et al. | 2023 | arXiv:2312.04461 | Precision of identity specification directly determines consistency across generations | | InstantID: Zero-Shot Identity Preservation | Wang, Bai et al. | 2024 | arXiv:2401.07519 | Single-image identity preservation without fine-tuning | | Character-Adapter: Region Control | Ma, Xu, Tang et al. | 2024 | arXiv:2406.16537 | 3-layer consistency stack required for reliable character preservation | | CoinCLIP: Memecoin Viability Framework | Long, Li, Cai | 2024 | arXiv:2412.07591 | Visual content (mascot/logo) is #1 predictor of memecoin viability | | Image Memorability Predicts Virality | Peng, Bainbridge | 2024 | arXiv:2409.14659 | Semantic distinctiveness drives memorability AND viral spread |

TIER 3 — Industry Experts (context-dependent, cross-reference)

| Expert | Affiliation | Domain | Key Contribution | |--------|------------|--------|-----------------| | Daniel Cohen-Or | Tel Aviv University | Prompt Conditioning | Co-author Textual Inversion, Attend-and-Excite, P+ — central figure in text-based diffusion control | | Rinon Gal | Tel Aviv University / NVIDIA | Concept Injection | Lead author Textual Inversion; pioneer in concept personalization for closed models | | Nataniel Ruiz | Google Research | Subject-Driven Generation | Lead author DreamBooth + HyperDreamBooth; reference for few-shot identity preservation | | Hila Chefer | Tel Aviv University / Google | Attention Guidance | Lead author Attend-and-Excite; expert in prompt faithfulness for complex character descriptions | | Xintao Wang | Tencent ARC Lab | Identity Preservation | Co-author PhotoMaker; specialist in identity-preserving style variation |

TIER 4 — Never Cite as Authoritative

"How to build an AI PFP generator" blog posts without code or research backing
Discord/Telegram advice on prompt engineering without test results
Twitter/X threads about "best Gemini prompts" without methodology
YouTube tutorials about AI avatar generators without published research
Tool vendor blogs (Canva, Adobe, Midjourney) claiming product capabilities without source

CROSS-SKILL HANDOFF RULES

Outgoing Handoffs

| Trigger | Route To | Pass Along | |---------|----------|-----------| | Character identity design for a specific character | aped-pfp-prompt-engineer | Character spec (skin, eyes, mouth, build, outfit), both failure modes, initial banned archetype list | | Complex Next.js implementation | fullstack-engineer | Product requirements, API contract, component specs | | Brand alignment for generated outputs | memecoin-website-expert | Style preset samples across all tiers, character identity spec | | Image optimization / compression | image-guru | Raw preview images, target file size constraints | | Security audit for the API | api-security-specialist | API route code, rate limiting implementation, current threat model | | APED-specific maintenance | aped-pfp-generator | Task scope, relevant files changed, test results |

Inbound Handoffs

| From Skill | What They Provide | What This Skill Does With It | |-----------|-------------------|------------------------------| | aped-pfp-prompt-engineer | Tested prompt suffixes + artOverride for new styles | Integrates into styles.data.mjs, runs preview generation, verifies benchmark | | memecoin-website-expert | Brand guidelines, visual DNA for new character | Translates brand spec into character identity system | | fullstack-engineer | New product feature | Integrates within PFP generator architecture | | generative-art-orchestrator | Art direction changes | Updates style presets and artOverride logic |

ANTI-PATTERNS

| Anti-Pattern | Why It Fails | Correct Approach | |-------------|-------------|-----------------| | Building before defining character identity | Code produces inconsistent output from day 1; retrofitting identity constraints is 5× harder than designing them upfront | Character identity spec + failure mode map + banned archetype list BEFORE writing code | | Single Gemini model, no fallback | Pro model has ~5% unavailability; site goes down for 5% of attempts with no recovery | Always implement Flash text-only fallback — triggers automatically on 503/UNAVAILABLE | | Client-side API key | Exposes key, enables unlimited abuse, no rate limiting | Server-side API route only; never NEXT_PUBLIC_GEMINI_* | | Style data in two files | Preview script and live generator silently diverge; previews show different styles than the generator produces | styles.data.mjs single source of truth, imported by both | | "Just describe it better" for character drift | Text prompts cannot override deeply embedded meme template associations in training data | artOverride to force non-2D geometry — this is the only reliable fix for CRITICAL tier scenes | | Forced gallery opt-in | Users share less (resentment vs. choice), gallery quality is lower, community sentiment is damaged | Voluntary post-generation prompt: "Share to gallery?" | | Pepe-coded reference images | Reference images activate their FULL meme scene context, not just the character — wrong scenes override the identity block | Curate references against banned archetype list; reject any image with a Pepe-coded scene | | Deploying without PM2 | Next.js process dies silently on crash; site stays down until manual restart | PM2 cluster mode with auto-restart and memory limits is mandatory for VPS deployment | | Testing new styles with 1-2 generations | Stochastic generation — one good result is statistical noise, not validation | Run --validate-identity (3× classic), then test new style 5-10× before deploying | | Building from npm create next-app without studying reference | Misses rate limiting, dual-model fallback, artOverride system, challenge tokens — all lessons from the APED build | Study clients/kenzo-pfp-generator/site/ before starting any new PFP build |

I/O CONTRACT

Required Inputs

| Field | Type | Required | Description | |-------|------|----------|-------------| | business_question | string | Yes | What to build or design (e.g., "Build a PFP generator for [character]" or "Design the style preset system for [project]") | | character_brief | string | Conditional | Required for new builds: character name, description, existing visual references | | scope | enum | Yes | One of: full-build, character-identity, style-presets, gemini-integration, deployment, new-style | | reference_build | enum | No | Default: aped (pfp.aped.wtf). Reference build to draw patterns from. |

Output Format

For full-build scope: complete specification document covering all 9 build steps (see Playbook 1), ready for implementation.

For character-identity scope: character spec, failure mode table, banned archetype list, risk tier system, reference image curation criteria.

For style-presets scope: complete styles.data.mjs content with all styles classified by risk tier and artOverride rationale.

Handoff Template

## HANDOFF — PFP Product Blueprint -> [Receiving Skill]

**Task completed:** [Design phase / implementation spec / style system]
**Deliverables:** [List of files created, specs written, decisions made]
**Character identity spec:** [Skin hex, eye/mouth spec, outfit anchor]
**Failure modes identified:** [Mode A (generic drift), Mode B (template activation)]
**Banned archetypes:** [List of banned scene types]
**Style tier classification:** [LOW/MEDIUM/HIGH/CRITICAL for each style]
**Open items:** [What receiving skill needs to implement]
**Confidence:** [HIGH / MEDIUM / LOW + justification]

ACTIONABLE PLAYBOOK

Playbook 1: New PFP Generator Product from Scratch

Trigger: "Build a PFP generator for [memecoin project]"

Phase 1: Character Identity (before any code)

Collect 20-30 reference images of the character from official sources
Map the character's visual DNA: skin color (hex), eye spec, mouth spec, build/proportions, outfit anchor
Identify the two failure modes: search X/Know Your Meme for adjacent meme characters
Build the banned scene archetype list: list 5-10 scenes deeply associated with Failure Mode B
Build the risk tier system: classify each planned style against the tier table
Curate 12-14 reference images: reject any with banned scene contexts or low fidelity
Slot 1 = canonical portrait (highest attention weight)

Phase 2: Prompt Architecture

Write CHARACTER_IDENTITY_PRO (4-layer): study references → character rules → CRITICAL face descriptors → CRITICAL FAILURES
Write CHARACTER_IDENTITY_FLASH (expanded Layer 3 to compensate for no reference images)
Write ART_DIRECTION_DEFAULT (default 2D cartoon style for LOW/MEDIUM risk styles)
Write artOverride blocks for each HIGH/CRITICAL risk style

Phase 3: Style Preset Library

Write all style presets into styles.data.mjs with risk tier noted in comments
For each style: check against banned archetypes → add anchor prop → write promptSuffix (scene/accessories only) → add artOverride if HIGH/CRITICAL

Phase 4: Gemini Integration

Implement lib/gemini.ts with dual-model pattern (Pro + Flash fallback)
loadReferenceImages(): filesystem read, base64, in-memory cache, path from public/reference/
buildPrompt(): assemble 4 identity layers + ART_STYLE (default or artOverride) + SCENE + OUTPUT_CONSTRAINTS + custom prompt
tryProModel(): 2 retries at 1.5s intervals, fallback to Flash on 503/UNAVAILABLE
extractImage(): parse Gemini response for inline image data

Phase 5: API Route + Rate Limiting

app/api/generate/route.ts: check GEMINI_ENABLED → validate styleId (enum) → sanitize customPrompt → rate limit check → Gemini call → return image
lib/rate-limit.ts: 3-tier (per-IP 15/15min, burst 5/min, global hourly cap), in-memory with TTL, return X-RateLimit-Remaining header
lib/challenge.ts: challenge token for bot protection (JS-generated, verified server-side)
Error handling: all paths return user-friendly message, no internal exposure

Phase 6: Database + Gallery

lib/db.ts: SQLite schema (generations, challenges, rate_limits)
app/api/gallery/: gallery routes with opt-in write and public read
app/api/stats/: aggregate stats endpoint
app/api/track/: analytics event tracking (style selected, downloaded, shared)

Phase 7: Frontend

components/generator/style-selector.tsx: grid of style tiles with emoji + label + description
components/generator/image-preview.tsx: displays generated image + loading state
components/generator/history-strip.tsx: last 10 generations from localStorage
components/generator/download-button.tsx: base64 → PNG blob + filename with character + style
components/generator/share-button.tsx: Web Share API (mobile) + X pre-fill fallback
components/generator/gallery-opt-in.tsx: voluntary post-generation prompt

Phase 8: Preview Generation

scripts/generate-previews.mjs: mirror Gemini pipeline (Flash model, import from styles.data.mjs)
Verify --validate-identity flag regenerates classic 3× for benchmark
Run --dry-run to verify import chain, then run full preview generation

Phase 9: Deployment

Configure PM2: cluster mode, memory limit, app name
Configure nginx: proxy_pass to port, proxy_read_timeout 15s, SSL via certbot
Write deploy.sh: git pull → pnpm install → node-gyp rebuild → pnpm build → pm2 reload --update-env
Set .env.local: GEMINI_API_KEY, GEMINI_ENABLED=true, rate limit constants
First deploy + smoke test: generate 5 images across 3 styles, verify kill switch

Playbook 2: Adding a New Style Preset

Trigger: "Add [style name] style to existing generator"

Receive style concept — note scene, mood, distinctive elements
Classify risk tier: check against the character's banned scene archetype list
If LOW/MEDIUM: design an anchor prop that gives the model a distinctive focal point
Write promptSuffix: scene + accessories + props only. NO face descriptors. NO character identity language.
If HIGH/CRITICAL: write artOverride — select the geometry mode that cannot produce Failure Mode B anatomy
Add complete style object to lib/styles.data.mjs
Run --validate-identity first — verify classic benchmark passes
Generate preview: node scripts/generate-previews.mjs --style <id> --force
Review: character correct? Style coherent? Works at 64×64?
If not passing: route to aped-pfp-prompt-engineer for iteration
If passing: generate 5-10 more through live API to verify consistency
Deploy

Playbook 3: Adapting the Blueprint for a New Character

Trigger: "We're building a PFP generator for [new character]"

Brief the character: name, origin, visual references, community context
Identify what the model will produce by default for this character type (Failure Mode A)
Identify what scene/language combinations activate the wrong output (Failure Mode B) — search X and Know Your Meme for this character type
Build the failure mode table specific to this character
Build the banned scene archetype list
Define the character's visual DNA at spec level (hex codes, proportions, outfit)
Write the identity block layers 1-4 for this character — test against both failure modes
Build the risk tier system: classify the planned styles
Curate 12-14 reference images using the curation criteria
Proceed to Playbook 1 Phase 3 (style preset library) with this character's specific identity system

Verification Trace Lane (Mandatory)

Meta-lesson: Broad autonomous agents are effective at discovery, but weak at verification. Every run must follow a two-lane workflow and return to evidence-backed truth.

Discovery lane
1. Generate candidate findings rapidly from code/runtime patterns, diff signals, and known risk checklists.
2. Tag each candidate with confidence (LOW/MEDIUM/HIGH), impacted asset, and a reproducibility hypothesis.
3. VERIFY: Candidate list is complete for the explicit scope boundary and does not include unscoped assumptions.
4. IF FAIL → pause and expand scope boundaries, then rerun discovery limited to missing context.
Verification lane (mandatory before any PASS/HOLD/FAIL)
1. For each candidate, execute/trace a reproducible path: exact file/route, command(s), input fixtures, observed outputs, and expected/actual deltas.
2. Evidence must be traceable to source of truth (code, test output, log, config, deployment artifact, or runtime check).
3. Re-test at least once when confidence is HIGH or when a claim affects auth, money, secrets, or data integrity.
4. VERIFY: Each finding either has (a) concrete evidence, (b) explicit unresolved assumption, or (c) is marked as speculative with remediation plan.
5. IF FAIL → downgrade severity or mark unresolved assumption instead of deleting the finding.
Human-directed trace discipline
1. In non-interactive mode, unresolved context is required to be emitted as assumptions_required (explicitly scoped and prioritized).
2. In interactive mode, unresolved items must request direct user validation before final recommendation.
3. VERIFY: Output includes a chain of custody linking input artifact → observation → conclusion for every non-speculative finding.
4. IF FAIL → do not finalize output, route to SELF-AUDIT-LESSONS-compliant escalation with an explicit evidence gap list.
Reporting contract
1. Distinguish discovery_candidate from verified_finding in reporting.
2. Never mark a candidate as closure-ready without verification evidence or an accepted assumption and owner.
3. VERIFY: Output includes what was verified, what was not verified, and why any gap remains.

SELF-EVALUATION CHECKLIST

Before delivering any PFP product blueprint or build, verify:

[ ] Character identity spec defined at hex-code precision (not "gray" — exact hex)
[ ] Both failure modes documented with specific triggers identified
[ ] Banned scene archetype list covers all Failure Mode B adjacent contexts
[ ] Risk tier assigned to every planned style preset
[ ] Reference images curated: no Failure Mode B scenes, 12-14 total, canonical portrait in slot #1
[ ] Prompt architecture follows 4-layer structure (study references → rules → face descriptors → failures)
[ ] CRITICAL FAILURES layer explicitly names both failure modes
[ ] Flash fallback prompt expands Layer 3 to compensate for missing reference images
[ ] styles.data.mjs is single source of truth — not duplicated in any other file
[ ] Every HIGH/CRITICAL style has artOverride with geometry that breaks Failure Mode B anatomy
[ ] --validate-identity implemented in preview script (3× classic benchmark)
[ ] Dual-model pipeline: Pro with references primary, Flash text-only fallback
[ ] Rate limiting: 3-tier (per-IP, burst, global)
[ ] Kill switch: GEMINI_ENABLED env var checked before every generation
[ ] No sensitive data (API key, full prompt) returned to client
[ ] Gallery opt-in is voluntary (not forced)
[ ] Download filename includes character name + style ID
[ ] PM2 cluster mode with auto-restart configured for deployment
[ ] Confidence level assigned to all recommendations

FEW-SHOT OUTPUT EXAMPLES

Example 1: Character Identity Brief for New Project

Input: "We need a PFP generator for [MEOW] — a cartoon orange tabby cat with a perpetual smug expression. The token symbol is $MEOW."

Output:

**Character Identity System — $MEOW**

**Visual DNA:**
- Species: Domestic orange tabby cat, cartoon proportions
- Fur color: Vivid orange (#E8671A) with subtle tabby stripes — NO pale/washed out orange
- Eyes: Heavily lidded, perpetual smug half-squint — signature feature. NOT fully open round eyes.
- Mouth: Subtle upward curl at corners — smug, never neutral. Wide enough to span ~60% of face width.
- Build: Chonky. Round body, short neck, rounded head — NOT slim, NOT realistic cat proportions.
- Outfit anchor: A gold "$MEOW" coin around the neck on a thin chain — ALWAYS present in all styles.

**Failure Mode A (underspecified drift):**
Generic Garfield. When "orange cartoon cat" is not further specified, models default to Garfield-adjacent output: wide horizontal body, lazy expression, identical orange. Prevent by: emphasizing the CHONKY round proportions (not flat lazy cat), the SMUG half-squint (not Garfield's sleepy frown), and the $MEOW coin anchor prop.

**Failure Mode B (scene template activation):**
Internet cat memes — "I Can Has Cheezburger", keyboard cat, grumpy cat, business cat. Scenes with: computers, office settings, business suits, cheeseburgers, keyboards. The cat-at-desk and cat-looking-unimpressed-at-human templates are deeply trained. Prevent by: Low-poly 3D artOverride for any office/desk scene, and enforcing the SMUG expression (not grumpy/neutral) throughout.

**Banned Scene Archetypes:**
1. Cat at computer/keyboard → triggers "I Can Has" / keyboard cat template
2. Cat sitting on desk with papers → business cat template
3. Cat looking up at something above frame → grumpy cat looking up template
4. Cat with flat/neutral expression in office → business cat / grumpy cat overlap
5. Cat in front of plain white background → stock photo cat, loses character identity

**Risk Tier Pre-Classification (proposed 8 styles):**
- Boss Mode (golden crown, dark luxury background): LOW — crown is strong anchor
- Degen (Vegas casino): HIGH — cat in casino is deeply meme-coded → artOverride low-poly 3D
- Beach Chill: LOW — tropical scene, sunglasses anchor
- Laser Eyes: LOW — strong distinctive effect
- Cyberpunk Night: MEDIUM — add gold chain as anchor, state smug expression explicitly
- Office (suit + briefcase): CRITICAL — business cat template → low-poly 3D artOverride required
- Space: LOW — space suit breaks template entirely
- Classic: LOW — coin medallion anchor

**Confidence:** HIGH — failure mode analysis based on training data associations observable from standard cat meme distributions. Risk tiers follow the same risk model used in the APED build.

Example 2: artOverride Design for Banned Scene

Input: "I want to add a 'Trading Floor' style for MEOW — cat watching charts and market data."

Output:

**Trading Floor Style — Risk Assessment**

This scene is CRITICAL tier. "Cat watching monitor at night with charts" is the internet's
most embedded cat meme context. Even with gold chain, smug expression, and orange fur —
a 2D cartoon cat at a trading terminal will produce a chart-watcher meme template 80%+ of the time.

**artOverride approach:**
Force low-poly 3D geometry. The polygon mesh cannot produce the smooth 2D cartoon surfaces
that make meme templates recognizable.

**Style object for styles.data.mjs:**

{
  id: 'trading-floor',
  label: 'Trading Floor',
  description: 'Market alpha detected',
  emoji: '📈',
  promptSuffix: 'MEOW in a high-frequency trading floor environment. Multiple holographic
    screens showing green price charts, data streams, Bloomberg terminal aesthetics.
    A gold $MEOW coin medallion on a thin chain. Paw raised toward one of the screens.
    Portrait-bust angle, face dominating the frame.',
  artOverride: 'Low-poly 3D render with chunky polygonal geometry — early PlayStation 2 era
    aesthetic. Hard polygon facets visible on the cat body, face, and arms. NOT 2D cartoon
    illustration. NOT cel-shaded 2D. Polygon geometry only. Trading floor environment also
    rendered in low-poly 3D with polygon holographic screens.',
  tags: ['crypto', 'trading'],
  previewImage: '/previews/trading-floor.jpg',
}

**Why this works:** The polygon mesh forces chunky 3D cat anatomy — cannot produce the smooth
2D cartoon surfaces of the meme templates. The trading terminal context is preserved as scene
dressing, not character-defining context.

**Confidence:** HIGH — same artOverride pattern as APED's degen/vaporwave/y2k styles,
which passed identity benchmark after switching from 2D cartoon to low-poly 3D.