SlideForge — Detailed Report
Project Origin
Why This Tool Was Needed
The company frequently creates investment research PowerPoint presentations, but faces a common problem: high content quality, low visual design quality. Analysts spend significant time researching data and writing arguments, but the resulting PPTs often have inconsistent fonts, colors, layouts, and chart styles. Hiring professional designers is expensive and scheduling-difficult; self-applying templates never achieves the desired results.
Original Requirements
Build an AI-powered PPT beautification system with these core constraints:
- Never modify text content — analysts' text is the most valuable asset, not a single word should change
- Automatically elevate visual quality to institutional research report grade
- Preserve all original materials — images, tables, charts must remain intact
- Avoid AI template cheapness — output should look human-designed, not obviously AI-generated
- Reusable — establish universal rules and processes applicable to any PPT topic
Technical Approach
Why PptxGenJS
After evaluation, PptxGenJS (Node.js package) was chosen as the core execution engine for its complete programmatic control over every element's precise coordinates, colors, and fonts. python-pptx was used for reading and verification but not for rebuilding. PowerPoint COM was used only for screenshot export during QA.
Supporting Tool Chain
- python-pptx + markitdown: Read original PPT structured data, extract text for diff verification
- Gemini Image MCP (Nano Banana): AI image generation (backgrounds, content illustrations)
- image-size: Read actual image dimensions for proper aspect ratio scaling
- PowerPoint COM: Export slides as JPG for visual QA
Development Process
Phase 1: Foundation Architecture
- Wrote
CLAUDE.md(complete instruction manual defining 5-stage process, tool usage, checkpoint protocol) - Wrote
DESIGN_PRINCIPLES.md(design principles and preferences, including Anti-AI-Slop strategy) - Defined file structure (input/working/output/scripts)
Phase 2: First Real Case — ALM.pptx Beautification
- Extracted original PPT complete structure using python-pptx (101 images, 3 tables, 32 slides)
- Exported all slide screenshots via PowerPoint COM
- Produced analysis report identifying design issues: illogical dark/light background alternation, 4 different table styles, 3 consecutive "three-card layout" patterns (AI slop characteristic), rainbow gradient title text
Phase 3: Design System + First Rebuild
- Proposed 3 design directions (Charcoal x Gold / Terracotta Earth / Steel Cool Gray) — Charcoal x Gold selected
- Built DESIGN_SYSTEM.json (complete color, font, spacing definitions)
- Generated background and content images with Gemini
- Wrote rebuild.js (~1,100 lines), rebuilding all 32 slides from scratch
- First output: beautified.pptx
Phase 4: Problem Discovery and Fix Cycles (4 Rounds)
Round 1: All slides truncated on right 30%. Root cause: LAYOUT_16x9 is actually 10"x5.625", not 13.33"x7.5". Fixed with defineLayout specifying correct dimensions.
Round 2: The character for "tungsten" was written incorrectly (Unicode confusion). Found 5 commonly confused Traditional Chinese character pairs, applied global replacement.
Round 3: Violated DESIGN_PRINCIPLES — decorative lines, inconsistent backgrounds, repeated card decorations. Updated rules, unified backgrounds, removed all decorative lines.
Round 4: Image distortion (stretched/compressed), zigzag layout hard to read, text orphans. Added fitImgData() for proper aspect ratio scaling, reverted to three-column cards, reduced font sizes to prevent orphans.
Phase 5: Rule System Iteration
DESIGN_PRINCIPLES.md underwent 4 update rounds, growing from basic rules to a complete review system. Every new rule came from real-world mistakes:
- Background consistency (no mixing dark/light)
- Complete decorative line prohibition
- Vertical accent line prohibition
- Image aspect ratio enforcement (no manual w/h guessing)
- Text orphan prevention
- Material preservation (must verify against manifest for every image)
- Agent Team verification process
- Gemini prompts must be English only (Chinese renders as garbled text)
Phase 6: Verification Tools
- Built
scripts/audit_images.py(auto-compare manifest vs rebuild.js) - Established 3-layer Agent Team verification: (1) parallel independent auditing, (2) comparison + visual review, (3) audit report with must-restore / ignorable / equivalent-conversion categories
AI Image Generation Integration
Gemini Usage
Used Gemini 2.5 Flash (Nano Banana) via MCP tools for image generation.
Process: Read each slide's complete text content (not just titles), write English prompts based on described scenarios, generate images, visually verify quality.
Images generated for ALM.pptx:
- Uniform dark charcoal matte paper texture background (no decorative elements)
- Slide 2: High-temperature molten metal (text mentions "highest melting point 3422C, tungsten carbide cutting tools")
- Slide 2: Wafer fabrication process (text mentions "tungsten hexafluoride for chip interconnects, EV batteries")
- Slide 2: Military equipment (text mentions "missile, armor-piercing core material")
- Slide 3: PCB micro-drilling close-up (text mentions "PCB micro-drill tools, semiconductor test packaging")
- Slide 3: Tungsten wire cutting industrial scene (text mentions "tungsten wire replacing diamond wire for silicon wafer cutting")
- Slide 20: Tungsten ore core samples, automated beneficiation lines, South Korean industrial port
Cost: ~$0.50 USD total for 12 images via Gemini 2.5 Flash.
Challenges Encountered
PPT Format Pitfalls
LAYOUT_16x9actual dimensions differ from expected — caused 30% right-side truncation- PptxGenJS mutates option objects — second use of same style object produces wrong results (solved with factory functions)
- Traditional Chinese Unicode confusion — 5 character pairs commonly mixed up
sizing: containleaves excessive whitespace — switched to manual dimension calculation usingimage-size
Design Quality Issues
- Three repeated "three-card layout" patterns (AI tendency toward high-frequency patterns)
- Decorative lines appearing everywhere (AI habit of adding "beautification" lines)
- Same-type slides looking different (sub-agents making independent decisions without unified standards)
- Zigzag layouts sacrificing readability for "variety"
Sub-Agent Management
- Original images deleted by sub-agents making autonomous judgments — fixed by requiring manifest-by-manifest verification
- Sub-agents unaware of DESIGN_PRINCIPLES — added mandatory file reading before entry
- Fixes treating symptoms not causes — established root cause analysis habit
Case Study: ALM.pptx Before and After
Original: ALM.pptx (8.5MB, 32 slides) — Tungsten Investment Analysis for Almonty Industries (ALM.US)
Beautified: Charcoal and Gold design direction (~50MB)
Key Improvements
- Cover: Split green texture + white text replaced with unified dark background + gold geometric lines + clear hierarchy
- Tungsten Properties (Slide 3): Dark three-cards with no images replaced with three columns each featuring Gemini-generated thematic photos (smelting/wafer/military)
- Market Share (Slide 5): Crushed donut chart (oval) properly scaled to circular display
- Comparison Tables (Slide 10): Clashing orange-yellow gradient replaced with gold header + dark alternating rows
- Section Dividers: 4 inconsistent pages unified to identical centered title + gold subtitle
- Investment Recommendation (Slide 32): Messy green/red cards + gold background + decorative lines replaced with unified dark background + functional green/red borders (no decoration)
Content Integrity Verification
- Text diff: 4 differences across 32 slides — all spacing/line-break formatting (zero content loss)
- Tables: Original 3 preserved, Slide 24 image converted to programmatic table (more structured data)
- Images: 101 originals — 20 content images preserved + 9 Gemini replacements + 52 decorative items correctly removed + 18 pending restoration
Current Limitations and Future Direction
Current Limitations
- 18 content images (P0-P2 priority) still pending restoration
- Sub-agent quality variability — different judgments each run, needs stricter manifest comparison
- Gemini image randomness — same prompt produces slightly different styles
- Complex layouts difficult to programmatically recreate (e.g., geological cross-section multi-image collages)
- Font width rendering differences between PptxGenJS and actual PowerPoint
Future Direction
- Automated Agent Team verification with vision-based review
- Template library from completed cases for faster future beautification
- Enhanced Gemini integration — auto-decide needed images per slide
- Support for 4:3 and custom slide dimensions
- Full automation pipeline from input PPT to beautified output
- Continuous design quality learning through user feedback
Report Date: April 13, 2026