Introduction
Visual QA is one of those activities everyone agrees is important…right up until it becomes the bottleneck.
A page looks “basically right,” you’re under deadline, and that last review pass turns into a game of spot the difference: margin tweaks, heading sizes, tiny spacing inconsistencies that are easy to miss and painful to repeat across dozens (or hundreds) of pages.
In a recent Quito Lambda talk at Stack Builders, our team explored a practical approach to reducing manual visual QA time using AI-assisted development and pixel-based visual comparison: pulling a baseline from Figma, capturing the “about to go live” view from Adobe Experience Manager (AEM), and generating a visual diff report that shows exactly where the UI diverges.
Stack Builders works extensively with AEM and is an official Adobe Experience Manager partner, so this kind of workflow is directly aligned with the kind of enterprise-grade content operations we help teams modernize.
The Pain: Manual Visual QA Doesn’t Scale
If you’ve ever reviewed two screenshots that look identical, you know how this goes:
- A paragraph is shifted by ~40px.
- A heading is an H2 instead of an H3—visually “almost the same,” but not quite.
- Spacing changes by a couple of pixels, and nobody notices until a stakeholder does.
Manual checks are:
- Repetitive and tiring
- Time-consuming
- Inconsistent (different reviewers notice different things)
- Risky (small UI regressions slip through and show up in production)
And importantly, you repeat the same effort for every page, every time.
The Real-World Workflow: From Content to “Live”
In many organizations (especially those running AEM), the pipeline often looks like this:
- Content writing (messaging, paragraphs, structure)
- Design in Figma (layouts, tokens, components, specs)
- Authoring in AEM (drag-and-drop components, build pages from templates)
- Visual QA (verify AEM matches Figma)
- Publish (page goes live)
AEM is particularly powerful here because it enables non-developers to assemble pages using controlled templates and components, great for scale, but it also means small configuration differences can produce subtle visual drift.
The Goal: Faster QA, More Consistency, Better Evidence
The objective isn’t to “remove QA,” it’s to make QA more reliable and dramatically less manual.
A good automated approach should:
- Reduce the time spent visually inspecting pages
- Increase consistency across reviews
- Produce evidence (diff images + percentage change) that teams can act on quickly
This is where pixel-based visual comparison shines.
Pixel-Based Comparison: Simple Idea, Huge Leverage
At the core is a straightforward method:
- Capture Screenshot A (baseline, e.g., from Figma export)
- Capture Screenshot B (actual UI, e.g., AEM preview)
- Compare pixels (RGB values by position)
- Output:
- Diff image/heatmap
- Percent difference
- Optional: segmented diffs per section (header, hero, etc.)
This is a classic form of visual regression testing, where you compare screenshots to catch unintended UI changes.
Where AI Fits: Building the Tool Faster (and Better) with “Vibe Engineering”
A key theme from the talk was the difference between:
- Vibe coding: “Prompt it and ship it.”
- Vibe engineering: Use AI for speed, but keep engineering discipline—security, reliability, maintainability, and real-world scalability.
The AI helped accelerate:
- Rapid prototyping of integrations (Figma + AEM preview capture)
- Refactoring guidance
- Documentation generation
- Security improvements (e.g., safer credential/token handling)
But the takeaway was clear: AI is strongest when paired with experienced engineering judgment, setting constraints, reviewing outputs, and enforcing standards.
A Practical Architecture: Figma + AEM + Screenshot Diffing
A lightweight architecture for AI-assisted visual QA looks like this:
Inputs
- Figma: design source of truth
- AEM Preview: “view as published” preview before release
Pipeline
- Pull/export the relevant frame from Figma (via API)
- Use browser automation to load AEM preview and capture a screenshot
- Normalize:
- crop / resize
- reduce whitespace
- align viewport
- Compare images (pixel-by-pixel)
- Produce a report: baseline, actual, diff/heatmap, percent change
Example Tech Stack
- Node.js + TypeScript
- Express for APIs + Helmet for security headers
- Playwright (Chromium) for headless browser automation + screenshot capture
- Sharp for image preprocessing (crop/resize/cleanup)
- pixelmatch for pixel-based diffs
This combination is popular because it’s scriptable, fast, and easy to run locally or in CI.
What the Report Gives You (and Why it Matters)
Instead of “it looks off somewhere,” you get:
- A diff heatmap that pinpoints the UI drift
- A different percentage that helps establish thresholds
- A repeatable process that’s consistent across reviewers/pages
A “good” page might show ~3% difference (often driven by tiny nav or content mismatches), while subtle layout issues (like heading sizing + a 40px indentation) pushed the diff higher (~5%), and the heatmap immediately highlighted the problem areas.
This is the big win: you can move from subjective review to actionable evidence.
Why “AI Image Analysis” Didn’t Fully Replace Pixel Diffs (Yet)
We’ve also experimented with using an AI model to interpret differences more semantically (“this heading should be smaller,” “this padding is off”). That part didn’t work as reliably as hoped.
The likely reason: pure screenshot-based AI analysis can struggle to infer intent and structure unless it’s grounded in the design system and underlying specs.
Which leads to the most important next step…
Roadmap: From Pixel Diffs to Design-System Validation
Pixel diffs are powerful, but the long-term path is even better:
1) Tighten Your Design System Bridge (Figma ↔ Implementation)
If Figma tokens and component structure map cleanly to your code (or CMS components), you can validate:
- typography scales
- spacing rules
- component variants
- layout constraints
This reduces false positives and moves QA closer to “verify intent,” not just pixels.
2) Use Design Tokens Consistently
Define tokens once (e.g., “Small = 14px”) and ensure they’re respected across:
- Figma
- CSS / component library
- AEM component styles
3) Expand Breakpoints
Desktop-only diffs are a start. Add:
- tablet
- mobile
- responsive states
4) Batch Runs
Instead of page-by-page:
- run an entire path, site section, or folder of pages
- produce a consolidated report for review
5) Broaden CMS Compatibility
AEM is a great first target, but the concept generalizes to other CMS platforms.
Want to Make Visual QA Faster in Your AEM Pipeline?
If your team is authoring high volumes of pages in AEM and spending too much time on repetitive reviews, this kind of workflow can pay off quickly, especially once it’s wired into CI or editorial release processes.
Stack Builders works with organizations modernizing their AEM implementations and delivery pipelines.