Selenium and Visual Testing: The Complete Guide for 2026
Selenium Visual Testing: "An approach that uses the Selenium WebDriver framework to capture images of web interfaces and compare them to visual references, in order to detect unintentional regressions in the application's appearance."
Selenium is the most widely used automated testing framework in the world. With over 22,000 monthly searches for "selenium webdriver" alone, it's clear that the global QA community still revolves largely around this tool. And for good reason: since 2004, Selenium has defined what it means to "automate a browser."
But here's the problem nobody really wants to hear: Selenium was not designed for visual testing. It excels at functional testing — verifying that a button works, that a form submits data, that a page loads correctly. When it comes to verifying that the page looks the way it should, Selenium stares back at you with the puzzled expression of a robot asked to judge a beauty contest.
This article is not a tutorial that promises Selenium can do everything. It's an honest guide that explores what's possible, what's painful, and what deserves a better solution.
Why visual testing has become essential
For a long time, software testing came down to a binary question: "does it work?" Does the button submit the form? Does the page display the correct data? Does the user journey end with a successful payment?
These questions remain essential. But in 2026, they're no longer enough.
Modern users judge an application in under 50 milliseconds — that's how long the human brain takes to form a first visual impression, according to a study by Carleton University in Canada. A button that works but is offset by 20 pixels, text that overflows its container, a dark theme displaying black text on a gray background — all of this "works" in a functional sense, but destroys the user experience.
Visual testing bridges this gap. It verifies not what the application does, but what it shows. And in a world where interfaces change every sprint, detecting visual regressions automatically is no longer a luxury — it's a necessity.
The problem? Functional testing tools like Selenium were never designed for this.
What Selenium can do (and what it can't)
Selenium WebDriver is an extraordinary tool for what it was designed to do: drive a browser and interact with web elements. It can click, type, navigate, wait, and verify the presence or content of elements in the DOM.
What Selenium can do on the visual side boils down to exactly one thing: takeScreenshot(). This method captures the current browser state as a PNG image. That's it. No comparison, no diffs, no tolerance thresholds, no masking of dynamic areas.
It's a bit like being handed a camera and told "there you go, you're a professional photographer." The capture tool is there, but all the creative — and technical — work remains to be done.
Here's what Selenium cannot do natively:
- Compare two screenshots
- Detect visual differences between two versions of a page
- Manage reference images (baselines)
- Filter visual noise (anti-aliasing, animations, dynamic content)
- Generate visual diff reports
- Automatically update references when a change is intentional
To get all of that, you need to build or integrate. Let's look at the options.
Approach 1: screenshots and external comparison
The most basic approach — and the most painful — is to use Selenium's takeScreenshot() to capture images, then compare them using an image processing library.
The principle
You take a screenshot of your page with Selenium. You store it as a reference. The next time you run the test, you take a new screenshot and compare it pixel by pixel against the reference. If the differences exceed a certain threshold, the test fails.
Common comparison tools
Several open-source libraries enable image comparison: pixelmatch (JavaScript), Pillow or scikit-image (Python), ImageMagick (command line). Each has its strengths, but none is specifically designed for web interface testing.
What you have to build yourself
In practice, this approach forces you to become the architect of a mini visual testing framework. You must manage the storage and versioning of reference images, comparison logic with configurable thresholds, masking of dynamic areas (dates, ads, personalized content), handling of different resolutions and viewport sizes, actionable reports that show differences, and a workflow for updating references when a change is intentional.
The limitations
This is the approach that offers the most control, but also the one that demands the most investment. You spend more time maintaining your visual testing infrastructure than writing actual tests. And every browser update, every change in font rendering can generate false positives that drown out real regressions.
Imagine an assistant who, instead of helping you find errors in a document, spends their time underlining every comma and asking if it's in the right place. Technically rigorous, practically unbearable.
Approach 2: third-party plugins and libraries
To avoid building everything from scratch, the community has created libraries that add visual testing capabilities on top of Selenium. Among the best known:
Ashot (Java)
Ashot is a Java library that extends Selenium's screenshot capabilities. It enables full-page capture (including off-screen content via scrolling), image comparison with difference highlighting, and cropping by specific element.
It's the most popular option in the Java/Selenium ecosystem, but it remains a low-level tool. You have the bricks, but the house is yours to build.
needle (Python)
needle is a pytest plugin that integrates Selenium with image comparison. It compares screenshots of full pages or specific elements, uses Pillow for comparison, and integrates cleanly with pytest.
The project has experienced periods of inactivity, and documentation can be sparse. It's a reasonable choice for modest needs, but shows its limitations on real-world projects.
BackstopJS
BackstopJS is not exactly a Selenium plugin — it's a standalone tool that uses Puppeteer or Playwright under the hood. But it deserves mention because many Selenium teams add it as a complement for visual testing.
BackstopJS offers declarative JSON configuration, multi-viewport captures, an interactive HTML report, and an approve/reject workflow for changes.
It's probably the most mature open-source tool for visual testing. But it doesn't integrate directly with your existing Selenium tests — it's a separate tool with its own pipeline.
Common limitations of plugins
All these tools share fundamental limitations. They require development skills for setup and maintenance. They depend on open-source projects whose longevity is never guaranteed — one maintainer changes jobs and an entire ecosystem slows down. They add complexity to your stack without solving the structural problems of pixel-by-pixel comparison. And above all, they remain developer tools, inaccessible to the rest of the team.
Approach 3: integration with Applitools
The third path is integration with a dedicated visual testing SaaS. Applitools Eyes is the most well-known in this category.
The principle
Applitools provides an SDK that integrates directly with Selenium. Your existing Selenium tests are augmented with Applitools API calls that send screenshots to their cloud for comparison.
What Applitools brings
Applitools uses AI-based comparison technology (their "Visual AI") that is significantly smarter than pixel-by-pixel comparison. It understands page structure, ignores insignificant differences, and detects real regressions with a remarkably low false positive rate.
The cloud dashboard allows the entire team (not just developers) to see results, approve changes, and track visual test status.
The limitations
Comfort comes at a price. Applitools is a paid cloud service whose costs increase with screenshot volume. Your interface images leave your infrastructure to be processed on their servers — a sensitive point for some organizations. And you remain dependent on a third-party service: if Applitools is down, so are your visual tests.
The SDK integration also means you still need to write and maintain Selenium tests. You've simplified comparison, but not the creation of capture scenarios. It's better than doing everything yourself, certainly — like having GPS instead of a road map, the route is the same but you get lost less.
The verdict: Selenium is for functional testing
After exploring all three approaches, one conclusion is clear: Selenium is a functional testing tool, and trying to graft visual testing capabilities onto it is like mounting a bike rack on a motorcycle — it holds, but it wasn't the original idea.
Each approach has its merits, but none solves the fundamental problem: visual testing and functional testing are two different disciplines that deserve different tools.
Functional testing verifies behavior. It asks the question "does it do what it's supposed to do?" The DOM, events, data — that's Selenium's territory, and it excels there.
Visual testing verifies appearance. It asks the question "does it look like what it's supposed to look like?" Pixels, layouts, renders — that's territory Selenium visits as a tourist.
Continue using Selenium for your functional tests. It's the right tool. But for visual testing, seriously consider a tool designed from the ground up for that mission.
The dedicated visual testing alternative
Delta-QA exists precisely because visual testing should not be a cobbled-together byproduct of your functional tests. It's a dedicated tool, designed from the very first line for visual regression detection.
Here's what changes with a dedicated approach:
Truly no-code: you don't need Selenium, WebDriver, or any programming skills whatsoever. You point to your pages, Delta-QA does the rest. Your QA analyst, your designer, your product owner — everyone can launch and interpret visual tests.
Smart comparison: Delta-QA doesn't just compare pixels. It understands significant differences and filters technical noise (anti-aliasing, rendering variations, dynamic content). Result: fewer false positives, more real regressions detected.
Local execution: your screenshots stay on your infrastructure. No data sent to a third-party cloud, no dependency on an external service, no bill that swells with volume.
Free with no artificial limits: no "enterprise" tier to unlock essential features, no screenshot counter that forces you to ration your tests.
Complementary to Selenium: Delta-QA does not replace your Selenium functional tests. It complements them by covering the visual dimension that Selenium cannot address natively.
Visual testing is too important to be treated as a last-minute add-on. Your users see the interface before they interact with it. If that interface is visually broken, they won't stay long enough to discover that all your functional tests pass.
FAQ
Can Selenium do visual testing natively?
No. Selenium WebDriver allows you to take screenshots via takeScreenshot(), but it offers no native functionality for image comparison, visual reference management, or regression detection. Everything must be built or integrated with third-party tools.
What is the best library for visual testing with Selenium?
It depends on your ecosystem. In Java, Ashot is the most common choice. In Python, needle offers pytest integration. For a more complete solution, BackstopJS (which uses Puppeteer/Playwright rather than Selenium) is often preferred. None of these solutions is as integrated as what Playwright offers natively.
Is Applitools worth the cost for visual testing?
Applitools' Visual AI technology is impressive and significantly reduces false positives. For large enterprises with comfortable budgets and high requirements, it's a solid choice. For smaller teams or those concerned about data privacy, the costs and cloud dependency can be deal-breakers. Local, free alternatives like Delta-QA exist.
Should I abandon Selenium for visual testing?
No. Selenium remains excellent for functional testing and should continue to play that role. The idea is not to replace Selenium, but to complement it with a tool dedicated to visual testing. Both disciplines are complementary, and the best results come from using the right tool for each need.
Is visual testing really necessary if you have comprehensive functional tests?
Absolutely. Functional tests verify behavior (does the button work?) but not appearance (is the button visible, properly positioned, the right color?). According to the Web Almanac from HTTP Archive, layout issues represent a significant share of user-reported bugs — bugs that functional tests never detect.
How does Delta-QA compare to the Selenium + Applitools approach?
Delta-QA is no-code (no Selenium or SDK needed), local (no third-party cloud), and free. The Selenium + Applitools approach requires development skills, sends data to the Applitools cloud, and involves recurring costs. Delta-QA is designed for teams that want visual testing accessible to everyone, with no external dependency.
Ready to separate your visual tests from your functional tests?