Wayground is an EdTech platform for teachers. With 20+ on-demand releases per day and no dedicated QA team, Wayground partnered with Empirical to build an automated testing pipeline that engineers, PMs, and designers all contribute to.
Challenge
Wayground has never had a dedicated QA team. Their philosophy: the people building the software, engineers, product managers, and designers, have the most context and are best positioned to test it. With 20+ on-demand releases per day, that approach demands serious automation.
But like any fast-moving startup, testing was always the first casualty. Engineers deferred writing tests in favor of shipping features. When the team did invest in automation, maintainability became the bottleneck. Code-based tests required constant upkeep and slowed the very velocity they were meant to protect.
“Just like it's true for any startup, the first casualty is testing. Engineers always say, 'I'll write the tests later.' That became a huge challenge for us.”
Solution
Wayground partnered with Empirical to fundamentally change how their team approaches testing. Instead of writing and maintaining test code, engineers describe test cases as natural-language prompts in a shared spreadsheet. Empirical’s platform generates them as Playwright tests, which a Playwright engineer on Empirical’s team reviews and maintains throughout the life of the test.
The rollout happened in three phases:
Reactive Monitoring
Started with scheduled runs against the production site using their existing small test suite. Even this initial setup caught bugs before users reported them.
Shift to Pre-Production
As confidence grew, Wayground moved Empirical runs into their preview environment. Issues were now caught before any code reached users.
CI-Integrated Quality Gates
Today, every PR triggers an Empirical run. 500+ tests execute in under 20 minutes, and a failure blocks the merge until the suite is green again.
“We didn't have to write code, we didn't have to maintain the code. We just had to write a prompt and the tests were generated. It was a huge unlock right away.”
Results
Here’s what changed:
Impact
Because test cases are expressed as natural-language descriptions of product use cases, Wayground’s product managers and designers now actively contribute to the test suite. The people with the deepest product context are writing the tests.
Empirical’s FDE team handles test flakiness, infrastructure maintenance, and ongoing optimization, freeing engineering bandwidth that would otherwise be consumed by test operations.
Looking ahead
With the rise of coding agents, Wayground’s engineers are developing and deploying more code than ever. The volume of changes has made automated quality gates not just valuable but essential. Empirical’s test suite serves as the guardrail ensuring increased velocity doesn’t come at the cost of the reliability teachers and students depend on.
“With coding agents, the amount our engineers can develop and deploy has gone up so much. These tests are the biggest guardrails for us in ensuring we deliver on our promise to teachers and students.”
Advice for teams considering Empirical
We asked Sandeep what he’d tell other engineering teams evaluating Empirical. His take was clear: start now, start small, and let the results build your conviction.
“The first few runs are going to be challenging. There's going to be flakiness. But once you get through those, the ceiling just keeps moving. The value unlock has been pretty amazing.”