Welcome to Issue #1!
Big Playwright release this month, some genuinely good migration war stories, and the AI testing space is finally separating into things that work vs things that looked good in the demo.
Glad you’re here.
Build reliable tests. Ship with confidence. 🙂
NEWS
Playwright 1.60 Is Out and It’s a Solid Release
locator.drop() finally makes drag-and-drop clean cross-browser. tracing.startHar() turns HAR recording into a proper tracing API. ARIA snapshots now carry bounding boxes for AI agents. And test.abort() lets you kill a test from inside a fixture or route handler, genuinely useful for guardrail enforcement. Heads-up: Currents had a brief compatibility break on 1.60, it’s patched now but check before upgrading.
State of the Playwright AI Ecosystem in 2026
I found this one worth bookmarking. Currents put together a clear map of where things actually stand: MCP server, planner, generator, healer agent loop, artifact-first debugging. Cuts through a lot of vendor noise. Also flags something important: full MCP runs burn ~114K tokens per task. The new @playwright/cli brings that to ~27K.
Playwright at 34M Weekly Downloads: the Gap Is Structural Now
Selenium-webdriver sits at ~2.1M on npm. ThoughtWorks moved Playwright to “Adopt.” Adobe, ING, NASA, all documented migrations. What I find more interesting than the download numbers: teams are now sharing what actually breaks during migration, not just why to bother.
AUTOMATION
Stop Hardcoding Your Shard Count in GitHub Actions
A setup job counts tests, computes shards, outputs JSON and downstream jobs consume it. No more “we added 80 tests and nobody updated the matrix.” One thing to remember: sharding splits files, not individual tests, so uneven file sizes will eat your speedup.
Enterprise Selenium Migration: It’s Not a Rewrite, It’s an Operating Model Change
Teams fail not because Playwright is weak but because they swap frameworks without changing how QA and eng collaborate or how CI feedback flows. And writing “Selenium-style Playwright” with explicit waits and PageFactory patterns leaves you with a worse suite than what you started with. I’ve seen this happen more than once.
Fix Flaky Tests Without Leaving Your Editor
CircleCI’s MCP server has a find_flaky_tests tool that pulls historical CI failure patterns into your AI assistant. Run a few builds, ask what’s flaky and why, get fix suggestions in context. Less tab-switching, faster diagnosis.
TOOLS & GITHUB
microsoft/playwright: Official MCP Server Now in the Main Repo
40+ browser tools via accessibility tree snapshots. LLMs read ~200 tokens per page state instead of processing screenshots. Works with VS Code Copilot, Cursor, Claude Desktop, Claude Code. One config block and it’s running.
playwright-mcp-demo: Generate, Run, and Self-Heal via AI
Kudos to Jay Yeluru for putting this together. Copilot (or Claude) inspects a live DOM through MCP, writes tests from scratch, self-heals broken locators. Comes with POM, typed test data, and a working GitHub Actions pipeline. Good scaffold even if you swap the AI.
playwright-wizard-mcp: Five-Step AI Scaffold for VS Code Copilot
Analyze → plan → configure → model → implement, chained as five MCP tools. The decomposition pattern is the useful part, cleaner than prompt-and-paste for anything beyond a single test file.
COMMUNITY INSIGHT
Flaky Tests: The Infrastructure Problem Teams Keep Blaming on Code
A recurring theme this month. Test infra treated as second-class. Reproducible environments plus monitoring equals roughly 20% fewer false failures per EdgeDelta’s data. GitLab’s public handbook has an actual triage model worth borrowing: auto-classifies failures as Flaky, Master-broken, or Unclear and routes to EMs automatically.
Top 8 Automation Trends for 2026, Joe Colantonio
Built on 40K+ survey responses and 50+ interviews. Better signal than most year-preview stuff, especially the part on where AI testing ROI is real vs still aspirational. Worth your time if you’re making H2 tooling decisions.
18+ Years in QA: The Mindset Shift That Actually Made Me Better
Started this discussion on r/QualityAssurance this week after
a conversation with a connection growing her QA career. The
core point: testing features is not the same as testing
products. Replies from mid-level testers reflecting on their
own turning points are worth reading. If you’ve had a similar
moment, drop it in the thread.
→ Join the discussion on Reddit
PRACTICAL TIP
Two Lines That Turn Flaky Detection Into a Hard Gate
Add failOnFlakyTests: !!process.env.CI to playwright.config.ts. Playwright marks a test flaky when it passes on retry after an initial fail. With this flag, that kills the build instead of quietly going green. Pair with retries: 2 on CI. Shipped in v1.52. Almost nobody has turned it on, which I still find surprising.
VIDEOS
Quality Engineering in the Age of AI, TestGuild IRL Miami (Mar 2026)
Live panel, practitioners not vendors. The part on where AI maintenance ROI is actually showing up vs where it’s still a demo is worth it alone. Skip to ~18 min past the opener.
Mobile Test Automation Is Broken: What Actually Fixes It
Good counterpoint to the AI-fixes-everything wave. Aditya Challa explains why LLM-generated tests fail on mobile specifically. Deterministic reliability issues most AI tooling wasn’t built to handle.
And Finally…
Spent time this week reviewing a team’s “AI-generated” test suite. Half the tests had hardcoded waits and no assertions worth trusting. AI is only as good as the engineer reviewing its output. We’re not at the “just ship it” stage yet. Maybe we never will be. 😅