Why Your CI Test Suite Keeps Getting Slower
Is your CI test suite getting slower? Dennis Martinez explains why this happens, which metrics to track and how to fix the worst offenders before execution time increases.
Playprom: Turn your Playwright test run into a time-series data
Want to see your Playwright test results over time? Lutfi Fitroh Hadi built Playprom — a reporter that emits test runs as metrics into Prometheus and Grafana.
Similarly, Vitali Haradkou created a handy solution for Playwright test reports that actually look good in email.
Testing LLM Outputs: A Hands-On Guide to DeepEval Metrics
Serhii Smetanskyi shares what each DeepEval metric actually does, what surprised him, and what he wishes someone had told him before starting. Great reference if you're testing LLM features.
Similarly, Katja Obring points out that The Hard Part of AI Evals Isn't the Tooling.
The Death of Determinism: How AI Forces Us to Rethink Testing
Padget Avery explains how testing AI differs from the traditional approach, requiring moving from strict assertions to thresholds, metrics, and human review in pipelines.
Moreover, Katja Obring shares thoughts on What I don't understand about AI evals (yet) and how The AI evals field chose a flawed tool and stuck with it.
AI and Testing: Using Local Models for Testing
Jeff Nyman continues a detailed series of articles on AI in testing, and shows how to set up and run a model on your machine to create automated tests with evaluation metrics. You can also read a follow-up article about Using Model Pipelines for Testing.
Build a Test Metrics Dashboard with Elasticsearch and Kibana
If you're looking for a way to track the results of your test automation tests over time, Oleksii Shamrai wrote a detailed guide to setting up a dashboard in Kibana.
Keeping tests valuable: Are the code coverage metrics reliable?
"When a metric becomes a goal, it stops being a good metric." — Goodhart's law.
Rafael Miguel explains how to use metrics, such as code coverage, right.
This is a great, thought-provoking article by Vernon Richards about why DORA metrics might not be perfect and an alternative approach to metrics.
Furthermore, Veronika Moran shares how they implemented QA metrics from scratch and Abhishek Verma gives practical advice on How I Use Data Analytics to Find Gaps in Test Coverage.
Beyond MTTR: 7 incident metrics that matter and 3 that don't
Have you heard of the DORA metrics? In this article, Ashley Sawatsky goes a step further describing several other metrics and explaining why to use them.
What's more, Rahul Parwal wrote about Testing & Quality Metrics — Pitfalls to Avoid.
How to apply DORA metrics for mobile development
The famous DORA metrics are now used by many engineering teams. Roldán Galán explains how they can work for mobiles, too.
Furthermore, Venkat Ramakrishnan shares some interesting insights from the report DORA Metrics 2022: Reliability And Delivery.