A flaky test is a quality assurance (QA) test that fails to
produce consistent results. Typically, a flaky test using the same software
code and the same configuration will produce both a passing result and a
failing result.
Whenever new code is written to develop or update computer software, a web
page or an app, it needs to be tested for quality assurance. Ideally, each time
the code is tested, the results are consistent. The code will either work as
expected and pass the test, or not work as expected and fail the test.
Sometimes, however, QA tests on the exact same code, using the exact same
configurations, will produce inconsistent results. When this happens, the test
is labeled "flaky." Unfortunately, flaky tests are not uncommon --
Google, for example, reports that 16 percent of its tests show some level of
flakiness.
Flaky tests can be caused by various factors, including:
- an issue with the newly-written code
- an issue with the test itself
- some external factor that compromises the test results
Once a test is deemed flaky, there are different approaches to dealing
with the muddled results. Some developers will ignore the flakiness entirely,
assuming that the issue is with the test and not with the newly-written code.
Others will rerun their test multiple times and only go back to investigate
further if the test fails a certain number of times in a row, indicating to
them a true failure.
However, the safest approach -- the only way to truly find out whether
there is a bug in the code -- is to halt the development of the application,
fully investigate the cause of the flaky test and resolve it. If left
unresolved and there truly is an issue with the code, one problem has the
potential to wind up leading to another and another as more is built onto the
faulty code.
When investigating the cause of a flaky test, the developer will need to
gather data to try to discover differences within the seemingly random results
in order to isolate the cause of the failed tests. The code should be
re-examined, as should the test itself, and if no issues are found then
external factors will need to be looked at to see if they might be at the core
of the problem. The developer might look at whether the tests that passed were
run at a certain time of day whereas the ones that failed were run at a
different time of day, whether certain programs were running on the developer's
computer at the same time of failed tests that weren't running when the tests
passed or whether the tests that failed did so at the same point in the test or
at different times during the test.
Sometimes, the cause of the flaky test is simple to diagnose and can be
quickly fixed. That's the best-case scenario. Other times, there is no easy
fix, and though potentially costly and time-consuming, the developer may need
to delete the test and rewrite it from scratch in order to ensure the accuracy
of the test results.
Comments
Post a Comment