Skip to main content

Flaky test


A flaky test is a quality assurance (QA) test that fails to produce consistent results. Typically, a flaky test using the same software code and the same configuration will produce both a passing result and a failing result.

Whenever new code is written to develop or update computer software, a web page or an app, it needs to be tested for quality assurance. Ideally, each time the code is tested, the results are consistent. The code will either work as expected and pass the test, or not work as expected and fail the test. Sometimes, however, QA tests on the exact same code, using the exact same configurations, will produce inconsistent results. When this happens, the test is labeled "flaky." Unfortunately, flaky tests are not uncommon -- Google, for example, reports that 16 percent of its tests show some level of flakiness.

Flaky tests can be caused by various factors, including:
  • an issue with the newly-written code
  • an issue with the test itself
  • some external factor that compromises the test results

Once a test is deemed flaky, there are different approaches to dealing with the muddled results. Some developers will ignore the flakiness entirely, assuming that the issue is with the test and not with the newly-written code. Others will rerun their test multiple times and only go back to investigate further if the test fails a certain number of times in a row, indicating to them a true failure.

However, the safest approach -- the only way to truly find out whether there is a bug in the code -- is to halt the development of the application, fully investigate the cause of the flaky test and resolve it. If left unresolved and there truly is an issue with the code, one problem has the potential to wind up leading to another and another as more is built onto the faulty code.

When investigating the cause of a flaky test, the developer will need to gather data to try to discover differences within the seemingly random results in order to isolate the cause of the failed tests. The code should be re-examined, as should the test itself, and if no issues are found then external factors will need to be looked at to see if they might be at the core of the problem. The developer might look at whether the tests that passed were run at a certain time of day whereas the ones that failed were run at a different time of day, whether certain programs were running on the developer's computer at the same time of failed tests that weren't running when the tests passed or whether the tests that failed did so at the same point in the test or at different times during the test.

Sometimes, the cause of the flaky test is simple to diagnose and can be quickly fixed. That's the best-case scenario. Other times, there is no easy fix, and though potentially costly and time-consuming, the developer may need to delete the test and rewrite it from scratch in order to ensure the accuracy of the test results.

Comments

Popular posts from this blog

Black swan

A  black swan event  is an incident that occurs randomly and unexpectedly and has wide-spread ramifications. The event is usually followed with reflection and a flawed rationalization that it was inevitable. The phrase illustrates the frailty of inductive reasoning and the danger of making sweeping generalizations from limited observations. The term came from the idea that if a man saw a thousand swans and they were all white, he might logically conclude that all swans are white. The flaw in his logic is that even when the premises are true, the conclusion can still be false. In other words, just because the man has never seen a black swan, it does not mean they do not exist. As Dutch explorers discovered in 1697, black swans are simply outliers -- rare birds, unknown to Europeans until Willem de Vlamingh and his crew visited Australia. Statistician Nassim Nicholas Taleb uses the phrase black swan as a metaphor for how humans deal with unpredictable events in his 2007...

A Graphics Processing Unit (GPU)

A graphics processing unit (GPU) is a computer chip that performs rapid mathematical calculations, primarily for the purpose of rendering images. A GPU may be found integrated with a central processing unit (CPU) on the same circuit, on a graphics card or in the motherboard of a personal computer or server. In the early days of computing, the CPU performed these calculations. As more graphics-intensive applications such as AutoCAD were developed; however, their demands put strain on the CPU and degraded performance. GPUs came about as a way to offload those tasks from CPUs, freeing up their processing power. NVIDIA, AMD, Intel and ARM are some of the major players in the GPU market. GPU vs. CPU A graphics processing unit is able to render images more quickly than a central processing unit because of its parallel processing architecture, which allows it to perform multiple calculations at the same time. A single CPU does not have this capability, although multi...

6G (sixth-generation wireless)

6G (sixth-generation wireless) is the successor to 5G cellular technology. 6G networks will be able to use higher frequencies than 5G networks and provide substantially higher capacity and much lower latency. One of the goals of the 6G Internet will be to support one micro-second latency communications, representing 1,000 times faster -- or 1/1000th the latency -- than one millisecond throughput. The 6G technology market is expected to facilitate large improvements in the areas of imaging, presence technology and location awareness. Working in conjunction with AI, the computational infrastructure of 6G will be able to autonomously determine the best location for computing to occur; this includes decisions about data storage, processing and sharing.  Advantages of 6G over 5G 6G is expected to support 1 terabyte per second (Tbps) speeds. This level of capacity and latency will be unprecedented and wi...