Skip to content

Testing

Tomas Tulka edited this page Aug 11, 2021 · 50 revisions

Tests should be coupled to the behavior of code and decoupled from the structure of code. (Kent Beck)

Program testing can be used to show the presence of bugs, but never to show their absence! (Edsger Dijkstra)

  • Testing is the nature of scientific theories (vs. mathematical) - it's falsifiable but not provable.

Programmers don’t get paid to write tests, they get paid to write code that works. (Kent Beck)

Table of contents

Properties of a valuable test

  • Isolated — tests should return the same results regardless of the order in which they are run.
  • Composable — if tests are isolated, then I can run 1 or 10 or 100 or 1,000,000 and get the same results.
  • Fast — tests should run quickly.
  • Inspiring — passing the tests should inspire confidence
  • Writable — tests should be cheap to write relative to the cost of the code being tested.
  • Readable — tests should be comprehensible for reader, invoking the motivation for writing this particular test.
  • Behavioral — tests should be sensitive to changes in the behavior of the code under test. If the behavior changes, the test result should change.
  • Structure-insensitive — tests should not change their result if the structure of the code changes.
  • Automated — tests should run without human intervention.
  • Specific — if a test fails, the cause of the failure should be obvious.
  • Deterministic — if nothing changes, the test result shouldn’t change.
  • Predictive — if the tests all pass, then the code under test should be suitable for production.

If you have a robust testing practice, you needn't fear change - you can embrace it as an essential quality of developing software.

  • Test only APIs never couple the test with the implementation.
  • API defines the unit for a unit test.
  • Tests are very detailed and concrete; and they always depend inward towards the code being tested.
  • You can think of tests as the outermost circle in the architecture.
  • Nothing within the system depends on the tests, and the tests always depend inward on the components of the system.
  • Testing is a cross-functional activity that involves the whole team and should be done continuously from the beginning of the project.
  • A good automated test suite should give you the confidence necessary to perform refactorings and even rearchitecting of your application knowing that if the tests pass, your application’s behavior really hasn’t been affected.
  • Unit tests, component tests, and deployment tests are written and maintained exclusively by developers.
  • Test the code that you change.
  • The ideal test should be atomic. Having atomic tests means that the order in which they execute does not matter, eliminating a major cause of hard-to-track bugs.
  • Unit of isolation is the test - tests mustn't impact each other and must be able to run in isolation. Don't mock components (except very expensive resources) from the rest of the system.
  • Organize tests so that each test’s data is only visible to that test.
  • Readability matters. Duplication is okay, if it improves readability.
  • A good test should stay inside itself, no references or trips into the environment.

TDD rules imply an order to the tasks of programming:

  1. Red — write a little test that doesn’t work, perhaps doesn’t even compile at first
  2. Green — make the test work quickly, committing whatever sins necessary in the process
  3. Refactor — eliminate all the duplication created in just getting the test to work

The more stress you feel, the less testing you will do. The less testing you do, the more errors you will make. The more errors you make, the more stress you feel. Rinse and repeat.

  • Testing is the first form of reuse.

Refactoring (without automated tools) is likely to result in errors, errors that you won’t catch because you don’t have the tests -- write tests for the whole thing and refactor the whole thing.

  • A test's name should summarize the behavior it is testing.
  • A good name describes both the actions that are being taken on a system and the expected outcome.

General TDD Cycle

  1. Write a test. Think about how you would like the operation in your mind to appear in your code. You are writing a story. Invent the interface you wish you had. Include all the elements in the story that you imagine will be necessary to calculate the right answers.
  2. Make it run. Quickly getting that bar green dominates everything else. If a clean, simple solution is obvious, type it in. If the clean, simple solution is obvious but it will take you a minute, make a note of it and get back to the main problem, which is getting the bar green in seconds. This shift in aesthetics is hard for some experienced software engineers. They only know how to follow the rules of good engineering. Quick green excuses all sins. But only for a moment.
  3. Make it right. Now that the system is behaving, put the sinful ways of the recent past behind you. Step back onto the straight and narrow path of software righteousness. Remove the duplication that you have introduced to get to quick green.

Agile Testing Principles

  • Provide Continuous Feedback
  • Deliver Value to the Customer
  • Enable Face-to-Face Communication
  • Have Courage
  • Keep It Simple
  • Practice Continuous Improvement
  • Respond to Change
  • Self-Organize
  • Focus on People
  • Enjoy

Whole-Team Approach

One of the biggest differences in agile development versus traditional development is the agile “whole-team” approach. With agile, it’s not only the testers or a quality assurance team who feel responsible for quality. We don’t think of “departments,” we just think of the skills and resources we need to deliver the best possible product. The focus of agile development is producing high-quality software in a time frame that maximizes its value to the business. This is the job of the whole team, not just testers or designated quality assurance professionals.

  • the whole team thinks constantly about designing code for testability

The whole-team approach involves constant collaboration. Testers collaborate with programmers, the customer team, and other team specialists—and not just for testing tasks, but other tasks related to testing, such as building infrastructure and designing for testability.

The Art of Software Testing

  • Testing is the process of executing a program with the intent of finding errors.
  • Don't test a program to show that it works; rather, start with the assumption that the program contains errors and test the program to find as many errors as possible.
  • Testing is a destructive, even sadistic, process, which explains why most people find it difficult.
  • The concept of a program without errors is basically unrealistic.
  • It is impractical, often impossible, to find all the errors in a program.
  • A test case that finds a new error can hardly be considered unsuccessful; rather, it has proven to be a valuable investment.
  • An error is clearly present if a program does not do what it is supposed to do; but errors are also present if a program does what it is not supposed to do.

Black-Box Testing

Since a program is a black box, the criterion is exhaustive input testing, making use of every possible input condition as a test case.

  • Exhaustive testing is impossible.

Even though black-box is preferable when writing tests, you can still use white-box method when analyzing the tests.

White-Box Testing

The ultimate white-box test is the execution of every path in the program, but complete path testing is not a realistic goal for a program with loops.

  • First, an exhaustive path test in no way guarantees that a program matches its specification.
  • Second, a program could be incorrect because of missing paths.
  • Third, an exhaustive path might not uncover data-sensitivity errors.

In conclusion, although exhaustive input testing is superior to exhaustive path testing, neither proves to useful because both are infeasible. Perhaps, then, there are ways of combining elements of black-box and white-box testing to derive a reasonable, but not airtight, testing strategy.

You can develop a reasonably rigorous test by using certain black-box oriented test-case design methodologies and then supplementing there test cases by examining the logic of the program, using white-box methods.

Test Pyramid

  • If a higher-level test spots an error and there's no lower-level test failing, you need to write a lower-level test.
  • Push your tests as far down the test pyramid as you can.
  • Delete high-level tests that are already covered on a lower level (given they don't provide extra value).
  • Replace higher-level tests with lower-level tests if possible.

If all your application does are basic CRUD operations with very few business rules or any other complexity, your test "pyramid" will most likely look like a rectangle with an equal number of unit and integration tests and no end-to-end tests.

Unit Testing

A unit test exercises the smallest piece of testable software in the application to determine whether it behaves as expected.

A test should tell a story about the problem your code helps to solve, and this story should be cohesive and meaningful to a non-programmer.

  • Unit tests shouldn't verify unit of code. Rather, they should verify units of behavior: something meaningful for the problem domain.
  • Number of classes to implement such a unit of behavior is irrelevant. It could span across multiple classes, only one class, or even take up just a tiny method.
  • Aiming at better code granularity isn't helpful. As long as the test checks a single unit of behavior, it's a good test. Targeting something less that that can in fact damage your unit tests, as it becomes harder to understand exactly what there tests verify.

Unit Testing

  • Unit testing alone doesn't provide guarantees about the behavior of the system.
  • Avoid asynchrony in unit tests - Asynchronous behaviors within the scope of a single test case make systems difficult to test.
  • Unit tests shouldn't be tied to your implementation too closely.
  • Private methods should generally be considered an implementation detail. That's why you shouldn't even have the urge to test them.
  • Unit tests are less useful in a setting without business complexity - they quickly descend into trivial tests. At the same, integration tests retain their value - it's still important to verify how code works in integration with other subsystems.
  • Don't test trivial code.
  • Test only concrete classes; don’t test abstract classes directly.
    • Abstract classes are implementation details.

There are three types of unit testing:

  1. Output-based
  2. State-based
  3. Communication-based

Integration Testing

An integration test verifies the communication paths and interactions between components to detect interface defects.

Integration Testing

  • Gateway integration tests (protocol level errors)

  • Persistence integration tests (assurance between the schema and the code)

  • Isolate access to the external system.

  • Have a configuration setting in your application that makes it talk to a simulated version of the external system.

  • If there's no way to run a third-party service locally you should opt for running a dedicated test instance and point at this test instance when running your integration tests.

  • Avoid integrating with the real production system in your automated tests.

  • Cover the longest happy path and any edge cases that can't be exercised by unit tests. The longest happy path is the one going thru all out-of-process dependencies.

Communication with managed dependencies is an implementation detail; use such dependencies as is in integration tests. Communication with unmanaged dependencies are part of your system's observable behavior; mock such dependencies out.

Component Testing

A component test limits the scope of the exercised software to a portion of the system under test, manipulating the system through internal code interfaces and using test doubles to isolate the code under test from other components.

Component Testing

Contract Testing

An integration contract test is a test at the boundary of an external service verifying that it meets the contract expected by a consuming service.

Contract Testing

  • The sum of all consumer contract tests defines the overall service contract

End-to-End Testing (GUI Tests)

An end-to-end test verifies that a system meets external requirements and achieves its goals, testing the entire system, from end to end.

End-to-End Testing

  • For many systems, some form of manual testing is desirable before release, even when you have a comprehensive set of automated tests.
  • Writing and maintaining end-to-end tests can be very difficult.
  • Write as few end-to-end tests as possible.
  • Rely on infrastructure-as-code for repeatability.
  • Make tests data-independent.
  • Testing your user interface doesn't have to be done in an end-to-end fashion.

Don't depend on volatile things. GUIs are volatile. Therefore, tests suites that operate the system through the GUI must be fragile.

Acceptance Testing (given-when-then)

Acceptance tests are executable specifications of the behavior of the software being developed. They should verify that your application delivers value to its users.

For each story or requirement there is a single canonical path through the application in terms of the actions that the user will perform. This is known as the happy path. This is often expressed using the form Given [a few important characteristics of the state of the system when testing begins], when [the user performs some set of actions], then [a few important characteristics of the new state of the system] will result.

  • The goal of the acceptance test stage is to assert that the system delivers the value the customer is expecting and that it meets the acceptance criteria.
  • Acceptance tests should be expressed in the language of the business (ubiquitous language).
  • Acceptance tests should be run when your system is in a production-like mode.

Testing microservices

  • Run the tested version of a microservice against the production infrastructure (don't duplicate/simulate all the dependencies for the test).

Test Doubles

  • Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
  • Fake objects actually have working implementations, but usually take some shortcut that makes them not suitable for production. A good example of this is the in-memory database.
  • Stubs provide canned answers to the calls made during the test, usually not responding at all to anything outside what’s programmed in for the test.
  • Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.

Only mock types that you own. Write your own adapters on top of third-party libraries that provide access to unmanaged dependencies. Mock those adapters instead of the underlying types.

Mocks help to emulate and examine outcoming interactions. These interactions are calls the system under test makes to its dependencies to change their state.

Stubs help to emulate incoming interactions. These interactions are calls the system under test makes to its dependencies to get input data.

A test that utilizes test doubles isolates the system under test from its dependencies so that the test does not execute code in the dependencies of the system under test.

  • Prefer realism over isolation.

State Testing

Classic style of running the code with fake objects and test the expected results.

  • assertThat("should be one", result, is(1));

Mock Testing

Behavior style of testing when a mock object tests whether an expected life cycle process was run.

  • verify(mockedList).add(1);

Only unmanaged ou-of-process mutable dependencies should be mocked (message bus, SMTP server, etc.).

JUnit 5

  • @Test methods don't have to public anymore.
  • assertThrows checks an exception is thrown from a call.
  • assertTimeout checks the timeout was not reached by a call.
  • @Tag annotation to compose a custom annotations.
  • @DisplayName to name a test.
  • @DisableIf disables the test under a condition.
  • TestInfo parameter to get information about the test within the test itself.
  • @RepeatedTest for repeating the test.
  • @ParametrizedTest for dynamical parametrization of the test.
  • Extensions support - Mocktio, Spring, Cucumber, Selenium, Docker, Android - @ExtendWith(SpringExtension.class)
  • Extension for Spring framework @SpringJUnitConfig allows to use Spring autowiring etc in the unit test.

Test anti-patterns

  • Second class citizens: Test code containing a lot of duplicated code, making it hard to maintain.
  • The free ride (also known as Piggyback): Instead of writing a new method to verify another feature/requirement, a new assertion is added to an existing test.
  • Happy path: It only verifies expected results without testing for boundaries and exceptions.
  • The local hero: A test dependent to some specific local environment. This antipattern can be summarized in the phrase It works in my machine.
  • The hidden dependency: A test that requires some existing data populated somewhere before the test runs.
  • Chain gang: Tests that must be run in a certain order, for example, changing the SUT to a state expected by the next one.
  • The mockery: A unit test that contains too much test doubles that the SUT is not even tested at all, instead of returning data from test doubles.
  • The silent catcher: A test that passes even if an unintended exception actually occurs.
  • The inspector: A test that violates encapsulation that any refactor in the SUT requires reflecting those changes in the test.
  • Excessive setup: A test that requires a huge setup in order to start the exercise stage.
  • Anal probe: A test which has to use unhealthy ways to perform its task, such as reading private fields using reflection.
  • The test with no name: Test methods name with no clear indicator about what it is being tested (for example, identifier in a bug tracking tool).
  • The slowpoke: A unit test which lasts over few seconds.
  • The flickering test: A test which contains race conditions within the proper test, making it to fail from time to time.
  • Wait and see: A test that needs to wait a specific amount of time (for example, Thread.sleep()) before it can verify some expected behavior.
  • Inappropriately shared fixture: Tests that use a test fixture without even need the setup/teardown.
  • The giant: A test class that contains a huge number of tests methods (God Object).
  • Wet floor: A test that creates persisted data but it is not clean up at when finished.
  • The cuckoo: A unit test which establishes some kind of fixture before the actual test, but then the test discards somehow the fixture.
  • The secret catcher: A test that is not making any assertion, relying on an exception to be thrown and reporting by the testing framework as a failure.
  • The environmental vandal: A test which requires the use of given environment variables (for instance, a free port number to allows simultaneous executions).
  • Doppelganger: Copying parts of the code under test into a new class to make visible for the test.
  • The mother hen: A fixture which does more than the test needs.
  • The test it all: Tests that should not break the Single Responsibility Principle.
  • Line hitter: A test without any kind of real verification of the SUT.
  • The conjoined twins: Tests that are called unit tests but are really integration tests since there is no isolation between the SUT and the DOC(s).
  • The liar: A test that does not test what was supposed to test.

References