Automatic Testing with ROS

This page lays out the rationale, best practices, and policies for writing and running unit tests and integration tests. It primarily aims at a developer who has not used automatic regression tests, unit tests, and integration tests. However, even if you have experience from other projects, it is useful to read it for exposition to the ROS process. You should read it regardless whether you want to contribute code to ROS distribution, or just want to build a robot application using ROS.

If you are in a rush, and already know all ins and outs of automated test writing from other projects, here are some shortcuts to language-specific guidelines on writing tests, see:

gtest (C++) on writing tests for ROS packages using the Google Test project
unittest (Python) on writing tests for ROS packages using Python's unit testing framework
rostest on integrating your tests into ROS build and test setup. In most scenarios, you will need this, and one of the two tutorials above.

All the others, please continue. We will bring you to these details anyway, when the time is right.

Why automatic tests?

Consider the following scenario. You are fixing a bug in a core ROS package (let's say roscpp). You have created a correction, and test the patch on your own computer. If you automate the test, and submit it together with the patch, you will make it much easier for other developers to avoid reintroducing the bug in the first place. And the continuous integration service will be able to detect such regressions automatically. So automatic tests can actually make your and other developers' life easier. They require some time to setup, but they'll save you plenty of time in the future.

Here are some of the many good reasons why should we have automated tests:

You can make incremental updates to your code more quickly. We have hundreds of packages with many interdependencies, so it's hard to anticipate the problems a small change might cause. If your change passes the unit tests, you can be confident that you haven't introduced problems — or at least the problems aren't your fault.
You can refactor your code with greater confidence. Passing the unit tests verifies that you haven't introduced any bugs while refactoring. This gives you this wonderful freedom from change fear! You can actually make things good quality!
It leads to better designed code. Unit tests force you to write your code so that it can be more easily tested. This often means keeping your underlying functions and framework separate, which is one of our design goals with ROS code.
They prevent recurring bugs (bug regressions). It's a good practice to write a unit test for every bug you fix. In fact, write the unit test before you fix the bug. This will help you to precisely, or even deterministically, reproduce the bug, and much precisely understand what the problem is. As a result, you will also create a better patch, which you can then test with your regression test to verify that the bug is fixed. That way the bug won't accidentally get reintroduced if the code gets modified later on. Also, whoever should accept your patch in a pull request, they will be much more easy to convince that the problem is solved, and the contribution is of high quality.
They let you blame other people (contract-based development). A unit test is documentation that your code meets its expected contract. If your contribution passes the tests written by others, you can claim that you did your job right. If someone else's code fails tests, you can reject it as being not of sufficient quality.
Other people can work on your code more easily (an automatic form of documentation). It's hard to figure out whether or not you've broken someone else's code when you make a change. The unit tests are a tool for other developers to validate their changes. Automatic tests document your coding decisions, and communicate to other developers automatically about their violation. Thus tests become documentation for your code — a documentation that does not need to be read for the most time, and when it does need to be inspected the test system will precisely indicate what to read (which tests fail). By writing automatic tests you make other contributors faster. This improves the entire ROS project.
It is much easier to become a contributor to ROS if we have automated unit tests. It is very difficult for new external developers to contribute to your components. When they make changes to code, they are often doing it in the blind, driven by a lot of guesswork. By providing a harness of automated tests, you help them in the task. They get immediate feedback for their changes. It becomes easier to contribute to a project, and new contributors to join more easily. Also their first contributions are of higher quality, which decreases the workload on maintainers. A win-win!
Automatic tests simplify maintainer-ship. Especially for mature packages, which change more slowly, and mostly need to be updated to new dependencies, an automatic test suite helps to very quickly establish whether the package still works. This makes it much easier to decide whether the package is still supported or not.
Automatic tests amplifying Value of Continuous Integration. Regression tests, along with normal scenario-based requirements tests contribute to overall body of automated tests for your component. This increases effectiveness of the build system and of continuous integration (CI). Your component is better tested against evolution of other APIs that it depends on (CI servers will tell you better and more precisely what problems develop in your code).

Perhaps the most important benefit of writing tests is that tests make you a good man (or woman!). Tests influence quality in the long term. It is a well accepted practice in many open-source projects. Writing regressions tests, you contribute to long term quality of the ROS ecosystem. See how great you are!

Who should care?

Automation of test is beneficial across the board, for core ROS contributors, for component and driver developers, and for application developers. Any ROS project (including application and component process) that decide to use continuous integration, should seriously consider also using regression testing.

Tests and contributing code. Automated testing is particularly important to smooth the collaboration of many developers on a distributed open source project such as ROS. The problem is that often bugs are (re-)introduced by developers not knowing the context sufficiently well, and bugs keep returning the project (they are known as regression bugs), because the rationale for a certain decision in code is forgotten. The goal is to prevent future developers from reintroducing the bug. This is why, when you are contributing code to ROS, whether bug fixes, or new features, you have to include automatic unit tests. Unit tests are a required step in our QA process. Without them, your code is not considered cleared for release or code review. In fact, a code review reviews your unit tests as well as your code. Unit tests are an important part of any quality software product, but they are especially important for ROS as we are supporting a very diverse code base with many, many components. It simply isn't possible to debug without them.

Is this all coming for free?

Of course, there is never free lunch. To get the benefits, some investment is necessary. By accumulating a set of automatic tests, you are creating a test harness that is both instrumental to keep the main code in order, but also costs itself to be created and maintained.

Development cost. You need to develop a test, which sometimes may be difficult or costly. Sometimes it might also be nontrivial, as the test should be automatic. Things get particularly hairy if your tests should involve special hardware (they should not: try to use simulation, mock the hardware, or narrow down the test to a smaller software problem) or require external environment, for instance human operators.

Maintenace cost. Regression tests and other automatic tests need to be maintained. When the design of the component changes, a lot of tests become invalidated (for instance no longer compile, or throw runtime exceptions related to the API design). These tests fail not only because the redesign re-introduced bugs but also because they need to be updated to the new design. Occasionally, with bigger redesigns, old regression tests should be dropped.

An example

Let's analyze an example of a bug that has been fixed in the ros_comm package in the past: fix race condition that lead to miss first message #1054. This particular example involves Google's gtest framework (without rostest).

A submitter identified problem in a core package (ros_comm / roscpp).
He wrote a patch and tested it locally on his hardware.
In this particular case, another developer submitted a regression test exhibiting the problem.
The submitter created a Pull Request against indigo-devel (second-to-last LTS).
Maintainer #1 identifies small issues with submitted patch, proposes fixes. In particular he requests that the regression test is incorporated into the patch.
Maintainer #2 reviews the new PR by maintainer #1 and waits for the Continuous Integration server to complete the testrun.
Eventually (on another branch, see https://github.com/ros/ros_comm/pull/1058) the fix for the problem is merged together with the test. You can see in this pull request, that the results of the continuous integration tests have been checked.
The test remains active for future modifications of locks in this package.

Sometimes regression tests will involve large amounts of data (for instance ROS bags storing data from the failing execution). It is possible to make tests conditional, so that they are not called if this data is not available. This allows the other developers on your package to avoid running expensive tests, if they don’t wish it.

Main testing tools in ROS

For testing Python code at library level (at the Python API level) ROS projects should use Python’s unittest framework. For testing C++ code at the library level (at the C++ API level) Google test framework gtest should be used. For testing at the ROS node level, involving ROS as a communication middleware, rostest is used together with unittest or gtest. This applies both to single node tests, and tests that require integrating several nodes (technically known as integration tests, not unit tests).

It is key that the tests are not only automatic, but are integrated in the project scripts, so that they are run by the build and test infrastructure, whenever the project is being tested. To run the tests, you will need catkin/roslaunch integration (see the tutorial Integrate tests in Catkin and rostest/Writing ). This may involve introducing a build dependency on rostest in package.xml and including a launcher for the test in the test file.

Test nodes should be introduced using the tag into launch files. The rostest tool interprets these tags to start the nodes responsible for running node-level tests. (See [roslaunch/XML/test] ) Regarding the submission please refer to the pattern Submit a patch where git infrastructure for submission is discussed.

Levels of testing

Level 1. Library unit test (unittest, gtest): a library unit test should test your code sans ROS (if you are having hard time testing sans ROS, you probably need to refactor your library). Make sure that your underlying functions are separated out and well tested with both common, edge, and error inputs. Make sure that the pre- and post- conditions of your methods are well tested. Also make sure that the behavior of your code under these various conditions is well documented.

Level 2. ROS node unit test (rostest + unittest/gtest): node unit tests start up your node and test its external API, i.e. published topics, subscribed topics, and services.

Level 3. ROS nodes integration / regression test (rostest + unittest/gtest): integration tests start up multiple nodes and test that they all work together as expected. Debugging a collection of ROS nodes is like debugging multi-threaded code: it can deadlock, there can be race conditions, etc... An integration test is often the best way of uncovering the bugs.

The levels that your code needs to be tested at will depend on the type of code you are writing. Here are some examples:

ROS-free library (e.g. some math routines): this will likely level 1 testing
ROS service: most ROS services will only need to be tested at the levels 1 and 2. The level 2 testing should hammer the API as much as possible to make sure there are no underlying threading or performance issues.
ROS node publisher/subscriber: most ROS nodes will need to be tested at all three levels. Level 1 will test underlying functionality. Level 2 tests the functionality at the ROS level and level 3 attempts to uncover race and deadlock conditions.

For regression tests, it is always best to write them at the lowest possible level, where the problem is exhibited. If the problem is local in a library function, it is beneficial to write the test at the API level of this library. If the problem involves communication on a ros-topic, it is probably best written at the ROS node level. The reason for this is three-fold. First, lower-level tests are more efficient, involving less ROS infrastructure, and thus more efficient to execute. Fast execution of tests is beneficial both offline (on your machine) and in continuous integration. Second, lower-level tests localize the problem better, so when they fail, it is easier for new developers to diagnose what is going on. Third, the most localized tests, introduce least maintenance cost, as they have least dependencies.