Tests

From Octave
Revision as of 12:54, 17 March 2022 by Nrjank (talk | contribs) (→‎Self-test Developer Best Practices: typo/grammar fixes)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Having a thorough test suite is something very important which is usually overlooked. It is an incredible help in preventing regression bugs and quickly assess the status of old code. For example, many packages in Octave Forge become deprecated after losing their maintainer simply because they have no test suite.

GNU Octave has multiple tools that help in creating a comprehensive test suite, accessible to both developers and end-users, as detailed on the Octave manual. Basically, test blocks are %!test comment blocks, typically at the end of a source file, which are ignored by the Octave interpreter and only read by the test function.

Running tests[edit]

To run all the tests of a specific function, simply use the test command at the Octave prompt. For example, to run the tests of the Octave function mean type:

>> test mean
PASSES 17 out of 17 tests

These tests are written in the Octave language at the bottom of mean.m which defines the mean function. It is important that these tests are also available for the end users so they can test the status of their installation. The whole Octave test suite can be run with:

>> __run_test_suite__

Integrated test scripts:

[...]

Summary:

  PASS     11556
  FAIL         3
  XFAIL        6
  SKIPPED     38

See the file test/fntests.log for additional details.

The summary indicates that most tests have passed as expected, 3 tests failed that were expected to pass, 6 tests failed that were expected to fail, and 38 tests were skipped due to Octave settings or configuration.

To run tests in a specific file, one can simply specify the path instead of a function name:

 test /full/path/to/file.m

Writing tests[edit]

(Please see the latest version of the Octave manual for complete documentation on Test and Demo Functions.)

Tests appear as %! blocks at the bottom of the source file, together with %!demo blocks. A typical m function file, will have the following structure:

## Copyright
##
## A block with the copyright notice

## -*- texinfo -*-
##
## A block with the help text

function [x, y, z] = foo (bar)
  ## some amazing code
endfunction

%!assert (foo (1))
%!assert (foo (1:10))
%!assert (foo ("on"), "off")
%!error <must be positive integer> foo (-1)
%!error <must be positive integer> foo (1.5)

%!demo
%! ## see how cool foo() is:
%! foo([1:100])

Tests can be added to oct functions in the C++ sources just as easily, see find.cc for example. The syntax is exactly the same, but done within C comment blocks. During installation, these lines are automatically extracted from the sources and special test scripts are generated. A typical C++ source file has the following structure:

// Copyright
//
// A block with the copyright notice
 
DEFUN_DLD (foo, args, ,
"-*- texinfo -*-\n\
A block with the help text")
{
  // some amazing code
}
 
/*
%!assert (foo (1))
%!assert (foo (1:10))
%!assert (foo ("on"), "off")
%!error <must be positive integer> foo (-1)
%!error <must be positive integer> foo (1.5)
*/

Assert[edit]

%!assert lines are the simplest one-line tests to write and also the most common:

%!assert (foo (bar))      # test fails if "foo (bar)" returns false
%!assert (foo (bar), qux) # test fails if "foo (bar)" is different from "qux"

These are actually a shorthand version of %!test assert (foo (bar)), and assert is simply an Octave function that throws an error when two arguments fail to compare. A tolerance can be added to assert to pass results that are numerically not exactly equal:

%!assert (pi, 3.14159)          # test fails as "Abs err 2.6536e-06 exceeds tol 0 by 3e-06"
%!assert (pi, 3.14159, 1e-5)    # test passes

Error / Warning[edit]

It is also important to test that a function performs its checks correctly and throws errors (or warnings) when it receives garbage. This can be done with error (or warning) blocks:

%!error foo ()  # test that causes any error
%!error <BAR must be a positive integer> foo (-1.5)  # test that throws specific error message
%!error id=Octave:invalid-fun-call foo ()  # test that throws specific error id

%!warning foo ()  # test that causes any warning
%!warning <negative values might give inaccurate results> foo (-1.5)  # test that triggers a specific warning message
%!warning id=BAR:possibly-inaccurate-result foo (-1.5)  # test that triggers a specific warning id

These are actually shorthand versions of %!test fail ("foo()", "error message") and %!test fail ("foo()", "warning", "warning message"), where %!fail returns true if the supplied code returns an error or warning.

Test Blocks[edit]

While single %!assert, %!error, and %!warning lines are the most common used tests, %!test blocks offer more features and flexibility. The code within %!test blocks is simply processed through the Octave interpreter. If the code generates an error, the test is said to fail. Often %!test blocks end with a call to assert:

%!test
%! a = [0 1 0 0 3 0 0 5 0 2 1];
%! b = [2 5 8 10 11];
%! for i = 1:5
%!   assert (find (a, i), b(1:i))
%! endfor

Test for no failure[edit]

In a few cases, there is the situation where a function returns nothing, and the only thing to test is that it causes no error. This can be tested simply with:

%!test foo (bar)

Test for failure[edit]

If a warning or error message cannot be tested with one of the single-line tests mentioned above, the fail function can be used within a test block to verify expected error and warning functionality such as:

%!test
%! a = [1 2 3];
%! b = [1 2];
%! fail ("a + b", "nonconformant arguments")

%!test
%! a = 111;
%! b = 112;
%! fail ("['foo', a, b]", "warning", "implicit conversion from numeric to char")

The tests above pass if the errors/warnings occur as expected.

Shared functions[edit]

It is often useful to share a function among multiple tests. Sometimes these are only small helper functions, but more often these are just simpler low performance implementations of the function being tested. These are created in %!function blocks:

%!function x = slow_foo (bar)
%!  ## a simple implementation of foo, definitely correct, but
%!  ## unfortunately too slow for anything other than tests.
%!endfunction

%!assert (foo (bar), slow_foo (bar))

%!test
%! for i = -100:100
%!   bar = qux (i);
%!   assert (foo (bar), slow_foo (bar))
%! endfor

Expected Failures and Known Bugs[edit]

It is often the case that a test is developed that should pass for proper function, but is known to fail and cannot be immediately fixed. In this case, the test can be included to provide documentation of the failure and expected behavior for future correction using either %!test or %!xtest as described below.

An %!xtest block is a test that is expected to fail. It is written and interpreted identically as a %!test block, with all of the same options available, but failure does not interrupt testing as a failing test would. Total %!xtest failures are counted in the XFAIL total of the test summary as shown above.

The following test block:

%!test
%!  assert (1+1, 2)
%!xtest 
%!  assert (1+1, 3)

will evaluate as:

***** xtest
 assert(1+1, 3)
!!!!! known failure
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
PASSES 1 out of 2 test (1 known failure)

An optional message can be added after %!test and %!xtest blocks that will be displayed if the test fails. For example:

%!test <good math>
%!  assert (1+1, 2)
%!xtest <bad math>
%!  assert (1+1, 3)

which can also be written as single line tests as:

%!test <good math> assert (1+1, 2)
%!xtest <bad math> assert (1+1, 3)

replaces the message:

!!!!! known failure

with the message:

!!!!! known bug: bad math

A %!test block followed by a message will only be displayed if that test fails:

***** test <good math> assert (1+1, 3)
!!!!! known bug: good math
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
***** xtest <bad math> assert (1+1, 3)
!!!!! known bug: bad math
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 2 tests (2 known bugs)

If the <Message> is just a integer, Octave interprets this as a bug report id number that is expected to fail, and that %!test block is treated the same as an %!xtest block:

%!test <12345> assert (1+1, 3)
%!xtest <12345> assert (1+1, 3)

produces:

***** test <12345> assert (1+1, 3)
!!!!! known bug: https://octave.org/testfailure/?12345
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
***** xtest <12345> assert (1+1, 3)
!!!!! known bug: https://octave.org/testfailure/?12345
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 2 tests (2 known bugs)

A %!test block with a <Message> that is an integer preceded by an asterisk (*) is interpreted as a bug report id number where the bug has been fixed. Octave's build process automatically checks the status of bug reports and adds the "*" in all source files that contain tests tagged with bug numbers. Such blocks failing on later tests are flagged as regressions:

%!test <*12345> assert (1+1, 3)

produces:

***** test <*12345> assert (1+1, 3)
!!!!! regression: https://octave.org/testfailure/?12345
ASSERT errors for:  assert (1 + 1,3)

  Location  |  Observed  |  Expected  |  Reason
     ()           2            3         Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 1 test

Self-test Developer Best Practices[edit]

  • "Too many tests" is rarely a problem.
  • Input Validation:
    • When developing a function, it is good practice to include an "Input Validation tests" section that includes %!error tests for every input combination expected to produce an error, including the expected <error message> when possible.
    • Input validation should test the number of inputs/outputs, input type/class, and any special handling (valid option names, values, etc.), including the possibility of "empty" and "NaN" inputs.
  • Error coverage: Ideally every call to error() in the function would include a test to ensure it is correctly reached and executed when the condition occurs.
  • Code path coverage - Include tests for every primary code function and major combination of inputs to reduce the number of future bugs reported by users.
  • Tests should verify proper function output (or appropriately informative error message) produced by:
    • different input shapes - scalars, vectors, arrays, or empty ([], ones(0,1), and ones(1,0) are not necessarily the same!)
    • types - numbers, booleans, strings, cells, multi-level cells, cell strings, structs, etc.
    • contents - real or complex, NaN, Inf, etc.
  • Functions that primarily rely on calling other functions with their own tests do not necessarily need to repeat all of the same tests. However, it may still be beneficial to do so if changes to the called function could produce errors that would not otherwise be caught by other tests.
  • Floating point calculations can cause failure of %!assert tests expecting exact equality. Adding a tolerance on the order of %!eps or %!eps(variable) should rarely be a problem and is often sufficient to circumvent the issue. Depending on mathematical operations, it may be appropriate to use a tolerance several orders of magnitude larger than eps, but care should be taken in setting arbitrarily large tolerances that could hide actual calculation errors.
  • Tests that ensure code compatibility with Matlab are very valuable for reducing future bug reports for incompatible behavior. Because future behavior of Matlab functions can and do often change, and those changes often go unnoticed until user bug reports appear, it can be useful to note with a comment which version of Matlab the test was verified against (or what the latest release was if the compatibility test is based on the public facing documentation). As always, please ensure that test code avoids the use of any copyrighted material.
  • Because of the automatic processing and regression tracking, xtest should only be used when there is an expected failure that has no related bug report. It is preferred that all known bugs that cause function/test failures be reported at bugs.octave.org so that a bug ID# is generated and a %!test <12345> format block can be used instead. Tests added to confirm bug fixes should use a %!test <*12345> block for automated regression tracking.