Tests: Difference between revisions
(→Shared functions: plural) |
(→Self-test Developer Best Practices: typo/grammar fixes) |
||
(6 intermediate revisions by 2 users not shown) | |||
Line 26: | Line 26: | ||
See the file test/fntests.log for additional details. | See the file test/fntests.log for additional details. | ||
The summary indicates that most tests have passed as expected, 3 tests failed that were expected to pass, 6 tests failed that were expected to fail, and 38 tests were skipped due to Octave settings or configuration. | |||
To run tests in a specific file, one can simply specify the path instead of a function name: | To run tests in a specific file, one can simply specify the path instead of a function name: | ||
Line 32: | Line 34: | ||
== Writing tests == | == Writing tests == | ||
(Please see the latest version of the [https://octave.org/doc/latest/index.html Octave manual] for complete documentation on [https://octave.org/doc/v6.4.0/Test-and-Demo-Functions.html Test and Demo Functions].) | |||
Tests appear as <code>%!</code> blocks at the bottom of the source file, together with <code>%!demo</code> blocks. A typical m function file, will have the following structure: | Tests appear as <code>%!</code> blocks at the bottom of the source file, together with <code>%!demo</code> blocks. A typical m function file, will have the following structure: | ||
Line 89: | Line 93: | ||
=== Assert === | === Assert === | ||
{{codeline|%!assert}} lines are the simplest tests to write and also the most | {{codeline|%!assert}} lines are the simplest one-line tests to write and also | ||
common: | the most common: | ||
<syntaxhighlight lang="Octave"> | <syntaxhighlight lang="Octave"> | ||
Line 97: | Line 101: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
These are actually a shorthand version of | These are actually a shorthand version of {{codeline|%!test assert (foo (bar))}}, and {{codeline|assert}} is simply an Octave function that throws an error when two arguments fail to compare. A tolerance can be added to {{codeline|assert}} to pass results that are numerically not exactly equal: | ||
{{codeline|%!test assert (foo (bar))}}, and {{codeline|assert}} is simply | <syntaxhighlight lang="Octave"> | ||
an Octave function that throws an error when two arguments fail to compare. | %!assert (pi, 3.14159) # test fails as "Abs err 2.6536e-06 exceeds tol 0 by 3e-06" | ||
%!assert (pi, 3.14159, 1e-5) # test passes | |||
</syntaxhighlight> | |||
=== | === Error / Warning === | ||
While single {{codeline|%!assert}} lines are the most common used tests, {{codeline|%!test}} blocks offer more features and flexibility. The code within {{codeline|%!test}} blocks is simply processed through the Octave interpreter. If the code generates an error, the test is said to fail. Often {{codeline|%!test}} blocks end with a call to {{codeline|assert}}: | It is also important to test that a function performs its checks correctly | ||
and throws errors (or warnings) when it receives garbage. This can be done with | |||
{{codeline|error}} (or {{codeline|warning}}) blocks: | |||
<syntaxhighlight lang="Octave"> | |||
%!error foo () # test that causes any error | |||
%!error <BAR must be a positive integer> foo (-1.5) # test that throws specific error message | |||
%!error id=Octave:invalid-fun-call foo () # test that throws specific error id | |||
%!warning foo () # test that causes any warning | |||
%!warning <negative values might give inaccurate results> foo (-1.5) # test that triggers a specific warning message | |||
%!warning id=BAR:possibly-inaccurate-result foo (-1.5) # test that triggers a specific warning id | |||
</syntaxhighlight> | |||
These are actually shorthand versions of | |||
{{codeline|%!test fail ("foo()", "error message")}} and {{codeline|%!test fail ("foo()", "warning", "warning message")}}, where {{codeline|%!fail}} returns true if the supplied code returns an error or warning. | |||
=== Test Blocks=== | |||
While single {{codeline|%!assert}}, {{codeline|%!error}}, and {{codeline|%!warning}} lines are the most common used tests, {{codeline|%!test}} blocks offer more features and flexibility. The code within {{codeline|%!test}} blocks is simply processed through the Octave interpreter. If the code generates an error, the test is said to fail. Often {{codeline|%!test}} blocks end with a call to {{codeline|assert}}: | |||
<syntaxhighlight lang="Octave"> | <syntaxhighlight lang="Octave"> | ||
Line 124: | Line 149: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== | ==== Test for failure ==== | ||
If a warning or error message cannot be tested with one of the single-line tests mentioned above, the {{codeline|fail}} function can be used within a test block to verify expected error and warning functionality such as: | |||
{{codeline| | |||
<syntaxhighlight lang="Octave"> | <syntaxhighlight lang="Octave"> | ||
%! | %!test | ||
%! | %! a = [1 2 3]; | ||
%! | %! b = [1 2]; | ||
%! fail ("a + b", "nonconformant arguments") | |||
%! | %!test | ||
%! | %! a = 111; | ||
%! | %! b = 112; | ||
%! fail ("['foo', a, b]", "warning", "implicit conversion from numeric to char") | |||
</syntaxhighlight> | </syntaxhighlight> | ||
The tests above pass if the errors/warnings occur as expected. | |||
=== Shared functions === | === Shared functions === | ||
It is often useful to share a function | It is often useful to share a function among multiple tests. Sometimes | ||
these are only small helper functions, but more often these are just simpler | these are only small helper functions, but more often these are just simpler | ||
low performance implementations of the function being tested. These are | low performance implementations of the function being tested. These are | ||
Line 161: | Line 187: | ||
%! endfor | %! endfor | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== Expected Failures and Known Bugs === | |||
It is often the case that a test is developed that should pass for proper function, but is known to fail and cannot be immediately fixed. In this case, the test can be included to provide documentation of the failure and expected behavior for future correction using either {{codeline|%!test}} or {{codeline|%!xtest}} as described below. | |||
An {{codeline|%!xtest}} block is a test that is expected to fail. It is written and interpreted identically as a {{codeline|%!test}} block, with all of the same options available, but failure does not interrupt testing as a failing test would. Total {{codeline|%!xtest}} failures are counted in the XFAIL total of the test summary as shown above. | |||
The following test block: | |||
<syntaxhighlight lang="Octave"> | |||
%!test | |||
%! assert (1+1, 2) | |||
%!xtest | |||
%! assert (1+1, 3) | |||
</syntaxhighlight> | |||
will evaluate as: | |||
<syntaxhighlight lang="Octave"> | |||
***** xtest | |||
assert(1+1, 3) | |||
!!!!! known failure | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
PASSES 1 out of 2 test (1 known failure)</syntaxhighlight> | |||
An optional message can be added after {{codeline|%!test}} and {{codeline|%!xtest}} blocks that will be displayed if the test fails. For example: | |||
<syntaxhighlight lang="Octave"> | |||
%!test <good math> | |||
%! assert (1+1, 2) | |||
%!xtest <bad math> | |||
%! assert (1+1, 3)</syntaxhighlight> | |||
which can also be written as single line tests as: | |||
<syntaxhighlight lang="Octave"> | |||
%!test <good math> assert (1+1, 2) | |||
%!xtest <bad math> assert (1+1, 3)</syntaxhighlight> | |||
replaces the message: | |||
<syntaxhighlight lang="Octave"> | |||
!!!!! known failure </syntaxhighlight> | |||
with the message: | |||
<syntaxhighlight lang="Octave"> | |||
!!!!! known bug: bad math</syntaxhighlight> | |||
A {{codeline|%!test}} block followed by a message will only be displayed if that test fails: | |||
<syntaxhighlight lang="Octave"> | |||
***** test <good math> assert (1+1, 3) | |||
!!!!! known bug: good math | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
***** xtest <bad math> assert (1+1, 3) | |||
!!!!! known bug: bad math | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
PASSES 0 out of 2 tests (2 known bugs) | |||
</syntaxhighlight> | |||
If the <em><Message></em> is just a integer, Octave interprets this as a bug report id number that is expected to fail, and that {{codeline|%!test}} block is treated the same as an {{codeline|%!xtest}} block: | |||
<syntaxhighlight lang="Octave"> | |||
%!test <12345> assert (1+1, 3) | |||
%!xtest <12345> assert (1+1, 3) | |||
</syntaxhighlight> | |||
produces: | |||
<syntaxhighlight lang="Octave"> | |||
***** test <12345> assert (1+1, 3) | |||
!!!!! known bug: https://octave.org/testfailure/?12345 | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
***** xtest <12345> assert (1+1, 3) | |||
!!!!! known bug: https://octave.org/testfailure/?12345 | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
PASSES 0 out of 2 tests (2 known bugs) | |||
</syntaxhighlight> | |||
A {{codeline|%!test}} block with a <em><Message></em> that is an integer preceded by an asterisk (*) is interpreted as a bug report id number where the bug has been fixed. Octave's build process automatically checks the status of bug reports and adds the "*" in all source files that contain tests tagged with bug numbers. Such blocks failing on later tests are flagged as regressions: | |||
<syntaxhighlight lang="Octave"> | |||
%!test <*12345> assert (1+1, 3) | |||
</syntaxhighlight> | |||
produces: | |||
<syntaxhighlight lang="Octave"> | |||
***** test <*12345> assert (1+1, 3) | |||
!!!!! regression: https://octave.org/testfailure/?12345 | |||
ASSERT errors for: assert (1 + 1,3) | |||
Location | Observed | Expected | Reason | |||
() 2 3 Abs err 1 exceeds tol 0 by 1 | |||
PASSES 0 out of 1 test | |||
</syntaxhighlight> | |||
=== Self-test Developer Best Practices === | |||
* "Too many tests" is rarely a problem. | |||
* Input Validation: | |||
** When developing a function, it is good practice to include an "Input Validation tests" section that includes {{codeline|%!error}} tests for every input combination expected to produce an error, including the expected <error message> when possible. | |||
** Input validation should test the number of inputs/outputs, input type/class, and any special handling (valid option names, values, etc.), including the possibility of "empty" and "NaN" inputs. | |||
* Error coverage: Ideally every call to error() in the function would include a test to ensure it is correctly reached and executed when the condition occurs. | |||
* Code path coverage - Include tests for every primary code function and major combination of inputs to reduce the number of future bugs reported by users. | |||
* Tests should verify proper function output (or appropriately informative error message) produced by: | |||
** different input shapes - scalars, vectors, arrays, or empty ([], ones(0,1), and ones(1,0) are not necessarily the same!) | |||
** types - numbers, booleans, strings, cells, multi-level cells, cell strings, structs, etc. | |||
** contents - real or complex, NaN, Inf, etc. | |||
* Functions that primarily rely on calling other functions with their own tests do not necessarily need to repeat all of the same tests. However, it may still be beneficial to do so if changes to the called function could produce errors that would not otherwise be caught by other tests. | |||
* Floating point calculations can cause failure of {{codeline|%!assert}} tests expecting exact equality. Adding a tolerance on the order of {{codeline|%!eps}} or {{codeline|%!eps(variable)}} should rarely be a problem and is often sufficient to circumvent the issue. Depending on mathematical operations, it may be appropriate to use a tolerance several orders of magnitude larger than eps, but care should be taken in setting arbitrarily large tolerances that could hide actual calculation errors. | |||
* Tests that ensure code compatibility with Matlab are very valuable for reducing future bug reports for incompatible behavior. Because future behavior of Matlab functions can and do often change, and those changes often go unnoticed until user bug reports appear, it can be useful to note with a comment which version of Matlab the test was verified against (or what the latest release was if the compatibility test is based on the public facing documentation). As always, please ensure that test code avoids the use of any copyrighted material. | |||
* Because of the automatic processing and regression tracking, {{codeline|xtest}} should only be used when there is an expected failure that has no related bug report. It is preferred that all known bugs that cause function/test failures be reported at [https://bugs.octave.org bugs.octave.org] so that a bug ID# is generated and a {{codeline|%!test <12345>}} format block can be used instead. Tests added to confirm bug fixes should use a {{codeline|%!test <*12345>}} block for automated regression tracking. | |||
[[Category:Testing]] | [[Category:Testing]] | ||
[[Category:Development]] | [[Category:Development]] |
Latest revision as of 12:54, 17 March 2022
Having a thorough test suite is something very important which is usually overlooked. It is an incredible help in preventing regression bugs and quickly assess the status of old code. For example, many packages in Octave Forge become deprecated after losing their maintainer simply because they have no test suite.
GNU Octave has multiple tools that help in creating a comprehensive test suite, accessible to both developers and end-users, as detailed on the Octave manual. Basically, test blocks are %!test
comment blocks, typically at the end of a source file, which are ignored by the Octave interpreter and only read by the test
function.
Running tests[edit]
To run all the tests of a specific function, simply use the test
command at the Octave prompt. For example, to run the tests of the Octave function mean
type:
>> test mean PASSES 17 out of 17 tests
These tests are written in the Octave language at the bottom of mean.m
which defines the mean
function. It is important that these tests are also available for the end users so they can test the status of their installation. The whole Octave test suite can be run with:
>> __run_test_suite__ Integrated test scripts: [...] Summary: PASS 11556 FAIL 3 XFAIL 6 SKIPPED 38 See the file test/fntests.log for additional details.
The summary indicates that most tests have passed as expected, 3 tests failed that were expected to pass, 6 tests failed that were expected to fail, and 38 tests were skipped due to Octave settings or configuration.
To run tests in a specific file, one can simply specify the path instead of a function name:
test /full/path/to/file.m
Writing tests[edit]
(Please see the latest version of the Octave manual for complete documentation on Test and Demo Functions.)
Tests appear as %!
blocks at the bottom of the source file, together with %!demo
blocks. A typical m function file, will have the following structure:
## Copyright
##
## A block with the copyright notice
## -*- texinfo -*-
##
## A block with the help text
function [x, y, z] = foo (bar)
## some amazing code
endfunction
%!assert (foo (1))
%!assert (foo (1:10))
%!assert (foo ("on"), "off")
%!error <must be positive integer> foo (-1)
%!error <must be positive integer> foo (1.5)
%!demo
%! ## see how cool foo() is:
%! foo([1:100])
Tests can be added to oct functions in the C++ sources just as easily, see find.cc for example. The syntax is exactly the same, but done within C comment blocks. During installation, these lines are automatically extracted from the sources and special test scripts are generated. A typical C++ source file has the following structure:
// Copyright
//
// A block with the copyright notice
DEFUN_DLD (foo, args, ,
"-*- texinfo -*-\n\
A block with the help text")
{
// some amazing code
}
/*
%!assert (foo (1))
%!assert (foo (1:10))
%!assert (foo ("on"), "off")
%!error <must be positive integer> foo (-1)
%!error <must be positive integer> foo (1.5)
*/
Assert[edit]
%!assert
lines are the simplest one-line tests to write and also
the most common:
%!assert (foo (bar)) # test fails if "foo (bar)" returns false
%!assert (foo (bar), qux) # test fails if "foo (bar)" is different from "qux"
These are actually a shorthand version of %!test assert (foo (bar))
, and assert
is simply an Octave function that throws an error when two arguments fail to compare. A tolerance can be added to assert
to pass results that are numerically not exactly equal:
%!assert (pi, 3.14159) # test fails as "Abs err 2.6536e-06 exceeds tol 0 by 3e-06"
%!assert (pi, 3.14159, 1e-5) # test passes
Error / Warning[edit]
It is also important to test that a function performs its checks correctly
and throws errors (or warnings) when it receives garbage. This can be done with
error
(or warning
) blocks:
%!error foo () # test that causes any error
%!error <BAR must be a positive integer> foo (-1.5) # test that throws specific error message
%!error id=Octave:invalid-fun-call foo () # test that throws specific error id
%!warning foo () # test that causes any warning
%!warning <negative values might give inaccurate results> foo (-1.5) # test that triggers a specific warning message
%!warning id=BAR:possibly-inaccurate-result foo (-1.5) # test that triggers a specific warning id
These are actually shorthand versions of
%!test fail ("foo()", "error message")
and %!test fail ("foo()", "warning", "warning message")
, where %!fail
returns true if the supplied code returns an error or warning.
Test Blocks[edit]
While single %!assert
, %!error
, and %!warning
lines are the most common used tests, %!test
blocks offer more features and flexibility. The code within %!test
blocks is simply processed through the Octave interpreter. If the code generates an error, the test is said to fail. Often %!test
blocks end with a call to assert
:
%!test
%! a = [0 1 0 0 3 0 0 5 0 2 1];
%! b = [2 5 8 10 11];
%! for i = 1:5
%! assert (find (a, i), b(1:i))
%! endfor
Test for no failure[edit]
In a few cases, there is the situation where a function returns nothing, and the only thing to test is that it causes no error. This can be tested simply with:
%!test foo (bar)
Test for failure[edit]
If a warning or error message cannot be tested with one of the single-line tests mentioned above, the fail
function can be used within a test block to verify expected error and warning functionality such as:
%!test
%! a = [1 2 3];
%! b = [1 2];
%! fail ("a + b", "nonconformant arguments")
%!test
%! a = 111;
%! b = 112;
%! fail ("['foo', a, b]", "warning", "implicit conversion from numeric to char")
The tests above pass if the errors/warnings occur as expected.
[edit]
It is often useful to share a function among multiple tests. Sometimes
these are only small helper functions, but more often these are just simpler
low performance implementations of the function being tested. These are
created in %!function
blocks:
%!function x = slow_foo (bar)
%! ## a simple implementation of foo, definitely correct, but
%! ## unfortunately too slow for anything other than tests.
%!endfunction
%!assert (foo (bar), slow_foo (bar))
%!test
%! for i = -100:100
%! bar = qux (i);
%! assert (foo (bar), slow_foo (bar))
%! endfor
Expected Failures and Known Bugs[edit]
It is often the case that a test is developed that should pass for proper function, but is known to fail and cannot be immediately fixed. In this case, the test can be included to provide documentation of the failure and expected behavior for future correction using either %!test
or %!xtest
as described below.
An %!xtest
block is a test that is expected to fail. It is written and interpreted identically as a %!test
block, with all of the same options available, but failure does not interrupt testing as a failing test would. Total %!xtest
failures are counted in the XFAIL total of the test summary as shown above.
The following test block:
%!test
%! assert (1+1, 2)
%!xtest
%! assert (1+1, 3)
will evaluate as:
***** xtest
assert(1+1, 3)
!!!!! known failure
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
PASSES 1 out of 2 test (1 known failure)
An optional message can be added after %!test
and %!xtest
blocks that will be displayed if the test fails. For example:
%!test <good math>
%! assert (1+1, 2)
%!xtest <bad math>
%! assert (1+1, 3)
which can also be written as single line tests as:
%!test <good math> assert (1+1, 2)
%!xtest <bad math> assert (1+1, 3)
replaces the message:
!!!!! known failure
with the message:
!!!!! known bug: bad math
A %!test
block followed by a message will only be displayed if that test fails:
***** test <good math> assert (1+1, 3)
!!!!! known bug: good math
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
***** xtest <bad math> assert (1+1, 3)
!!!!! known bug: bad math
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 2 tests (2 known bugs)
If the <Message> is just a integer, Octave interprets this as a bug report id number that is expected to fail, and that %!test
block is treated the same as an %!xtest
block:
%!test <12345> assert (1+1, 3)
%!xtest <12345> assert (1+1, 3)
produces:
***** test <12345> assert (1+1, 3)
!!!!! known bug: https://octave.org/testfailure/?12345
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
***** xtest <12345> assert (1+1, 3)
!!!!! known bug: https://octave.org/testfailure/?12345
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 2 tests (2 known bugs)
A %!test
block with a <Message> that is an integer preceded by an asterisk (*) is interpreted as a bug report id number where the bug has been fixed. Octave's build process automatically checks the status of bug reports and adds the "*" in all source files that contain tests tagged with bug numbers. Such blocks failing on later tests are flagged as regressions:
%!test <*12345> assert (1+1, 3)
produces:
***** test <*12345> assert (1+1, 3)
!!!!! regression: https://octave.org/testfailure/?12345
ASSERT errors for: assert (1 + 1,3)
Location | Observed | Expected | Reason
() 2 3 Abs err 1 exceeds tol 0 by 1
PASSES 0 out of 1 test
Self-test Developer Best Practices[edit]
- "Too many tests" is rarely a problem.
- Input Validation:
- When developing a function, it is good practice to include an "Input Validation tests" section that includes
%!error
tests for every input combination expected to produce an error, including the expected <error message> when possible. - Input validation should test the number of inputs/outputs, input type/class, and any special handling (valid option names, values, etc.), including the possibility of "empty" and "NaN" inputs.
- When developing a function, it is good practice to include an "Input Validation tests" section that includes
- Error coverage: Ideally every call to error() in the function would include a test to ensure it is correctly reached and executed when the condition occurs.
- Code path coverage - Include tests for every primary code function and major combination of inputs to reduce the number of future bugs reported by users.
- Tests should verify proper function output (or appropriately informative error message) produced by:
- different input shapes - scalars, vectors, arrays, or empty ([], ones(0,1), and ones(1,0) are not necessarily the same!)
- types - numbers, booleans, strings, cells, multi-level cells, cell strings, structs, etc.
- contents - real or complex, NaN, Inf, etc.
- Functions that primarily rely on calling other functions with their own tests do not necessarily need to repeat all of the same tests. However, it may still be beneficial to do so if changes to the called function could produce errors that would not otherwise be caught by other tests.
- Floating point calculations can cause failure of
%!assert
tests expecting exact equality. Adding a tolerance on the order of%!eps
or%!eps(variable)
should rarely be a problem and is often sufficient to circumvent the issue. Depending on mathematical operations, it may be appropriate to use a tolerance several orders of magnitude larger than eps, but care should be taken in setting arbitrarily large tolerances that could hide actual calculation errors. - Tests that ensure code compatibility with Matlab are very valuable for reducing future bug reports for incompatible behavior. Because future behavior of Matlab functions can and do often change, and those changes often go unnoticed until user bug reports appear, it can be useful to note with a comment which version of Matlab the test was verified against (or what the latest release was if the compatibility test is based on the public facing documentation). As always, please ensure that test code avoids the use of any copyrighted material.
- Because of the automatic processing and regression tracking,
xtest
should only be used when there is an expected failure that has no related bug report. It is preferred that all known bugs that cause function/test failures be reported at bugs.octave.org so that a bug ID# is generated and a%!test <12345>
format block can be used instead. Tests added to confirm bug fixes should use a%!test <*12345>
block for automated regression tracking.