Compiled on: 2024-05-07 — printable version
Would you drive a car that has not been succefully passed its quality control?
As any engineered product, software should be subject to quality assurance control.
Quality assurance (in SE) is the set of activities and practices aimed at ensuring that a software product works and it is of good quality.
Insight: software works when it meets the requirements
Insight: software is good when it is
easy for developers to evolve or maintain it
Recall that good software should have many quality attributes
How to translate these attributes into quality assurance practices?
Verify that the software meets quality criteria.
Running an application manually is a form of testing: exploratory testing.
If there is a plan that can be followed step-by-step, then there is a program that can do it for you
As any engineering product, software can be tested at different levels of abstraction
Unit testing: test single software components
class
(or function
or module
) behavior the expected one?Integration testing: test an entire subsystem, i.e. the interplay among multiple components
A
uses class B
and C
. Are they working together as expected?End-to-end (or acceptance) testing: test an entire system (may involve aesthetics/usability criteria)
A well-maintained engineering product must have tests at all granularity levels
Creating automated test procedures makes the activity of testing very cheap (in terms of effort)
Being cheap, automated tests can serve as canaries in cold mines
Test failures are precious during development
The more granular the tests, the easier it is to spot and fix problems
Would you be comfortable with a car that passes the crash test
99.9%
of time, but on the0.1%
of the cases fails unexplicably?
Reproducibility is central for testing
(true for any engineering, but in particular for software)
(we will focus on Python, but the concepts are general)
the source code can now be conceinved as composed by two parts:
the test code is usually placed in a separate folder, and it is usually named tests/
(or test/
)
the dependencies of the project are now of two sorts:
developers may now want to launch not only the software, but also the tests
root_directory/
├── main_package/ # main package (i.e. directory for the main code)
│ ├── __init__.py
│ ├── sub_module.py
│ └── sub_package/
│ ├── __init__.py
│ └── sub_sub_module.py
├── tests/ # directory for the test code
│ ├── test_module_1.py
│ ├── ...
│ └── test_module_N.py
├── .python-version
├── README.md
├── requirements-dev.txt # file to list *development* dependencies
└── requirements.txt # file to list *main* dependencies
Important conventions:
all the test code should be placed in a directory named tests/
(or test/
)
the test code should be put into .py
files whose name starts with test_
requirements.txt
is for the main dependencies, requirements-dev.txt
is for the dev dependencies
requirements.txt
example:
Kivy>=2.3.0
requirements-dev.txt
example:
-r requirements.txt
pytest>=8.1.0
System under test (SUT): the component of the software that is being tested
class
, a function
, a module
Test case: a class that contains the test functions for a specific SUT
Test suite: a collection of test cases, commonly related to similar SUTs
test_*.py
fileAssertion: a boolean (i.e. either True
or False
) check about the SUT
True
, the assertion passes, and the test proceedsFalse
, the test fails, and it is interruptedTest procedure: a sequence of actions and assertions about some SUT
True
and no unexpected error occursWe adopt unittest
, a built-in library for writing tests in Python
JUnit
library for Javapytest
is a popular alternative (but it needs to be installed)unittest
Let’s assume this is the test_my_system.py
test suite (full code here)
⬇️
import unittest
# first test case
class TestMySystemUnderOrdinaryConditions(unittest.TestCase):
# initialization activities (most commonly, just initialises the SUT)
def setUp(self):
# activities to be performed BEFORE EACH test procedure
self.sut = MySystem() # sut instantiation
# test procedure 1
def test_initial_condition(self):
self.assertEquals(self.sut.my_attribute, 123) # assertion (my_attribute is initially 123)
self.assertEquals(self.sut.other_attribute, "foo") # assertion (other_attribute is initially "foo")
self.assertTrue(self.sut.is_ready()) # assertion (function is_ready returns True)
# test procedure 2
def test_do_something(self):
self.sut.do_something() # legitimate action
self.assertEquals(self.sut.my_attribute, 124) # assertion (my_attribute is 124 after do_something)
self.assertEquals(self.sut.other_attribute, "bar") # assertion (other_attribute is "bar" after do_something)
self.assertFalse(self.sut.is_ready()) # assertion (function is_ready returns False after do_something)
# test procedure 3
def test_do_something_bad(self):
with self.assertRaises(ValueError): # assertion (do_something_base raises ValueError)
self.sut.do_something_bad() # illegitimate action
# you can put as many test procedures as you want
# cleaning up activities (most commonly omitted, i.e. nothing to do)
def tearDown(self):
# activities to be performed AFTER EACH test procedure
self.sut.shutdown() # legitimate action
# second test case
class TestMySystemUnderSpecialConditions(unittest.TestCase):
# put other test proceedures here
# you can put as many test cases as you want
unittest
tests suitesMany assertion functions, cf.: https://docs.python.org/3/library/unittest.html#assert-methods
Many options to customise/parametrise your test suites, cf. https://docs.python.org/3/library/unittest.html
How to run tests:
python -m unittest discover -v -s tests
-v
stands for verbose (i.e. more detailed output)-s
stands for start directory (i.e. the directory where the tests are, in this case tests
)Effect of running all tests with subcommand discover
:
test_*.py
files in the tests/
directory (and its sub-directories) are loaded
unittest.TestCase
from those files are instantiated
test_
are executed
setUp
function is executed before each test functiontearDown
function is executed after each test functionunittest
Fork the following repository: https://github.com/unibo-dtm-se/testable-calculator
Clone the forked repository on your machine
Open VS Code into the testable-calculator
directory
Restore both dependencies and dev-dependencies
pip install -r requirements-dev.txt
unittest
Minimalistic: python -m unittest discover -s tests
..............
----------------------------------------------------------------------
Ran 14 tests in 0.478s
OK
(each dot represents a successful test procedure… not really clear, right?)
Verbose: python -m unittest discover -v -s tests
(notice option -v
)
test_cli_with_invalid_expression (test_cli.TestCalculatorCli.test_cli_with_invalid_expression) ... ok
test_cli_with_single_expression (test_cli.TestCalculatorCli.test_cli_with_single_expression) ... ok
test_cli_with_sliced_expression (test_cli.TestCalculatorCli.test_cli_with_sliced_expression) ... ok
[...]
test_expression_insertion (test_model.TestCalculatorUsage.test_expression_insertion) ... ok
----------------------------------------------------------------------
Ran 14 tests in 0.447s
OK
(one test per line: clearer)
unittest
Before:
After:
(if you cannot find the Test section, look at the next slide)
You probably have and old version of VS Code, and you should update it
⬇️ Meanwhile, you can follow this workaround ⬇️
Go to the Extensions section of VS Code
In the search bar of the Extensions section, type python tests
the first result should be the Python extension by Microsoft
the second result should be the Python Test Explorer extension by Little Fox Team
while installing, VS Code may look like this
Once the installation is complete, you should see the Test section in the Activity Bar on the side of the window
unittest
tests/test_model.py
file and listen to the teacher explanation
Calculator
class⬇️
import unittest
from calculator import Calculator
# test case testing what the effect of each method of the Calculator class is
# when executed on a fresh new Calculator instance
class TestCalculatorMethods(unittest.TestCase):
def setUp(self):
# here we create one "virgin" instance of the Calculator class (our SUT)
self.calculator = Calculator()
def test_initial_expression_is_empty(self):
# here we ensure the expression of a virgin Calculator is empty
self.assertEqual("", self.calculator.expression)
def test_digit(self):
# here we ensure that the digit method effectively appends one digit to the Calculator expression
self.calculator.digit(1)
self.assertEqual("1", self.calculator.expression)
def test_plus(self):
# here we ensure that the plus method effectively appends one "+" symbol to the Calculator expression
self.calculator.plus()
self.assertEqual("+", self.calculator.expression)
def test_minus(self):
# here we ensure that the minus method effectively appends one "-" symbol to the Calculator expression
self.calculator.minus()
self.assertEqual("-", self.calculator.expression)
def test_multiply(self):
# here we ensure that the multiply method effectively appends one "*" symbol to the Calculator expression
self.calculator.multiply()
self.assertEqual("*", self.calculator.expression)
def test_divide(self):
# here we ensure that the divide method effectively appends one "/" symbol to the Calculator expression
self.calculator.divide()
self.assertEqual("/", self.calculator.expression)
# test case testing the usage of the Calculator class
class TestCalculatorUsage(unittest.TestCase):
def setUp(self):
# here we create one "virgin" instance of the Calculator class (our SUT)
self.calculator = Calculator()
def test_expression_insertion(self):
# here we simulate the insertion of a simple expression, one symbol at a time...
self.calculator.digit(1)
self.calculator.plus()
self.calculator.digit(2)
# ... and we ensure the expression is as expected
self.assertEqual("1+2", self.calculator.expression)
def test_compute_result(self):
# here we simulate the insertion of an expression "as a whole",
# by setting the expression attribute of a virgin Calculator
self.calculator.expression = "1+2"
# ... and we ansure the compute_result method evaluates the expression as expected
self.assertEqual(3, self.calculator.compute_result())
def test_compute_result_with_invalid_expression(self):
# here we simulate the insertion of an invalid expression "as a whole"...
self.calculator.expression = "1+"
with self.assertRaises(ValueError) as context:
# ... and we ensure the compute_result method raises a ValueError in such situation
self.calculator.compute_result()
# ... and we also ensure that the exception message carries useful information
self.assertEqual("Invalid expression: 1+", str(context.exception))
unittest
Try to run tests via the terminal and via VS Code
Let’s now simulate the scenario where tests are failing (e.g. due to buggy code)
Calculator
in file calculator/__init__.py
to introduce a bug
__init__
function as follows:
def __init__(self):
self.expression = "0" # bug: the expression is not initially empty
Run the tests again: many tests should now fail
unittest
tests/test_gui.py
file and listen to the teacher explanation:CalculatorApp
classCalculatorGUITestCase
),
which adds
press_button(button_name)
)assert_display(expected_text)
)⬇️
import unittest
from calculator.ui.gui import CalculatorApp
# this is not a test case!
# it is a way to add custom actions, assertions, initialisation/clean-up activities to other test cases
class CalculatorGUITestCase(unittest.TestCase):
# default initialization activity (create & start the GUI, i.e. our SUT)
def setUp(self):
self.app = CalculatorApp() # create the GUI
self.app._run_prepare() # start the GUI
# re-usable action: presses a button on the GUI, given the button's text
def press_button(self, button_text):
self.app.find_button_by(button_text).trigger_action()
# re-usable assertion: checks the text displayed on the GUI is equal to the provided one
def assert_display(self, expected_text):
self.assertEqual(self.app.display.text, expected_text)
# default cleaning-up activity (stop the GUI)
def tearDown(self):
self.app.stop()
unittest
tests/test_gui.py
file and listen to the teacher explanation:CalculatorApp
classNotice that tests are based on custom _base class _(namely CalculatorGUITestCase
),
which adds
press_button(button_name)
)assert_display(expected_text)
)In particular, have a look to the TestExpressions
test case, and listen to the teacher explanation
⬇️
# this is a test case! (based upon the aforementioned base class)
class TestExpressions(CalculatorGUITestCase):
# test procedure: inserting and evaluating a simple integer expression
def test_integer_expression(self):
# insert symbols "1", "+", "2"
self.press_button("1")
self.press_button("+")
self.press_button("2")
# check the display shows "1+2"
self.assert_display("1+2")
# press the "=" button
self.press_button("=")
# check the display shows "3"
self.assert_display("3")
# test procedure: inserting and evaluating a simple float expression
def test_float_expression(self):
self.press_button("1")
self.press_button(".")
self.press_button("2")
self.press_button("+")
self.press_button("2")
self.assert_display("1.2+2")
self.press_button("=")
self.assert_display("3.2")
CalculatorApp
class’s public API has been extended with further functionalities:
find_button_by(text)
: a function returning the button widget with the given textdisplay
: an attribute referencing the display widget (it’s now public)Before:
After:
find_button_by(text: str)
: is necessary to make simulate buttons pressure in the tests_browse_children(container)
: is a private functionality, necessary to implement find_button_by
display
: is necessary to make assertions about the displayed text in the testsHow these novel functionalities are implemented in practice is not that relevant, but here it is:
class CalculatorApp(App):
# returns a sort of list of all the widgets directly or indirectly contained in the given container
def _browse_children(self, container):
yield container
if hasattr(container, 'children'):
for child in container.children:
yield from self._browse_children(child)
# returns the first widget in the GUI which 1. is a button and 2. whose text is equal to the given one
def find_button_by(self, text) -> Button:
for widget in self._browse_children(self.root):
if isinstance(widget, Button) and widget.text == text:
return widget
def build(self):
# ... (unchanged)
self.display = Label(text="0", font_size=24, size_hint=(3, 1))
# ... (unchanged)
# the rest of the class is unchanged
Take-away: when writing post-hoc tests (i.e., after the main code has been already written), it is often necessary to extend the public API of the SUT to make its internal state and functioning observable and controllable from the outside, and therefore testable
If you read them at the adequate abstraction level, each test case is telling a story about the SUT
e.g. TestCalculatorMethods
is telling the story of the Calculator
class
Calculator
object looks like when it is freshly instantiatedCalculator
object behaves when it is used to build an expressionCalculator
object behaves when it is used to evaluate an expressione.g. TestExpressions
is telling the story of the CalculatorApp
class (i.e. the GUI)
Take-away: the story you can picture in your mind when reading a test is a way to describe the test plan, the designer of the test suite was envisioning when writing the tests
Focus on the test_gui.py
file
Add one more test case for the GUI, say TestLayout
, which ensures that:
0
0
, 1
, 2
, 3
, 4
, 5
, 6
, 7
, 8
, 9
, +
, -
, *
, /
, =
, C
, .
Hints:
CalculatorGUITestCase
class and its functionalitiesassert_button_exists
, to CalculatorGUITestCase
One possible solution is in the next slide
⬇️ (please resist the temptation, and try to solve the exercise before looking at the solution) ⬇️
(also available on branch exercises/01-test-layout
of the testable-calculator
repository)
class CalculatorGUITestCase(unittest.TestCase):
# rest of the class is unchanged
def assert_button_exists(self, button_text):
self.assertIsNotNone(self.app.find_button_by(button_text))
class TestLayout(CalculatorGUITestCase):
buttons_to_test = {
'C',
'7', '8', '9', '/',
'4', '5', '6', '*',
'1', '2', '3', '-',
'.', '0', '=', '+',
}
def test_initial_display(self):
self.assert_display("0")
def test_buttons(self):
for button_text in self.buttons_to_test:
with self.subTest(button=button_text):
self.assert_button_exists(button_text)
subTest
?
Testing should be planned for in advance
A good test plan can guide the development, and should be ready early in the project
To plan a test, one might try to convert the requirements’s acceptance criteria into test cases
To plan unit tests, one might try to create test cases covering each aspect of the public API of the SUT
When designing cars, the crash testing procedure, the engine test bench, and so on are prepared well before the car prototype is ready!
The practice of:
Key-point: in TDD, tests are not only a form of validation, but also a form of specification
Customers ask for new features in the calculator:
- possibility to write expressions with parentheses (e.g.
(1+2)*3
)- possibility to write expressions with the square root function (e.g.
sqrt(4)
)- possibility to write expressions with the power function (e.g.
2**3
)
Extend the model’s test suite (i.e. file
test_model.py
which aims at testing the
Calculator
class)
Calculator
class’s public API should be envisioned (not realised)Extend the GUI’s test suite (i.e. file
test_gui.py
which aims at testing the
CalculatorApp
class, i.e. the GUI)
Launch your tests: it’s OK if novel tests fail at this stage
(also available on branch exercises/02-tdd-before-impl
of the testable-calculator
repository)
Test suite for the model (i.e. test_model.py
)
# other test cases are unchanged
class TestComplexExpressions(unittest.TestCase):
def setUp(self):
self.calculator = Calculator()
def test_expression_with_parentheses(self):
self.calculator.open_parenthesis()
self.calculator.digit(1)
self.calculator.plus()
self.calculator.digit(2)
self.calculator.close_parenthesis()
self.calculator.multiply()
self.calculator.digit(3)
self.assertEqual("(1+2)*3", self.calculator.expression)
self.assertEqual(9, self.calculator.compute_result())
def test_expression_with_sqrt(self):
self.calculator.digit(1)
self.calculator.plus()
self.calculator.square_root()
self.calculator.open_parenthesis()
self.calculator.digit(1)
self.calculator.digit(1)
self.calculator.minus()
self.calculator.digit(2)
self.calculator.close_parenthesis()
self.assertEqual("1+sqrt(11-2)", self.calculator.expression)
self.assertEqual(4, self.calculator.compute_result())
def test_expression_with_pow(self):
self.calculator.open_parenthesis()
self.calculator.digit(1)
self.calculator.plus()
self.calculator.digit(1)
self.calculator.close_parenthesis()
self.calculator.power()
self.calculator.digit(3)
self.assertEqual("(1+1)**3", self.calculator.expression)
self.assertEqual(8, self.calculator.compute_result())
(also available on branch exercises/02-tdd-before-impl
of the testable-calculator
repository)
Test suite for the GUI (i.e. test_gui.py
)
class TestExpressions(CalculatorGUITestCase):
# other test methods are unchanged
def test_expression_with_parentheses(self):
self.press_button("(")
self.press_button("1")
self.press_button("+")
self.press_button("2")
self.press_button(")")
self.press_button("*")
self.press_button("3")
self.assert_display("(1+2)*3")
self.press_button("=")
self.assert_display("9")
def test_expression_wit_sqrt(self):
self.press_button("sqrt")
self.press_button("4")
self.press_button(")")
self.assert_display("sqrt(4)")
self.press_button("=")
self.assert_display("2.0")
def test_expression_with_pow(self):
self.press_button("2")
self.press_button("**")
self.press_button("3")
self.assert_display("2**3")
self.press_button("=")
self.assert_display("8")
Now it’s time to implement the new features
In any case, once you are done, commit & push
One possible solution is on the
exercises/02-tdd-after-impl
branch of thetestable-calculator
repository
Developing without testing is unsustainable
Yet many software projects have no or minimal tests, as:
Common misconception: We do not have time (or money) for testing
Beware: testing saves times in the long run, not testing is a cost!
Technical debt is a concept in software development that reflects the implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer.
Not writing tests on ASAP, greatly increases the technical debt
Development is never really finished in real software projects
TDD practices help in keeping the technical debt under control
we never have the money to do it right but somehow we always have the fucking money to do it twice
$—$ UserInputSucks (@UserInputSucks) May 27, 2019
Decreasing preference order:
Ideal situation: always writing tests during design, before implementation
Common situation: design and implement, then write tests
Barely tolerable situation: design and implement, only add tests upon bugs
Very bad situation: never write tests
When a new bug (or a regression, namely, a feature that was working and it is now compromised) is discovered, resist the temptation to “fix” the issue right away
A more robust approach:
Motivations:
- the new test case prevents the issue from being mistakenly re-introduced again
- develop the test case before the fix will help the debugging process
Problem: how is it possible to test code that does not exist?
Clean boundaries: the component must have a well-defined interface with the rest of the world.
Clear scope: well engineered (software) components usually do one thing well.
How to test a new suspension system if the “surrounding” car is not ready (not even fully designed) yet?
How to test that our new rocket engine works as expected with no rocket?
How to test that our multi-engine rocket works as expected without payload?
The trick: simulate components that are not ready yet!
When writing software, components required for the execution that are not ready yet can be simulated if their API has been clearly defined
The simulated component are called test doubles
dummy: a (usually unimplemented) placeholder (e.g., unused mandatory argument)
stub: partly implemented dummy
spy: a stub that tracks information of the way it is being used
mock: a spy that expects to be used in a certain way, and fails if the expectation is unmet
fake: a fully implemented version of the component unsuitable for production
Why should the team “waste” time creating doubles instead of just writing the thing?
doubles are cheaper: dedicated libraries make doubles implementation extremely quick
unittest.mock
is included in the distribution, and Doubles is a valid alternative.doubles are simpler: only encode the behavior required to check some part of the behaviour.
Test-driven development is a practice that can help in keeping the technical debt under control
Tests act as a form of validation (ex-post), specification (ex-ante), and as sentinels (along the way)
Designing and implementing tests is a project-in-the-project
Patterns and strategies exists to design / implement tests, e.g. test doubles
Time for testing should be allocated in the project plan
Code coverage is a set of metrics that measure how much of the source code of a program has been executed when testing.
Common metrics:
Notice the coverage
(dev) dependency in the requirements-dev.txt
file
pip install -r requirements-dev.txt
Runs the tests while measuring coverage: coverage run -m unittest discover -v -s tests/
coverage run
instead of python
python -m coverage run -m unittest discover -v -s tests/
Check the coverage report in the terminal coverage report -m
the output should be similar to the one below:
Name Stmts Miss Cover Missing
------------------------------------------------------
calculator\__init__.py 38 3 92% 13, 39, 48
calculator\ui\cli.py 21 3 86% 16-17, 27
calculator\ui\gui.py 57 8 86% 55-57, 61, 63, 65, 69, 76
tests\test_cli.py 15 0 100%
tests\test_gui.py 30 0 100%
tests\test_model.py 38 0 100%
------------------------------------------------------
TOTAL 199 14 93%
pretty obscure, isn’t it?
Let’s try to create a more pleasant report, in HTML format: coverage html
htmlcov
folder in the current directoryhtmlcov/index.html
file is a static Web page, reporting the coverage of your projectOpen the htmlcov/index.html
file in your browser (any of the following may work)
start .\htmlcov\index.html
open htmlcov/index.html
xdg-open htmlcov/index.html
htmlcov/index.html
file in your
Explorer /
Finder /
Dolphin or Nautilus etc You should see an overview similar to the terminal one
If you click on a file, you may get a line-by-line report of test coverage
the actual information coverage provides is which code is partly tested or untested!
we know nothing of the testing quality on the covered part, but that control flow goes through
Useful metric, but it cannot be the only metric to evaluate testing
Use coverage as a hint for reasoning about what to test next
Use coverage to spot the untested parts of the testable-calculator
project
Add tests which cover the untested parts
After your reach 100% coverage (or close) as your-self:
“It works” is not good enough
(besides, the very notion of “it works” is debatable)
Syntactical correctness is the first level of quality assurance:
“Is the code well-formed, i.e. readable for a computer?
In compiled languages, the compiler checks for syntactical correctness
Python is an interpreted language, so there is no compiler
Syntactical correctness can be checked in Python by means of:
compileall
standard ($\approx$ included in Python by default) module
python -m compileall CODE_DIRECTORY_1 CODE_DIRECTORY_2 ...
In the testable-calculator
project, run python -m compileall calculator tests
you may notice an output similar to the following one:
Listing 'calculator'...
Listing 'calculator\\ui'...
Listing 'tests'...
which means that all .py
files in those directories were checked, and they are syntactically correct
Try to artificially add some syntax error in some Python file
Run python -m compileall calculator tests
again
Code analysis without execution is called static analysis.
Static analysis tools are often referred to as linters (especially those providing auto-formatting tools)
Idiomatic and standardized code:
Identification and reporting of patterns known to be problematic
mypy
Notice the mypy
(dev) dependency in the requirements-dev.txt
file
pip install -r requirements-dev.txt
Run mypy
on the testable-calculator
project
mypy calculator tests
calculator\__init__.py:13: error: Unsupported operand types for + ("str" and "int") [operator]
calculator\__init__.py:44: error: Parameterized generics cannot be used with class or instance checks [misc]
calculator\__init__.py:44: error: Argument 2 to "isinstance" has incompatible type "<typing special form>"; expected "_ClassInfo" [arg-type]
calculator\__init__.py:46: error: Returning Any from function declared to return "int | float" [no-any-return]
Found 4 errors in 1 file (checked 6 source files)
mypy
What are those errors?
Listen to the teacher explanation about the meaning of those errors
DRY: Don’t Repeat Yourself
General advice: never copy-paste your code
Instead of copy-pasting code, write a parametric function/class/module which can be re-used
Before (reliance on copy-paste):
def test_my_gui(self):
self.sut.find_button_by("1").trigger_action()
self.sut.find_button_by("2").trigger_action()
self.sut.find_button_by("3").trigger_action()
self.sut.find_button_by("4").trigger_action()
self.sut.find_button_by("5").trigger_action()
After refactor (no more duplication):
def press_button(self, text):
self.sut.find_button_by(text).trigger_action()
def test_my_gui(self):
for i in range(1, 6):
self.press_button(str(i))
Multi-language tool: Copy/Paste Detector (CPD) (part of PMD)
There exist a number of recommended services that provide additional QA and reports.
Non-exhaustive list: