|
3 | 3 | Graphics tests
|
4 | 4 | **************
|
5 | 5 |
|
6 |
| -TODO: a full description is pending, will be provided for release 1.12. |
| 6 | +The only practical way of testing plotting functionality is to check actual |
| 7 | +output plots. |
| 8 | +For this, a basic 'graphics test' assertion operation is provided in the method |
| 9 | +:method:`iris.tests.IrisTest.check_graphic` : This tests plotted output for a |
| 10 | +match against a stored reference. |
| 11 | +A "graphics test" is any test which employs this. |
| 12 | + |
| 13 | +At present (Iris version 1.10), such tests include the testing for modules |
| 14 | +`iris.tests.test_plot` and `iris.tests.test_quickplot`, and also some other |
| 15 | +'legacy' style tests (as described in :ref:`developer_tests`). |
| 16 | +It is conceivable that new 'graphics tests' of this sort can still be added. |
| 17 | +However, as graphics tests are inherently "integration" style rather than true |
| 18 | +unit tests, results can differ with the installed versions of dependent |
| 19 | +libraries (see below), so this is not recommended except where no alternative |
| 20 | +is practical. |
| 21 | + |
| 22 | +Testing actual plot results introduces some significant difficulties : |
| 23 | + * Graphics tests are inherently 'integration' style tests, so results will |
| 24 | + often vary with the versions of key dependencies, i.e. the exact versions of |
| 25 | + third-party modules which are installed : Obviously, results will depend on |
| 26 | + the matplotlib version, but they can also depend on numpy and other |
| 27 | + installed packages. |
| 28 | + * Although it seems possible in principle to accommodate 'small' result changes |
| 29 | + by distinguishing plots which are 'nearly the same' from those which are |
| 30 | + 'significantly different', in practice no *automatic* scheme for this can be |
| 31 | + perfect : That is, any calculated tolerance in output matching will allow |
| 32 | + some changes which a human would judge as a significant error. |
| 33 | + * Storing a variety of alternative 'acceptable' results as reference images |
| 34 | + can easily lead to uncontrolled increases in the size of the repository, |
| 35 | + given multiple independent sources of variation. |
| 36 | + |
| 37 | + |
| 38 | +Graphics Testing Strategy |
| 39 | +========================= |
| 40 | + |
| 41 | +Prior to Iris 1.10, all graphics tests compared against a stored reference |
| 42 | +image with a small tolerance on pixel values. |
| 43 | + |
| 44 | +From Iris v1.11 onward, we want to support testing Iris against multiple |
| 45 | +versions of matplotlib (and some other dependencies). |
| 46 | +To make this manageable, we have now rewritten "check_graphic" to allow |
| 47 | +multiple alternative 'correct' results without including many more images in |
| 48 | +the Iris repository. |
| 49 | +This consists of : |
| 50 | + |
| 51 | + * using a perceptual 'image hash' of the outputs (see |
| 52 | + <<https://github.com/JohannesBuchner/imagehash>) as the basis for checking |
| 53 | + test results. |
| 54 | + * storing the hashes of 'known accepted results' for each test in a |
| 55 | + database in the repo (which is actually stored in |
| 56 | + ``lib/iris/tests/results/imagerepo.json``). |
| 57 | + * storing associated reference images for each hash value in a separate public |
| 58 | + repository, currently in https://github.com/SciTools/test-images-scitools , |
| 59 | + allowing human-eye judgement of 'valid equivalent' results. |
| 60 | + * a new version of the 'iris/tests/idiff.py' assists in comparing proposed |
| 61 | + new 'correct' result images with the existing accepted ones. |
| 62 | + |
| 63 | +BRIEF... |
| 64 | +There should be sufficient work-flow detail here to allow an iris developer to: |
| 65 | + * understand the new check graphic test process |
| 66 | + * understand the steps to take and tools to use to add a new graphic test |
| 67 | + * understand the steps to take and tools to use to diagnose and fix an graphic test failure |
| 68 | + |
| 69 | + |
| 70 | +Basic workflow |
| 71 | +============== |
| 72 | +# If you notice that a graphics test in the Iris testing suite has failed |
| 73 | + following changes in Iris or any of its dependencies, this is the process |
| 74 | + you now need to follow: |
| 75 | + |
| 76 | +#1. Create a directory in iris/lib/iris/tests called 'result_image_comparison'. |
| 77 | +#2. From your Iris root directory, run the tests by using the command: |
| 78 | + ``python setup.py test``. |
| 79 | +#3. Navigate to iris/lib/iris/tests and run the command: ``python idiff.py``. |
| 80 | + This will open a window for you to visually inspect the changes to the |
| 81 | + graphic and then either accept or reject the new result. |
| 82 | +#4. Upon acceptance of a change or a new image, a copy of the output PNG file |
| 83 | + is added to the reference image repository in |
| 84 | + https://github.com/SciTools/test-images-scitools. The file is named |
| 85 | + according to the image hash value, as ``<hash>.png``. |
| 86 | +#5. The hash value of the new result is added into the relevant set of 'valid |
| 87 | + result hashes' in the image result database file, |
| 88 | + ``tests/results/imagerepo.json``. |
| 89 | +#6. The tests must now be re-run, and the 'new' result should be accepted. |
| 90 | + Occasionally there are several graphics checks in a single test, only the |
| 91 | + first of which will be run should it fail. If this is the case, then you |
| 92 | + may well encounter further graphical test failures in your next runs, and |
| 93 | + you must repeat the process until all the graphical tests pass. |
| 94 | +#7. To add your changes to Iris, you need to make two pull requests. The first |
| 95 | + should be made to the test-images-scitools repository, and this should |
| 96 | + contain all the newly-generated png files copied into the folder named |
| 97 | + 'image_files'. |
| 98 | +#8. The second pull request should be created in the Iris repository, and should |
| 99 | + only include the change to the image results database |
| 100 | + (``tests/results/imagerepo.json``) : |
| 101 | + This pull request must contain a reference to the matching one in |
| 102 | + test-images-scitools. |
| 103 | + |
| 104 | +Note: the Iris pull-request will not test out successfully in Travis until the |
| 105 | +test-images-scitools pull request has been merged : This is because there is |
| 106 | +an Iris test which ensures the existence of the reference images (uris) for all |
| 107 | +the targets in the image results database. |
| 108 | + |
| 109 | + |
| 110 | +Fixing a failing graphics test |
| 111 | +============================== |
| 112 | + |
| 113 | + |
| 114 | +Adding a new graphics test |
| 115 | +========================== |
0 commit comments