|
| 1 | +# Scripture Forge End-to-End Tests |
| 2 | + |
| 3 | +## Testing philosophy |
| 4 | + |
| 5 | +### The testing pyramid |
| 6 | + |
| 7 | +The greater focus on integration tests rather than E2E tests in this version of Scripture Forge came from this Google |
| 8 | +developer blog post: https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html |
| 9 | + |
| 10 | +The main point is to use unit tests as much as possible, use integration tests for what unit tests can't cover, and use |
| 11 | +E2E tests for what only E2E tests can cover. This is mainly because unit tests are faster, more reliable, and pinpoint |
| 12 | +the source of the problem more accurately. |
| 13 | + |
| 14 | +It's not that E2E tests are bad, but that they come with trade-offs that should be considered. |
| 15 | + |
| 16 | +### A pyramid approach to the E2E tests themselves |
| 17 | + |
| 18 | +While the article above focuses on different types of tests, the same principle can be applied to the E2E tests |
| 19 | +themselves. Once a test is written, it may be possible to run that test across multiple browsers, viewport sizes, |
| 20 | +localization languages, and depending on the test type, different user roles. Being able to test every possible |
| 21 | +permutation of these variables is extremely powerful, but it also takes much longer to run (and the probability of |
| 22 | +failure due to flakey tests increases). |
| 23 | + |
| 24 | +Rather than choosing to always have a ton of tests or only a few tests, we can scale the number of tests based on the |
| 25 | +situation. In general, we should run as many tests as possible without sacrificing efficiency. In general this means: |
| 26 | + |
| 27 | +1. Pull requests should run as many E2E tests as can be run without slowing down the process (i.e. they need to be |
| 28 | + reliable and take no longer than the other checks that are run on pull requests) |
| 29 | +2. Pull requests that make major changes should have more tests run (this could be a manual step or somehow configured |
| 30 | + in the CI by setting a tag on the pull request). |
| 31 | +3. Release candidates should run as many tests as possible. |
| 32 | + |
| 33 | +### Other goals |
| 34 | + |
| 35 | +Instrumentation for E2E tests can be used for more than automated testing. We can also use it to create screenshots for |
| 36 | +visual regression testing, and to keep screenshots in documentation up to date and localized. Some of this would incur |
| 37 | +additional effort to implement, but the instrumentation should be designed with this future in mind. |
| 38 | + |
| 39 | +## Implementation |
| 40 | + |
| 41 | +Playwright is being used for the E2E tests. It comes with both a library for driving the browser, and a test runner. For |
| 42 | +the most part, I have avoided using the test runner, opting instead to use the library directly. This gives a lot more |
| 43 | +flexibility in controlling what tests are run. The Playwright test runner is powerful, allowing for permutations of |
| 44 | +tests to be run, and multiple browsers run in parallel. However, there are also scenarios where more flexibility is |
| 45 | +needed, such as when running smoke tests for each user role. The admin user needs to be able to create the share links |
| 46 | +that are then used for invitations, having one test use the output of another. |
| 47 | + |
| 48 | +I opted to use Deno rather than Node for the E2E tests, though this comes with some drawbacks (see the "Working with |
| 49 | +Deno" section below). |
| 50 | + |
| 51 | +There are two types of tests that have been created so far: |
| 52 | + |
| 53 | +- Smoke tests: The tests log in as each user role, navigate to each page, and take a screenshot on each. |
| 54 | +- Workflow: A specific workflow that a user may perform is tested from start to finish. |
| 55 | + |
| 56 | +## Running tests |
| 57 | + |
| 58 | +A test plan is defined in `e2e-globals.ts` as a "Preset". It defines which locales to test, which browser engines, |
| 59 | +user roles, whether to take screenshots, etc. It also defines which categories of tests should be run (e.g. smoke tests, |
| 60 | +generating a draft, community checking). When the tests are executed, the run sheet should be followed to the degree |
| 61 | +that is possible. For example, the smoke tests should test only the user roles specified in the run sheet, but |
| 62 | +certain tests are specific to a given role (for example, you have to be an admin to set up a community checking |
| 63 | +project), and won't need to consider the specified roles. |
| 64 | + |
| 65 | +To run the tests, make any necessary edits to the run sheet, then run `e2e.ts`. |
| 66 | + |
| 67 | +Screenshots are saved in the `screenshots` directory, in a subfolder specified by the run sheet. The default subfolder |
| 68 | +name is the timestamp when the tests started. |
| 69 | + |
| 70 | +A file named `run_log.json` is saved to the directory with information about the test run and metadata regarding each of |
| 71 | +the screenshots. |
| 72 | + |
| 73 | +## Other notes |
| 74 | + |
| 75 | +### Working with Deno |
| 76 | + |
| 77 | +Unfortunately, I have not found a good way to make Deno play nicely with Node and Angular. In VS Code, I always run the |
| 78 | +`Deno: Enable` command when working with files that will be run by Deno, and then run `Deno: Disable` when switching to |
| 79 | +other TypeScript files. When Deno is disabled the language server complains about problems in the files intended to be |
| 80 | +run by Deno, and when Deno is enabled the language server complains about problems in the other files. |
| 81 | + |
| 82 | +Hopefully a better solution is available. |
| 83 | + |
| 84 | +### Making utility functions wait for completion |
| 85 | + |
| 86 | +A utility function that performs an action should also wait for for any side effects of the action to complete before |
| 87 | +returning. For example, a function that deletes the current project should wait until the user is redirected to the my |
| 88 | +projects page before returning. This can be done by waiting for the URL to change. This has two main benefits: |
| 89 | + |
| 90 | +1. Whatever action runs next does not need to wait for the side effects of the previous action to complete. |
| 91 | +2. When failures occur (such as if the redirect following the deletion doesn't happen), it's much easier to determine |
| 92 | + where things went wrong, because the failure will occur in the function where the problem originated. |
| 93 | + |
| 94 | +### Recording tests |
| 95 | + |
| 96 | +In general it does not work well to just record a test with Playwright and then consider it a finished test. However, it |
| 97 | +can be much quicker to have Playwright record a test and then use that as a starting point. You can record a test by |
| 98 | +running `npx playwright codegen`, or by calling `await page.pause()` in a test, which stops execution and opens a second |
| 99 | +inspector window, which allows recording of tests, or using Playwright's locator tool. |
| 100 | + |
| 101 | +## Future plans |
| 102 | + |
| 103 | +Workflow tests that should be created: |
| 104 | + |
| 105 | +- Community checking |
| 106 | +- Editing, including simultaneous editing and change in network status |
| 107 | +- Serval admins |
| 108 | +- Site admins |
| 109 | + |
| 110 | +Other use-cases for the E2E tests: |
| 111 | + |
| 112 | +- Automated screenshot comparison |
| 113 | +- Localized screenshots for documentation |
| 114 | +- Help videos |
0 commit comments