There was a time when a ‘grown-up’ database that is capable of running a medium to large enterprise took months to test, and even towards the end was still finding bugs. Nowadays, we expect testing to be done more quickly but more often: This means that it has to be automated.
Database testing, and especially the automation of database tests, is one of the most technically-challenging aspects of database development, and too often is both remarkably unpopular and poorly executed, even in Agile development environments.
There is even a tendency amongst inexperienced acolytes of DevOps techniques to believe that testing is less important. The contrary is true. The burden of test is shifting ‘left’ into development and it is being done more frequently, and in more parts of the database lifecycle.
The misconception that the database is now less important has spawned some spectacular recent failures, most notably in the UK’s TSB Banking disaster. Whatever your development methodology or techniques, database testing remains as complex as database development and deserves the necessary resources.
Below we’ll have a look at four database testing types, namely user acceptance testing, integration testing, regression testing, and unit testing, and how they relate to MongoDB where applicable.
Integration and regression tests run on every identifiable process, rather than individual units. It is assumed, at this stage, that the unit works fine in isolation.
Integration testing aims to ensure that, when units of functionality are integrating, no errors are introduced, whereas regression testing is done after every change to ensure that these changes did not break units that were already tested.
Integration tests are necessary in the same way that the running of a mechanical clock is tested in addition to the clock’s individual components. However splendid a clock’s mechanism may be, it must still always tell the correct time.
These tests are best devised by people who are not actively involved in development as they require a different mindset and should be directed by the requirements stated in the
Integration tests are prepared together with the intended users to conform with the business model and must not change without sign-off. They will validate that a set of units works together, and the interfaces between them are correctly configured and deployed, so that they always perform the required process and produce the expected result.
For example, when a new user registers for your service, does a user object get saved to the database, along with the user’s default settings, and a ‘welcome’ email sent to the user offering a 10% discount on their first purchase?
Integration and regression tests run on every identifiable process, rather than individual units.
Integration testing and regression testing are usually performed on completion of every build. Where builds are regular and frequent, these tests must be scripted.
Normally, test staff will also do manual tests, many of them exploratory, that, if successful, will be automated and added to the test bank. These will include checks on any new processes that are introduced in the build.
Like unit tests, both integration and regression tests use a standard input and must check the output against the ‘correct’ output.
For example, when testing the purchase of an item in a ‘basket’, you will agree with the business what should happen, then set up integration tests to prove that every part of the purchasing process works as defined by the business, and all appropriate tables are updated as expected. This requires table values to be compared with the validated version.
The test data we need for integration testing generally should be realistic business data that conforms to experience, but changes as little as possible because it has to be cross-checked by the business to make sure of the validity of the result.
Whichever way one automates the integration tests, the results of the tests should be quickly and easily summarized and reported so that developers can quickly be alerted to any issues.
Unit testing is performed on each routine that accesses the database, in isolation, to ensure that it returns, in a timely manner, a predictable result for each specific set of inputs that is used.
What is a ‘unit’ of testing in MongoDB? It varies, but effectively you will be testing the
routines that access the database in logical groups.
Typically, there will be CRUD operations that, if done in a particular order with predictable data, in a collection, or on a group of related collections, will return a predictable result.
Will it scale? Will it still work when under high demand under intensive multi-user pressure? If a collection has JSON validation on insertion or alteration, this must be tested with a variety of inputs.
We need to be confident that it has the necessary validation constraints to ensure that only appropriate data can be inserted into it. Once the test is devised, then it should be possible to automate it for the longer term.
Middleware functions that take parameters and return results are also easy; the unit test will verify that it always returns the expected
data, in the expected structure, for the whole range of possible data or parameter values.
Other unit tests might, for example, ensure that aggregations perform rounding of figures in a way that conforms with industry standards. Unit tests will also perform basic resilience testing, to ensure that the module continues to respond correctly even if it encounters ‘dirty’, ‘unexpected’ or ‘difficult’ data, such as the names Mr Null or Mrs O’Brien, or negative currency items.
The easiest tests to automate are those that have no dependencies. The more references made by the object under test, and the more objects to which it refers, the more complicated the unit test, but its objective must remain a simple test of the serviceability of the object.
Unit tests are run throughout development whenever a logical ‘unit’ of the database application is created, and they are rerun every time a change is made to it, and before the change is checked in. No untested code should ever reach the regular build.
This might seem unnecessarily restrictive, but the discipline of developing the test before or alongside the code can bring clarity to the work. In test-driven development, the test harness precedes the development and is enhanced alongside the development of the object. It often checks for performance of the unit as well as its accuracy.
You wouldn’t want to use real data for unit testing. It is better done on small, unchanging, datasets.
Nowadays, I generally use data from open access databases and keep data sets with the database build code in source control. The developers will wish to include in these datasets many of the edge-case values that are generally added by the resilience/limit testers into their datasets, in order to avoid any such errors in
You wouldn’t want to use real data for unit-testing. It is better done on small, unchanging, datasets.
User-acceptance tests ensure that the database is aligned with the business objectives.
They are conducted by the customers of the database, usually people from the business who would use the system, developers of applications or downstream reporting systems, as well as training staff.
Operations will also check that the database has met all the requirements for maintainability.
There is a misconception that UAT can be left until the point of deployment.
The first problem with this is that a mistake in the business model, or a sudden change in business, such as a merger or change in business strategy, must be matched by a change in the development, as soon as possible.
The other problem is that materials for staff training often must be prepared well in advance of the release of a new or changed system. This means that user-acceptance tests should be part of the development process from early prototypes to ensure that the database is aligned as closely as possible to the business model.
Although the data should be realistic, it need not be derived from production data because a difference in execution plans is unlikely to be considered relevant. The test data sets and test scenarios are usually created in collaboration with the business, especially those people likely to be involved in subsequent staff training.
Because it is used for checking business processes, they are most effective if created by the business with technical help from the testers. The scenarios and test data are likely to be reused for subsequent training and usability tests.