Generating Java Test Data With Instancio

Last Updated:  July 16, 2024 | Published: July 23, 2024

During unit and integration testing, the majority of our test cases begin with manually generating test data with static values, which leads to boilerplate code in our test suite:

The above code snippet might be a familiar sight for many of us. We often find ourselves manually creating test data, setting properties individually, and dealing with the verbosity that comes with it. This process becomes more annoying when dealing with complex objects with multiple fields, relationships, and collections.

This is where Instancio comes into play. Instancio is a Java library that simplifies and automates the process of test data generation. It provides a fluent API for creating instances of Java classes with ease:

That's it! With just a single line of code, Instancio generates a fully populated Instructor object with random values, including the associated Course object collection.

This is such a simple library that we could just finish the article right now but let's explore additional functionalities, extensions, and customization options that Instancio provides which allow us to fully control and fine-tune the test data generation process in our test suite.

The working code referenced in this article can be found on Github.

Benefits of Random Test Data

Using static, manually created data, we often end up testing only a limited set of scenarios. Randomized test data helps uncover edge cases that we might've missed.

By default, each time a test is executed, Instancio uses randomly generated values. This allows us to test our code with a wide range of inputs, increasing the chances of catching unexpected behavior and uncovering hidden bugs.

Additionally, by incorporating random test data generation into our test suite, we feed a diverse range of data combinations, which helps increase our test coverage! And who doesn't like that?

By embracing randomized test data generation, we become more confident in the reliability of our codebase. It acts as an additional safety shield, complementing our existing test cases and ensuring that our application handles the variety of inputs we give it gracefully.

Getting Started with Instancio in Java Projects

To get started with Instancio, we first need to add it as a dependency to our project.

If we're using Maven, here's how our pom.xml file would look like:

The latest version of Instancio can be fetched from Maven Central.

We also include the Spring Boot Starter Test dependency a.k.a. the Swiss Army knife for testing, to have access to the basic testing toolbox, as it transitively includes JUnit and other utility libraries that we'll require to write assertions and run our tests.

However, Instancio is not limited to JUnit 5 or Spring Boot. When using a different testing framework or running tests standalone, we can include the instancio-core dependency instead.

Ignoring Specific Fields with Instancio

In some test cases, we may want to exclude certain fields from being populated. This can be useful when dealing with conditionally required fields in our test scenarios.

Instancio provides a handy method to specify the fields that should be left untouched during the data generation process:

In the above example, we use the field() method to select the middleName and referralCode fields of our UserCreationRequestclass and pass them to the ignore() method.

Instancio will create an instance of UserCreationRequest with all fields populated except for the two fields we've specified, which will be left as null.

Alternatively, we might want to test our code's behaviour when certain fields are randomly populated or left as null:

By using the withNullable() method, Instancio will randomly decide whether to populate the specified fields with a value or set them as null during the data generation process.

It's important to note that we've used the @RepeatedTest annotation instead of the generic @Test annotation in order to execute our test case multiple times and verify the behaviour of our application code with different combinations of populated and null fields.

Customizing Specific Fields

We've discussed how to ignore fields, but a more common use case is customizing fields based on the business logic that our application runs on. This can include both setting a static value for certain fields to test a scenario, or changing the manner in which a random field is generated.

Instancio supports these use cases and allows customizing the values of fields during data generation. Let's take a look at how we can configure a static value to a field:

In this example, we use the set() method to explicitly set the hatedLanguage field of our UserCreationRequest object to “Java”. The remaining fields will still be populated with random values.

But, in scenarios where we still want the generated data to be random but adhere to some of our custom constraints, we can use Instancio's generate() method. Let's take a look at a more complex example:

In this example, we use the generate() method to customize the generation of specific fields of our User record. We provide expressions to define the generation logic for each field using Instancio's fluent API, which allows us to generate values such as lorem ipsum text, random integers within a range, dates in the past, credit card numbers, and random values selected from predefined options.

We then use assertions to verify that the generated data meets the criteria we've defined.

By combining static values and custom generators, we can create test data that closely mimics our application's business logic and thoroughly test its behavior.

Reproducing Failed Tests

We've already discussed that for every test execution, Instancio injects random values to our variables. However, when a test fails, we need a way to reproduce the failure using the same generated data for debugging. Instancio addresses this concern by providing a way to reproduce failed tests using seed values.

A seed value is a long number that's used as an input to generate random values by Instancio. By using the same seed value, we can ensure that identical values are generated across multiple test runs.

To use this feature, we'll need to extend our test class with the InstancioExtension:

If our test fails, Instancio will log the seed value used to generate the test data:

To reproduce and fix our failed test, we can annotate our test method with the @Seed annotation and provide the logged seed value:


However, it's important to remember to remove the @Seed annotation before pushing our code. Leaving the seed value in our test code, makes it run with the same data each time, which defeats the whole purpose. This feature should be leveraged for local debugging and reproduction of failed tests only.

Overriding Default Settings for Test Data Generation

Instancio defines a list of default settings that are used during the random data generation process. Let's look at a few of these:

  • The generated Strings are alphabetical
  • The generated Strings do not exceed 10 characters
  • A generated list does not contain more than 6 elements

However, there might be situations where we need to update these settings to fit our specific requirements. Instancio provides the flexibility to do so, by allowing us to override the default settings at different levels.

Global Override

If we want to apply certain settings across our entire test suite, we can define them globally by creating an instancio.properties file and placing it in our src/test/resources directory. Instancio will automatically pick up the defined settings and use them instead of the defaults.

For example, to override the default settings mentioned earlier, we can add the following lines to our instancio.properties file:

With these global settings in place, Instancio will generate alphanumeric strings up to 200 characters long and limit generated collections to a maximum of 50 elements:

This approach is particularly useful when we have common configurations that we want to apply consistently across our entire test suite. It provides a centralized way to manage settings and externalizes it from our code.

Per-Class Override

In scenarios where we want to apply custom settings to a specific test class, we can use the @WithSettings annotation in combination with the InstancioExtension. Let's take a look at how this can be done:

By annotating a Settings field with @WithSettings, Instancio will use the specified settings when running test cases inside our class.

This approach takes priority over the default and global settings defined in the instancio.properties file and allows us to customize the behavior for a single or group of test classes without affecting the rest of our test suite.

Per-Object Override

If we want an even more granular control over the generated data for a specific object in a test case, Instancio provides the withSetting()method that can be invoked within the fluent API. This allows us to customize the settings on a per-object basis, taking precedence over the default, global, or class-level settings:

In our test case, we use the withSetting() method to specify that the STRING_TYPE should be set to DIGITS for the User record being created and then assert this behavior on the bio field.

It's worth noting that we can chain multiple withSetting() calls to override multiple settings for an object.

Alternatively, we can use the withSettings() method to pass a Settings object defined at class level:

Bean Validation with Instancio

Bean Validation is the widely adopted standard to implement validation logic in the Java and Spring ecosystem. Instancio also provides support for generating valid test data based on the declared Bean Validation annotations.

This allows us to ensure that the generated data adheres to the validation constraints defined on our domain objects.

To enable data to be generated based on Bean Validation annotations, we need to set the key BEAN_VALIDATION_ENABLED to true. We can do this by using any of the override approach we've discussed earlier. For example, we can enable it globally in our instancio.propertiesfile:

We'll also add the spring-boot-starter-validation dependency to our project, since Instancio does not provide it transitively:

This dependency will allow us to use both Jakarta and Hibernate validation annotations.

Now, let's look at an example of how this feature can be used in our test:

Cool, right? We enable the Bean Validation feature using the @WithSettings annotation and create a User record using Instancio. We then assert that the created object does not have any constraint violations and the field values are valid as per the validation annotations we've used.

By using this approach, we can ensure that the generated test data is valid according to our domain constraints, without having to customize the required fields for every test case.

Parameterized Tests with Instancio

Instancio also integrates smoothly with JUnit 5's parameterized tests by providing the InstancioSource annotation. This allows us to further reduce the boilerplate code in our tests if we don't want to apply any customizations to the default test data that Instancio generates.

To use this feature, we'll again have to extend our test class using InstancioExtension and specify a single or multiple arguments depending on the requirement:

In our above test case, Instancio will provide a random UUID and a fully populated instance of our UserCreationRequest class that we can use in our test code without having to create it manually using the Instancio API.

Creating JSON Request Bodies for @WebMvcTest

When writing tests for our controller API endpoints using @WebMvcTest, we often find ourselves manually converting our POJOs to JSON strings to use as the request body. This typically involves autowiring an ObjectMapper or any other JSON conversion tool into our test class and then using it to serialize our objects:

When writing controller tests using Spring's @WebMvcTest annotation, we often find ourselves manually converting our request POJOs to JSON strings to use as the request body in our test cases. This typically involves autowiring an ObjectMapper or any other JSON conversion library into our test class and then using it to serialize our objects:

While this approach works, it adds unnecessary boilerplate code to our tests. Wouldn't it be nice if we could generate the request object and convert it to JSON in a single step? This is where we can use Instancio's as() method.

This is where we can use Instancio's as() method, which allows us to specify a function that'll be applied to the generated object before it is returned. For our case, we'll use it to convert the generated object directly into a JSON string.

Let's see how we can refactor our previous example:

In our refactored version, we first generate an instance of our UserCreationRequest class using Instancio API as we've been doing throughout this article. We then chain the as() method and pass a reference to a JsonUtil::convert method, which is responsible for converting the generated object to JSON.

The JsonUtil class can be created inside the test directory itself, here's how it might look like:

With this approach, we've eliminated the need to manually convert our request objects to JSON in every test case. Instancio takes care of both generating the test data and converting it to our desired format in a single, concise step.

Creating Reusable Test Data Templates

As we write more test cases, we may find ourselves repeating similar data setup code across different tests. This is where an Instancio Model can be used. A Model acts as a template for creating objects with predefined properties and settings.

By defining a reusable Model, we can define common field configurations and generation logic, making our test code more concise and maintainable. Let's take a look at an example to understand it better:

In our example, we define a Model to hold the common field generation logic of our User record. We specify that a generated Userobject should have an email ID with the specific pattern and a date of birth in the past. We then call the toModel() method to create an immutable Model instance.

Now, in our test methods, we can create User objects based on the defined userModel template. We use Instancio.of(userModel) to inherit the predefined configurations from the model and then customize specific fields as needed for each test case.

By using models, we reduce code duplication and keep our test code focused on the specific variations needed for each test case.

Additionally, models can serve as templates for creating other models, which allows for further customization and flexibility in our test data setup.

Conclusion

In this article, we've explored how Instancio, a simple yet powerful Java library, can significantly reduce boilerplate test data generation code in our unit and integration tests. It basically makes the “Arrange” easier in the Arrange-Act-Assert testing pattern.

We've successfully learnt to make our test code more readable and maintainable, in addition to benefiting from random data inputs for each test run, helping uncover edge cases that static test data might've missed.

We also discussed several features of Instancio to tailor the test data generation process to our requirements, such as ignoring specific fields, customizing field values, and overriding default settings at different levels.

In addition, we explored how Instancio helps in reproducing failed tests using seed values and integrates with Bean Validation annotations to generate valid test data based on constraint validations.

And lastly, we looked at how we can further reduce the test data creation process by using Instancio's @InstancioSource annotation that complements parameterized testing, and by creating reusable test data templates in our test classes.

We'd encourage you to explore Instancio and incorporate it into your own testing workflows. For more details on Instancio's features and usage, refer to the official User Guide.

As always, the complete source code demonstrated throughout this article is available on Github.

Joyful testing,

Hardik Singh Behl
Github | LinkedIn | Twitter

>