Testing Shape of Data
Last time I wrote about integration testing on different levels and in this post I want to explore topic that’s somewhat related to integration testing.
When testing code that integrates with DBs and JSON APIs I often find myself only caring about (and wanting to test) a subset of data. To give a concrete example, let’s say we’re integration testing a simple Rails controller with /items
endpoint.
The simplest possible test for that could be just:
def test_create
Item.create!(name: "Item 1")
Item.create!(name: "Item 2")
get "/items"
assert_equal [
{"id" => 1, "name" => "Item 1"},
{"id" => 2, "name" => "Item 2"},
], JSON(response.body)
end
There’s a problem with this test and it’s the hardcoded ids (this is even more painful with things like created_at
). This can be pretty easily solved by introducing local variables:
def test_create
item1 = Item.create!(name: "Item 1")
item2 = Item.create!(name: "Item 2")
get "/items"
assert_equal [
{"id" => item1.id, "name" => "Item 1"},
{"id" => item2.id, "name" => "Item 2"},
], JSON(response.body)
end
However, I think this introduced some complexity into the test code and decreased readability a bit. Also, I don’t really care what are the ids that have been auto-generated by my DB. My intent is to test the name
attribute was correctly saved, and the id
attribute is just an accidental detail.
Another way of addressing this problem is introducing even more complexity into the test code, by extracting only the fields I care about:
def test_create
item1 = Item.create!(name: "Item 1")
item2 = Item.create!(name: "Item 2")
get "/items"
assert_equal [
"Item 1", "Item 2",
], JSON(response.body).map { |h| h['name'] }
end
However, I think this is really hard to read. Moreover, when this test grows I’d be concerned about a possibility of introducing a bug in my test code.
As mentioned above, my intent is to test the name
attribute and I don’t really care what’s the id
attribute - it can be any integer, as far as I’m concerned in this test. This is the test I’d like to write:
def test_create
item1 = Item.create!(name: "Item 1")
item2 = Item.create!(name: "Item 2")
get "/items"
assert_equal [
{"id" => any_integer, "name" => "Item 1"},
{"id" => any_integer, "name" => "Item 2"},
], JSON(response.body)
end
and let me show you how this can be achieved.
Building any_integer
Let’s break down the test above into the simplest possible code we could discuss. It boils down to:
assert_equal [any_integer, any_integer], [1, 2]
Minitest’s assert_equal
uses Ruby’s equality operator ==
to do the comparison. In this case, we’re equaling two Array
s and per documentation “Two arrays are equal if they contain the same number of elements and if each element is equal to (according to Object#==)”
So, all we need to do is to implement the ==
method on the any_integer
object:
class AnyInteger
def ==(other)
other.is_a? Integer
end
end
def any_integer
AnyInteger.new
end
[any_integer, any_integer] == [1, 2] # => true
[any_integer, any_integer] == [1, :bad] # => false
any_integer
and many other methods (as well as ways to compose them) is available on GitHub: https://github.com/wojtekmach/anything, https://github.com/wojtekmach/anything/blob/master/test/anything_test.rb
Conclusion
In this post I described a different approach to testing data structures where in some cases the exact value isn’t as important as long as some property of that value is being checked. One could say that testing for type (and not for actual value) is kind of weak and languages with static types are solving this “by design” and I tend to agree. However, what I’m trying to show here is testing for “shape” of data (which is very useful in JSON APIs) and avoiding adding complexity in test code by extracting only a subset of attributes that matter. I was inspired by ideas from property-based testing to write this blog post, but I’m not sure people would say these two techniques are related. If anything, one could say that instead of bringing the best of both worlds (exact values from example-based testing and checking properties against random data from propert-based testing) I’m showing the worst of both worlds :-) Let me know what you think in the comments!