Testing Shape of Data

Last time I wrote about integration testing on different levels and in this post I want to explore topic that’s somewhat related to integration testing.

When testing code that integrates with DBs and JSON APIs I often find myself only caring about (and wanting to test) a subset of data. To give a concrete example, let’s say we’re integration testing a simple Rails controller with /items endpoint.

The simplest possible test for that could be just:

def test_create
  Item.create!(name: "Item 1")
  Item.create!(name: "Item 2")

  get "/items"

  assert_equal [
    {"id" => 1, "name" => "Item 1"},
    {"id" => 2, "name" => "Item 2"},
  ], JSON(response.body)
end

There’s a problem with this test and it’s the hardcoded ids (this is even more painful with things like created_at). This can be pretty easily solved by introducing local variables:

def test_create
  item1 = Item.create!(name: "Item 1")
  item2 = Item.create!(name: "Item 2")

  get "/items"

  assert_equal [
    {"id" => item1.id, "name" => "Item 1"},
    {"id" => item2.id, "name" => "Item 2"},
  ], JSON(response.body)
end

However, I think this introduced some complexity into the test code and decreased readability a bit. Also, I don’t really care what are the ids that have been auto-generated by my DB. My intent is to test the name attribute was correctly saved, and the id attribute is just an accidental detail.

Another way of addressing this problem is introducing even more complexity into the test code, by extracting only the fields I care about:

def test_create
  item1 = Item.create!(name: "Item 1")
  item2 = Item.create!(name: "Item 2")

  get "/items"

  assert_equal [
    "Item 1", "Item 2",
  ], JSON(response.body).map { |h| h['name'] }
end

However, I think this is really hard to read. Moreover, when this test grows I’d be concerned about a possibility of introducing a bug in my test code.

As mentioned above, my intent is to test the name attribute and I don’t really care what’s the id attribute - it can be any integer, as far as I’m concerned in this test. This is the test I’d like to write:

def test_create
  item1 = Item.create!(name: "Item 1")
  item2 = Item.create!(name: "Item 2")

  get "/items"

  assert_equal [
    {"id" => any_integer, "name" => "Item 1"},
    {"id" => any_integer, "name" => "Item 2"},
  ], JSON(response.body)
end

and let me show you how this can be achieved.

Building `any_integer`

Let’s break down the test above into the simplest possible code we could discuss. It boils down to:

assert_equal [any_integer, any_integer], [1, 2]

Minitest’s assert_equal uses Ruby’s equality operator == to do the comparison. In this case, we’re equaling two Arrays and per documentation “Two arrays are equal if they contain the same number of elements and if each element is equal to (according to Object#==)”

So, all we need to do is to implement the == method on the any_integer object:

class AnyInteger
  def ==(other)
    other.is_a? Integer
  end
end
def any_integer
  AnyInteger.new
end

[any_integer, any_integer] == [1, 2]    # => true
[any_integer, any_integer] == [1, :bad] # => false

any_integer and many other methods (as well as ways to compose them) is available on GitHub: https://github.com/wojtekmach/anything, https://github.com/wojtekmach/anything/blob/master/test/anything_test.rb

Conclusion

In this post I described a different approach to testing data structures where in some cases the exact value isn’t as important as long as some property of that value is being checked. One could say that testing for type (and not for actual value) is kind of weak and languages with static types are solving this “by design” and I tend to agree. However, what I’m trying to show here is testing for “shape” of data (which is very useful in JSON APIs) and avoiding adding complexity in test code by extracting only a subset of attributes that matter. I was inspired by ideas from property-based testing to write this blog post, but I’m not sure people would say these two techniques are related. If anything, one could say that instead of bringing the best of both worlds (exact values from example-based testing and checking properties against random data from propert-based testing) I’m showing the worst of both worlds :-) Let me know what you think in the comments!

Building any_integer

Conclusion

Building `any_integer`