test.check

Rerunning a particular failure is difficult.

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

Test.check has the concept of a seed that is used when a property is tested. The seed can be supplied or generated by the current time. Either way, the single seed is used to generate all the trials for the test. If a test fails, the only way to try to re-test the property on the same inputs is to run all of the tests again using the same seed. If the tests were running for hours before a failure was found, this is prohibitive.

I would like instead to be able to check the property for exactly the input that failed. One step in that direction is to have an explicit different seed for each trial (perhaps generated from an initial metaseed), and on failure to report a [seed size] pair that can be used to run a test a single time.

The way this interacts with shrinking is interesting though, since it's just as likely that I'll want to re-run on the shrunk value from a failure. Since the shrunk value was never explicitly generated from a known [seed size], just that information is insufficient. We could perhaps report [seed size shrink-path] where the third element is a list of indexes to walk through the shrink tree with. Probably the worst part about that is it might be rather long.

In any case I think something like this would be more useful than the current setup. I'll probably be hacking up something similar for a fork branch I'm maintaining, but if we can settle on a specific design I'd be happy to cook up a patch.

Activity

Hide
Reid Draper added a comment - - edited

When a test fails, we do return the arguments that originally failed, and the arguments of the shrunk test. Is this insufficient?

For example:

{:result false,
       :failing-size 45,
       :num-tests 46,
       :fail [[10 1 28 40 11 -33 42 -42 39 -13 13 -44 -36 11 27 -42 4 21 -39]],
       :shrunk {:total-nodes-visited 38,
                :depth 18,
                :result false,
                :smallest [[42]]}}

see :fail and [:shrunk :smallest].

Show
Reid Draper added a comment - - edited When a test fails, we do return the arguments that originally failed, and the arguments of the shrunk test. Is this insufficient? For example:
{:result false,
       :failing-size 45,
       :num-tests 46,
       :fail [[10 1 28 40 11 -33 42 -42 39 -13 13 -44 -36 11 27 -42 4 21 -39]],
       :shrunk {:total-nodes-visited 38,
                :depth 18,
                :result false,
                :smallest [[42]]}}
see :fail and [:shrunk :smallest].
Hide
Gary Fredericks added a comment - - edited

It's certainly adequate to reproduce in theory, but I don't know of any built in mechanism that makes this easy. Here's what I end up doing:

Given this:

(defspec foo
  (prop/for-all [x gen]
    (f x)))

and after getting a failure with a 50-line shrunk value (e.g. :bar), I manually pretty-print it and paste it back into my file like so:

(def shrank
  ':bar)

(defspec foo
  (prop/for-all [x #_gen (gen/return shrank)]
    (f x)))

And now I can run (foo 1) to re-run with the failing value.

This awkwardness is partly due to three facts:

  1. My generators create large unwieldy values
  2. My property is a large body of code rather than a single function
  3. There's no easy way to test a property on a specific value

I could mitigate some of the pain by fixing #2 – having each one of my tests split into a defspec / prop/for-all and a plain function. But #1 is not easy to fix, and having a small tuple is rather nicer for me.

I've already written a POC for this and it's working rather well. I used the term :key to refer to the tuple, so the term doesn't conflict with :seed. I end up calling the test like (foo 0 :key [329489249329323 19 []]), but I'll probably think up something else that doesn't require passing a dummy first arg.

Show
Gary Fredericks added a comment - - edited It's certainly adequate to reproduce in theory, but I don't know of any built in mechanism that makes this easy. Here's what I end up doing: Given this:
(defspec foo
  (prop/for-all [x gen]
    (f x)))
and after getting a failure with a 50-line shrunk value (e.g. :bar), I manually pretty-print it and paste it back into my file like so:
(def shrank
  ':bar)

(defspec foo
  (prop/for-all [x #_gen (gen/return shrank)]
    (f x)))
And now I can run (foo 1) to re-run with the failing value. This awkwardness is partly due to three facts:
  1. My generators create large unwieldy values
  2. My property is a large body of code rather than a single function
  3. There's no easy way to test a property on a specific value
I could mitigate some of the pain by fixing #2 – having each one of my tests split into a defspec / prop/for-all and a plain function. But #1 is not easy to fix, and having a small tuple is rather nicer for me. I've already written a POC for this and it's working rather well. I used the term :key to refer to the tuple, so the term doesn't conflict with :seed. I end up calling the test like (foo 0 :key [329489249329323 19 []]), but I'll probably think up something else that doesn't require passing a dummy first arg.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated: