test.check

Doc improvements: how to separate "size" from "range"

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

Size is discussed mainly in terms of recursive generators. However, it is crucial to understanding other aspects such as vector length and integer ranges.

Some examples like these would be helpful:

  • How to get a small number of samples with a large range.
  • How to get a large number of samples with a small range.

Activity

Hide
Reid Draper added a comment -

Agree. I'll try and get a patch up in the next couple of days. Do you just have something in mind going in the doc/ dir?

Show
Reid Draper added a comment - Agree. I'll try and get a patch up in the next couple of days. Do you just have something in mind going in the doc/ dir?
Hide
Brian Craft added a comment -

I have exactly this problem. Are there any examples from projects using test.check?

Show
Brian Craft added a comment - I have exactly this problem. Are there any examples from projects using test.check?
Hide
Reid Draper added a comment - - edited

Brian, can you be more specific about your problem? Some helpful functions regarding sizing and domain-generation are documented here. Take a look specifically at gen/choose, gen/resize, and gen/sized.

Show
Reid Draper added a comment - - edited Brian, can you be more specific about your problem? Some helpful functions regarding sizing and domain-generation are documented here. Take a look specifically at gen/choose, gen/resize, and gen/sized.
Hide
Brian Craft added a comment -

I don't need to test, for example, strings that are 500 chars long, but do need to test lists of 500 strings. If I have a generator that builds lists of strings & size it larger, both the lists and strings get larger. Then I start hitting irrelevant resource limits, like db field widths, or jvm memory. It seems like I need a way to specify how different data elements grow. Just setting the range, like (gen/vector ... 0 5000), seems wrong, because then it always picks up to 5000, while I suspect for shrinking it needs to respond to "size". Something like this?

(gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int)))))

forcing the size of the vector to increase more quickly than the size of the elements in the vector.

I gave this a try, but hit the vector stack overflow problem. I will try the fmap/flatten workaround, but am wondering if this is the right approach. Also wondering about the range of "size". From the code it looks like it goes to 100 due to sample-seq. So functions of size passed to gen/sized should expect range from 0-99, and return the largest meaningful data when given size parameter 99?

Show
Brian Craft added a comment - I don't need to test, for example, strings that are 500 chars long, but do need to test lists of 500 strings. If I have a generator that builds lists of strings & size it larger, both the lists and strings get larger. Then I start hitting irrelevant resource limits, like db field widths, or jvm memory. It seems like I need a way to specify how different data elements grow. Just setting the range, like (gen/vector ... 0 5000), seems wrong, because then it always picks up to 5000, while I suspect for shrinking it needs to respond to "size". Something like this? (gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int))))) forcing the size of the vector to increase more quickly than the size of the elements in the vector. I gave this a try, but hit the vector stack overflow problem. I will try the fmap/flatten workaround, but am wondering if this is the right approach. Also wondering about the range of "size". From the code it looks like it goes to 100 due to sample-seq. So functions of size passed to gen/sized should expect range from 0-99, and return the largest meaningful data when given size parameter 99?
Hide
Reid Draper added a comment -

I don't need to test, for example, strings that are 500 chars long, but do need to test lists of 500 strings. If I have a generator that builds lists of strings & size it larger, both the lists and strings get larger.

Indeed, by default, all generators will 'grow' together. And as you've correctly noticed, you can use gen/sized and gen/resize to further control this.

Just setting the range, like (gen/vector ... 0 5000), seems wrong, because then it always picks up to 5000, while I suspect for shrinking it needs to respond to "size". Something like this?

Shrinking is completely separate from the size parameter. You can safely use (gen/vector ... 0 n) without negatively affecting shrinking.

(gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int)))))

forcing the size of the vector to increase more quickly than the size of the elements in the vector.

I gave this a try, but hit the vector stack overflow problem. I will try the fmap/flatten workaround, but am wondering if this is the right approach.

Your example looks fine. And you can avoid the stack overflow problem by simply not allowing size to grow above a certain limit, say 5000: (gen/resize (min 5000 (* size size)) ...).

(gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int)))))

forcing the size of the vector to increase more quickly than the size of the elements in the vector.

Also wondering about the range of "size". From the code it looks like it goes to 100 due to sample-seq. So functions of size passed to gen/sized should expect range from 0-99, and return the largest meaningful data when given size parameter 99?

When testing, the default max size is 200. However, you can override this simply by passing in a :max-size n argument to tc/quick-check. For example: (tc/quick-check 100 my-prop :max-size 75).

Hope this helps, and don't hesitate to ask anything else.

Show
Reid Draper added a comment -
I don't need to test, for example, strings that are 500 chars long, but do need to test lists of 500 strings. If I have a generator that builds lists of strings & size it larger, both the lists and strings get larger.
Indeed, by default, all generators will 'grow' together. And as you've correctly noticed, you can use gen/sized and gen/resize to further control this.
Just setting the range, like (gen/vector ... 0 5000), seems wrong, because then it always picks up to 5000, while I suspect for shrinking it needs to respond to "size". Something like this?
Shrinking is completely separate from the size parameter. You can safely use (gen/vector ... 0 n) without negatively affecting shrinking.
(gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int))))) forcing the size of the vector to increase more quickly than the size of the elements in the vector. I gave this a try, but hit the vector stack overflow problem. I will try the fmap/flatten workaround, but am wondering if this is the right approach.
Your example looks fine. And you can avoid the stack overflow problem by simply not allowing size to grow above a certain limit, say 5000: (gen/resize (min 5000 (* size size)) ...).
(gen/sized (fn [size] (gen/resize (* size size) (gen/vector (gen/resize size gen/int))))) forcing the size of the vector to increase more quickly than the size of the elements in the vector. Also wondering about the range of "size". From the code it looks like it goes to 100 due to sample-seq. So functions of size passed to gen/sized should expect range from 0-99, and return the largest meaningful data when given size parameter 99?
When testing, the default max size is 200. However, you can override this simply by passing in a :max-size n argument to tc/quick-check. For example: (tc/quick-check 100 my-prop :max-size 75). Hope this helps, and don't hesitate to ask anything else.
Hide
Brian Craft added a comment -

At the moment the hardest bit for me is understanding the shrinking. Composing generators, I inadvertently build things that won't shrink. E.g. to generate a 2d matrix of numbers with row & column names, I make use of bind so I can size vectors of names to match the matrix size, but then the resulting matrix/vectors will not shrink.

Here's a small example. Generate a vector of short strings, and test that they are all unique.

(tc/quick-check 40 (prop/for-all [f (gen/bind gen/int (fn [i] (gen/vector (gen/resize 5 (gen/such-that not-empty gen/string-ascii)) i)))] (do (println "f" f "count" (count f)) (= (count f) (count (set f))))))

Run this a few times & it will hit a duplicate string, failing the test. It is then unable to shrink the vector.

Here's a case generating two vectors of the same size by generating one vector, then using bind to generate a second of the same size. Testing uniqueness on the second vector, it is able to shrink, however it does so by regenerating the 2nd vector for every test.

(tc/quick-check 40 (prop/for-all [f (gen/bind (gen/vector (gen/resize 5 (gen/such-that not-empty gen/string-ascii))) (fn [v] (gen/hash-map :a (gen/return v) :b (gen/vector gen/int (count v)))))] (do (println "f" f "count" (count (:a f))) (= (count (:b f)) (count (set (:b f)))))))

I expect this will shrink poorly in many cases, because regenerating the second vector destroys the (possibly very rare) failure condition. I suspect this is why my 2d matrices shrink poorly. One really wants to be able to shrink by dropping the same positional element from both vectors. In the case of a fixed dimension (two), you can rewrite it to generate a vector of tuples. For a 2d matrix, though, I'm not sure what to do. Ideally it would shrink by dropping either a column or a row, and the corresponding column or row label. Perhaps I need to write a recursive generator like vector?

Are there any docs/papers describing how the shrinking works?

Show
Brian Craft added a comment - At the moment the hardest bit for me is understanding the shrinking. Composing generators, I inadvertently build things that won't shrink. E.g. to generate a 2d matrix of numbers with row & column names, I make use of bind so I can size vectors of names to match the matrix size, but then the resulting matrix/vectors will not shrink. Here's a small example. Generate a vector of short strings, and test that they are all unique. (tc/quick-check 40 (prop/for-all [f (gen/bind gen/int (fn [i] (gen/vector (gen/resize 5 (gen/such-that not-empty gen/string-ascii)) i)))] (do (println "f" f "count" (count f)) (= (count f) (count (set f)))))) Run this a few times & it will hit a duplicate string, failing the test. It is then unable to shrink the vector. Here's a case generating two vectors of the same size by generating one vector, then using bind to generate a second of the same size. Testing uniqueness on the second vector, it is able to shrink, however it does so by regenerating the 2nd vector for every test. (tc/quick-check 40 (prop/for-all [f (gen/bind (gen/vector (gen/resize 5 (gen/such-that not-empty gen/string-ascii))) (fn [v] (gen/hash-map :a (gen/return v) :b (gen/vector gen/int (count v)))))] (do (println "f" f "count" (count (:a f))) (= (count (:b f)) (count (set (:b f))))))) I expect this will shrink poorly in many cases, because regenerating the second vector destroys the (possibly very rare) failure condition. I suspect this is why my 2d matrices shrink poorly. One really wants to be able to shrink by dropping the same positional element from both vectors. In the case of a fixed dimension (two), you can rewrite it to generate a vector of tuples. For a 2d matrix, though, I'm not sure what to do. Ideally it would shrink by dropping either a column or a row, and the corresponding column or row label. Perhaps I need to write a recursive generator like vector? Are there any docs/papers describing how the shrinking works?
Hide
Gary Fredericks added a comment -

I've included a generator called cap-size in the test.chuck library.

Show
Gary Fredericks added a comment - I've included a generator called cap-size in the test.chuck library.

People

Vote (1)
Watch (3)

Dates

  • Created:
    Updated: