test.check

Default sizing on gen/any needs re-evaluation

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

The following innocuous-looking test blows the heap and then crashes a 4GB JVM with an out-of-memory error:

(tc/defspec merge-is-idempotent
  100
  (prop/for-all [m (gen/map gen/any gen/any)]
    (= m (merge m m))))

I understand how this happens, and how to fix it by adjusting the size parameters.

However, it would be great if using the defaults did not have the potential for such a nasty failure mode (particularly as the unwary user will have trouble determining that the fault is not in their code).

Activity

Hide
Philip Potter added a comment -

I was going to raise a separate ticket for this, but it seems like it might be related: gen/any (and any-printable) seem to take exponential time in the size of the input. See for example this session:

user> (time (dorun (gen/sample (gen/resize 50 gen/any-printable))))
"Elapsed time: 2204.284643 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 55 gen/any-printable))))
"Elapsed time: 2620.717337 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 60 gen/any-printable))))
"Elapsed time: 5923.636336 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 65 gen/any-printable))))
"Elapsed time: 9035.762191 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 70 gen/any-printable))))
"Elapsed time: 15393.687184 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 75 gen/any-printable))))
"Elapsed time: 9510.571668 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 80 gen/any-printable))))
"Elapsed time: 39591.543565 msecs"
nil

Apart from the anomaly at 75, adding 10 to the size seems to increase runtime by almost 3x.

Show
Philip Potter added a comment - I was going to raise a separate ticket for this, but it seems like it might be related: gen/any (and any-printable) seem to take exponential time in the size of the input. See for example this session:
user> (time (dorun (gen/sample (gen/resize 50 gen/any-printable))))
"Elapsed time: 2204.284643 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 55 gen/any-printable))))
"Elapsed time: 2620.717337 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 60 gen/any-printable))))
"Elapsed time: 5923.636336 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 65 gen/any-printable))))
"Elapsed time: 9035.762191 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 70 gen/any-printable))))
"Elapsed time: 15393.687184 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 75 gen/any-printable))))
"Elapsed time: 9510.571668 msecs"
nil
user> (time (dorun (gen/sample (gen/resize 80 gen/any-printable))))
"Elapsed time: 39591.543565 msecs"
nil
Apart from the anomaly at 75, adding 10 to the size seems to increase runtime by almost 3x.
Hide
Reid Draper added a comment -

I believe I should have a fix for this today, as well as an easier way to write custom, recursive generators.

Show
Reid Draper added a comment - I believe I should have a fix for this today, as well as an easier way to write custom, recursive generators.
Hide
Reid Draper added a comment -

This isn't fixing the OOM yet, but is making a big difference in making the size have a linear relationship with the number of elements generated, in a recursive generator. Here's branch: https://github.com/clojure/test.check/compare/feature;recursive-generator-helpers.

Show
Reid Draper added a comment - This isn't fixing the OOM yet, but is making a big difference in making the size have a linear relationship with the number of elements generated, in a recursive generator. Here's branch: https://github.com/clojure/test.check/compare/feature;recursive-generator-helpers.
Hide
Reid Draper added a comment -

Fixed in https://github.com/clojure/test.check/commit/19ca756c95141af3fb9caa5e053b9d01120e5d7e. I'll try and get a snapshot build up tonight. Will be 0.5.9-SNAPSHOT. Check out the commit-message for a full explanation of how this now works.

Show
Reid Draper added a comment - Fixed in https://github.com/clojure/test.check/commit/19ca756c95141af3fb9caa5e053b9d01120e5d7e. I'll try and get a snapshot build up tonight. Will be 0.5.9-SNAPSHOT. Check out the commit-message for a full explanation of how this now works.
Hide
Philip Potter added a comment -

awesome work! The new recursive-gen fn is great, too: I can immediately see a use case for generating arbitrary JSON-compatible data (ie vectors, maps, strings, numbers, bools, but no rationals, characters, symbols...).

Show
Philip Potter added a comment - awesome work! The new recursive-gen fn is great, too: I can immediately see a use case for generating arbitrary JSON-compatible data (ie vectors, maps, strings, numbers, bools, but no rationals, characters, symbols...).
Hide
Philip Potter added a comment -

Just re-tested on my machine; performance is vastly improved though still superlinear:

clojure.test.mode> (time (dorun (gen/sample (gen/resize 100 gen/any-printable))))
"Elapsed time: 101.907628 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 200 gen/any-printable))))
"Elapsed time: 302.341697 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 400 gen/any-printable))))
"Elapsed time: 1154.098163 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 800 gen/any-printable))))
"Elapsed time: 2954.889396 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 1600 gen/any-printable))))
"Elapsed time: 22335.200578 msecs"
nil

although since the default max-size is 200, this is very much no big deal.

Show
Philip Potter added a comment - Just re-tested on my machine; performance is vastly improved though still superlinear:
clojure.test.mode> (time (dorun (gen/sample (gen/resize 100 gen/any-printable))))
"Elapsed time: 101.907628 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 200 gen/any-printable))))
"Elapsed time: 302.341697 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 400 gen/any-printable))))
"Elapsed time: 1154.098163 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 800 gen/any-printable))))
"Elapsed time: 2954.889396 msecs"
nil
clojure.test.mode> (time (dorun (gen/sample (gen/resize 1600 gen/any-printable))))
"Elapsed time: 22335.200578 msecs"
nil
although since the default max-size is 200, this is very much no big deal.
Hide
Reid Draper added a comment -

I'm not too surprised that performance is still superlinear. What should be linear is the number of leaf nodes in the generated tree. We may eventually be able to do something a little more complex and have the total number of nodes be linear (including internal nodes). For now, however, it should be considered a bug if the relationship between leaf nodes doesn't grow linearly with the size parameter.

Show
Reid Draper added a comment - I'm not too surprised that performance is still superlinear. What should be linear is the number of leaf nodes in the generated tree. We may eventually be able to do something a little more complex and have the total number of nodes be linear (including internal nodes). For now, however, it should be considered a bug if the relationship between leaf nodes doesn't grow linearly with the size parameter.
Hide
Reid Draper added a comment -

I'm still seeing that the original example is OOMing. I'm not sure the best way to solve that. For a single layer of nested generators, the above patch seems to be making a big difference. But when you make a map of gen/any, map doesn't know that it's arguments are themselves recursive generators. This means you might create 100 keys and values, each themselves, large recursive generators. There's a balance that has to be had between making the default recursive generators be 'large enough' by themselves, but not too large when used like this. Hmm.. I think I'll go ahead and release 0.5.9, and keep on thinking of ways to make this even better.

Show
Reid Draper added a comment - I'm still seeing that the original example is OOMing. I'm not sure the best way to solve that. For a single layer of nested generators, the above patch seems to be making a big difference. But when you make a map of gen/any, map doesn't know that it's arguments are themselves recursive generators. This means you might create 100 keys and values, each themselves, large recursive generators. There's a balance that has to be had between making the default recursive generators be 'large enough' by themselves, but not too large when used like this. Hmm.. I think I'll go ahead and release 0.5.9, and keep on thinking of ways to make this even better.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated: