<< Back to previous view

[TCHECK-74] gen-int generates very small ints Created: 02/Jul/15  Updated: 02/Jul/15

Status: Open
Project: test.check
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Jonathan Leonard Assignee: Reid Draper
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Sorry for creating an issue for this but I couldn't find any way to communicate with the developers of this project otherwise. Can someone explain why taking 100,000 ints never yields an int larger than 100? Is it due to the 'size' parameter? Does the runtime just adjust 'size' on the fly to reach 2.7 billion or what not?

If manually sampling a generator, how can one adjust the size?



 Comments   
Comment by Gary Fredericks [ 02/Jul/15 3:22 PM ]

This is a pretty major gotcha currently, and one that definitely would have been fixed by now if there was an obvious best way to fix it.

See the "Numerics" section of this page if you're interested in my ideas for fixing it.

Fortunately there's nothing blocking you from generating what you want once you understand how the pieces work together.

By default the size parameter ranges from 0-200 in normal testing (even smaller if you run your tests less than 200 times at once), which means that gen/int and similar generators stay quite small.

There are at least a couple options for opting-in to larger numbers currently:

  • use the new gen/scale function, e.g. (gen/scale #(* % 10000) gen/int)
  • use the gen/choose function, with the main downside being it ignores the size parameter and generates uniformly within the given range (but it does shrink in the normal way)

As for setting the size when sampling, I recommend the new gen/generate function, which takes an optional size argument.

Comment by Gary Fredericks [ 02/Jul/15 3:23 PM ]

I was looking for another ticket on this topic since I figured one must exist, but I didn't find it. So I'll probably leave this ticket open as a representative of the broader numerics issue.

Comment by Jonathan Leonard [ 02/Jul/15 4:24 PM ]

Thanks for the reply/explanation.

However, doesn't the `gen/scale` solution still have the drawback that it will be biased towards larger numbers-- i.e., it will be impossible to return integers between 1 and 9999 in your example code.

Comment by Gary Fredericks [ 02/Jul/15 8:23 PM ]

The distribution should be a uniform range of size roughly (* 10000 size), so small numbers are definitely possible:

user> (def g (gen/scale #(* % 10000) gen/int))
#'user/g
user> (gen/sample g)
(0 -3423 -14429 788 -16910 49010 4363 56297 -1776 58167)




Generated at Sat Jul 04 21:51:07 CDT 2015 using JIRA 4.4#649-r158309.