s/coll-of and s/every gen is very slow if :kind specified without :into

Description

An s/coll-of with :kind but without :into takes a very long time to generate. You will mostly notice this when using stest/check but here is an example that replicates the important bits:

Cause: If :into is not provided, s/every and s/coll-of has to generate a collection (via gen of list?) to call empty on it, to then fill it up. This is quite clearly documented in the docstring of `s/every`:

Assumedly the list? generates quite large vectors at a certain point which significantly slows down the generation. The responsible code is in gen* of every-impl.

Proposed: Use standard empty coll for persistent collection :kind preds (list? vector? set? map?).

In the patch:

  • Create lookup table `empty-coll` for kind-form (resolved pred) to empty coll.

  • When possible (if :into coll is specified or kind-form is one of these special cases), determine the empty gen-into collection once at the beginning.

  • If empty gen-into collection is available, start with that in the generator. Otherwise, fall back to prior behavior for kind.

Performance-wise, the times after the patch are comparable to times using :into.

Patch: clj-2103.patch

Environment

alpha14

Attachments

1

Activity

Show:

Leon Grapenthin February 14, 2017 at 9:03 PM

I see two approaches to improve this behavior:

1. The gen uses the gen of kind to generate one value with the smallest size, calls empty on it to determine :into. This would lead to a surprise when your :kind is e. g. (s/or :a-vec vector? :a-list list?) (which currently throws, anyway)
2. We use an internal lookup table to assume :into. {clojure.core/vector? [], clojure.core/set? #{} ...}

Completed

Details

Assignee

Reporter

Approval

Patch

Priority

Affects versions

Fix versions

Created January 28, 2017 at 11:50 PM
Updated October 6, 2017 at 7:52 PM
Resolved October 6, 2017 at 7:52 PM