Better support for sets

Description

It seems reasonable to want to use a clojure set as the items parameter for the combinatoric functions. However, in the case of (combinations some-set 1), an exception is raised because Clojure's distinct function (called when (= t 1)) does not support sets. This problem can be worked around by converting the set to a seq before calling combinations.

For all the other functions, it appears extending all-different? with a set? check would make the functions slightly more efficient (by avoiding the linear distinct? scan).

Environment

None

Activity

Show:

Mark Engelberg May 27, 2017 at 8:58 AM

For the time being, I've added a note to the top of the usage notes in the README. The only thing that really breaks down for sets is the call to `distinct`, so if Clojure ever modifies the code for `distinct` so that it works on sets (which seems like it would be a trivial change), then this note can be removed.

import March 18, 2017 at 11:13 PM

Comment made by: nathan

I see some of the issue with combinations. Mathematical combinations are order-agnostic; but these are lists. For equality-testing you'd use: (set (map set (combinations some-set n)))

import March 18, 2017 at 10:52 PM

Comment made by: nathan

Thinking about #1, for anything but size-1 combinations, the behavior wouldn't be any less pure than it is now. It would also have the exact same purity as seq, vec, map or any other order-preserving function. (map inc some-set) has the same caveat for order. Since the issue of equality on order-hiding data-structures and order-preserving (revealing?) functions is a lot bigger than combinatorics functions, a suitable equality-warning really belongs with order-hiding objects themselves.

Thinking about the discussion on the mailing list, one of the reasons people use sets is to deliberately ignore order. I think the best advice would be to convert sequences back to sets before testing. (set (map inc some-set))

I think if we wouldn't leave map broken on sets, it's worth fixing here.

Mark Engelberg March 17, 2017 at 11:08 PM

I agree that the current behavior is unfriendly. Possible solutions are:

1) Implicitly call seq on all inputs (disadvantage: no longer pure functions)
2) Throw errors on sets (disadvantage: adds runtime check)
3) Add info to docstring that inputs must be a sequential (disadvantage: need to read docstring to understand constraint, does everyone know what a "sequential" means, and that sets and maps do not qualify?)
4) Add discussion on this point to README (disdavantage: does anyone read the README?)

I'd lean towards 3 or 4. I'm happy to make a change along those lines.

import March 17, 2017 at 10:46 PM

Comment made by: nathan

Fix or no fix, the least that needs to happen is an edit to the docstring.

As it is, combinations is going to work just fine until the size-1 corner-case hits. Hidden land-mines like this need some kind of fix.
Forcing action on ordering issues would entail throwing an exception, popping a warning, or making functions error on sets consistently.

Details

Assignee

Reporter

Priority

Created March 12, 2017 at 7:50 PM
Updated May 27, 2017 at 8:58 AM