JIRA ticket CLJ-1065 has been created for this.

See this thread on the Clojure Google group: https://groups.google.com/forum/?fromgroups=#!topic/clojure/AG667ACBd3I

Note especially Chas Emerick's detailed analysis of how we arrived at the current state, posted Aug 5, 2012.  Also Mark Engelberg's argumentation on Sep 4, 2012 in favor of reverting to the older pre-exception-throwing behavior, all of which should now be duplicated below.

 


RH Feedback Zone

Bugs

Set 'Problems'

Map 'Problems'

Opinions

Recommendations

  1. hash/sorted-set/map should make an explicit as-if-by-repeated-conj/assoc promise
    1. thus will never throw, and be consistent
    2. if you don't know that you have unique keys, use these!
  2. Document "Duplicate keys in map/set literals, evident or not, are user errors"
    1. saying that is not the same as guaranteeing they will generate exceptions!
    2. generate exceptions for now
    3. eventually move the (non-reader) check into debug mode, or otherwise provide runtime control
  3. Restore the fastest path possible for those cases where the keys are compile-time detectable unique constants
    1. high perf for known correct programs at least

end RH Feedback Zone



Current behavior of Clojure 1.4.0

;; Sets
user=> #{28 28}
IllegalArgumentException Duplicate key: 28  clojure.lang.PersistentHashSet.createWithCheck (PersistentHashSet.java:68)

;; It is when set literals contain variables that are unexpectedly equal
;; that some do not want an exception thrown.
user=> (def a 28)
#'user/a
user=> (def b 28)
#'user/b
user=> #{a b}
IllegalArgumentException Duplicate key: 28  clojure.lang.PersistentHashSet.createWithCheck (PersistentHashSet.java:68)

;; This is one way to construct a set that allows duplicates.
user=> (set [a b])
#{28}


;; Maps

;; Similar to sets, except that only keys must be distinct.
;; However, in this case the construction functions array-map
;; and hash-map also disallow duplicate keys, whereas
;; sorted-map permits them.

user=> {a 5 b 7}
IllegalArgumentException Duplicate key: 28  clojure.lang.PersistentArrayMap.createWithCheck (PersistentArrayMap.java:70)
user=> (array-map a 5 b 7)
IllegalArgumentException Duplicate key: 28  clojure.lang.PersistentArrayMap.createWithCheck (PersistentArrayMap.java:70)
user=> (hash-map a 5 b 7)
IllegalArgumentException Duplicate key: 28  clojure.lang.PersistentHashMap.createWithCheck (PersistentHashMap.java:92)
user=> (sorted-map a 5 b 7)
{28 7}

;; assoc is one way to create a map that silently eliminates duplicate keys
user=> (assoc {} a 5 b 7)
{28 7}

 

Arguments for changing it back to never throwing exceptions on duplicates

1. "It's a bug that should be fixed."  The change to throw-on-duplicate behavior for sets in 1.3 was a breaking change that causes a runtime error in previously working, legitimate code.

Looking through the history of the issue, one can see that no one was directly asking for throw-on-duplicate behavior.  The underlying problem was that array-maps with duplicate keys returned nonsensical objects; surely it would be more user-friendly to just block people from creating such nonsense by throwing an error.  This logic was extended to other types of maps and sets.

It's not entirely clear the degree to which the consequences of these changes were considered, but it seems likely that there was an implicit assumption that throw-on-duplicate behavior would only come into play in programs with some sort of syntactic error, when in fact it has semantic implications for working programs.  When a new "feature" causes unintentional breakage in working code, this is arguably a bug and needs to be reconsidered.  

2. "The current way of doing things is internally inconsistent and therefore complex."

(def a 1)

(def b 1)

(set [a b]) -> good

(hash-set a b) -> error

#{a b} -> error

(sorted-set a b) -> good

(into #{} a b) -> good

The cognitive load from having to remember which constructors do what is a bad thing.  

3. "Current behavior conflicts with the mathematical and intuitive notion of a set."

In math, {1, 1} = {1}.  In programming, sets are used as a means to eliminate duplicates.

Arguments for leaving things as is

Now let's summarize the arguments that have been raised here in support of the status quo.

1. "Changing everything to throw-on-duplicate would be just as logically consistent as changing everything to use-last-in."

True, but that doesn't mean that both approaches would be equally useful.  It's readily apparent that an essential idea of sets is that they need to be able to gracefully absorb duplicates, so at least one such method of doing that is essential.  On the other hand, we can get along just fine without sets throwing errors in the event of a duplicate value.  So if you're looking for consistency, there's really only one practical option.

2.  "I like the idea that Clojure will protect me from accidentally from this kind of syntax error."

Clojure, as a dynamically typed language, is unable to protect you from the vast majority of data-entry syntax errors you're likely to make.

Let's say you want to type in {:apple 1, :banana 2}.  Even if Clojure can catch your mistake if you type {:apple 1, :apple 2}, there's no way it's ever going to catch you if you type {:apple 1, :banano 2}, and frankly, the latter error is one you're far more likely to make.

This is precisely why there's little evidence that anyone was asking for this kind of syntax error protection, and little evidence that anyone has benefited significantly from its addition -- its real-world utility is fairly minimal and dwarfed by the other kinds of errors one is likely to make.

3.  "Maybe we can do it both ways."

It's laudable to want to make everyone happy.  The danger, of course, is that such sentiment paints a picture that it would be a massive amount of work to please everyone, and therefore, we should do nothing.  Let's be practical about what is easily doable here with the greatest net benefit.  The current system has awkward and inconsistent semantics with little benefit.  Let's focus on fixing it. The easiest patch -- revert to 1.2 behavior, but bring array-map's semantics into alignment with the other associative collections.