...
- sorted-set duplicate handling behavior differs from hash-set (which throws)
- sorted-map duplicate handling behavior differs from hash-map (which throws)
Set 'Problems'
- set literals throw on duplicate keys
- Is it a user error?
- open question
- is there a purpose to writing #{42 42}?
- must every reader deal with that?
- if yes, then checking for dupes might be penalizing correct programs (perf-wise)
- not checking means maybe creating an invalid object
- if hash-set didn't throw you would have an alternative when unknown entries
- arguably there should be no problem, since conflict free
- yet, a user seeing
- #{a b}
- should expect a set with 2 entries
- yet, a user seeing
- this behavior is just an artifact of sharing implementation with map
- Is it a user error?
- hash-set throws on duplicate keys
- same reasons
Map 'Problems'
- map literals throw on duplicate keys
- this is not conflict-free, as values for same key might differ
- is it a user error?
- Yes
- This is an inarguably, and apparently, bad map:
- {:a 1 :b 2 :a 3}
- Using keys not known to be unique in a literal is bad form
- a user reading seeing this:
- {a 1 b 2 c 3}
- should expect a map with 3 entries
- a user reading seeing this:
- This is an inarguably, and apparently, bad map:
- if hash-map did not throw, you would have an alternative when keys unknown
- Yes
- auto-resolving implies an order-of-consideration for map literal entries
- and there should not be one
- a complete semantic mess
- non-resolving alternatives:
- throw
- checking for dupes might be penalizing correct programs
- not checking means maybe creating an invalid object
- throw
- hash-map throws on duplicate keys
- here there might be an implicit order due to argument order
- could make repeated assoc promise
- that's the behavior of sorted-map
Opinions
- I don't think there is any merit whatsoever to supporting duplicates, evident or not, in literal sets and maps
- such programs are at worst broken and at best anti-social
- so, what should happen?
- there are read-time and runtime considerations
- first step - declare such things are user errors
- second step - decide on a reporting strategy
- Don't penalize correct programs!
- unchecked array-based map constructors are a critical way to competitive perf for object-like use of maps
- I think hash-set and hash-map should not throw on dupes
- and that hash/sorted-set/map should make an explicit as-if-by-repeated assoc promise
- If you think a month is too long to get a response to your needs, from a bunch of very busy volunteers, you need to chill out
- just because you decided to bring it up doesn't mean everyone else needs to drop what they are doing
- This page was useful, thanks.
Recommendations
- hash/sorted-set/map should make an explicit as-if-by-repeated-conj/assoc promise
- thus will never throw, and be consistent
- if you don't know that you have unique keys, use these!
- Document "Duplicate keys in map/set literals, evident or not, are user errors"
- saying that is not the same as guaranteeing they will generate exceptions!
- generate exceptions for now
- eventually move the (non-reader) check into debug mode, or otherwise provide runtime control
- Restore the fastest path possible for those cases where the keys are compile-time detectable unique constants
- high perf for known correct programs at least
end RH Feedback Zone
...
Current behavior of Clojure 1.4.0
...