empty? relies on seqability so doesn't work with non-seqable collections
Description
Environment
Attachments
Activity
Alex Miller June 30, 2022 at 9:32 PM
Released in 1.12.0-alpha1

Michael Fogus May 10, 2022 at 4:34 PM
The patch code looks good but I think that a doctoring change is is order. Consider something like:

Alex Miller May 13, 2019 at 7:27 PM
I think there are good reasons for transient collections not to be Seqable - seqs imply caching, caching hurts perf, and the whole reason to be using transients is for batch load perf. So that seems counter-productive. Iterators are stateful and again, I suspect that is probably a bad thing to add just for the sake of checking empty?.
An explicit check for emptiness of counted? colls would cover all the transient colls and anything else counted without making a seq. That might be faster for all those cases, and doesn't require anything new of anybody in the impl.
Another option would be to have an IEmptyable interface and/or protocol to indicate explicit empty? check support. Probably overkill.

Devin Walters May 9, 2019 at 11:47 PM
First things first, the original description brings up `(empty? (transient ()))`. Per the documentation at https://clojure.org/reference/transients, there is no benefit to be had for supporting transients on lists.
Current behavior for java collections:
The same behavior is true of java arrays.
Over in CLJS-2802, the current patch's approach is to `cond` around the problem in `empty?` by explicitly checking whether it's a TransientCollection, and if so, using `(zero? (count coll))` as the original description mentions as a workaround.
Currently, transient collections do not implement Iterable as the persistent ones do. If Iterable were implemented, I believe RT.seqFrom would work, and by extension, `empty?`.

Devin Walters May 9, 2019 at 8:52 PM
As mentioned in CLJ-700, this is a different issue.
Details
Assignee
UnassignedUnassignedReporter
importimportLabels
Approval
OkPatch
Code and TestPriority
MajorAffects versions
Fix versions
Details
Details
Assignee
Reporter

Labels
Approval
Patch
Priority

This was originally filed with respect to transient collections, which are countable, but not seqable, but I’ve widened it as it’s really an issue with any non-seqable collection.
There are good reasons for transient collections not to be Seqable - seqs imply caching, caching hurts perf, and the whole reason to be using transients is for batch load perf. Similarly, iterators are stateful and transients are mutable so that does not seem like a good path for transients. But it is quite reasonable to want to know whether a transient you are building has anything in it yet via
empty?
.Separately, needing to convert a Counted collection to a seq to check emptiness is usually slower than just checking the count.
Original example that drove this ticket (obviously the check would probably be used in the midst of an algorithm, not in this way, this is just a small example):
A workaround is to use
(zero? (count (transient ...)))
check instead.Cause:
empty?
is based on seqability, which transients don't implement.Proposed Add a branch to
empty?
for counted? colls. Transients implement Counted so gain support via this branch. Other colls that are counted are also faster. Seq branch continues to work for seqable non-Counted colls as before.Perf test:
Results:
coll
before
after
result
p
0.72 ms
0.08 ms
much faster when empty
p1
0.15 ms
0.13 ms
slightly faster when not empty
t
error
0.19 ms
no longer errors
t1
error
0.20 ms
no longer errors
Not sure if doc string should be tweaked to be more generic, particularly the "same as (not (seq coll))" which is now only true for Seqable but not Counted. I think the advise to use (seq coll) for seq checks is still good there.
I did a skim for other types that are Counted but not seqs/Seqable and didn't find much other than internal things like ChunkBuffer. Many are both and would thus use the counted path instead (all the persistent colls for example and any kind of IndexedSeq).
I guess another option would be just to fully switch empty? to be about (zero? (bounded-count 1 coll)) and lean on count's polymorphism completely.
Patch: clj-1872-2.patch
Screened by: fogus