1) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L101 - do we need the (o instanceof CacheKey) check? If the usage of this is constrained then we might be able to skip it (and it will blow up on the next line if something is wrong).
I'm usually wary of violating equality/hashCode contracts, and this doesn't even appear as a blip on the radar in profiling. I think we could drop it
2) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L110 - shouldn't we precompute and save the hash code!? The only thing we're making this for is fast hash comparisons. That hash computation is string length dependent - it ain't cheap.
It's less memoizable than you might think; each CacheKey is only indexed a few times, and only at query time; it also doesn't help us for equality checks, since those only occur after hashing. I can add a memoizing field for it at the cost of another 32 bits/kw; we'll see how it impacts performance.
3) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L126 - have you tested with other values here? Should have some justification for this.
I experimented with several values on the Clojure test suite, benchmarks, and some real-world hadoop code. Diminishing returns, as you'd expect. 0.1 and 0.5 are essentially identical in runtime tradeoff. We could drop to 0.01 if desired; it'll only make a difference in large (10-100K) extant keyword benchmarks, as far as I can tell.
4) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L126 - have you tested with other values here? Should have some justification that this is a reasonable number.
Same question as #3?
5) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L169 - there is a race here (actually more than one if you include getting the tableSize):
Th1: orphansCount = orphans.get()
Th2: orphansCount = orphans.get()
Th2: orphansNew = orphans.getAndSet(0)
Th2: orphansNew > orphansCount -> start cleaning
<huge gc, 10 zillion orphans are created>
Th1: orphansNew = orphans.getAndSet(0)
Th1: orphansNew > orphansCount -> start cleaning
but I guess this is "safe"; we just have multiple threads cleaning in that case.
Yep. This check is only there as an optimization--and note that if a huge GC occurs, it's likely we want to schedule a followup traversal of the table anyway, because the thread that's already cleaned some of the table has probably missed some subsequently GC'ed elements. The number of concurrently cleaning threads is bounded by the rate of GC churn, and in the most pathological case (sadly, I haven't been able to produce this race experimentally), this degenerates to the old Clojure behavior of every thread doing a full scan.
6) In general it seems pretty sloppy (I don't mean that pejoratively) how the orphans, rq, and cleaning thread are working together and may be out of sync. I get it and I even believe it will work (usually) but I worry about timing issues that I am too dumb to think of.
Is there a simpler strategy? What if every Nth call to intern() cleaned M entries (or up to M% of table)?
Every nth call is just fine, but it degrades more poorly for large tables. In general, I try to lean towards scale-invariant solutions, which is why I aimed to reclaim roughly a tenth of the entries in the map every time. Maybe more, maybe less, depending on CAS retries, delayed threads resetting the counter to zero, etc.
Doing M entries or M% is more tricky; gotta figure out which threads collect what fraction when, how you efficiently access that subsection of the hash, make sure elements don't fall through the cracks, etc.
7) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L177 - if you made the iterator explicit in this loop, it would be safe to call iterator.remove() instead of table.remove() and I believe that would be faster on CHM.
I agree. I figured Rich had a good reason for doing it this way, but if you concur I'll change it.
8) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L198 - could have two versions of this with/without the symbol. Not sure if that would be faster but they might both inline better into their callers then?
I agree. We can do that dispatch statically and cut down on branch misprediction, too.
9) https://gist.github.com/aphyr/f72e72992dade4578232#file-gistfile1-diff-L242 - what's the use case for finding an external CacheKey? Fast re-lookup for specialized use? Do we want to commit to this in the API?
Keep forgetting Java's obsession with encapsulation. I'll privatize.