<< Back to previous view

[CTYP-99] Checking an ns becomes significantly slower as the number of optional keys in a HMap increases Created: 03/Dec/13  Updated: 14/Feb/14  Resolved: 14/Feb/14

Status: Resolved
Project: core.typed
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Blocker
Reporter: Gordon Syme Assignee: Ambrose Bonnaire-Sergeant
Resolution: Completed Votes: 1
Labels: None

Attachments: File core.clj     File core.clj     File project.clj     GZip Archive test-hmap.tar.gz    

 Description   

I'm using clojure.core.typed 0.2.19 with the slim classifier but I've observed the same without slim.

My suspicion is that the latent filters associated with functions grow in size exponentially with each extra optional key to a HMap (based on the output when you have a type error). I think it's generating all combinations of present and absent keys for the HMap when calculating latent filters for a function.

I've attached a tarball with a lein project with ten namespaces that all contain the same ten simple functions in the form

(t/ann f [(HMap :optional {:a String}) -> (t/Option String)])
(defn f [m]
  (:a m))

The type annotations vary in the number of optional keywords.

(test-hmap.core/go) checks all the namespaces. The time to check each namespace grows non-linearly. The first namespace gets penalised by core.typed initialisation the first time it's run.

E.g. on my local machine:

test-hmap.one
Start collecting test-hmap.one
Finished collecting test-hmap.one
Collected 1 namespaces in 20.743 msecs
Start checking test-hmap.one
Checked test-hmap.one in 46.231 msecs
Checked 1 namespaces (approx. 34 lines) in 67.672 msecs


test-hmap.two
Start collecting test-hmap.two
Finished collecting test-hmap.two
Collected 1 namespaces in 18.751 msecs
Start checking test-hmap.two
Checked test-hmap.two in 45.525 msecs
Checked 1 namespaces (approx. 35 lines) in 64.814 msecs


test-hmap.three
Start collecting test-hmap.three
Finished collecting test-hmap.three
Collected 1 namespaces in 19.106 msecs
Start checking test-hmap.three
Checked test-hmap.three in 57.346 msecs
Checked 1 namespaces (approx. 35 lines) in 77.055 msecs


test-hmap.four
Start collecting test-hmap.four
Finished collecting test-hmap.four
Collected 1 namespaces in 21.222 msecs
Start checking test-hmap.four
Checked test-hmap.four in 51.339 msecs
Checked 1 namespaces (approx. 35 lines) in 73.155 msecs


test-hmap.five
Start collecting test-hmap.five
Finished collecting test-hmap.five
Collected 1 namespaces in 22.215 msecs
Start checking test-hmap.five
Checked test-hmap.five in 63.415 msecs
Checked 1 namespaces (approx. 35 lines) in 86.309 msecs


test-hmap.six
Start collecting test-hmap.six
Finished collecting test-hmap.six
Collected 1 namespaces in 23.76 msecs
Start checking test-hmap.six
Checked test-hmap.six in 99.407 msecs
Checked 1 namespaces (approx. 35 lines) in 123.881 msecs


test-hmap.seven
Start collecting test-hmap.seven
Finished collecting test-hmap.seven
Collected 1 namespaces in 26.519 msecs
Start checking test-hmap.seven
Checked test-hmap.seven in 213.515 msecs
Checked 1 namespaces (approx. 35 lines) in 240.639 msecs


test-hmap.eight
Start collecting test-hmap.eight
Finished collecting test-hmap.eight
Collected 1 namespaces in 32.581 msecs
Start checking test-hmap.eight
Checked test-hmap.eight in 626.76 msecs
Checked 1 namespaces (approx. 35 lines) in 659.975 msecs


test-hmap.nine
Start collecting test-hmap.nine
Finished collecting test-hmap.nine
Collected 1 namespaces in 43.478 msecs
Start checking test-hmap.nine
Checked test-hmap.nine in 2634.468 msecs
Checked 1 namespaces (approx. 35 lines) in 2678.716 msecs


test-hmap.ten
Start collecting test-hmap.ten
Finished collecting test-hmap.ten
Collected 1 namespaces in 84.394 msecs
Start checking test-hmap.ten
Checked test-hmap.ten in 9277.54 msecs
Checked 1 namespaces (approx. 35 lines) in 9362.613 msecs


 Comments   
Comment by Ambrose Bonnaire-Sergeant [ 03/Dec/13 12:50 PM ]

Yes, the primitives are :mandatory, :absent-keys and :complete; :optional expands to be in terms of those.

Thanks for the report, I haven't done performance testing on this strategy. It will probably need to be reconsidered.

Comment by Cees van Kemenade [ 26/Jan/14 1:04 PM ]

code of project ctcrash

Comment by Cees van Kemenade [ 26/Jan/14 1:05 PM ]

+1 For this issue.

I've spend quite some time to learn core.typed on a real use case, and I have to say that I'm amazed by the power of the analysis and the potential to prevent errors that otherwise would be detected to late! Apart from that the type-annotations are valuable and precise information about the interface when doing maintenance at code you did not see for some time.
Two big reasons to use core.typed, however, as I turned more and more code into typed code the number of optional keys increased in my core data-record increase, and core.typed (check-ns) slowed down to the extreme (using version 0.2.25 (current stable)). I refactored the code to get a single line of code that triggers the issue on my core data structure a (Seqable DbMsg). Attached you find the project.clj and code (core.clj vs 8:09pm).

When I reduce the number of optional keys to 4-5 the (check-ns) runs in a few seconds. But having 9-11 optional keys kills the (check-ns) process.

I hope this one-liner helps making the issues easy to reproduce.
Currently I don't know a work-around for this issue,
so I can not use core.typed to check my project.

I guess that many people/projects should run into the same issues,
as the pattern is returning quite often in clojure:

1. start with data-set with small Hashmaps
2. the hashmaps evolve in a number steps (meaning more keys are added and some of the existing keys are removed)
I guess the workaround would be to use limited or no optional keys, and prepare a custom HMap definition for each stage of analysis. Although this would work, the amount of bookkeeping already significant for a all records are running through the same stages (linear process), and will be a showstopper if it is a non-linear process (not all records follow the same path (sequence of analysis stages)).
I will investigate whether splitting in multiple types (for different stages) will rescue my day.

Comment by Ambrose Bonnaire-Sergeant [ 26/Jan/14 7:51 PM ]

Upgraded to 'Blocker'.

Comment by Ambrose Bonnaire-Sergeant [ 27/Jan/14 6:34 AM ]

This is theoretically trivial to implement, and just involves shuffling around where optional keys are handled. However, this is spread out all over the code base, so I may not complete the patch for a week.

Great to hear core.typed is working out for you!

Comment by Cees van Kemenade [ 27/Jan/14 6:44 AM ]

Thanks for picking up this issue fast.
I'll wait patiently.

Comment by Ambrose Bonnaire-Sergeant [ 14/Feb/14 2:26 AM ]

This is fixed in 0.2.31.

I've added both of your tests in the test suite - thanks! https://github.com/typedclojure/core.typed-test-suite/tree/master/src/hmap

Please confirm and I'll close the issue.

Comment by Cees van Kemenade [ 14/Feb/14 4:11 AM ]

This resolve the issue and also resolves CT-102.
Thanks!

The code in hmap/big-options.clj (from your test-suite) still contains a workaround to make core.typed proceed without errors.
The (if (string? notifs) ... on line 107 is introduced to prevent a type-error (core.typed infers it that notifs might als be a (sequable String). However, as this is the else-branch of line 104 (if (or (seq? notifs) (vector notifs)) ... the case that notifs is a (Seqable string) can be excluded already).
Also using on line 104 the more general (if (sequential? notifs) ... does not resolve this issues.

Is this inference error already on the list, or would you recommend posting a seperate JIRA-case for this issue?

Comment by Gordon Syme [ 14/Feb/14 5:56 AM ]

Our core.typed checks run an order of magnitude quicker now (~400s --> ~80s). And they keep that speed with all the extra optionals now. That's really awesome, thanks Ambrose!

Generated at Sun Sep 14 23:09:40 CDT 2014 using JIRA 4.4#649-r158309.