tools.nrepl

Middleware linearization is nondeterministically wrong

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Critical Critical
  • Resolution: Unresolved
  • Affects Version/s: 0.2.3
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Environment:
    clojure 1.6.0

Description

I've had trouble getting my middleware ordered correctly, and have wittled it down to the following example which seems to only fail in some processes. I.e., it is deterministic within a particular jvm, but across jvms the result can change. My guess is that this has to do with hashing vars.

This code prints true (signifying a correct result) roughly 1/3 of the time and false otherwise.

(ns chas.core
  (:require [clojure.tools.nrepl.middleware :as mid]
            [clojure.tools.nrepl.middleware.pr-values]
            [clojure.tools.nrepl.middleware.session]))

(def wrap-foo identity)
(mid/set-descriptor! #'wrap-foo
                     {:expects #{#'clojure.tools.nrepl.middleware.session/session "eval"}})

(def wrap-bar identity)
(mid/set-descriptor! #'wrap-bar
                     {:requires #{#'clojure.tools.nrepl.middleware.session/session},
                      :expects #{"describe" #'clojure.tools.nrepl.middleware.pr-values/pr-values},
                      :handles {"stacktrace" {}}})

(defn -main
  []
  (println
   ;; should return true, since #'wrap-foo belongs later in the stack than
   ;; #'clojure.tools.nrepl.middleware.session/session due to its :expects
   ;; clause
   (->> [#'wrap-foo
         #'wrap-bar
         #'clojure.tools.nrepl.middleware.session/session]
        (mid/linearize-middleware-stack)
        (filter #{#'clojure.tools.nrepl.middleware.session/session #'wrap-foo})
        (= [#'clojure.tools.nrepl.middleware.session/session #'wrap-foo]))))

Activity

Gary Fredericks made changes -
Field Original Value New Value
Description I've had trouble getting my middleware ordered correctly, and have wittled it down to the following example which seems to only fail in some processes. I.e., it is deterministic within a particular jvm, but across jvms the result can change. My guess is that this has to do with hashing vars.

This code prints {{true}} (signifying a correct result) roughly 1/3 of the time and {{false}} otherwise.

{code}
(ns chas.core
  (:require [clojure.tools.nrepl.middleware :as mid]
            [clojure.tools.nrepl.middleware.pr-values]
            [clojure.tools.nrepl.middleware.session]))

(def wrap-foo identity)
(mid/set-descriptor! #'wrap-foo
                     {:expects #{#'clojure.tools.nrepl.middleware.session/session "eval"}})

(def wrap-bar identity)
(mid/set-descriptor! #'wrap-bar
                     {:requires #{#'clojure.tools.nrepl.middleware.session/session},
                      :expects #{"describe" #'clojure.tools.nrepl.middleware.pr-values/pr-values},
                      :handles {"stacktrace" {}}})

(defn -main
  []
  (println
   ;; should be true
   (= #'wrap-foo
      (last
       (mid/linearize-middleware-stack
        [#'wrap-foo
         #'wrap-bar
         #'clojure.tools.nrepl.middleware.session/session])))))
{code}
I've had trouble getting my middleware ordered correctly, and have wittled it down to the following example which seems to only fail in some processes. I.e., it is deterministic within a particular jvm, but across jvms the result can change. My guess is that this has to do with hashing vars.

This code prints {{true}} (signifying a correct result) roughly 1/3 of the time and {{false}} otherwise.

{code}
(ns chas.core
  (:require [clojure.tools.nrepl.middleware :as mid]
            [clojure.tools.nrepl.middleware.pr-values]
            [clojure.tools.nrepl.middleware.session]))

(def wrap-foo identity)
(mid/set-descriptor! #'wrap-foo
                     {:expects #{#'clojure.tools.nrepl.middleware.session/session "eval"}})

(def wrap-bar identity)
(mid/set-descriptor! #'wrap-bar
                     {:requires #{#'clojure.tools.nrepl.middleware.session/session},
                      :expects #{"describe" #'clojure.tools.nrepl.middleware.pr-values/pr-values},
                      :handles {"stacktrace" {}}})

(defn -main
  []
  (println
   ;; should return true, since #'wrap-foo belongs later in the stack than
   ;; #'clojure.tools.nrepl.middleware.session/session due to its :expects
   ;; clause
   (->> [#'wrap-foo
         #'wrap-bar
         #'clojure.tools.nrepl.middleware.session/session]
        (mid/linearize-middleware-stack)
        (filter #{#'clojure.tools.nrepl.middleware.session/session #'wrap-foo})
        (= [#'clojure.tools.nrepl.middleware.session/session #'wrap-foo]))))
{code}
Environment clojure 1.6.0
Affects Version/s 0.2.3 [ 10251 ]
Hide
Gary Fredericks added a comment -

Interesting followup observation I made with Alan Malloy in IRC: if we remove "eval" from the :expects clause, it seems to return consistently true. On the other hand if we instead add #'clojure.tools.nrepl.middleware.interruptible-eval/interruptible-eval to the middleware list, suddenly it returns consistently false.

Show
Gary Fredericks added a comment - Interesting followup observation I made with Alan Malloy in IRC: if we remove "eval" from the :expects clause, it seems to return consistently true. On the other hand if we instead add #'clojure.tools.nrepl.middleware.interruptible-eval/interruptible-eval to the middleware list, suddenly it returns consistently false.
Hide
Gary Trakhman added a comment - - edited

I should add my own experience report with this, hope to dig in to it someday soon: https://groups.google.com/d/msg/clojure/nUBBbYZHuTE/ScLBH-A2HkoJ

In my case, it was incorrectness, not indeterminism.

Show
Gary Trakhman added a comment - - edited I should add my own experience report with this, hope to dig in to it someday soon: https://groups.google.com/d/msg/clojure/nUBBbYZHuTE/ScLBH-A2HkoJ In my case, it was incorrectness, not indeterminism.
Chas Emerick made changes -
Fix Version/s 0.2.4 [ 10160 ]
Hide
Chas Emerick added a comment -

Half of the middleware linearization stuff is too clever, the other half is really dumb. :-/

I'd love a patch for this. Absent one, I'll have to dig into it, but I'm not going to hold up 0.2.4 any further for that opportunity to come around.

Show
Chas Emerick added a comment - Half of the middleware linearization stuff is too clever, the other half is really dumb. :-/ I'd love a patch for this. Absent one, I'll have to dig into it, but I'm not going to hold up 0.2.4 any further for that opportunity to come around.
Chas Emerick made changes -
Fix Version/s 0.2.4 [ 10160 ]
Chas Emerick made changes -
Priority Major [ 3 ] Critical [ 2 ]
Hide
Gary Fredericks added a comment -

Is this a fair summary of the requirements?

1) Should be deterministic
2) Should only return a result that respects the :expects and :requires clauses – if this is impossible it should throw an exception explaining why

Show
Gary Fredericks added a comment - Is this a fair summary of the requirements? 1) Should be deterministic 2) Should only return a result that respects the :expects and :requires clauses – if this is impossible it should throw an exception explaining why
Hide
Chas Emerick added a comment -

Yes, that's fair. I might add (3) try to not break anything significant that might rely upon undocumented behaviour, but I doubt that's helpful.

Show
Chas Emerick added a comment - Yes, that's fair. I might add (3) try to not break anything significant that might rely upon undocumented behaviour, but I doubt that's helpful.
Hide
Gary Fredericks added a comment -

I've written generative tests for this, but test.check requires clojure 1.4 (while nrepl tests run on 1.2).

If I wrap them in code that NOOPs for old clojures, I imagine they will they get run in CI? Is that a reasonable approach?

Show
Gary Fredericks added a comment - I've written generative tests for this, but test.check requires clojure 1.4 (while nrepl tests run on 1.2). If I wrap them in code that NOOPs for old clojures, I imagine they will they get run in CI? Is that a reasonable approach?
Hide
Colin Jones added a comment -

That approach sounds great to me.

Show
Colin Jones added a comment - That approach sounds great to me.
Hide
Gary Fredericks added a comment -

Given I've coded it that way, is there an easy way for me to run the tests without tweaking the pom.xml to specify 1.4?

Show
Gary Fredericks added a comment - Given I've coded it that way, is there an easy way for me to run the tests without tweaking the pom.xml to specify 1.4?
Hide
Gary Fredericks added a comment -

Okay I think I finally tracked this down.

Basically the conj-sorted function was not a valid way of doing a topological sort.

I replaced it with an actual topological sort, that still pushes independent middlewares toward the end.

The generative tests are now passing, as are the rest of the tests.

At some point I'll circle back to my original problem and see if that's solved.

I still don't fully understand all of the linearization code, so it's possible that the problem is more subtle than this.

The fix is on a github branch here. I might want to wait on another release of test.check before making a patch, since it should have some new features that make the generative tests better (in particular I'd be able to remove the dependency on my test.chuck utility library, which obviously shouldn't be in the tools.nrepl codebase).

Show
Gary Fredericks added a comment - Okay I think I finally tracked this down. Basically the conj-sorted function was not a valid way of doing a topological sort. I replaced it with an actual topological sort, that still pushes independent middlewares toward the end. The generative tests are now passing, as are the rest of the tests. At some point I'll circle back to my original problem and see if that's solved. I still don't fully understand all of the linearization code, so it's possible that the problem is more subtle than this. The fix is on a github branch here. I might want to wait on another release of test.check before making a patch, since it should have some new features that make the generative tests better (in particular I'd be able to remove the dependency on my test.chuck utility library, which obviously shouldn't be in the tools.nrepl codebase).
Hide
Gary Fredericks added a comment -

Attached a patch that contains the aforementioned topological sort, with a single regression test (and no generative tests).

Things to note:

  • I created a wrapper for clojure.core/ex-info so that in the case of a dependency cycle we can include with the exception message an example cycle, but don't have to assume clojure 1.4. If assuming clojure 1.4 is okay I can take that out and use clojure.core/ex-info directly.
  • I tried adding a test to verify that a dependency cycle creates an exception, but the test failed, and when I dug into it I discovered that the extend-deps function was doing strange things I did not expect, which means either it is buggy or I do not understand what it is supposed to do. See this commit for the failing test.
  • The topological sort now explicitly pushes middlewares with no dependencies to the end of the stack, instead of the more implicit fashion in the older code.
  • This patch does fix the problem I described in the description
Show
Gary Fredericks added a comment - Attached a patch that contains the aforementioned topological sort, with a single regression test (and no generative tests). Things to note:
  • I created a wrapper for clojure.core/ex-info so that in the case of a dependency cycle we can include with the exception message an example cycle, but don't have to assume clojure 1.4. If assuming clojure 1.4 is okay I can take that out and use clojure.core/ex-info directly.
  • I tried adding a test to verify that a dependency cycle creates an exception, but the test failed, and when I dug into it I discovered that the extend-deps function was doing strange things I did not expect, which means either it is buggy or I do not understand what it is supposed to do. See this commit for the failing test.
  • The topological sort now explicitly pushes middlewares with no dependencies to the end of the stack, instead of the more implicit fashion in the older code.
  • This patch does fix the problem I described in the description
Gary Fredericks made changes -
Attachment NREPL-53.patch [ 13389 ]

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated: