Skip to end of metadata
Go to start of metadata

Rationale

It can be difficult to manage resource lifetimes, especially when laziness is involved. Simply local scoping can be inadequate. Resource scopes can decouple resource lifetimes from creating scope.

Plan

  • Generalize from resource scopes to general scopes (succeed/fail/exit)
  • with-open et al make scopes
  • Some design work and code has been done

Use Cases

  • REPL usage: open stream and it closes before I lazily consume it. Doh!
  • Start an activity, and then have nested resource lifetime automatically tie to it
    • Task A calls Iib B which allocates resources
  • Prevent resource lifetimes from nesting
    • lib C knows it doesn't want its lifetime governed by Task A
  • All works across threads

Problems

  • Other than memory allocation, the proper freeing of resources (e.g. closing open files) is on the user
    • thus, easy to get wrong
  • 'He who acquires releases' is one strategy
    • can be automated, e.g. with-xxx
      • but with-xxx puts an envelope around the availability of the resource, tied to the acquirer
      • we trip over this when we create and return lazy seqs that reference resources
      • anywhere else?, could we limit the scope of the solution to lazy seqs?
        • even then, needs care
        • e.g. closing on seq exhaustion not triggered by partial consumption
  • piggybacking on GC gets around this
    • i.e. finalizers
    • but is bad, because we can't know when/if they will run
      • non-option?
  • reference counting works in some langs
    • but not on the JVM where aliasing is common and undetectable
    • is it possible to wrap resources with a counter/tracker?
      • open question
  • can tag things as closable and call close everywhere defensively
    • but everywhere is a big and dangerous place
  • Fundamentally - how do you know when something is (or contains) a 'resource'
    • and how do you know when no one cares about it anymore?
      • and what does it mean to care?
      • could just fold down to  - care means would be unhappy if closed
    • and what needs to be done about it when that is the case?
      • who knows what that is?
  • Is the problem as simple as closing?
    • are there ever two paths?
      • e.g. happy path saves and unhappy path discards
  • are there any similarities to exception handling?
    • one part of code has problem, another deals with it

Issues

  • Consider threads as well as nesting
  • How do scopes interact with io! and stm
  • Can we hydrate the fork/join tree where we need it?
  • How to prevent mess?
    • (when-scope :fails ...) et al are very declarative
    • but are setting up non-local effects
    • potentially leading to wtf when triggered
    • is this the right external API?
      • any way to ensure correctness
      • will we be beset with requests for order control
  • Is the nearest enclosing scope always the right one?
    • what to do if not
      • named scopes
      • scopes as args

Proposal

  • dynamic binding *scope* holds collection of things needing cleanup
  • with-open binds a new *scope*, cleans up *scope* at end
  • lib APIs can call a new fn (scope thing) to add a resource to the current bound scope
    • throws exception if no scope available
  • REPL cleans up scope before every loop
  • *scope* is an atom, so N "child" threads can add to it
    • binding conveyance makes this work
    • still up to you to make sure "parent" thread outlives children
  • this patch demonstrates the idea

IMO this nails all the use cases and problems above, with a few
caveats that I think can be subsumed into a single example as follows:

Caller A creates a scope. Lib B combines several resources to do some
work for A, some of which will be lazy/incomplete when returning to
A. The challenge is that B wants to clean up some, but not all, of the
scoped things it used. Some possibilities:

  • You are on your own in this case. Don't use scopes.
  • Scopes are more queryable/manipulable to help lib B author, e.g. named scopes.
    • complicated!
  • Scope push: Some way to opt back out of as scope (push me to the enclosing scope).
    • complicated and difficult to reason about
  • Make resources aware of their own cleanup logic, so that B can say "I am done with C but still need D and E".
    • non-starter, pollutes every API in the universe with bookkeeping to propagate cleanup logic.
  • Leverage close-on-consume
    • This is an antipatten today, but a potentially useful optimization with scopes in play
  • Scope grab
    • Give Lib B a way to grab the scope associate with a single resource and clean it up:
    • Easier to compose than named scopes
      • C doesn't have to do anything special to be used in this way
      • B doesn't have to know anything about C's use of scopes
    • Easier to componse that scope push
      • simple inversion: cleanup what you are done with, instead of enumerating the things still in play

I believe that 95% of the current problems people experience are solved with the simple mechanism proposed here. My preference is to start with "you're on your own" and evolve to "scope grab" if necessary. The latter is an easy, non-breaking extension.

TBD to flesh out this approach

  • investigate interaction with fork/join
    • think this reduces to "make binding conveyance work with fork/join"
  • new protocol or IFn for bits of cleanup logic placed in a scope
    • handling Closeable and IFn in an if statement for now
    • code in core happens before protocols are available
  • verify that change to with-open is non-breaking for existing code
    • but note no benefit either, until you start calling (scope ...)

Scenarios

  1. Creating a resource whose lifetime is bounded to yours: use with-open, as you do today
  2. Creating a resource that will be passed back to a caller unconsumed: call (scope res)
    1. caller must have made a scope
    2. mandatory scope is not a limitation here, it is a sensible requirement
  3. Returning >1 resources from a function with different lifecycles and cleanup rules: not supported
    1. not going to worry about this without realistic example
  4. Create resources from child threads
    1. works if thread creation conveys bindings
    2. you are responsible for waiting on child threads
  5. Consuming one resource, pass another through to caller: put with-open around the one you are consuming
  6. Decide on the fly whether to clean up or let parent do it
Labels:
  1. Dec 28, 2010

    Thoughts:

    • Explicitly named scopes are less confusing than nearest-enclosing-scope
    • Are agent-send semantics (hold until transaction commits) sufficient for STM support?

    Sample implementation (updated): source and tests.

    1. Jan 28, 2011

      There's a discussion on clojure-dev about with-open related to dropping exceptions.  I suggest each of the handlers in your implementation, and Rich's, should be called in a try/catch so exceptions aren't silently dropped.  There needs to be a decision about what do do in the catch: propagate 1 of the 2 exceptions and handle the other another way or swallow it, or throw a new exception that holds both.  Another key in your scope object could hold a function that throws which the user could override.  Scopes should be consistent with an improved with-open in this regard.

    2. Jan 28, 2011

      Concerning named scopes... I don't see how this would be used across libraries.  Say I write a db library (I have) that opens lots of resources deeply nested (the fns return objects that hold resources).  Does the user use declare-scope and the library takes scope as a param to it's fns?

      With enclosing scopes (like Rich's), everyone would refer to the single *scope* var, and if the user wants multiple independent scopes, use binding (not tested):

      In an implementation I did, the enclosing scope style worked nicely, but I also had a macro scope1 that created a scope if there was not already one, otherwise, added nothing so the body could return more resources to the calling scope.

    3. Oct 12, 2013

      (I'm here via https://groups.google.com/d/topic/clojure/qPUd3AEVxT8/discussion)

      In this code, knowing the name of a scope grants full privileges over the scope. Alternatively, unique scopes could be allocated like gensyms and their protocol could offer a function for creating a handle to the scope with reduced privledges. This would allow you to pass a scope pointer around while retaining authority higher up in the stack. Additionally, "stack" can be generalized to "process tree", since you're no longer tied to a thread local: You could create a scope and pass the privilege object around to transfer ownership. Although, I guess you can do that by manually fiddling with the vars anyway.

  2. Jan 14, 2011

    I think the goal is to hold precious resources long enough to do required processing -- no more, no less -- then let it go. This problem is not restricted to a particular known Java types or their uses (e.g., accessing a File object via a lazy sequence). It could be any Java type held referred to from anywhere (e.g., a database connection held in a closure tied to an anonymous function). So you have to assume that anything could own a precious resource.

    1. Jan 14, 2011

      Yes, in all the above files are just an example resource, and 'closing' is a metaphor for releasing it

  3. Jan 14, 2011

    Is it necessary to address these two caes the same way:

    • he who acquires releases
    • one acquires, another releases

    That is, with-open (like using in C# and try in Java 7), is useful in the case that bound resource will be consumed entirely. Returning it wrapped in a lazy seq or any other construct, is a different case. Does it have to work with with-open? Certainly in C#, you use using when you'll consume and dispose the resource and you don't use using in cases where you're storing the resource, handing it back to someone, etc.

    1. Jan 14, 2011

      No, but the problem we are contending with is when 'he who acquires releases' is insufficient. Also, it can be difficult to know sometimes when someone has captured your resource in a nested contexts and then leaked it out of your with/using. This is much more likely to happen when HOFs and laziness are involved.

      1. Jan 14, 2011

        Agreed. Certainly we pass references to precious resources managed by a with/using into functions all the time w/o really knowing what might happen to them.

  4. Jan 14, 2011

    I think there is a strong relationship to exception handling.

    EH allows a callee to trigger a caller's cleanup code, scopes allow a callee to be able to trigger it's own cleanup code
    In either case, the caller defines when/where the cleanup happens.

    One thing EH has that I haven't seen mentioned beyond the comment about supporting nesting, is the need for some form of propagation. For instance A calls B, B calls C and D. B gets constructs back from both C and D. Both constructs encapsulate references to precious resources. B wants to clean up the resource from C when it's done. B wants to return the construct returned from D back to A. A will clean it up when it's done with it. So:

    • C and D need a way to indicate "someone else has to clean this up"
    • B needs a way to indicate "I clean up the resource from D, but not C"
    • A needs a way to indicate "I clean up the resource from C"

    In the EH case, A and B would indicate what cases they handle by catching a particular type and/or rethrowing. In the resource case, all you have is the construct the callee returned, which may (or may not) encapsulate a resource. If cleanup code were associated with the construct that encapsulates the resource, and there were nested scopes, functions could hand responsibility for cleaning up a resource by "popping scope" for a given construct. That is, B would have a scope, but also a call to say "push responsibility for cleaning up the resource encapsulated by the construct returned from C up to the enclosing scope".

    Hopefully that makes sense - the main point is that it some way to have hierarchy for processing (like EH) is key.

    1. Oct 12, 2013

      > I think there is a strong relationship to exception handling.

      There's also a sound theoretical basis for this relationship too!

      I've found myself linking folks to this paper about Eff more and more frequently.

  5. Jan 28, 2011

    The link under Plan to github is broken because of a trailing paren.

  6. Jul 04, 2011

    "Scope grab" is an interesting addition. Earlier on in these discussions, I was afraid that having only one scope active at a time would be too limiting. But now I'm not so sure. How often does one need to keep two or more resources open at the same time and close them at different times? I expect not often, so "you're on your own" is an acceptable answer.

    This still requires that users be cognizant of how they are consuming lazy sequences, and that libraries which create resources (e.g. DBs, files, sockets) to be redesigned to support scopes.

    1. Jul 05, 2011

      Users will have to be cognizant in the same way they have to be cognizant of transactions when using the STM: if you do it wrong you will get an exception.

      Libraries won't be required to change, i.e. old code isn't broken. They will have to change to take advantage of the feature, but I think that is inevitable. A big goal for me is to keep the means of combination clean, e.g. the sequence library doesn't have to change, only some input sources to it.

      I hope to post some examples later this week. If you have specific examples you would like to see let me know.

  7. Jul 05, 2011

    It doesn't feel complete to me (but last time we talked about something more complete, a bunch of people in the room weren't even sure there was a problem, so what do I know... :-).

    It isn't clear in your proposal whether scopes nest or not. If not, what happens when a library attempts to make a scope when one already exists? Does it make a new one, reuse the old one, throw an exception, or do something else?

    This doesn't feel any simpler to reason about that scope push, but I agree it is essentially an inversion of that idea.

    In any case - this, scope push, or reifying a close protocol onto things - you have to chain cleanup calls if you encapsulate things AND want (or require in the close protocol case) a caller to be able to selectively control when somethings get closed.

    I'd like to work through a complex example with you when you have time.

    1. Jul 12, 2011

      Scopes nest, e.g. with-open will make a new one. Most "selective close" scenarios are not supported. I am going to add some examples

  8. Nov 21, 2011

    At the Clojure/conj, Rich Hickey said that this was still an open problem.  It turns out that it's difficult to know what the user code had done with a resource so it's not clear that there's a general solution.