Skip to end of metadata
Go to start of metadata

Problem

There are Clojure dialects (Clojure, ClojureScript, ClojureCLR) on several different platforms. We wish to write a single source file that runs on more than one of these platforms, retaining all of the common code and factoring out only the platform-specific bits.

Use cases for platform-specific functionality:

  • Platform-specific require/import (the most common case)
  • Exception handling (catch "all" is platform-specific - see http://dev.clojure.org/jira/browse/CLJ-1293 about this specific case)
  • Platform-specific calls for strings, dates, random numbers, uris, reflection warnings (hosty things)
  • Protocol implementations have leading dash in cljs so name is different
  • Extending protocol to implementation-specific classes (clojure.lang.IPersistentVector)
  • Math - casting or other math stuff specific to ClojureScript

Proposal: Feature expressions

Common Lisp approaches this problem using feature expressions.

Each platform has a variable called *features* that is a set of keywords to indicate supported features. The initial set of well-known features will indicate the platform: :clj, :cljs, :clr. Users may introduce a broader set of features or the well-known set may be expanded at a later date. User-supplied features should be namespaced.

The Reader understands a new kind of "feature expression". The reader macros #+ and #- are used to include or skip a form based on a feature expression:

  • #+feature-expr form - include next form if feature-expr evaluates to true
  • #-feature-expr form - skip next form if feature-expr evaluates to true

Feature expressions evaluate as booleans and are defined as follows:

  • feature - a symbol, evaluated as if:

  • (not feature-expr) - returns logical not of feature-expr
  • (and feature-expr*) - returns true if all feature-expr are true, otherwise false.
  • (or feature-expr*) - returns true if any of its feature-expr are true, otherwise false.

Skipping in the reader is performed by binding *read-suppressed* to true for the next form.

Example:

(ns feature.expressions
  #+cljs (:require [goog.string :as gstring]))

(defn my-trim [s]
  #+clj (.. s toString trim)
  #+cljs (gstring/trim s))

(my-trim " Hello CL? ")

 

Reading Unreadable Things

There may be tagged literals or classes that are not known or available on all platform. The reader must be able to read but avoid constructing these entities.

Example:

The #js tagged literal is not known on the Clojure platform, but the reader should read and skip it without failure.

Potential issue: Clojure CLR needed to expand the valid set of symbols and uses the #| reader extension from Common Lisp to support this in the CLR. The corresponding symbols are not currently readable by Clojure or ClojureScript.  The #| extension support is described here: https://github.com/clojure/clojure-clr/wiki/Specifying-types. The #| reader extension may need to be supported in Clojure and ClojureScript to support the full set of valid ClojureCLR symbols in feature expressions.

Loading and File Extensions

When code needs to be loaded based on a namespace:

  • Clojure will continue to read only .clj files.
  • ClojureScript will first look for .cljs files, then for .clj files. The .clj file will be read as by the reader as a ClojureScript file.
  • ClojureCLR - ???
Potential ClojureScript issue: existing projects may have macro .clj files in the same namespace.
 

Open Extension

One consequence of feature expressions is that the provided library code must encode solutions for all of the supported platforms in the source code at packaging time (in other words, extension is closed to external users).
 
Sometimes it is preferable to isolate the platform specific code in one or more namespaces and allow others to provide implementations for each platform (if, for example, the library author lacked the expertise to do so). This strategy is orthogonal to feature expressions but made possible by the extension preference checking as specified in the previous section.
 

 

Library files:
 src/my/core.clj - ns requires my.thing
 src/my/thing.clj - Clojure implementation of my.thing (loaded by Clojure)
 src/my/thing.cljs - ClojureScript implementation of my.thing (loaded by ClojureScript)

 

Alternate Approaches

Copy / paste

One approach is to maintain two versions of the same file that are largely the same but modify the platform-specific parts in each copy. This obviously works but is gross.

cljx

cljx is an implementation of feature expressions that:

  • rewrites .cljx files containing feature expression-annotated code into external files based on well-known tags
    • The most common use is to use clj and cljs tags and write .clj and .cljs files for consumption by other tools/compilers/etc
  • optionally applies the same transformation interactively via installation of a REPL extension

It has been used successfully by a number of projects (see the cljx README for a partial list).  cljx's limitations include:

  • It does not address portability of macros at all; it is strictly a source-to-source transformation.  Macros continue to be written in Clojure, and must be rewritten or implemented conditionally on the contents of &env.
  • It does not provide any runtime customization of the "features" you can use; these are set either within build configuration (via cljx' Leiningen integration), or via the configuration of the cljx REPL extension.  The latter technically is available for modification, but is not in practical use.
  • The set of provided "features" is limited to one for Clojure (#+clj) and one for ClojureScript (#+cljs).  Further discrimination based on target runtime (e.g. rhino vs. node vs. v8 vs. CLR) would be trivial, but has not been implemented to date. 

cljx expressions are typically applied:

  • Inside ns macro
  • Top-level forms
  • Occasionally internal forms where it's concise
    • (.getTime #+clj (java.util.Date.) #+cljs (js/Date.))
lein-cljsbuild "crossovers"

lein-cljsbuild provides a (deprecated, to be removed) feature called "crossovers" that provides a very limited preprocessing of certain files during the cljsbuild build process; a special comment string is removed, allowing one to work around the -macros declarations required in ClojureScript ns forms.  Crossover files must otherwise be fully portable.  Language/runtime-specific code must be maintained in separate files.  However, (my) experience shows that this can quickly lead to the situation where one has to think a lot about in which file to put a specific function, in order to go though the whole preprocessing machinery. Functions are split into namespaces because of conditional compilation, and not because they belong to the same part or module of the program.

Tagged Literal

Define a custom tagged literal that implements conditional read-time expressions:

#feature/condf [ (and jdk-1.6+ clj-1.5.*) 
 (call-my-fast-reducer-code) 
 else (some-old-fashioned-code) ]

Proof of concept here:  https://github.com/miner/wilkins

References

The Common Lisp Hyperspec about the Sharp Sign macros:

Examples of Common Lisp's Feature Expresions:

Maintaining Portable Lisp Programs:

Crossover files in lein-cljsbuild:

ClojureScript JIRA Tickets and patches with a proof of concept implementation of CL's feature expressions:

Labels:
  1. Jul 22, 2012

    I am not wild about the conditional stuff happening at read time, it means the conditionals have no data representation and cannot be manipulated as such

    1. Jul 22, 2012

      Kevin,

      Can you elaborate with an example?

      1. Jul 27, 2012

        with this design if you read

        #+clojure (+ 1 2)
        #+clojurescript (+ 2 3)

        on clojure you get the list (+ 1 2) and on clojurescript the list (+ 2 3) and there is no indication that a compile time conditional was read. this means the conditionals will not be visible to anything that operates on datastructures rather than text.

        how will macros be shared? I guess they would check `*features*` directly to decide if they want to emit something different for clojure or clojurescript

        another thing is there is no indication that the above is basically a cond, besides the forms being written next to each other. compare to something like:

        (feature-cond
        clojure (+ 1 2)
        clojurescript (+ 2 3))

        it is possible that we have to have conditionals at read time because some forms are readable on one host but another (vars? syntax quote in the reader does some class lookups I think, on the otherhand, syntax quote should be rewritten as a macro)

        if issues with readable/unreadable forms are addressed between hosts, feature-cond above could be a macro that checks `*features*` so no change to the compiler or reader

         

        1. Mar 05, 2013

          At first, I wanted to agree with you that it would be nicer to just have a Macro rather than introducing a new feature into the reader. However, after more thought, I think that this absolutely needs to occur at read time. The main reason is that, if it's a macro, then it will be visible to outer macro forms when it shouldn't be. Consider some macro which expects a symbol, but instead gets a (feature-cond ...) form. It will (assert (symbol? ...)) and fail, even though the feature-cond form might expand to a symbol, the outer macro has no way of knowing.

          It seems like the expected thing is for this to happen almost equivalent to the C preprocessor, so the reader is the right place to do that.

      2. Jul 27, 2012

        The complaint I have with this feature is that now all the separate implementations of a given expression are housed in a single file. This closes off the system, disallowing future expansion (without getting a patch submitted upstream to the library provider). This gets to be a bigger issue when we start porting libraries to other platforms.

        What I'd like to see is some sort of "patch" based system. I'd like to see some way where we can put JVM specific code into a single source file, and then allow users of the library to replace those JVM versions with CLJS versions, CLR, Python, or C versions as needed.

        The common lisp method shown here is simple, but I have to say, it feels like a hack.

        1. Jul 27, 2012

          load-file is an expression, so any kind of feature expressions would allow for providing implementations in different files

          1. Mar 05, 2013

            That assumes you have a runtime compiler, which ClojureScript does not. You'd need to run load-file in the context of the compiler, but ClojureScript (currently) emits top-level forms as JavaScript, rather than running them in the compiler's environment.

  2. Jul 23, 2012

    I received an email from Kevin Lynagh. He could not write to the wiki, so I post it here:

    Hi Roman,

     
    I can't figure out how to leave a comment on the JIRA page, but I hope an email is okay.
     
    Your potential solution looks a lot like my cljx leiningen plugin: https://github.com/lynaghk/cljx
    You write platform-sharable code in .cljx files, optionally annotating toplevel forms with metadata to mark them as platform specific.
    Running the plugin copies code into .clj or .cljs files as appropriate, stripping out forms that are specific to other platforms.
    It also rewrites references to clojure/core functionality; e.g., clojure.lang.Atom becomes cljs.core.Atom.
     
    The plugin is based on kibit (which is based on core.logic), and it's simple to add new rules.
    I'm using it for my C2 library, but outside of that I have no idea if anyone else has used it or run into issues.
    1. Jan 30, 2014

      Just an update re: cljx, which I've been maintaining for some months now.  It was rewritten for the 0.3.0 release to perform all of its transformations statically, i.e. it does not use runtime metadata and does not use the Clojure reader (or tools.reader for that matter).  (The previous usage of runtime metadata and Clojure reader necessitated some compromises in the quality of the transformed output and in preventing the use of some language features tied to the compiler state/runtime, e.g. alias-namespaced symbols and keywords.)

      I've also added some REPL integration, so that namespaces defined in .cljx files are transformed and loaded/compiled as appropriate for the current REPL session (either Clojure or ClojureScript).

      These enhancements have made cljx a workable (and pleasant, IMO) solution to the problem of targeting multiple languages/runtimes from a single Clojure codebase.

      I'll update this wiki page to include a link to cljx under "current solutions".

  3. Jul 27, 2012

    • I like the keywords :clojure and :clojurescript, that is how the dialects are known.
    • I don't think it makes sense to add e.g. :rhino until/unless there is something that can be read only there.
    • AFAIAC namespace support means just "allow namespaced names", which the attached patch handles just fine.
    • I think compilation process and naming questions can be considered separately.
  4. Jul 28, 2012

    > Are those keywords ok? Is :jvm for Clojure and :js for ClojureScript better? 

    > Should ClojureScript add something like :rhino, :v8 or :browser as well?

    :jvm isn't a good idea for Clojure, since :rhino is also on the JVM. If you were writing some ClojureScript code, and wanted to test for a Java package, you'd want to check for :jvm. Similarly, you'd want a :gclosure member as well, if there was ever an implementation that targeted JS without it.

     

    I'd imagine the Clojure set could look like #{:clojure :java :jvm}

    And the ClojureScript one could be #{:clojurescript :js :gclosure} potentially conj :jvm for Rhino

    Because it will soon be possible to have #{:clojurescript :lua}

     

    This raises an interesting point: Why a set? Why not a map? Here's an example Clojure map:

    {:dialect :clojure, :clojure {:version "1.2.3"}, :target :java, :runtime :jvm, :jvm {}}

    And one for ClojureScript:

    {:dialect :clojurescript, :clojurescript {:version "2.3.4"}, :target :js, :js {}, :runtime :browser} ; maybe something for :gclosure version

    This way the #+clojurescript and #-js things will still work like set memberships. We could come up with some way to include an arbitrary predicate against map values, or we could leave that to be feature not available to the reader, only to macros against *features*.

     

    A few issues: Since this happens at read time, it may be too late to decide :v8 vs :rhino, etc. Although, some JavaScript shops/frameworks/teams/whatever do multi-compile sources for different browsers. ie. Do conditional logic by browser version, server side, rather than client side, serve up the browser-specific javascript. In that case, you even could support a :browser key with the various information there.

     

    I feel like this needs some kind of cond macro. I should be able to check for "do I have a runtime that optimizes foo? if so, do this, otherwise, do it the slow way like this" I realize that I could use #+foo, then #-foo, but what if I had two possible optimizations? I'd need #+foo(x) #+bar(y) #-foo(#-bar(y)) ;; seems like I need some kind of "else" expression.

     

    RE: read-time vs macro-expansion-time. Seems like the right thing for this is read-time because of symbol resolution, but we should also consider some compile-time macro support against the same dynamic var. In theory, it should be workable with all the standard macros and stuff, since it's just a set or map, but we should still explore it a bit, so we know we're not missing something.

  5. Aug 01, 2012

    I asked the Common Lisper's for some input over here:

    https://groups.google.com/forum/?hl=en&fromgroups#!topic/comp.lang.lisp/x2k3XbW3LmA

    Elias Mårtenson responded to one of my questions on
    comp.alt.lisp. Here's what he wrote:

    I can address one of the quest, whether they are better
    implemented as a macro. The answer to this is clearly no. In
    fact, it couldn't be implemented as a macro, because at the time
    the macro is evaluated, the symbols making up the expression has
    already been interned by the reader. Thus, if the form references
    any packages that does not exist in your environment, you would
    get an error in the reader, before your conditional expression
    has had the chance to run.

    A somehow related thread I found on this topic is here:

    https://groups.google.com/forum/?hl=en&fromgroups#!topic/comp.lang.lisp/CBB4hzqRCS8

    What are the next steps?

    1. Aug 01, 2012

      I would be careful conflating those issues with common lisp with clojure. clojure's reader is not the same as common lisp's and clojure's symbols are not the same as common lisp's. I think the only part of the clojure reader that has those kind of issues in syntax quote, which tries to resolve classes.

      1. Aug 25, 2012

        Kevin is right. Try this out, it works:

         

        (def ^:dynamic *features* #{'clojure})

        (defmacro feature-cond [& exprs]
        `~(-> (filter (fn [[feature expr]]
        (contains? *features* feature))
        (partition 2 exprs))
        first
        second))

        (feature-cond
        clojure (def x 1)
        clojurescript (def y 2))

        (feature-cond
        clojurescript (* y 3)
        clojure (* x 3))

  6. Aug 02, 2012

    I would like to see the feature system being open to use by libraries. In pallet, we recently introduced features to be able to write providers that work with multiple versions of pallet, and I think it would make sense to replace this with whatever feature system ends up in clojure.

    The current proposal seems to support this, but it would be good to clarify this as a requirement of the feature system.

    1. Aug 05, 2012

      Again, if 'features (or whatever it is called) is a map, then you might be able to do something like (load-file (str "foo/" (:dialect features) ".clj")) or something like that... That wouldn't be possible as a bit-flag set.

  7. Aug 23, 2012

    What problems, if any, could this change cause for existing code? Obviously pre-feature-expression code will be unable to read the forms. In particular:

    • 1.4 Clojure and ClojureScript fail to read feature expressions, thinking they are tagged literals with no handler
    • pre-1.4 Clojure fails with a ClassNotFoundException
    1. Aug 26, 2012

      feature-cond above would be readable in previous versions of clojure, and even other lisps for whatever that is worth.

  8. Mar 06, 2013

    What if we provided read-time variants of unquote and unquote-splicing?

    Here's a splicing example:

    The code inside the #~ and #~@ forms would be run at read-time and subject to *read-eval*. This gives you the full power of Clojure for conditional compilation at read time. We could provide some standard function and macro utilities for conditional compilation that are available both at read and macro expansion time. Allowing for something like:

    This proposal gives you the full power of Clojure for controlling conditional reading and compilation without adding dramatically new ideas (it's basically just syntax-quote at read-time).

    1. May 06, 2013

      I love the consistency and minimalism of this approach.

      One Issue however: It probably won't be possible to read-splice on the toplevel. Consider: What should (read-str "#~@(range 5)") yield?

      I'm still +1 on this one and just forbid toplevel splices.

    2. May 08, 2013

      have you looked at what it would take to allow the reader to splice in forms at any point? with syntax quote it is a little easier because the expression has to be synax quoted to begin with before you can splice in to it.

      1. May 09, 2013

        I guess at the top level, a splice would just assume an implicit "do".

        1. May 09, 2013

          the top level is the easiest case though, the gnarly cases are splicing in to nested data structures, that may be spliced together themselves.

          1. May 09, 2013

            Oh, I though you were asking with respect to Herwig's comment.

            Doesn't syntax quote already have this problem? Try `{1 ~@(list 2 3 4)} for example. You can do that, but you can't do `{1 2 ~@(list 3 4)}

            Seems like a separate issue all together.

            1. May 09, 2013

              syntax quote at least can limit the scope of splicing and unsplicing to syntax quoted forms. if any recursive call to read can (some how?) returning multiple values I think the entire reader would be much more complex. I have no evidence of that, which is what I asked if you had looked in to what it would take.

              I have no doubt it is possible to do, I just don't thing I'd care for the result

              1. Jun 22, 2013

                Syntax quote also has this same problem:

                It would be trivial to mimic that behavior with reader-level syntax quoting.

                Alternatively, you could just return the splice form unmodified, as if quoted. Consider the Sequence form in Mathematica, which is basically clojure.core/unquote-splicing:

                Since expansion is resolved from the inside out, returning a splice form would be a sensible thing to do in a nested quoting scenario.

  9. May 09, 2013

    pushing conditionals in to the reader does not save us from requiring the code to be readable on all platforms. it has to be readable for the reader to even recognize it as a form to throw away

  10. May 11, 2013

    if any recursive call to read can (some how?) returning multiple values I think the entire reader would be much more complex.

    I guess recursive invokations of read could be replaced with read-seq-of-forms + concat et al.

    That would buy us read-splice, but: Is it worth the performance cost? If yes, should read-seq-of-forms be made public?

    I guess at the top level, a splice would just assume an implicit "do".

    I'm kind of meh on (= (read-str "#~@(range 5)") '(do 1 2 3 4 5))

    If I want a do, I'll write (a regular form generating) it. `~@ sets a nice precedent.

    Those complications make me wonder if #~@ serves any conceivable purpose, that #~ together `(~@) can't (or shouldn't) serve.

    pushing conditionals in to the reader does not save us from requiring the code to be readable on all platforms. it has to be readable for the reader to even recognize it as a form to throw away

    True, this should be an intended design constraint. We want to have the same basic clojure syntax on every platform, i.e. a very small superset of edn, right?

    The reason we even need conditionals in the reader is that the reader already does some possibly platform-dependent side-effects, i.e. reader tags, defrecord instances and friends.

    So let's just KISS the syntactic considerations goodbye for a moment:

    That gets us to the real meat of the problem:

    Say we have:

    Under current reader rules, would the #platform tag have a chance to remove the form before the reader tried to apply the #asm tag?

    I suspect that's not the case, which makes me wonder if it's worth changing the reader to allow reader tags to omit reader tags from their child form. This would imply the ability to generate new reader tags (and other reader side effects) in code a reader tag emits.

    I also suspect that if we try to just allocate a couple of new reader table characters  and work downwards from there, we will end up with pretty much the same code it would take to apply reader tags in a second pass, except it would be less general purpose.

    We still might not want to expose it through reader tags, since it would alter their semantics on the input side. Opinions on this one?