Skip to end of metadata
Go to start of metadata

Problem

There are Clojure dialects (Clojure, ClojureScript, ClojureCLR) hosted on several different platforms. We wish to write libraries that can share as much portable code as possible while leaving flexibility to provide platform-specific bits as needed, then have this code run on all of them.

Use cases for platform-specific functionality:

Proposal: Feature expressions

Common Lisp approaches this problem using feature expressions.

Each platform has a variable called *features* that is a set of keywords to indicate supported features. 

The Reader understands a new kind of "feature expression". The reader macros #+ and #- are used to include or skip a form based on a feature expression:

  • #+feature-expr form - include next form if feature-expr evaluates to true
  • #-feature-expr form - skip next form if feature-expr evaluates to true

Feature expressions evaluate as booleans and are defined as follows:

  • feature - a symbol, evaluated as if:

  • (not feature-expr) - returns logical not of feature-expr
  • (and feature-expr*) - returns true if all feature-expr are true, otherwise false.
  • (or feature-expr*) - returns true if any of its feature-expr are true, otherwise false.

Skipping in the reader is performed by binding *read-suppressed* to true for the next form.

Example:

(ns feature.expressions
  #+cljs (:require [goog.string :as gstring]))

(defn my-trim [s]
  #+clj (.. s toString trim)
  #+cljs (gstring/trim s))

(my-trim " Hello CL? ")

Features

The platform feature will be one of: clj, cljs, or clr.  Users may supply their own features, which should, by convention, use namespaces. Platform-specified features will not be namespaced.

In Clojure, the initial feature set may be specified at start time with the system property clojure.features, which is a comma-delimited list of symbols to add as features. The platform feature will always be added, regardless of whether it is set in the system property. 

For example:

will yield the runtime feature set: #{:clj :arch/osx :my.app/prod :my.app/strictmath}

In ClojureScript, there is a new build option with key :features that takes a set of keywords defining the features. The platform feature :cljs is always added to this set, as in Clojure.

In addition to setting the features initially, users may bind *features* around explicit calls to the reader.

Reading Unreadable Things

There may be tagged literals or classes that are not known or available on all platforms. The reader must be able to read but avoid constructing these entities.

Example:

The #js tagged literal is not known on the Clojure platform, but the reader should read and skip it without failure.

CLR extended symbol problem: Clojure CLR uses an expanded set of valid symbols in the reader. The #| reader extension from Common Lisp was implement in ClojureCLR to delimit symbols containing otherwise invalid characters. The corresponding symbols are not currently readable by Clojure or ClojureScript.  The #| extension support is described here: https://github.com/clojure/clojure-clr/wiki/Specifying-types. The #| reader extension would need to be supported in Clojure and ClojureScript readers to support the full set of valid ClojureCLR symbols in feature expressions.

Loading and File Extensions

When code needs to be loaded based on a namespace:

  • Clojure will continue to read only .clj files. The file will be read by the Clojure reader and feature expressions will be applied.
  • ClojureScript will first look for .cljs files, then for .clj files. The .clj file will be read by the ClojureScript reader and feature expressions will be applied.
  • ClojureCLR - ???
CLJS source extension problem: ClojureScript currently expects source files to end (only) in .cljs. There are a number of places in the code where this assumption is made:
  • clj.cljs.compiler/rename-to-js - renames .cljs file name to .js file name. 
    • regex is easily fixed for this to cover both .clj and .cljs
  • clj.cljs.compiler/cljs-files-in - given a directory, finds all .cljs files to compile
    • used by clj.cljs.repl/analyze-source, which is used by clj.cljs.repl.browser/-setup and clj.cljs.repl/repl
    • can grab .clj too, but may include macro files not previously included - how do we distinguish here?? see next question.
  • clj.cljs.analyzer/ns->relpath - given a namespace, find the resource path
    • can check first for .cljs path, then .clj path (adds resource check where prior was simple string manipulation)
  • clj.cljs.closure/cljs-source-for-namespace - given a ns, find path and url of ns
    • can check first for .cljs resource, then .clj resource
  • clj.cljs.repl.rhino/goog-require - given ns, makes cljs resource path
    • can check first for .cljs resource, then .clj resource
  • clj.cljs.repl.browser/send-static - serves static resources
    • add support for serving .clj as well as .cljs
 
CLJS accidental macro compilation problem: when we collect all the ClojureScript code to compile, we will (accidentally) include .clj macro files that should only be read for macro expansion.

We use a single classpath (often merged from several directories within a library) to load several kinds of ClojureScript resources:
  • ClojureScript-only files: .cljs - compiled as CLJS
  • Mixed source files: .clj (presumably with feature expressions) - compiled as CLJS (NEW)
  • ClojureScript macro files: .clj (loaded as needed for macro support in CLJS)
Because mixed and macro files have the same extension, we need a way to indicate that macro files should not be compiled as ClojureScript. Possible solutions:
  1. Modify the way ClojureScript loads code to specify separate paths for ClojureScript code and macro code  NO
    1. Likely breaks many existing projects which mingle these into the same source path.
    2. Also breaks library publishing as a single jar.
  2. Add build option to specify names to skip during compilation (the macro namespaces).   NO
    1. require-macros would ignore this directive so require-macros would still load the ns'es
    2. Existing CLJS projects would still work if the code in the macro files happened to be compile-able by ClojureScript
    3. This capability would also make it easier to mingle Clojure and ClojureScript projects in the same project in a single source tree (with capability to ignore certain files for CLJS for reasons other than avoiding compilation of macro files)
    4. Main problem is that if published as a library, there is no way to get this ns exclusion list to users of the published library. Would have to also invent some way to specify that list of namespaces in jar metadata etc.
  3. Add marker in macro namespaces to identify them as being skipped for compilation  INVESTIGATING
    1. Add namespace meta or other indicator that this .clj file is only included for macros not for compilation - this is significantly more 
    2. Could automatically be picked up by downstream users of code published as a library
  4. Use a new mixed mode file extension (other than .clj or .cljs) to make finding files to be compiled as ClojureScript unambiguous.
    1. Fits well with existing strategies for finding source in ClojureScript
    2. Requires change in Clojure and ClojureCLR to look for and read additional extension
    3. May cause issues in wider range of tooling

Open Extension

One consequence of feature expressions is that a library must encode solutions for all of the supported platforms in the source code at packaging time (in other words, extension is closed to external users).
 
An alternate solution to specifying all variants in a single file with feature expressions is to swap whole namespaces in and out for different platforms. This allows others to provide implementations for different platforms (if, for example, the library author lacked the expertise to do so). This strategy is orthogonal to feature expressions but made possible by the extension preference checking as specified in the previous section.

Example:

Library files:
 src/my/core.clj - ns requires my.thing
 src/my/thing.clj - Clojure implementation of my.thing (loaded by Clojure)
 src/my/thing.cljs - ClojureScript implementation of my.thing (loaded by ClojureScript)

ClojureScript Tooling Support

ClojureScript tooling needs to be aware of the change in supported file extension names and possibly new build options. Need to ensure tools are able to support the new mixed-language projects well:

  • lein-cljsbuild - most ClojureScript projects today are built with the lein-cljsbuild plugin. 
    • compiler.clj - needs to properly account for ClojureScript .clj files
    • features - might need to add support for specifying non-default feature set when testing
  • austin?
  • what else?

 

FAQ

  1. What about non-boolean expressions for things like Clojure version, JDK version, etc?  Out of scope. The "compile-if" trick covers many of those (relatively rare) cases already.

Patches

JIRA Tickets and patches:

Alternate Approaches

Copy / paste

One approach is to maintain two versions of the same file that are largely the same but modify the platform-specific parts in each copy. This obviously works but is gross.

cljx

cljx is an implementation of feature expressions that:

  • rewrites .cljx files containing feature expression-annotated code into external files based on well-known tags
    • The most common use is to use clj and cljs tags and write .clj and .cljs files for consumption by other tools/compilers/etc
  • optionally applies the same transformation interactively via installation of a REPL extension

It has been used successfully by a number of projects (see the cljx README for a partial list).  cljx's limitations include:

  • It does not address portability of macros at all; it is strictly a source-to-source transformation.  Macros continue to be written in Clojure, and must be rewritten or implemented conditionally on the contents of &env.
  • It does not provide any runtime customization of the "features" you can use; these are set either within build configuration (via cljx' Leiningen integration), or via the configuration of the cljx REPL extension.  The latter technically is available for modification, but is not in practical use.
  • The set of provided "features" is limited to one for Clojure (#+clj) and one for ClojureScript (#+cljs).  Further discrimination based on target runtime (e.g. rhino vs. node vs. v8 vs. CLR) would be trivial, but has not been implemented to date. 

cljx expressions are typically applied:

  • Inside ns macro
  • Top-level forms
  • Occasionally internal forms where it's concise
    • (.getTime #+clj (java.util.Date.) #+cljs (js/Date.))
lein-cljsbuild "crossovers"

lein-cljsbuild provides a (deprecated, to be removed) feature called "crossovers" that provides a very limited preprocessing of certain files during the cljsbuild build process; a special comment string is removed, allowing one to work around the -macros declarations required in ClojureScript ns forms.  Crossover files must otherwise be fully portable.  Language/runtime-specific code must be maintained in separate files.  However, (my) experience shows that this can quickly lead to the situation where one has to think a lot about in which file to put a specific function, in order to go though the whole preprocessing machinery. Functions are split into namespaces because of conditional compilation, and not because they belong to the same part or module of the program.

Tagged Literal

Define a custom tagged literal that implements conditional read-time expressions:

#feature/condf [ (and jdk-1.6+ clj-1.5.*) 
 (call-my-fast-reducer-code) 
 else (some-old-fashioned-code) ]

Proof of concept here:  https://github.com/miner/wilkins

References

The Common Lisp Hyperspec about the Sharp Sign macros:

Examples of Common Lisp's Feature Expresions:

Maintaining Portable Lisp Programs:

Crossover files in lein-cljsbuild:

Labels:
  1. Jul 22, 2012

    I am not wild about the conditional stuff happening at read time, it means the conditionals have no data representation and cannot be manipulated as such

    1. Jul 22, 2012

      Kevin,

      Can you elaborate with an example?

      1. Jul 27, 2012

        with this design if you read

        #+clojure (+ 1 2)
        #+clojurescript (+ 2 3)

        on clojure you get the list (+ 1 2) and on clojurescript the list (+ 2 3) and there is no indication that a compile time conditional was read. this means the conditionals will not be visible to anything that operates on datastructures rather than text.

        how will macros be shared? I guess they would check `*features*` directly to decide if they want to emit something different for clojure or clojurescript

        another thing is there is no indication that the above is basically a cond, besides the forms being written next to each other. compare to something like:

        (feature-cond
        clojure (+ 1 2)
        clojurescript (+ 2 3))

        it is possible that we have to have conditionals at read time because some forms are readable on one host but another (vars? syntax quote in the reader does some class lookups I think, on the otherhand, syntax quote should be rewritten as a macro)

        if issues with readable/unreadable forms are addressed between hosts, feature-cond above could be a macro that checks `*features*` so no change to the compiler or reader

         

        1. Mar 05, 2013

          At first, I wanted to agree with you that it would be nicer to just have a Macro rather than introducing a new feature into the reader. However, after more thought, I think that this absolutely needs to occur at read time. The main reason is that, if it's a macro, then it will be visible to outer macro forms when it shouldn't be. Consider some macro which expects a symbol, but instead gets a (feature-cond ...) form. It will (assert (symbol? ...)) and fail, even though the feature-cond form might expand to a symbol, the outer macro has no way of knowing.

          It seems like the expected thing is for this to happen almost equivalent to the C preprocessor, so the reader is the right place to do that.

      2. Jul 27, 2012

        The complaint I have with this feature is that now all the separate implementations of a given expression are housed in a single file. This closes off the system, disallowing future expansion (without getting a patch submitted upstream to the library provider). This gets to be a bigger issue when we start porting libraries to other platforms.

        What I'd like to see is some sort of "patch" based system. I'd like to see some way where we can put JVM specific code into a single source file, and then allow users of the library to replace those JVM versions with CLJS versions, CLR, Python, or C versions as needed.

        The common lisp method shown here is simple, but I have to say, it feels like a hack.

        1. Jul 27, 2012

          load-file is an expression, so any kind of feature expressions would allow for providing implementations in different files

          1. Mar 05, 2013

            That assumes you have a runtime compiler, which ClojureScript does not. You'd need to run load-file in the context of the compiler, but ClojureScript (currently) emits top-level forms as JavaScript, rather than running them in the compiler's environment.

  2. Jul 23, 2012

    I received an email from Kevin Lynagh. He could not write to the wiki, so I post it here:

    Hi Roman,

     
    I can't figure out how to leave a comment on the JIRA page, but I hope an email is okay.
     
    Your potential solution looks a lot like my cljx leiningen plugin: https://github.com/lynaghk/cljx
    You write platform-sharable code in .cljx files, optionally annotating toplevel forms with metadata to mark them as platform specific.
    Running the plugin copies code into .clj or .cljs files as appropriate, stripping out forms that are specific to other platforms.
    It also rewrites references to clojure/core functionality; e.g., clojure.lang.Atom becomes cljs.core.Atom.
     
    The plugin is based on kibit (which is based on core.logic), and it's simple to add new rules.
    I'm using it for my C2 library, but outside of that I have no idea if anyone else has used it or run into issues.
    1. Jan 30, 2014

      Just an update re: cljx, which I've been maintaining for some months now.  It was rewritten for the 0.3.0 release to perform all of its transformations statically, i.e. it does not use runtime metadata and does not use the Clojure reader (or tools.reader for that matter).  (The previous usage of runtime metadata and Clojure reader necessitated some compromises in the quality of the transformed output and in preventing the use of some language features tied to the compiler state/runtime, e.g. alias-namespaced symbols and keywords.)

      I've also added some REPL integration, so that namespaces defined in .cljx files are transformed and loaded/compiled as appropriate for the current REPL session (either Clojure or ClojureScript).

      These enhancements have made cljx a workable (and pleasant, IMO) solution to the problem of targeting multiple languages/runtimes from a single Clojure codebase.

      I'll update this wiki page to include a link to cljx under "current solutions".

  3. Jul 27, 2012

    • I like the keywords :clojure and :clojurescript, that is how the dialects are known.
    • I don't think it makes sense to add e.g. :rhino until/unless there is something that can be read only there.
    • AFAIAC namespace support means just "allow namespaced names", which the attached patch handles just fine.
    • I think compilation process and naming questions can be considered separately.
  4. Jul 28, 2012

    > Are those keywords ok? Is :jvm for Clojure and :js for ClojureScript better? 

    > Should ClojureScript add something like :rhino, :v8 or :browser as well?

    :jvm isn't a good idea for Clojure, since :rhino is also on the JVM. If you were writing some ClojureScript code, and wanted to test for a Java package, you'd want to check for :jvm. Similarly, you'd want a :gclosure member as well, if there was ever an implementation that targeted JS without it.

     

    I'd imagine the Clojure set could look like #{:clojure :java :jvm}

    And the ClojureScript one could be #{:clojurescript :js :gclosure} potentially conj :jvm for Rhino

    Because it will soon be possible to have #{:clojurescript :lua}

     

    This raises an interesting point: Why a set? Why not a map? Here's an example Clojure map:

    {:dialect :clojure, :clojure {:version "1.2.3"}, :target :java, :runtime :jvm, :jvm {}}

    And one for ClojureScript:

    {:dialect :clojurescript, :clojurescript {:version "2.3.4"}, :target :js, :js {}, :runtime :browser} ; maybe something for :gclosure version

    This way the #+clojurescript and #-js things will still work like set memberships. We could come up with some way to include an arbitrary predicate against map values, or we could leave that to be feature not available to the reader, only to macros against *features*.

     

    A few issues: Since this happens at read time, it may be too late to decide :v8 vs :rhino, etc. Although, some JavaScript shops/frameworks/teams/whatever do multi-compile sources for different browsers. ie. Do conditional logic by browser version, server side, rather than client side, serve up the browser-specific javascript. In that case, you even could support a :browser key with the various information there.

     

    I feel like this needs some kind of cond macro. I should be able to check for "do I have a runtime that optimizes foo? if so, do this, otherwise, do it the slow way like this" I realize that I could use #+foo, then #-foo, but what if I had two possible optimizations? I'd need #+foo(x) #+bar(y) #-foo(#-bar(y)) ;; seems like I need some kind of "else" expression.

     

    RE: read-time vs macro-expansion-time. Seems like the right thing for this is read-time because of symbol resolution, but we should also consider some compile-time macro support against the same dynamic var. In theory, it should be workable with all the standard macros and stuff, since it's just a set or map, but we should still explore it a bit, so we know we're not missing something.

  5. Aug 01, 2012

    I asked the Common Lisper's for some input over here:

    https://groups.google.com/forum/?hl=en&fromgroups#!topic/comp.lang.lisp/x2k3XbW3LmA

    Elias Mårtenson responded to one of my questions on
    comp.alt.lisp. Here's what he wrote:

    I can address one of the quest, whether they are better
    implemented as a macro. The answer to this is clearly no. In
    fact, it couldn't be implemented as a macro, because at the time
    the macro is evaluated, the symbols making up the expression has
    already been interned by the reader. Thus, if the form references
    any packages that does not exist in your environment, you would
    get an error in the reader, before your conditional expression
    has had the chance to run.

    A somehow related thread I found on this topic is here:

    https://groups.google.com/forum/?hl=en&fromgroups#!topic/comp.lang.lisp/CBB4hzqRCS8

    What are the next steps?

    1. Aug 01, 2012

      I would be careful conflating those issues with common lisp with clojure. clojure's reader is not the same as common lisp's and clojure's symbols are not the same as common lisp's. I think the only part of the clojure reader that has those kind of issues in syntax quote, which tries to resolve classes.

      1. Aug 25, 2012

        Kevin is right. Try this out, it works:

         

        (def ^:dynamic *features* #{'clojure})

        (defmacro feature-cond [& exprs]
        `~(-> (filter (fn [[feature expr]]
        (contains? *features* feature))
        (partition 2 exprs))
        first
        second))

        (feature-cond
        clojure (def x 1)
        clojurescript (def y 2))

        (feature-cond
        clojurescript (* y 3)
        clojure (* x 3))

  6. Aug 02, 2012

    I would like to see the feature system being open to use by libraries. In pallet, we recently introduced features to be able to write providers that work with multiple versions of pallet, and I think it would make sense to replace this with whatever feature system ends up in clojure.

    The current proposal seems to support this, but it would be good to clarify this as a requirement of the feature system.

    1. Aug 05, 2012

      Again, if 'features (or whatever it is called) is a map, then you might be able to do something like (load-file (str "foo/" (:dialect features) ".clj")) or something like that... That wouldn't be possible as a bit-flag set.

  7. Aug 23, 2012

    What problems, if any, could this change cause for existing code? Obviously pre-feature-expression code will be unable to read the forms. In particular:

    • 1.4 Clojure and ClojureScript fail to read feature expressions, thinking they are tagged literals with no handler
    • pre-1.4 Clojure fails with a ClassNotFoundException
    1. Aug 26, 2012

      feature-cond above would be readable in previous versions of clojure, and even other lisps for whatever that is worth.

  8. Mar 06, 2013

    What if we provided read-time variants of unquote and unquote-splicing?

    Here's a splicing example:

    The code inside the #~ and #~@ forms would be run at read-time and subject to *read-eval*. This gives you the full power of Clojure for conditional compilation at read time. We could provide some standard function and macro utilities for conditional compilation that are available both at read and macro expansion time. Allowing for something like:

    This proposal gives you the full power of Clojure for controlling conditional reading and compilation without adding dramatically new ideas (it's basically just syntax-quote at read-time).

    1. May 06, 2013

      I love the consistency and minimalism of this approach.

      One Issue however: It probably won't be possible to read-splice on the toplevel. Consider: What should (read-str "#~@(range 5)") yield?

      I'm still +1 on this one and just forbid toplevel splices.

    2. May 08, 2013

      have you looked at what it would take to allow the reader to splice in forms at any point? with syntax quote it is a little easier because the expression has to be synax quoted to begin with before you can splice in to it.

      1. May 09, 2013

        I guess at the top level, a splice would just assume an implicit "do".

        1. May 09, 2013

          the top level is the easiest case though, the gnarly cases are splicing in to nested data structures, that may be spliced together themselves.

          1. May 09, 2013

            Oh, I though you were asking with respect to Herwig's comment.

            Doesn't syntax quote already have this problem? Try `{1 ~@(list 2 3 4)} for example. You can do that, but you can't do `{1 2 ~@(list 3 4)}

            Seems like a separate issue all together.

            1. May 09, 2013

              syntax quote at least can limit the scope of splicing and unsplicing to syntax quoted forms. if any recursive call to read can (some how?) returning multiple values I think the entire reader would be much more complex. I have no evidence of that, which is what I asked if you had looked in to what it would take.

              I have no doubt it is possible to do, I just don't thing I'd care for the result

              1. Jun 22, 2013

                Syntax quote also has this same problem:

                It would be trivial to mimic that behavior with reader-level syntax quoting.

                Alternatively, you could just return the splice form unmodified, as if quoted. Consider the Sequence form in Mathematica, which is basically clojure.core/unquote-splicing:

                Since expansion is resolved from the inside out, returning a splice form would be a sensible thing to do in a nested quoting scenario.

  9. May 09, 2013

    pushing conditionals in to the reader does not save us from requiring the code to be readable on all platforms. it has to be readable for the reader to even recognize it as a form to throw away

  10. May 11, 2013

    if any recursive call to read can (some how?) returning multiple values I think the entire reader would be much more complex.

    I guess recursive invokations of read could be replaced with read-seq-of-forms + concat et al.

    That would buy us read-splice, but: Is it worth the performance cost? If yes, should read-seq-of-forms be made public?

    I guess at the top level, a splice would just assume an implicit "do".

    I'm kind of meh on (= (read-str "#~@(range 5)") '(do 1 2 3 4 5))

    If I want a do, I'll write (a regular form generating) it. `~@ sets a nice precedent.

    Those complications make me wonder if #~@ serves any conceivable purpose, that #~ together `(~@) can't (or shouldn't) serve.

    pushing conditionals in to the reader does not save us from requiring the code to be readable on all platforms. it has to be readable for the reader to even recognize it as a form to throw away

    True, this should be an intended design constraint. We want to have the same basic clojure syntax on every platform, i.e. a very small superset of edn, right?

    The reason we even need conditionals in the reader is that the reader already does some possibly platform-dependent side-effects, i.e. reader tags, defrecord instances and friends.

    So let's just KISS the syntactic considerations goodbye for a moment:

    That gets us to the real meat of the problem:

    Say we have:

    Under current reader rules, would the #platform tag have a chance to remove the form before the reader tried to apply the #asm tag?

    I suspect that's not the case, which makes me wonder if it's worth changing the reader to allow reader tags to omit reader tags from their child form. This would imply the ability to generate new reader tags (and other reader side effects) in code a reader tag emits.

    I also suspect that if we try to just allocate a couple of new reader table characters  and work downwards from there, we will end up with pretty much the same code it would take to apply reader tags in a second pass, except it would be less general purpose.

    We still might not want to expose it through reader tags, since it would alter their semantics on the input side. Opinions on this one?