Skip to end of metadata
Go to start of metadata
This page is work-in-progress. Until finalization, normative language in this page should be considered a proposal.



[r1] 04082016 Encode xml namespaces directly into keyword namespaces

The runtime-global registry for xmlns <-> cljns mappings, provided by declare-ns and alias-ns, poses a problem of governance: If two libraries want to be composable (a very basic requirement), they need to agree on their declare-ns clauses. Even worse: user source code is hardcoded against whatever mapping, their library chose to provide. The only "fix" for that would be maintaining a mapping of "known" uris to clojure namespaces within data.xml, but that is unsatisfactory, as the work of assigning unique names is already provided by various registries, such as iana, and the list of xml uris, that people might want to use with data.xml is quite large.

At the same time, we want to keep using ::xmlns-alias/keywords. This can be achieved, by encoding the uri directly into the keyword, after substituting clojure's syntax characters. Percent-encoding, along with the substitution rules given in!topic/clojure/Txj3suj2B3s

Runtime data structures

canonical representation

Even if the emitter accepts a slightly larger set of representations, the parser should produce a very uniform data structure, which should map xml infoset equality to clojure equality and match, what a user would write by hand.

Unfortunately, percent-encoding uri-namespaces don't quite fit the bill on user-friendliness, but outside of clojure's kw-aliasing facilities, this can still be fixed by using reader tags.

xml elements

Elements are represented as maps with keys #{:tag :attrs :content}. The canonical representation is a defrecord, exposed through the constructors element and element*.

xml names

̶I̶n̶ ̶t̶h̶e̶ ̶g̶e̶n̶e̶r̶a̶l̶ ̶c̶a̶s̶e̶,̶ ̶x̶m̶l̶ ̶n̶a̶m̶e̶s̶ ̶a̶r̶e̶ ̶r̶e̶p̶r̶e̶s̶e̶n̶t̶e̶d̶ ̶a̶s̶ ̶(̶Q̶N̶a̶m̶e̶s̶)̶[̶h̶t̶t̶p̶:̶/̶/̶d̶o̶c̶s̶.̶o̶r̶a̶c̶l̶e̶.̶c̶o̶m̶/̶j̶a̶v̶a̶e̶e̶/̶1̶.̶4̶/̶a̶p̶i̶/̶j̶a̶v̶a̶x̶/̶x̶m̶l̶/̶n̶a̶m̶e̶s̶p̶a̶c̶e̶/̶Q̶N̶a̶m̶e̶.̶h̶t̶m̶l̶]̶ ̶o̶r̶,̶ ̶i̶f̶ ̶t̶h̶e̶y̶ ̶h̶a̶v̶e̶ ̶n̶o̶ ̶n̶a̶m̶e̶s̶p̶a̶c̶e̶ ̶u̶r̶i̶,̶ ̶a̶s̶ ̶k̶e̶y̶w̶o̶r̶d̶.̶
̶d̶a̶t̶a̶.̶x̶m̶l̶ ̶h̶a̶s̶ ̶a̶ ̶f̶a̶c̶i̶l̶i̶t̶y̶ ̶t̶o̶ ̶a̶s̶s̶o̶c̶i̶a̶t̶e̶ ̶c̶l̶o̶j̶u̶r̶e̶ ̶n̶a̶m̶e̶s̶p̶a̶c̶e̶s̶ ̶w̶i̶t̶h̶ ̶x̶m̶l̶ ̶n̶a̶m̶e̶s̶p̶a̶c̶e̶ ̶u̶r̶i̶s̶.̶ ̶W̶h̶i̶c̶h̶ ̶a̶l̶l̶o̶w̶s̶ ̶c̶l̶o̶j̶u̶r̶e̶'̶s̶ ̶s̶h̶o̶r̶t̶h̶a̶n̶d̶-̶s̶y̶n̶t̶a̶x̶ ̶f̶o̶r̶ ̶n̶a̶m̶e̶s̶p̶a̶c̶e̶d̶ ̶k̶e̶y̶w̶o̶r̶d̶s̶ ̶t̶o̶ ̶b̶e̶ ̶u̶s̶e̶d̶:̶

Xml qnames are uniformly encoded into keywords, by urlencoding the xmlns uri into the keyword namespace. For names in the empty namespace, non-namespaced keywords are used.

<foo/> => {:tag :foo}

<n:foo xmlns:n="NO:NO/NO" /> => {:tag :xmlns.NO%3ANO%2FNO/foo}

Similar to xml serialization, the kw-ns :xmlns/... and :xml/... are given special treatment: Even though you can still emit them, by giving their full namespace uri, their canonical representation is the short form.

Additionally accepted qname types in the emitter:

xml attributes

Are stored in hash-maps. The parser removes xmlns attributes from the attr hash and stores them in metadata (accessible via

The namespace environment can be augmented by associating :xmlns and :xmlns/<prefix> attributes.