Skip to end of metadata
Go to start of metadata

This design document covers a superset of the clojure.xml schema.

All metadata are optional, parsers SHOULD provide them and serializers MAY leverage it: metadata is not required to be kept in sync with their data so it's up to the serializers heuristics to use or ignore them.


A namespace aware XML parser produces a datastructure following these representations, this representation is designed as a superset of the clojure.xml representation:

^{::xml/ns {"" "http://..."
"x" "http://..."}
::xml/prefix "x"}
{:tag :foo
:uri "http://..."
:attrs {...}
:content [...]}

A map with keys: :tag (local name keywordized), :attrs (attribute map), :content (nil, sequence, sequential collection of nodes), :uri (namespace URI as string).

Metadata: under the ::xml/ns key, a map of prefixes (strings) to URIs (strings), under the ::xml/prefix a string of the original prefix.

(where xml is an alias for data.xml)

^{::xml/ns {"" "http://..."
"x" "http://..."}
::xml/prefix "x"}
[:href "http://..."]

A map entry whose key is either a keyword (no prefix) or a [:kw "uri"] pair; the value is a string.

In the case of [:kw "uri"] keys, ::xml/prefix SHOULD be present

the uri MUST NOT be nil or the empty string

Text nodes
 {:comment "text"}
... TBD

Default serialization strategy suggestion

The default serialization strategy is conservative.

When serializing an element: add missing namespaces declarations.

For each name (elements or attributes name):

  1. If the ::xml/prefix matches the :uri then use it,
  2. if the uri is mapped to another alias, use it, (Should we rather map the url to the new alias?)
  3. if the uri is not mapped then map it under the specified alias (if present) or a gensymed alias.

For elements:

emit xmlns and xmlns:* attributes for new mappings (generated by attributes serialization and by new entries under the ::xml/ns key – new when compared to the state maintained by the serializer).


Custom serialization strategies

To please broken consumers, XML serialization has to be tweaked. It may be interesting to have emit or *xml-emitter* or somethieng to be a dynamic var.


Less than half-baked idea:

Defining a full serializer is tiresome so it may be interesting to provide a factory.


All serialization quirks can't be solved by emitters produced by such a factory but if it covers a good chunk it may be worthy.

There must be a better abstraction.

  1. Apr 21, 2013

    In case someone would like to see some of the discussion that led to this proposal:

  2. Jan 24, 2014

    A better link is!msg/clojure-dev/3_jkBrdQKgs/dUwtevWqlwkJ

    Is there a clojure xml lib around implementing the proposed schema?

    As I'm trying to implement this, I'm having second thoughts about the direction of the mapping: it's important to preverse that one url is mapped to several prefixes but using prefixes as keys makes serialization inconvenient.

    So I propose to have a map of uris to sets of prefixes.

    A map of urls to stes of prefixes is in the good direction when emitting (and it's the only moment where we care about mapping) while still allowing to check cheaply if a namespace and a uri are mapped together.