This design document covers a superset of the clojure.xml schema.
All metadata are optional, parsers SHOULD provide them and serializers MAY leverage it: metadata is not required to be kept in sync with their data so it's up to the serializers heuristics to use or ignore them.
A namespace aware XML parser produces a datastructure following these representations, this representation is designed as a superset of the clojure.xml representation:
| XML | Clojure | Notes |
|---|---|---|
| Element | ^{::xml/ns {"" "http://..." | A map with keys: :tag (local name keywordized), :attrs (attribute map), :content (nil, sequence, sequential collection of nodes), :uri (namespace URI as string). Metadata: under the ::xml/ns key, a map of prefixes (strings) to URIs (strings), under the ::xml/prefix a string of the original prefix. (where xml is an alias for data.xml) |
| Attribute | ^{::xml/ns {"" "http://..." | A map entry whose key is either a keyword (no prefix) or a [:kw "uri"] pair; the value is a string. In the case of [:kw "uri"] keys, ::xml/prefix SHOULD be present the uri MUST NOT be nil or the empty string |
| Text nodes | "text" | |
| Comment | {:comment "text"} | |
| ... | TBD |
The default serialization strategy is conservative.
When serializing an element: add missing namespaces declarations.
For each name (elements or attributes name):
For elements:
emit xmlns and xmlns:* attributes for new mappings (generated by attributes serialization and by new entries under the ::xml/ns key – new when compared to the state maintained by the serializer).
To please broken consumers, XML serialization has to be tweaked. It may be interesting to have emit or *xml-emitter* or somethieng to be a dynamic var.
Less than half-baked idea:
Defining a full serializer is tiresome so it may be interesting to provide a factory.
(serializer init ; internal state of the serializer, initial value (fn [state xmlns] state') ; fn to update internal state given a new xmlns map (fn [state local-name uri prefix] prefix')) ; fn which decides which prefix to use for a given name |
All serialization quirks can't be solved by emitters produced by such a factory but if it covers a good chunk it may be worthy.
There must be a better abstraction.