data.xml

Support for preserving whitespace between tags

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

XML parsers can support preserving white space nodes, but clojure.data.xml does not seem to support this functionality.

For example, the following should be able to return true (perhaps with an option to parse-str):

Desired Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             " "
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

This is the current behavior:

Current Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

Activity

Hide
Jason Gilman added a comment -

I'm running into this problem as well. Can this be fixed without using the contributed patch?

Show
Jason Gilman added a comment - I'm running into this problem as well. Can this be fixed without using the contributed patch?
Hide
Ryan Senior added a comment -

Hi Aron,

Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.

Show
Ryan Senior added a comment - Hi Aron, Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.
Hide
Aron Nopanen added a comment -

I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.

Show
Aron Nopanen added a comment - I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.
Hide
Aron Nopanen added a comment -

Seconded.

The issue lies with the '.isWhiteSpace' check in this section of function pull-seq:

XMLStreamConstants/CHARACTERS
(if-let [text (and (not (.isWhiteSpace sreader))
(.getText sreader))]
(cons (event :characters nil nil text)
(pull-seq sreader))
(recur))

While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

Show
Aron Nopanen added a comment - Seconded. The issue lies with the '.isWhiteSpace' check in this section of function pull-seq: XMLStreamConstants/CHARACTERS (if-let [text (and (not (.isWhiteSpace sreader)) (.getText sreader))] (cons (event :characters nil nil text) (pull-seq sreader)) (recur)) While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

People

Vote (3)
Watch (1)

Dates

  • Created:
    Updated: