<< Back to previous view

[DXML-13] Support for preserving whitespace between tags Created: 10/Feb/13  Updated: 08/Apr/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Kevin Albrecht Assignee: Ryan Senior
Resolution: Unresolved Votes: 3
Labels: None

Attachments: Text File DXML-13.patch    

 Description   

XML parsers can support preserving white space nodes, but clojure.data.xml does not seem to support this functionality.

For example, the following should be able to return true (perhaps with an option to parse-str):

Desired Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             " "
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

This is the current behavior:

Current Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true


 Comments   
Comment by Aron Nopanen [ 18/Aug/13 3:49 PM ]

Seconded.

The issue lies with the '.isWhiteSpace' check in this section of function pull-seq:

XMLStreamConstants/CHARACTERS
(if-let [text (and (not (.isWhiteSpace sreader))
(.getText sreader))]
(cons (event :characters nil nil text)
(pull-seq sreader))
(recur))

While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

Comment by Aron Nopanen [ 20/Aug/13 12:47 AM ]

I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.

Comment by Ryan Senior [ 10/Nov/13 10:38 PM ]

Hi Aron,

Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.

Comment by Jason Gilman [ 08/Apr/14 6:51 AM ]

I'm running into this problem as well. Can this be fixed without using the contributed patch?

Generated at Tue Oct 21 09:05:15 CDT 2014 using JIRA 4.4#649-r158309.