data.xml

Support for preserving whitespace between tags

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

XML parsers can support preserving white space nodes, but clojure.data.xml does not seem to support this functionality.

For example, the following should be able to return true (perhaps with an option to parse-str):

Desired Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             " "
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

This is the current behavior:

Current Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

Activity

Hide
Aron Nopanen added a comment -

Seconded.

The issue lies with the '.isWhiteSpace' check in this section of function pull-seq:

XMLStreamConstants/CHARACTERS
(if-let [text (and (not (.isWhiteSpace sreader))
(.getText sreader))]
(cons (event :characters nil nil text)
(pull-seq sreader))
(recur))

While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

Show
Aron Nopanen added a comment - Seconded. The issue lies with the '.isWhiteSpace' check in this section of function pull-seq: XMLStreamConstants/CHARACTERS (if-let [text (and (not (.isWhiteSpace sreader)) (.getText sreader))] (cons (event :characters nil nil text) (pull-seq sreader)) (recur)) While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.
Hide
Aron Nopanen added a comment -

I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.

Show
Aron Nopanen added a comment - I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.
Aron Nopanen made changes -
Field Original Value New Value
Attachment DXML-13.patch [ 12195 ]
Hide
Ryan Senior added a comment -

Hi Aron,

Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.

Show
Ryan Senior added a comment - Hi Aron, Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.
Hide
Jason Gilman added a comment -

I'm running into this problem as well. Can this be fixed without using the contributed patch?

Show
Jason Gilman added a comment - I'm running into this problem as well. Can this be fixed without using the contributed patch?
Hide
Jan-Paul Bultmann added a comment -

Aron Nopanen, the author of the patch is now in the contributors list after signing the agreement.
Is it still applicable or does it have to be adapted due to its age?

Show
Jan-Paul Bultmann added a comment - Aron Nopanen, the author of the patch is now in the contributors list after signing the agreement. Is it still applicable or does it have to be adapted due to its age?

People

Vote (3)
Watch (2)

Dates

  • Created:
    Updated: