<< Back to previous view

[DXML-15] data.xml can't parse own output if there's a colon in an attribute name Created: 03/Apr/13  Updated: 03/Apr/13

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: ben wolfson Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

data.xml 0.0.7



 Description   

Observe:

> (x/emit-str (x/element :NC {"xmlns" "http://example.com" "xmlns:xsi" "http://www.w3.org/2001/XMLSchema-instance" "xsi:schemaLocation" "http://www.example.com/schema.xsd"} (x/element :Foo {} "bar")))
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><NC xsi:schemaLocation=\"http://www.example.com/schema.xsd\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns=\"http://example.com\"><Foo>bar</Foo></NC>"
> (x/parse-str *1)
#clojure.data.xml.Element{:tag :NC, :attrs {:xsi/schemaLocation "http://www.example.com/schema.xsd"}, :content (#clojure.data.xml.Element{:tag :Foo, :attrs {}, :content ("bar")})}
a> (x/emit-str *1)
XMLStreamException Prefix cannot be null com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeAttribute (XMLStreamWriterImpl.java:574)
app.services.external.experian.internal.test-data>

(a) the xmlns and xmlns:xsi attributes have disappeared. Not the point of this issue but worth pointing out.
(b) "xsi:schemaLocation" has become :xsi/schemaLocation
(c) emitting a string blows up.






[DXML-13] Support for preserving whitespace between tags Created: 10/Feb/13  Updated: 08/Apr/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Kevin Albrecht Assignee: Ryan Senior
Resolution: Unresolved Votes: 2
Labels: None

Attachments: Text File DXML-13.patch    

 Description   

XML parsers can support preserving white space nodes, but clojure.data.xml does not seem to support this functionality.

For example, the following should be able to return true (perhaps with an option to parse-str):

Desired Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             " "
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

This is the current behavior:

Current Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true


 Comments   
Comment by Aron Nopanen [ 18/Aug/13 3:49 PM ]

Seconded.

The issue lies with the '.isWhiteSpace' check in this section of function pull-seq:

XMLStreamConstants/CHARACTERS
(if-let [text (and (not (.isWhiteSpace sreader))
(.getText sreader))]
(cons (event :characters nil nil text)
(pull-seq sreader))
(recur))

While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

Comment by Aron Nopanen [ 20/Aug/13 12:47 AM ]

I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.

Comment by Ryan Senior [ 10/Nov/13 10:38 PM ]

Hi Aron,

Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.

Comment by Jason Gilman [ 08/Apr/14 6:51 AM ]

I'm running into this problem as well. Can this be fixed without using the contributed patch?





[DXML-10] Support for DOCTYPE when emitting XML Created: 14/Nov/12  Updated: 14/Nov/12

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Thomas Greve Kristensen Assignee: Ryan Senior
Resolution: Unresolved Votes: 1
Labels: None

Attachments: XML File web.xml    

 Description   

Some consumers of XML files require an explicit DOCTYPE to accept an XML file. data.xml does not currently support the specification of doctypes when emitting XML. When XML is parsed, I believe DOCTYPEs are silently ignored, so there is no representation in the data model for them. The best design is possibly an :doctype option in clojure.data.xml/emit ?

I've attached a web.xml as example.






[DXML-4] Namespaces support Created: 27/Mar/12  Updated: 21/May/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Carlo Sciolla Assignee: Ryan Senior
Resolution: Unresolved Votes: 4
Labels: None

Attachments: Text File add_namespaces.patch     Text File add-namespace-support.patch     Text File roundtrip-documents.patch    
Patch: Code and Test

 Description   

Add support for both parsing and emitting namespace qualified tags and namespaces URI declarations.
It basically follows the underlying Java XML API in giving xmlns:foo attributes "special" treatment.



 Comments   
Comment by Ryan Senior [ 22/May/12 10:26 AM ]

I don't see a contributor agreement for you Carlo. Have you signed one? http://clojure.org/contributing

Comment by Gary Trakhman [ 19/Jun/12 6:09 PM ]

ping, is the patch still waiting for a signed CA?

Comment by Ryan Senior [ 26/Jun/12 12:14 PM ]

Yes

Comment by Robert Onslow [ 01/Dec/12 5:07 AM ]

Is this patch due reasonable soon?

Comment by Andy Fingerhut [ 21/Apr/13 7:04 PM ]

Link to a design page with some ideas for XML namespace support in Clojure: http://dev.clojure.org/display/DXML/Fuller+XML+support

Comment by Herwig Hochleitner [ 26/Mar/14 9:20 AM ]

I've taken another stab at this. Attached roundtrip-documents.patch implements roundtripping, which means reading and writing xmlns attributes and namespaces as is.

Further improvements, that would fall into the scope of this ticket, but should be implemented on top of correct roundtripping, hence another ticket might be in order:

  • functionality for normalizing prefixes
  • rewriting prefixes
  • finding a minimal set of prefix names and/or default namespace, for given fragment
Comment by Steve Suehs [ 26/Mar/14 4:04 PM ]

I could really use this. I'm tweaking poms and the xml headers with schema locations cause grief. If you are in Austin I'll buy you a beer.

Comment by Herwig Hochleitner [ 01/Apr/14 4:41 AM ]

Good to hear that. I've implemented a walker to resolve names in namespaced xml and have the emitter assign the prefix of a resolved name. You can review / use at your own peril from here: https://github.com/bendlas/data.xml

Right now, I'm doing cleanup passes and trying to get feedback from the before pushing for change.

Comment by Paul Gearon [ 21/May/14 12:53 AM ]

I stupidly did this myself before realizing it was already done.
What is the current status? Still waiting on Carlo (since he's submitted a patch), looking to use Herwig's, or something else?

I wasn't totally happy with how I did it, since I used a binding for a parallel stack containing the current prefix->URI mappings. This was because QName prefixes are kept in the namespace of an element's keyword, but .writeStartElement and .writeAttribute need the URIs the prefix maps to, which wasn't being kept. It'd be nice to see if there's a better way.

Comment by Paul Gearon [ 21/May/14 10:36 AM ]

Submitting this patch, since the process requires a patch file. Carlo has not responded about the contributor agreement for 2 years, and none of the other attempts have been submitted as patches (edit: I've now seen Herwig's emails and realize that this is active).

Other implementations may be better, but I need to get the ball rolling on this.





[DXML-24] parse can be extremely slow for certain input data Created: 07/Jun/14  Updated: 07/Jun/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Sean Corfield Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None


 Description   

I'm still doing some experiments but parse seems to take a very long time to deal with this URL http://www.cybletechnologies.com/?feed=rss2 and I wonder if it's due to huge CDATA piece containing JS code?

I'll do some more experimentation to narrow it down but wanted to get at least a placeholder bug in play in case this was a known issue.






[DXML-23] Prefix is null in Inkscape SVG Created: 26/May/14  Updated: 13/Jun/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Christian Weilbach Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Ubuntu 14.04 amd64, openjdk-7, clojure 1.6.0, data.xml 0.0.7.



 Description   

When loading a fairly basic inkscape XML (1) in data.xml with (emit-str (parse (io/reader ".../minimal.svg"))), I get:

XMLStreamException Prefix cannot be null
com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeAttribute (XMLStreamWriterImpl.java:575)
clojure.data.xml/write-attributes (xml.clj:39)
clojure.data.xml/emit-start-tag (xml.clj:50)
clojure.data.xml/emit-event (xml.clj:67)
clojure.data.xml/emit (xml.clj:367)
clojure.data.xml/emit-str (xml.clj:375)
xml-test.core/eval1512 (form-init517397699703209853.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6703)
clojure.lang.Compiler.eval (Compiler.java:6666)
clojure.core/eval (core.clj:2927)
clojure.main/repl/read-eval-print-6625/fn-6628 (main.clj:239)
clojure.main/repl/read-eval-print--6625 (main.clj:239)

(1) https://gist.github.com/ghubber/34dbc54a9cf30ce68b8a



 Comments   
Comment by john walker [ 13/Jun/14 9:26 PM ]

What is your system encoding?
Edit: Nevermind. It's specified in emit. Looks to be related to http://dev.clojure.org/jira/browse/DXML-4





[DXML-22] Adding hiccup generation function for elements Created: 24/Feb/14  Updated: 28/Mar/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Chris Zheng Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

N/a



 Description   

This is for completeness really. See pull request https://github.com/clojure/data.xml/pull/10

I would like to:

  • generate an element using hiccup (already exists)
  • generate hiccup using an element (proposed)


 Comments   
Comment by Chris Zheng [ 28/Mar/14 7:22 AM ]

I'm hoping someone can at least give some feedback to this ticket.

Comment by Ryan Senior [ 28/Mar/14 7:53 AM ]

Hi Chris,

Thanks for the reminder on this. I'll have more time to dig in this weekend, but off the top of my head I think more will need to be done on this, both on implementation and on testing. I think what you have now won't work with comments or cdata. One way to flesh some of that out is to create round trip types of tests in src/test/clojure/clojure/data/xml/test_sexp.clj.





[DXML-25] Emit Empty Elements using EmptyElementTag Created: 09/Jul/14  Updated: 09/Jul/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Alexander Kiel Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: enhancement, patch
Environment:

Does not apply


Patch: Code and Test

 Description   

Currently data.xml emits empty elements (elements without content) using start and end tags. The XML spec also allows special empty tags like <foo/>.

I need to serialize XML using such special empty tags because a device, I want to communicate with, does require empty tags. The device is just not able to parse XML messages using start and end tags.

I created a branch on GitHub where I implemented empty tags in the emit function. I'm not familiar how to create a patch. So for now here is the link to the compare view.

As I wrote in my commit message we should discuss, whether a option to the emit function would be a better solution.






Generated at Thu Jul 24 10:36:23 CDT 2014 using JIRA 4.4#649-r158309.