<< Back to previous view

[DXML-4] Namespaces support Created: 27/Mar/12  Updated: 03/Nov/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Carlo Sciolla Assignee: Ryan Senior
Resolution: Unresolved Votes: 7
Labels: None

Attachments: Text File add_namespaces.patch     Text File add-namespace-support.patch     Text File roundtrip-documents.patch    
Patch: Code and Test

 Description   

Add support for both parsing and emitting namespace qualified tags and namespaces URI declarations.
It basically follows the underlying Java XML API in giving xmlns:foo attributes "special" treatment.



 Comments   
Comment by Ryan Senior [ 22/May/12 10:26 AM ]

I don't see a contributor agreement for you Carlo. Have you signed one? http://clojure.org/contributing

Comment by Gary Trakhman [ 19/Jun/12 6:09 PM ]

ping, is the patch still waiting for a signed CA?

Comment by Ryan Senior [ 26/Jun/12 12:14 PM ]

Yes

Comment by Robert Onslow [ 01/Dec/12 5:07 AM ]

Is this patch due reasonable soon?

Comment by Andy Fingerhut [ 21/Apr/13 7:04 PM ]

Link to a design page with some ideas for XML namespace support in Clojure: http://dev.clojure.org/display/DXML/Fuller+XML+support

Comment by Herwig Hochleitner [ 26/Mar/14 9:20 AM ]

I've taken another stab at this. Attached roundtrip-documents.patch implements roundtripping, which means reading and writing xmlns attributes and namespaces as is.

Further improvements, that would fall into the scope of this ticket, but should be implemented on top of correct roundtripping, hence another ticket might be in order:

  • functionality for normalizing prefixes
  • rewriting prefixes
  • finding a minimal set of prefix names and/or default namespace, for given fragment
Comment by Steve Suehs [ 26/Mar/14 4:04 PM ]

I could really use this. I'm tweaking poms and the xml headers with schema locations cause grief. If you are in Austin I'll buy you a beer.

Comment by Herwig Hochleitner [ 01/Apr/14 4:41 AM ]

Good to hear that. I've implemented a walker to resolve names in namespaced xml and have the emitter assign the prefix of a resolved name. You can review / use at your own peril from here: https://github.com/bendlas/data.xml

Right now, I'm doing cleanup passes and trying to get feedback from the before pushing for change.

Comment by Paul Gearon [ 21/May/14 12:53 AM ]

I stupidly did this myself before realizing it was already done.
What is the current status? Still waiting on Carlo (since he's submitted a patch), looking to use Herwig's, or something else?

I wasn't totally happy with how I did it, since I used a binding for a parallel stack containing the current prefix->URI mappings. This was because QName prefixes are kept in the namespace of an element's keyword, but .writeStartElement and .writeAttribute need the URIs the prefix maps to, which wasn't being kept. It'd be nice to see if there's a better way.

Comment by Paul Gearon [ 21/May/14 10:36 AM ]

Submitting this patch, since the process requires a patch file. Carlo has not responded about the contributor agreement for 2 years, and none of the other attempts have been submitted as patches (edit: I've now seen Herwig's emails and realize that this is active).

Other implementations may be better, but I need to get the ball rolling on this.

Comment by Martin Clausen [ 27/Aug/14 2:35 PM ]

Carlo has signed a CA and is on the contributor list. Hope this means this much needed patch can be accpeted.

Comment by Thomas Engelschmidt [ 01/Nov/14 8:48 AM ]

Are there any plans on applying the patches ?

Comment by Ryan Senior [ 03/Nov/14 6:42 AM ]

We will be merging in Herwig's implementation soon. We're still waiting on getting him commit rights to the data.xml repo, so for now you can find his implementation here: https://github.com/bendlas/data.xml. We're planning on merging that into a namespaces branch where we'll hopefully have a beta release soon.





[DXML-26] Disable external entities resolution in the default XML parser to prevent XXE attacks Created: 27/Aug/14  Updated: 28/Sep/14  Resolved: 28/Sep/14

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Critical
Reporter: Carlo Sciolla Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: security

Attachments: Text File 0001-Prevent-XXE-attacks-by-disabling-external-entities-r.patch    
Patch: Code and Test

 Description   

The default behavior of Java XML parsers is to happily resolve external XML entities, which exposes any application that processes unsecured XMLs to XXE vulnerabilities.

By default data.xml should initialize the XML parses with disabled XXE processing.



 Comments   
Comment by Ryan Senior [ 28/Sep/14 7:23 AM ]

Patch looks good, I've applied it. Thanks Carlo

Comment by Carlo Sciolla [ 28/Sep/14 11:51 AM ]

Great, thanks!





[DXML-24] parse can be extremely slow for certain input data Created: 07/Jun/14  Updated: 28/Sep/14  Resolved: 28/Sep/14

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Sean Corfield Assignee: Ryan Senior
Resolution: Declined Votes: 0
Labels: None


 Description   

I'm still doing some experiments but parse seems to take a very long time to deal with this URL http://www.cybletechnologies.com/?feed=rss2 and I wonder if it's due to huge CDATA piece containing JS code?

I'll do some more experimentation to narrow it down but wanted to get at least a placeholder bug in play in case this was a known issue.



 Comments   
Comment by Ryan Senior [ 28/Sep/14 10:09 AM ]

I profiled this. The problem looks to be with the DTD calls. I've not done a lot of stuff with DTD, but it looks like the StAX parser is making a bunch of HTTP calls for things referenced by the DTD. First it pulls in http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd, my guess is it's resolving a bunch of stuff referenced from that DTD. The parse did eventually finish but took around 10 minutes on my laptop.

If I pass in :support-dtd false to the parse call, it returns very quickly for me, around 4 milliseconds.

(parse (java.io.FileInputStream. "path/to/file.html") :support-dtd false)

I'm going to close this as the behavior seems to be correct from a StAX perspective, and it :support-dtd false seems to be a pretty reasonable work around.





[DXML-25] Emit Empty Elements using EmptyElementTag Created: 09/Jul/14  Updated: 09/Jul/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Alexander Kiel Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: enhancement, patch
Environment:

Does not apply


Patch: Code and Test

 Description   

Currently data.xml emits empty elements (elements without content) using start and end tags. The XML spec also allows special empty tags like <foo/>.

I need to serialize XML using such special empty tags because a device, I want to communicate with, does require empty tags. The device is just not able to parse XML messages using start and end tags.

I created a branch on GitHub where I implemented empty tags in the emit function. I'm not familiar how to create a patch. So for now here is the link to the compare view.

As I wrote in my commit message we should discuss, whether a option to the emit function would be a better solution.






[DXML-23] Prefix is null in Inkscape SVG Created: 26/May/14  Updated: 13/Jun/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Christian Weilbach Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Ubuntu 14.04 amd64, openjdk-7, clojure 1.6.0, data.xml 0.0.7.



 Description   

When loading a fairly basic inkscape XML (1) in data.xml with (emit-str (parse (io/reader ".../minimal.svg"))), I get:

XMLStreamException Prefix cannot be null
com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeAttribute (XMLStreamWriterImpl.java:575)
clojure.data.xml/write-attributes (xml.clj:39)
clojure.data.xml/emit-start-tag (xml.clj:50)
clojure.data.xml/emit-event (xml.clj:67)
clojure.data.xml/emit (xml.clj:367)
clojure.data.xml/emit-str (xml.clj:375)
xml-test.core/eval1512 (form-init517397699703209853.clj:1)
clojure.lang.Compiler.eval (Compiler.java:6703)
clojure.lang.Compiler.eval (Compiler.java:6666)
clojure.core/eval (core.clj:2927)
clojure.main/repl/read-eval-print-6625/fn-6628 (main.clj:239)
clojure.main/repl/read-eval-print--6625 (main.clj:239)

(1) https://gist.github.com/ghubber/34dbc54a9cf30ce68b8a



 Comments   
Comment by John Walker [ 13/Jun/14 9:26 PM ]

What is your system encoding?
Edit: Nevermind. It's specified in emit. Looks to be related to http://dev.clojure.org/jira/browse/DXML-4





[DXML-21] Some unit tests are never run because their names are the same Created: 23/Dec/13  Updated: 16/Apr/14  Resolved: 16/Apr/14

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Andy Fingerhut Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None

Attachments: File dxml-21-v1.diff    

 Description   

Whenever two deftest statements have the same name, the first is ignored and its tests are never run.

In namespace clojure.data.xml.test-emit, there are two deftests with name defaults, and two with the name test-indent.

Found using pre-release version of Eastwood Clojure lint tool.



 Comments   
Comment by Andy Fingerhut [ 23/Dec/13 5:50 PM ]

Patch dxml-21-v1.diff makes all deftest names unique. I have not attempted to correct any new failing tests, if any.

Comment by Ryan Senior [ 16/Apr/14 4:18 PM ]

Patch applied





[DXML-13] Support for preserving whitespace between tags Created: 10/Feb/13  Updated: 08/Apr/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Kevin Albrecht Assignee: Ryan Senior
Resolution: Unresolved Votes: 3
Labels: None

Attachments: Text File DXML-13.patch    

 Description   

XML parsers can support preserving white space nodes, but clojure.data.xml does not seem to support this functionality.

For example, the following should be able to return true (perhaps with an option to parse-str):

Desired Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             " "
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true

This is the current behavior:

Current Behavior
(= (clojure.data.xml/element :x {}
                             (clojure.data.xml/element :a {} "foo")
                             (clojure.data.xml/element :a {} "bar"))
   (clojure.data.xml/parse-str
     (str "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
          "<x>"
          "<a>foo</a>"
          " "
          "<a>bar</a>"
          "</x>")))
;=> true


 Comments   
Comment by Aron Nopanen [ 18/Aug/13 3:49 PM ]

Seconded.

The issue lies with the '.isWhiteSpace' check in this section of function pull-seq:

XMLStreamConstants/CHARACTERS
(if-let [text (and (not (.isWhiteSpace sreader))
(.getText sreader))]
(cons (event :characters nil nil text)
(pull-seq sreader))
(recur))

While the 'props' argument to parse/parse-str currently only holds XMLInputFactory options, perhaps a ':maintain-whitespace' option could be added that affects this behavior? It would be straightforward to pass the props into pull-seq to conditionally perform the .isWhiteSpace check.

Comment by Aron Nopanen [ 20/Aug/13 12:47 AM ]

I have attached a patch to support a :maintain-whitespace property to parse and parse-str. If set to 'true', whitespace-only nodes will not be stripped during the parsing process.

Comment by Ryan Senior [ 10/Nov/13 10:38 PM ]

Hi Aron,

Thanks for the patch. Have sent in a contributor agreement? I didn't see you name here: http://clojure.org/contributing. Submitting patches to Clojure contrib libraries requires this.

Comment by Jason Gilman [ 08/Apr/14 6:51 AM ]

I'm running into this problem as well. Can this be fixed without using the contributed patch?





[DXML-22] Adding hiccup generation function for elements Created: 24/Feb/14  Updated: 28/Mar/14

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Chris Zheng Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

N/a



 Description   

This is for completeness really. See pull request https://github.com/clojure/data.xml/pull/10

I would like to:

  • generate an element using hiccup (already exists)
  • generate hiccup using an element (proposed)


 Comments   
Comment by Chris Zheng [ 28/Mar/14 7:22 AM ]

I'm hoping someone can at least give some feedback to this ticket.

Comment by Ryan Senior [ 28/Mar/14 7:53 AM ]

Hi Chris,

Thanks for the reminder on this. I'll have more time to dig in this weekend, but off the top of my head I think more will need to be done on this, both on implementation and on testing. I think what you have now won't work with comments or cdata. One way to flesh some of that out is to create round trip types of tests in src/test/clojure/clojure/data/xml/test_sexp.clj.





[DXML-14] IllegalArgumentException when trying to emit a boolean value Created: 07/Mar/13  Updated: 10/Nov/13  Resolved: 10/Nov/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ed O'Loughlin Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None
Environment:

JRE 1.7, OS X 10.7.5, Clojure 1.4 & 1.5, data.xml 0.0.7



 Description   

I can create an element with a boolean value but I can't emit it...

user=> (emit-str (element :something {} false))
IllegalArgumentException No implementation of method: :gen-event of protocol: #'clojure.data.xml/EventGeneration found for class: java.lang.Boolean clojure.core/-cache-protocol-fn (core_deftype.clj:541)



 Comments   
Comment by Ryan Senior [ 10/Nov/13 10:41 PM ]

Thanks for the bug report. The fix is in ffd6957baa0cf752fa0678be7f2a3393eab16739 and should be released with 0.0.8.





[DXML-18] Not parsing multiple top-level elements Created: 24/Jun/13  Updated: 10/Nov/13  Resolved: 10/Nov/13

Status: Closed
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Alan Busby Assignee: Ryan Senior
Resolution: Declined Votes: 0
Labels: None
Environment:

[org.clojure/clojure "1.5.1"] and [org.clojure/data.xml "0.0.7"]



 Description   

(xml/parse-str "<a>1</a><b>2</b>")
Emits
{:tag :a, :attrs {}, :content ("1")}

Where did "b" go?



 Comments   
Comment by Alan Busby [ 24/Jun/13 7:43 AM ]

Sorry, feel free to close this.
Reviewing the code it appears that parse-str only accepts full XML documents and can't handle fragments, or is that incorrect?

Comment by Ryan Senior [ 10/Nov/13 10:39 PM ]

That's correct. Closing this as it's by design.





[DXML-20] Odd behaviour when using lein uberjar Created: 25/Sep/13  Updated: 10/Nov/13  Resolved: 10/Nov/13

Status: Closed
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Neil Laurance Assignee: Ryan Senior
Resolution: Declined Votes: 0
Labels: None
Environment:

Leiningen 2.3.2 on Java 1.7.0_11 Java HotSpot(TM) 64-Bit Server VM



 Description   

(For original query posted to leiningen project, see: https://github.com/technomancy/leiningen/issues/1334)

I have a trivial app:

Unable to find source-code formatter for language: clojure. Available languages are: javascript, sql, xhtml, actionscript, none, html, xml, java
(ns test-app.core
  (:require [clojure.data.xml :as dx])
  (:gen-class))

(def xml
  (dx/emit-str
   (dx/sexp-as-element
    [:hello
     [:world]])))

(defn -main
  [& args]
  (println xml))

And a lein project definition of:

Unable to find source-code formatter for language: clojure. Available languages are: javascript, sql, xhtml, actionscript, none, html, xml, java
(defproject test-app "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [
    [org.clojure/clojure "1.5.1"]
    [org.clojure/data.xml "0.0.7"]]
  :main test-app.core
  :profiles {:uberjar {:aot :all}})

Attempting to run lein uberjar throws a stacktrace.

Attempting to run a second time without cleaning first succeeds.

Workaround is to change (def xml) to (defn xml []) and then invoke it.

Stacktrace is:

Warning: specified :main without including it in :aot. 
Implicit AOT of :main will be removed in Leiningen 3.0.0. 
If you only need AOT for your uberjar, consider adding :aot :all into your
:uberjar profile instead.
Compiling test-app.core
Exception in thread "main" java.lang.IllegalArgumentException: No implementation of method: :gen-event of protocol: #'clojure.data.xml/EventGeneration found for class: clojure.data.xml.Element, compiling:(core.clj:6:3)
    at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3463)
    at clojure.lang.Compiler$DefExpr.eval(Compiler.java:408)
    at clojure.lang.Compiler.compile1(Compiler.java:7153)
    at clojure.lang.Compiler.compile(Compiler.java:7219)
    at clojure.lang.RT.compile(RT.java:398)
    at clojure.lang.RT.load(RT.java:438)
    at clojure.lang.RT.load(RT.java:411)
    at clojure.core$load$fn__5018.invoke(core.clj:5530)
    at clojure.core$load.doInvoke(core.clj:5529)
    at clojure.lang.RestFn.invoke(RestFn.java:408)
    at clojure.core$load_one.invoke(core.clj:5336)
    at clojure.core$compile$fn__5023.invoke(core.clj:5541)
    at clojure.core$compile.invoke(core.clj:5540)
    at user$eval9.invoke(form-init6653146504522592512.clj:1)
    at clojure.lang.Compiler.eval(Compiler.java:6619)
    at clojure.lang.Compiler.eval(Compiler.java:6609)
    at clojure.lang.Compiler.load(Compiler.java:7064)
    at clojure.lang.Compiler.loadFile(Compiler.java:7020)
    at clojure.main$load_script.invoke(main.clj:294)
    at clojure.main$init_opt.invoke(main.clj:299)
    at clojure.main$initialize.invoke(main.clj:327)
    at clojure.main$null_opt.invoke(main.clj:362)
    at clojure.main$main.doInvoke(main.clj:440)
    at clojure.lang.RestFn.invoke(RestFn.java:421)
    at clojure.lang.Var.invoke(Var.java:419)
    at clojure.lang.AFn.applyToHelper(AFn.java:163)
    at clojure.lang.Var.applyTo(Var.java:532)
    at clojure.main.main(main.java:37)
Caused by: java.lang.IllegalArgumentException: No implementation of method: :gen-event of protocol: #'clojure.data.xml/EventGeneration found for class: clojure.data.xml.Element
    at clojure.core$_cache_protocol_fn.invoke(core_deftype.clj:541)
    at clojure.data.xml$fn__136$G__131__141.invoke(xml.clj:73)
    at clojure.data.xml$flatten_elements$fn__189.invoke(xml.clj:129)
    at clojure.lang.LazySeq.sval(LazySeq.java:42)
    at clojure.lang.LazySeq.seq(LazySeq.java:60)
    at clojure.lang.RT.seq(RT.java:484)
    at clojure.core$seq.invoke(core.clj:133)
    at clojure.data.xml$emit.doInvoke(xml.clj:366)
    at clojure.lang.RestFn.invoke(RestFn.java:425)
    at clojure.data.xml$emit_str.invoke(xml.clj:375)
    at clojure.lang.AFn.applyToHelper(AFn.java:161)
    at clojure.lang.AFn.applyTo(AFn.java:151)
    at clojure.lang.Compiler$InvokeExpr.eval(Compiler.java:3458)
    ... 27 more
Compilation failed: Subprocess failed


 Comments   
Comment by Ryan Senior [ 10/Nov/13 10:35 PM ]

I think this is a broader issue with AOT. I'm thinking it's probably related to this: http://dev.clojure.org/jira/browse/CLJ-979. I've seen this in other code, but it's the first time I've seen it reported for data.xml. Wrapping it in a function is a good idea. I think avoiding AOT would also fix your issue.





[DXML-16] Eliminate reflection in emit-cdata Created: 25/Apr/13  Updated: 14/Aug/13  Resolved: 14/Aug/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Andy Fingerhut Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File dxml-16-eliminate-relfection-in-emit-cdata-patch-v1.txt    

 Description   

Solvable with a type hint on emit-cdata arg 'writer'



 Comments   
Comment by Andy Fingerhut [ 25/Apr/13 1:30 PM ]

Patch dxml-16-eliminate-relfection-in-emit-cdata-patch-v1.txt dated Apr 25 2013 eliminates a couple of uses of reflection in function emit-cdata.

Comment by Ryan Senior [ 14/Aug/13 11:50 PM ]

Thanks Andy, just pushed up your patch.

Comment by Ryan Senior [ 14/Aug/13 11:53 PM ]

Accidentally marked as closed





[DXML-17] Embedded CDATA end tags are not properly handled Created: 19/Jun/13  Updated: 14/Aug/13  Resolved: 14/Aug/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Jeff Weiss Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File dxml17.patch    

 Description   

user> (xml/indent-str (xml/sexp-as-element [:pre [:-cdata "foo]]>bar"]]))
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><pre><![CDATA[foo]]><![CDATA[bar]]></pre>\n"

What's being emitted here is cdata for foobar not foo]]>bar.

What needs to be done is break up the embedded ]]> so that the first two characters are in one cdata block, and the last character is in the next block.

The tests are wrong, as far as I can tell. I think I have fixed the code and the tests, I just need to figure out how to run the tests and submit a patch.



 Comments   
Comment by Jeff Weiss [ 20/Jun/13 8:16 AM ]

Patch that fixes issue and tests

Comment by Jeff Weiss [ 20/Jun/13 8:18 AM ]

And just to clear up what the issue is, currently if the cdata contains the cdata end tag "]]>" it is just dropped and when the xml is read in those characters are gone.

That is not correct behavior, the cdata should be able to contain any arbitrary characters without any loss of data, and the attached patch will allow this.

Comment by Ryan Senior [ 14/Aug/13 11:52 PM ]

Thanks for the patch! It's been pushed up and will be in the next release.





[DXML-19] data.xml should ship a copy of the EPL license in LICENSE Created: 29/Jul/13  Updated: 14/Aug/13  Resolved: 14/Aug/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Wolodja Wentland Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None


 Description   

One requirement for licensing code under the EPL is that "a copy of this Agreement [the EPL]
must be included with each copy of the Program." [0]. Unfortunately data.xml does not comply
with this requirement even though its README.md claims that it is licensed under the EPL.

Please fix this issue and release a new version of data.xml as it is not legally distributable
in its current form.

[0] http://www.eclipse.org/legal/epl-v10.html → 3. REQUIREMENTS



 Comments   
Comment by Ryan Senior [ 14/Aug/13 11:51 PM ]

Good catch. I've added the EPL file, it will be in the next release.





[DXML-8] Cannot pass strings when keywords are expected Created: 27/Sep/12  Updated: 26/Jul/13  Resolved: 14/Nov/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Brian Siebert Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None
Environment:

Windows 7


Patch: Code

 Description   

This error does not present till you attempt to emit xml that has a string where the element function is expecting a keyword. This is double hard to figure out at first because the error message is vague. I am requesting that the element function is allowed to use strings instead of keywords or the error message is cleaned up so that the "user" error is clear.



 Comments   
Comment by Ryan Senior [ 14/Nov/12 7:28 AM ]

I have added this. Supporting keywords and strings seems to be common in some of the other contrib libraries. Now you can use the keyword :foo or the string "foo" for tags and attributes.





[DXML-9] Remove some use of reflection in data.xml Created: 28/Oct/12  Updated: 26/Jul/13  Resolved: 14/Nov/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Minor
Reporter: Andy Fingerhut Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File dxml-9-remove-reflection-v1.txt    
Patch: Code and Test

 Description   

There are a couple of occurrences of reflection in the data.xml library



 Comments   
Comment by Andy Fingerhut [ 28/Oct/12 6:10 PM ]

dxml-9-remove-reflection-v1.txt dated Oct 28 2012 removes one use of reflection in data.xml. There is still one remaining, to which I have added a comment explaining why it cannot be removed with a single type hint.

Comment by Ryan Senior [ 14/Nov/12 7:26 AM ]

Thanks Andy. Will be in the next release.





[DXML-15] data.xml can't parse own output if there's a colon in an attribute name Created: 03/Apr/13  Updated: 03/Apr/13

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: ben wolfson Assignee: Ryan Senior
Resolution: Unresolved Votes: 0
Labels: None
Environment:

data.xml 0.0.7



 Description   

Observe:

> (x/emit-str (x/element :NC {"xmlns" "http://example.com" "xmlns:xsi" "http://www.w3.org/2001/XMLSchema-instance" "xsi:schemaLocation" "http://www.example.com/schema.xsd"} (x/element :Foo {} "bar")))
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><NC xsi:schemaLocation=\"http://www.example.com/schema.xsd\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns=\"http://example.com\"><Foo>bar</Foo></NC>"
> (x/parse-str *1)
#clojure.data.xml.Element{:tag :NC, :attrs {:xsi/schemaLocation "http://www.example.com/schema.xsd"}, :content (#clojure.data.xml.Element{:tag :Foo, :attrs {}, :content ("bar")})}
a> (x/emit-str *1)
XMLStreamException Prefix cannot be null com.sun.xml.internal.stream.writers.XMLStreamWriterImpl.writeAttribute (XMLStreamWriterImpl.java:574)
app.services.external.experian.internal.test-data>

(a) the xmlns and xmlns:xsi attributes have disappeared. Not the point of this issue but worth pointing out.
(b) "xsi:schemaLocation" has become :xsi/schemaLocation
(c) emitting a string blows up.






[DXML-12] Do the right thing if cdata content contains the cdata end-tag "]]>" Created: 21/Nov/12  Updated: 08/Jan/13  Resolved: 08/Jan/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Jeff Weiss Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None


 Description   

(xml/emit-str (xml/cdata "fooo]]>bar"))

"<?xml version=\"1.0\" encoding=\"UTF-8\"?><![CDATA[fooo]]>bar]]>"

This is invalid xml. The contract for cdata states that it cannot contain the end tag "]]>", so if the cdata function gets passed content that contains it, it should do the right thing, which is probably this:

http://stackoverflow.com/questions/223652/is-there-a-way-to-escape-a-cdata-end-token-in-xml

(split the content so it is emitted as multiple cdata blocks, none of which contain the entire end-tag "]]>").

This is not a purely academic bug report - I actually hit this problem in prxml and fixed it on my fork.



 Comments   
Comment by Ryan Senior [ 08/Jan/13 10:07 PM ]

Fixed, released in 0.0.7





[DXML-11] Support cdata with sexp-as-element Created: 21/Nov/12  Updated: 08/Jan/13  Resolved: 08/Jan/13

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Jeff Weiss Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None


 Description   

prxml allowed something like this:

(prxml [:foo [:cdata! "all my cdata"]])

It doesn't look like that is currently allowed in data.xml. It looked like maybe I could extend the AsElements protocol to get this behavior, but I couldn't quite figure it out, seems like I'd have to have access to the XmlStreamWriter to get the string representation of the cdata.



 Comments   
Comment by Ryan Senior [ 08/Jan/13 10:07 PM ]

Added, released in 0.0.7





[DXML-10] Support for DOCTYPE when emitting XML Created: 14/Nov/12  Updated: 14/Nov/12

Status: Open
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Thomas Greve Kristensen Assignee: Ryan Senior
Resolution: Unresolved Votes: 1
Labels: None

Attachments: XML File web.xml    

 Description   

Some consumers of XML files require an explicit DOCTYPE to accept an XML file. data.xml does not currently support the specification of doctypes when emitting XML. When XML is parsed, I believe DOCTYPEs are silently ignored, so there is no representation in the data model for them. The best design is possibly an :doctype option in clojure.data.xml/emit ?

I've attached a web.xml as example.






[DXML-7] cannot change encoding when using the indent function Created: 27/Sep/12  Updated: 09/Oct/12  Resolved: 09/Oct/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Brian Siebert Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None
Environment:

Window 7



 Description   

When using the Indent function, and trying to change the encoding, an exception is thrown.

java.lang.IllegalArgumentException: No value supplied for key: [:encoding "UTF-8"]

This seems to be that the options are not being passed from indent to emit correctly.



 Comments   
Comment by Ryan Senior [ 09/Oct/12 10:46 PM ]

Thanks for finding the bug. It's fixed in the repo and will be included in the next release.





[DXML-5] OutOfMemory errors when emitting large XML documents Created: 27/Apr/12  Updated: 26/Jun/12  Resolved: 26/Jun/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ryan Senior Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None


 Description   

Emitting large XML documents, fed from lazy-seqs in data.xml does not work. Currently, the lazy-seq is held in a defrecord, which holds onto the head of the lazy-seq and will force it to all be in memory (eventually consuming all available memory). Example code to reproduce the issue below:

Unable to find source-code formatter for language: clojure. Available languages are: javascript, sql, xhtml, actionscript, none, html, xml, java
(with-open [fw (java.io.FileWriter. "/tmp/lots-of-foo.xml")]
    (xml/emit
       (Element. :some-tags
           {}
           (map #(Element. :foo {} [(str "foo" %)])
                (range 0 10000000)))
       fw))


 Comments   
Comment by Ryan Senior [ 22/May/12 10:57 AM ]

Fixed

Comment by Ryan Senior [ 26/Jun/12 12:18 PM ]

Found this to be fixed only in the simplest case. If you have a large lazy-seq nested below 2+ tags it will hold onto the head of the lazy-seq and consume memory.

Comment by Ryan Senior [ 26/Jun/12 1:37 PM ]

Added an intermediate step to emitting elements to the stream writer. Now elements get flattened to a stream of events that get written to the stream writer.

Comment by Ryan Senior [ 26/Jun/12 1:37 PM ]

Not sure how to set a "Fix Version" in Jira, but this was fixed in 0.0.5





[DXML-6] data.xml tests fail on clojure 1.2.0 and 1.2.1 Created: 22/May/12  Updated: 22/May/12  Resolved: 22/May/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ryan Senior Assignee: Ryan Senior
Resolution: Completed Votes: 0
Labels: None


 Description   

See the test matrix here: http://build.clojure.org/job/data.xml-test-matrix/. Looks like the mixed-quotes test is to blame, just a reordering of attributes when they are emitted to a string.



 Comments   
Comment by Ryan Senior [ 22/May/12 12:54 PM ]

Tests now run successfully on 1.2.0 and 1.2.1





[DXML-2] lein deps fails Created: 17/Feb/12  Updated: 22/May/12  Resolved: 22/May/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Ralph Möritz Assignee: Ryan Senior
Resolution: Declined Votes: 0
Labels: None
Environment:

Leiningen


Attachments: File project.clj    

 Description   

C:\Users\ralphm\workspace\dbxml-env>lein version
Leiningen 1.6.2 on Java 1.7.0_02 Java HotSpot(TM) Client VM
C:\Users\ralphm\workspace\dbxml-env>lein deps
Downloading: org/clojure/data.xml/0.0.2-SNAPSHOT/data.xml-0.0.2-SNAPSHOT.pom from repository clojars at http://clojars.org/repo/
Unable to locate resource in repository
[INFO] Unable to find resource 'org.clojure:data.xml:pom:0.0.2-SNAPSHOT' in repository clojars (http://clojars.org/repo/)
Downloading: org/clojure/data.xml/0.0.2-SNAPSHOT/data.xml-0.0.2-SNAPSHOT.jar from repository clojars at http://clojars.org/repo/
Unable to locate resource in repository
[INFO] Unable to find resource 'org.clojure:data.xml:jar:0.0.2-SNAPSHOT' in repository clojars (http://clojars.org/repo/)
An error has occurred while processing the Maven artifact tasks.
Diagnosis:

Unable to resolve artifact: Missing:
----------
1) org.clojure:data.xml:jar:0.0.2-SNAPSHOT

Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=org.clojure -DartifactId=data.xml -Dversion=0.0.2-SNAPSHOT
-Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=org.clojure -DartifactId=data.xml -Dversion=0.0.2-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) org.apache.maven:super-pom:pom:2.0
2) org.clojure:data.xml:jar:0.0.2-SNAPSHOT

----------
1 required artifact is missing.

for artifact:
org.apache.maven:super-pom:pom:2.0

from the specified remote repositories:
central (http://repo1.maven.org/maven2),
clojars (http://clojars.org/repo/)



 Comments   
Comment by Ryan Senior [ 20/Feb/12 10:03 PM ]

As far as I know, there haven't been any releases of data.xml (SNAPSHOT or regular) to the maven repositories. I'm working on this and will hopefully have something out soon.

Comment by Ryan Senior [ 22/May/12 10:56 AM ]

data.xml doesn't get deployed to clojars. Look for it in maven central: http://search.maven.org/#search|ga|1|data.xml . 0.0.3 is the most recent version released, but 0.0.4 will be released soon.





[DXML-1] Stack overflow when parsing huge XML file Created: 10/Feb/12  Updated: 20/Mar/12  Resolved: 20/Mar/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Justin Kramer Assignee: Ryan Senior
Resolution: Completed Votes: 1
Labels: patch,
Environment:

OS X


Attachments: Text File data-xml-kwopts.patch    
Patch: Code and Test

 Description   

This is using Ryan Senior's new 0.0.3-SNAPSHOT.

While trying to parse a huge XML file (7.5 GB compressed, a dump of Wikipedia), got a stack overflow error. Some digging turned up this bug:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6440214

Modifying clojure.data.xml/source-seq to disable the IS_COALESCING property got rid of the error.

The old lazy-xml contrib code worked (although used up tons more memory).

Attached is a patch that adds keyword options to source-seq, parse, and parse-str, allowing the consumer to disable coalescing and sidestep the upstream bug.



 Comments   
Comment by Ryan Senior [ 20/Mar/12 8:05 AM ]

Thanks Justin!





[DXML-3] Build release on JDK 1.6 Created: 17/Feb/12  Updated: 24/Feb/12  Resolved: 24/Feb/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Stuart Sierra Assignee: Alan Malloy
Resolution: Completed Votes: 0
Labels: None

Attachments: Text File jdk_16_jobs.patch    
Approval: Vetted

 Description   

See https://groups.google.com/d/topic/clojure-dev/Z-wrRTcUs6U/discussion



 Comments   
Comment by Ryan Senior [ 20/Feb/12 9:58 PM ]

Patch for adding JDK version to a Hudson job config

Comment by Stuart Sierra [ 24/Feb/12 3:13 PM ]

Patch applied to build.ci. Rebuilding Hudson configs now.





Generated at Sat Nov 29 01:33:29 CST 2014 using JIRA 4.4#649-r158309.