[DXML-1] Stack overflow when parsing huge XML file Created: 10/Feb/12 Updated: 20/Mar/12 Resolved: 20/Mar/12
|Reporter:||Justin Kramer||Assignee:||Ryan Senior|
|Patch:||Code and Test|
This is using Ryan Senior's new 0.0.3-SNAPSHOT.
While trying to parse a huge XML file (7.5 GB compressed, a dump of Wikipedia), got a stack overflow error. Some digging turned up this bug:
Modifying clojure.data.xml/source-seq to disable the IS_COALESCING property got rid of the error.
The old lazy-xml contrib code worked (although used up tons more memory).
Attached is a patch that adds keyword options to source-seq, parse, and parse-str, allowing the consumer to disable coalescing and sidestep the upstream bug.
|Comment by Ryan Senior [ 20/Mar/12 8:05 AM ]|