[DXML-1] Stack overflow when parsing huge XML file Created: 10/Feb/12 Updated: 20/Mar/12 Resolved: 20/Mar/12 |
|
| Status: | Resolved |
| Project: | data.xml |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Defect | Priority: | Major |
| Reporter: | Justin Kramer | Assignee: | Ryan Senior |
| Resolution: | Completed | Votes: | 1 |
| Labels: | patch, | ||
| Environment: |
OS X |
||
| Attachments: |
|
| Patch: | Code and Test |
| Description |
|
This is using Ryan Senior's new 0.0.3-SNAPSHOT. While trying to parse a huge XML file (7.5 GB compressed, a dump of Wikipedia), got a stack overflow error. Some digging turned up this bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6440214 Modifying clojure.data.xml/source-seq to disable the IS_COALESCING property got rid of the error. The old lazy-xml contrib code worked (although used up tons more memory). Attached is a patch that adds keyword options to source-seq, parse, and parse-str, allowing the consumer to disable coalescing and sidestep the upstream bug. |
| Comments |
| Comment by Ryan Senior [ 20/Mar/12 8:05 AM ] |
|
Thanks Justin! |