<< Back to previous view

[DXML-1] Stack overflow when parsing huge XML file Created: 10/Feb/12  Updated: 20/Mar/12  Resolved: 20/Mar/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Justin Kramer Assignee: Ryan Senior
Resolution: Completed Votes: 1
Labels: patch,


Attachments: Text File data-xml-kwopts.patch    
Patch: Code and Test


This is using Ryan Senior's new 0.0.3-SNAPSHOT.

While trying to parse a huge XML file (7.5 GB compressed, a dump of Wikipedia), got a stack overflow error. Some digging turned up this bug:


Modifying clojure.data.xml/source-seq to disable the IS_COALESCING property got rid of the error.

The old lazy-xml contrib code worked (although used up tons more memory).

Attached is a patch that adds keyword options to source-seq, parse, and parse-str, allowing the consumer to disable coalescing and sidestep the upstream bug.

Comment by Ryan Senior [ 20/Mar/12 8:05 AM ]

Thanks Justin!

Generated at Sat Oct 10 13:12:21 CDT 2015 using JIRA 4.4#649-r158309.