<< Back to previous view

[DXML-1] Stack overflow when parsing huge XML file Created: 10/Feb/12  Updated: 20/Mar/12  Resolved: 20/Mar/12

Status: Resolved
Project: data.xml
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Justin Kramer Assignee: Ryan Senior
Resolution: Completed Votes: 1
Labels: patch,
Environment:

OS X


Attachments: Text File data-xml-kwopts.patch    
Patch: Code and Test

 Description   

This is using Ryan Senior's new 0.0.3-SNAPSHOT.

While trying to parse a huge XML file (7.5 GB compressed, a dump of Wikipedia), got a stack overflow error. Some digging turned up this bug:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6440214

Modifying clojure.data.xml/source-seq to disable the IS_COALESCING property got rid of the error.

The old lazy-xml contrib code worked (although used up tons more memory).

Attached is a patch that adds keyword options to source-seq, parse, and parse-str, allowing the consumer to disable coalescing and sidestep the upstream bug.



 Comments   
Comment by Ryan Senior [ 20/Mar/12 8:05 AM ]

Thanks Justin!

Generated at Fri Oct 24 23:45:26 CDT 2014 using JIRA 4.4#649-r158309.