<< Back to previous view

[DFRS-2] Make writing footer checksums less expensive or optional Created: 17/Dec/13  Updated: 18/Dec/13

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Approval: Incomplete

 Description   

Problem:
JVM profiler indicates checksums as implemented are a significant bottleneck.

Cause:
impl.RawOutput wraps the provided OutputStream with a CheckedOutputStream. Every time a rawInt is written, CheckedOutputStream calls on its checksum to update itself.

Adler32's update method happens to be native, which may not be germane to the problem.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/zip/Adler32.java#91

The read side of data.fressian already exposes a knob for checksums to be ignored in RawInput. No such knob exists on the write side.

Checksums are used in the footer methods. They may be extremely useful for data at rest, but may be redundant with other out-of-band mechanisms.

Possible solutions
Buffering so that checksums don't recalculate frequently.
Exposing a knob to control whether write checksums are enabled. This would potentially involve changes with the footer.



 Comments   
Comment by Stuart Halloway [ 18/Dec/13 8:33 AM ]

It is definitely possible that the checksum calculation dings perf. (And if so, another possible solution is just removing checksums entirely from Fressian.)

That said, I don't want to trust a profiler. To move this forward, would like to see a benchmark of a real-world use case without the profiler in play.





Generated at Fri Oct 31 18:00:26 CDT 2014 using JIRA 4.4#649-r158309.