<< Back to previous view

[DFRS-5] Add c.l.PersistentHashSet ReadHandler Created: 12/Nov/14  Updated: 15/Nov/14

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Daniel Compton Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None
Environment:

data.fressian 0.2.0



 Description   

clojure.data.fressian doesn't have a ReadHandler defined for "set". This means that when you serialise and deserialise a Clojure PersistentHashSet, it returns as a java.util.HashMap. Is there a reason why we wouldn't define a ReadHandler for "set" in clojure-read-handlers?



 Comments   
Comment by Daniel Compton [ 15/Nov/14 12:35 AM ]

@stu is it intentional that c.d.f doesn't have a ReadHandler for reading fressian sets into persistent sets, or vectors into persistent vectors?





[DFRS-7] Roundtrip encoding of values are unequal under Clojure 1.6 Created: 12/Nov/14  Updated: 13/Nov/14

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Daniel Compton Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Clojure 1.6



 Description   

While running the tests on https://github.com/danielcompton/data.fressian-test I found this bug:

(let [mys {[3] [] :a nil, :b nil :c nil :d nil :e nil :f nil :g nil}
      myf (fr/read (fr/write mys))]
  (println (type mys))
  (println (type (key (first mys))))
  (println (type myf))
  (println (type (key (first myf))))
  (= mys myf))
clojure.lang.PersistentArrayMap
clojure.lang.PersistentVector
clojure.lang.PersistentHashMap
java.util.Arrays$ArrayList
=> false

Changing the key [3] to :x

{:x [] :a nil, :b nil :c nil :d nil :e nil :f nil :g nil}

or removing another kv pair

{[3] [] :b nil :c nil :d nil :e nil :f nil :g nil}

will make the roundtrip encoding of Fressian values equal.

This is because of the changes in hashing behaviour introduced in 1.6 and relates to CLJ-1372. I think there may also be some interaction with Fressian's creation of ArrayMaps for maps with less than 8 kv pairs.

One fix would be to make sure data.fressian always returns persistent Clojure data structures, so it avoids tripping over Clojure no longer having Java data structures be equal to their analogous Clojure ones.



 Comments   
Comment by Daniel Compton [ 13/Nov/14 3:06 PM ]

Clarify issue.





[DFRS-6] Use a testing framework which creates nested collections Created: 12/Nov/14  Updated: 12/Nov/14

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Daniel Compton Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None


 Description   

When you come to upgrade Fressian from 1.5.1 to 1.6, the tests will pass, but the equality behaviour in 1.6 is different. For example

(let [val #{}]
  (= (set [val]) (set [(fressian/read (fressian/write val))])))

This is because Fressian 0.2.0 reads a Fressian set back off the wire as a java.util.HashSet (c.f. DFRS-5), and PersistentHashSet containing an empty PersistentHashSet is no longer equal to a PersistentHashSet containing a java.util.HashSet.

To catch these changes, you will need to use a generative tester which creates nested values (and include sets as well). test.check does arbitrarily deep nested values but doesn't have sets yet (TCHECK-51).

I have put a sample project on GitHub showing this in action https://github.com/danielcompton/data.fressian-test






[DFRS-3] Lists do not round trip Created: 06/Apr/14  Updated: 29/Jul/14  Resolved: 29/Jul/14

Status: Closed
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Alex Miller Assignee: Stuart Halloway
Resolution: Declined Votes: 0
Labels: None


 Description   
(fress/read (fress/write '())) -> []

Note: Moved from https://github.com/clojure/data.fressian/issues/3



 Comments   
Comment by Stuart Halloway [ 29/Jul/14 9:51 AM ]

Roundtripping is not an objective. User control, however, is an objective, and the real issue here is Fressian handling of list.

See comments on http://dev.clojure.org/jira/browse/DFRS-1.





[DFRS-1] add clojure.lang.PersistentVector encoding/decoding Created: 03/Dec/13  Updated: 29/Jul/14  Resolved: 29/Jul/14

Status: Closed
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Max Countryman Assignee: Stuart Halloway
Resolution: Declined Votes: 0
Labels: None

Attachments: Text File 0001-add-clojure.lang.PersistentVector-encoding-decoding.patch    
Patch: Code and Test

 Description   

Previously, clojure.lang.PersistentVector was not special-cased on writing
and reading. This resulted in fressian-encoded streams, containing Clojure
vectors, being coerced on reads into java.lang.ArrayList. Coercion here is
a bit surprising and potentially causes unexpected errors.

Here we explicitly mark clojure.lang.PersistentVector instances with the
vec tag on writes and upon reads decode these back to Clojure's vector type
by casting them with a call to vec.



 Comments   
Comment by Stuart Halloway [ 29/Jul/14 9:48 AM ]

The real problem here is that List extensibility is not exposed in Fressian itself. We would need to look into either

  1. user specified handlers where FressianReader calls getHandler
  2. documenting the ConvertList interface

or

  1. making the list extension point more feel like the other collection types for customization
Comment by Stuart Halloway [ 29/Jul/14 9:48 AM ]

Needs Fressian enhancement instead.





[DFRS-4] Better document how to extend with custom readers and writers Created: 05/May/14  Updated: 06/May/14  Resolved: 06/May/14

Status: Resolved
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Alex Miller Assignee: Alex Miller
Resolution: Completed Votes: 0
Labels: docstring, documentation

Approval: Ok

 Description   

This was requested over email as it was not particularly clear how to integrate custom handlers with the existing handler maps when creating readers and writers.

In particular, it was confusing how to properly create the handler maps using the provided utilities.



 Comments   
Comment by Alex Miller [ 05/May/14 8:38 PM ]

Created wiki page with an example and more info:

https://github.com/clojure/data.fressian/wiki/Creating-custom-handlers

Comment by Alex Miller [ 06/May/14 11:36 AM ]

docstrings updated for create-reader and create-writer.





[DFRS-2] Make writing footer checksums less expensive or optional Created: 17/Dec/13  Updated: 18/Dec/13

Status: Open
Project: data.fressian
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Enhancement Priority: Major
Reporter: Ghadi Shayban Assignee: Stuart Halloway
Resolution: Unresolved Votes: 0
Labels: None

Approval: Incomplete

 Description   

Problem:
JVM profiler indicates checksums as implemented are a significant bottleneck.

Cause:
impl.RawOutput wraps the provided OutputStream with a CheckedOutputStream. Every time a rawInt is written, CheckedOutputStream calls on its checksum to update itself.

Adler32's update method happens to be native, which may not be germane to the problem.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/zip/Adler32.java#91

The read side of data.fressian already exposes a knob for checksums to be ignored in RawInput. No such knob exists on the write side.

Checksums are used in the footer methods. They may be extremely useful for data at rest, but may be redundant with other out-of-band mechanisms.

Possible solutions
Buffering so that checksums don't recalculate frequently.
Exposing a knob to control whether write checksums are enabled. This would potentially involve changes with the footer.



 Comments   
Comment by Stuart Halloway [ 18/Dec/13 8:33 AM ]

It is definitely possible that the checksum calculation dings perf. (And if so, another possible solution is just removing checksums entirely from Fressian.)

That said, I don't want to trust a profiler. To move this forward, would like to see a benchmark of a real-world use case without the profiler in play.





Generated at Sat Nov 22 10:48:21 CST 2014 using JIRA 4.4#649-r158309.