Clojure-Contrib

append-spit should only write out an encoding marker once

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Resolution: Declined
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

In clojure.contrib.duck-streams append-spit writes out encoding
markers (for UnicodeLittle for example this is a FEFF in hex)
each time it appends to a file. This should happen only when
the file is initially created.

Test case for reproducing this behaviour:

(use 'clojure.contrib.duck-streams)

(binding [*default-encoding* "UnicodeLittle"]
  (append-spit "/foo.txt" "Line 1\n"))
(binding [*default-encoding* "UnicodeLittle"]
  (append-spit "/foo.txt" "Line 2\n"))

(slurp "c:/foo.txt" "UnicodeLittle")

The slurp outputs
"Line 1\n?Line 2\n"
The expected output is:
"Line 1\nLine 2\n"

Activity

Hide
Assembla Importer added a comment -
Show
Assembla Importer added a comment - Converted from http://www.assembla.com/spaces/clojure/tickets/30
Hide
Assembla Importer added a comment -

stuart.sierra said: Updating tickets (#1, #2, #3, #4, #6, #20, #23, #25, #30, #31, #33, #34, #35, #37, #38, #52, #55, #58, #59, #60, #61, #62, #63, #64)

Show
Assembla Importer added a comment - stuart.sierra said: Updating tickets (#1, #2, #3, #4, #6, #20, #23, #25, #30, #31, #33, #34, #35, #37, #38, #52, #55, #58, #59, #60, #61, #62, #63, #64)
Hide
Assembla Importer added a comment -

stu said: I am not sure there is a good answer here. The code above chooses an encoding with an explicit marker, and gets what it asks for.

One proposed solution (http://github.com/sergey-miryanov/clojure-contrib/commits/bug-30) tries to detect this scenario, and recover via a hard-coded mapping between encodings-with-markers and similar-encodings-without. But I don't think this can work in general, because the set of possible encodings is open and the Charset API doesn't provide a mapping between the with-markers and without-markers versions.

Sorry, and please feel free to reopen this if I am missing an obvious approach.

Show
Assembla Importer added a comment - stu said: I am not sure there is a good answer here. The code above chooses an encoding with an explicit marker, and gets what it asks for. One proposed solution (http://github.com/sergey-miryanov/clojure-contrib/commits/bug-30) tries to detect this scenario, and recover via a hard-coded mapping between encodings-with-markers and similar-encodings-without. But I don't think this can work in general, because the set of possible encodings is open and the Charset API doesn't provide a mapping between the with-markers and without-markers versions. Sorry, and please feel free to reopen this if I am missing an obvious approach.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: