Clojure

java.io/do-copy can garble multibyte characters

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Completed
  • Affects Version/s: Release 1.3
  • Fix Version/s: Release 1.4
  • Component/s: None
  • Labels:
  • Environment:
    all
  • Patch:
    Code and Test
  • Approval:
    Ok

Description

See comments in fix:

(defmethod do-copy [InputStream Writer] [#^InputStream input #^Writer output opts]
;; WRONG! if the buffer boundry falls in the middle of a multibyte character, we will get garbled results.
#_
(let [#^"[B" buffer (make-array Byte/TYPE (buffer-size opts))]
(loop []
(let [size (.read input buffer)]
(when (pos? size)
(let [chars (.toCharArray (String. buffer 0 size (encoding opts)))]
(do (.write output chars)
(recur)))))))
;; here we decode the characters before stuffing them into the buffer
(let [#^"[C" buffer (make-array Character/TYPE (buffer-size opts))
in (InputStreamReader. input (encoding opts))]
(loop []
(let [size (.read in buffer 0 (alength buffer))]
(if (pos? size)
(do (.write output buffer 0 size)
(recur)))))))

  1. clj-886.diff
    05/Dec/11 11:23 AM
    2 kB
    Jeff Palmucci
  2. CLJ-886-fix2.patch
    09/Feb/12 8:13 PM
    7 kB
    Andy Fingerhut

Activity

People

Vote (2)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: