clojure.string/trim uses different defn of whitespace as triml, trimr


  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Completed
  • Affects Version/s: Release 1.6
  • Fix Version/s: Release 1.6
  • Component/s: None
  • Labels:
  • Patch:
    Code and Test
  • Approval:


clojure.string/triml and trimr use Character/isWhitespace to determine whether a character is whitespace, but trim uses some other definition of white space character. For example:

user=> (use 'clojure.string)
user=> (def s "  \u2002  foo")
user=> (trim s)
"?  foo"
user=> (triml s)

Cause: triml and trimr use Character/isWhitespace. trim uses String/trim which seems to define whitespace as any character less than or equal '\u0020'. The isWhitespace() definition is slightly different and includes other Unicode space characters.

Approach: The attached patch changes trim to use Character/isWhitespace. The isWhitespace version seems generally newer and more Unicode considerate so this was chosen over changing triml and trimr to match trim.

A few alternative implementations were considered with respect to longs, ints, etc. The patch opts to use the simplest possible code, eschewing any extreme performance measures. See the comments for more info if desired.

The patch also changes triml to only call .length on s once.

Patch: clj935-3.patch

Screened by: Stuart Sierra

  1. clj935-2.patch
    30/Aug/13 3:47 PM
    3 kB
    Alex Miller
  2. clj935-3.patch
    02/Dec/13 11:07 PM
    3 kB
    Alex Miller
  3. fix-trim-fns-different-whitespace-patch.txt
    21/Feb/12 1:29 PM
    3 kB
    Andy Fingerhut



Vote (0)
Watch (2)


  • Created: