<< Back to previous view

[CLJ-1393] clojure.string/trim doesn't trim null character Created: 28/Mar/14  Updated: 28/Mar/14  Resolved: 28/Mar/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.6
Fix Version/s: None

Type: Defect Priority: Minor
Reporter: Ryan Fowler Assignee: Unassigned
Resolution: Declined Votes: 0
Labels: string


 Description   

CLJ-935 changed clojure.string/trim to not trim all the characters less than or equal to \u0020 as Java does.

I noticed this because base64 uses null characters to pad the end of encoding blocks.

Clojure 1.6.0's trim leaves the null character in:
user=> (.length (clojure.string/trim "\u0000"))
1

java.lang.String's trim takes it out:
user=> (.length (.trim "\u0000"))
0

Here are the first 21 unicode characters and what Character/isWhitespace says about them.

(dotimes [n 0x20] (printf "
u%04x - %b\n" n (Character/isWhitespace n)))
\u0000 - false
\u0001 - false
\u0002 - false
\u0003 - false
\u0004 - false
\u0005 - false
\u0006 - false
\u0007 - false
\u0008 - false
\u0009 - true
\u000a - true
\u000b - true
\u000c - true
\u000d - true
\u000e - false
\u000f - false
\u0010 - false
\u0011 - false
\u0012 - false
\u0013 - false
\u0014 - false
\u0015 - false
\u0016 - false
\u0017 - false
\u0018 - false
\u0019 - false
\u001a - false
\u001b - false
\u001c - true
\u001d - true
\u001e - true
\u001f - true



 Comments   
Comment by Alex Miller [ 28/Mar/14 12:27 PM ]

The choice was made in CLJ-935 to consistently define whitespace as Character.isWhitespace() across trim, triml, and trimr. There are many possible ways to define "space" (at least two as we see here). If your trimming needs differ from the standard library, then you'll probably need to define your own functions to trim your data. You can still use Java interop to call String.trim() directly if that happens to match your needs.

Comment by Ryan Fowler [ 28/Mar/14 1:03 PM ]

Indeed, it's an easy workaround to use Java interop once you figure out what your problem is.

It's just unintuitive that the character generally used for string termination isn't trimmed by clojure.string/trim.

Generated at Sat Sep 20 21:07:32 CDT 2014 using JIRA 4.4#649-r158309.