[CLJ-1360] clojure.string/split strips trailing delimiters Created: 18/Feb/14 Updated: 21/May/15
clojure.string/split and clojure.string/split-lines inherit the bizarre default behavior of java.lang.String#split(String,int) in stripping trailing consecutive delimiters:
In the case of split-lines, processing a file line by line and rejoining results in truncation of trailing newlines in the file. In both cases, the behavior is surprising and cannot be inferred from the docstrings.
This behavior should either be fixed or documented.
|Comment by Andy Fingerhut [ 18/Feb/14 10:51 AM ]|
Probably documenting would be safer than changing the behavior at this point, given that some people may actually rely on the current behavior after testing, deploying, etc.
I don't currently have a suggestion for a modified doc string, but note that there are examples of this behavior and how one can use an extra "-1" limit argument at the end to get all split strings: http://clojuredocs.org/clojure_core/clojure.string/split
|Comment by Crispin Wellington [ 21/May/15 10:46 PM ]|
This bug just bit me. +1 to be fixed. If we just document and leave the behavior as is, then we have a surprising and inconsistent behaving split (why are inner empty values kept, but outer ones stripped?) that is different to every other split you've ever used. The optional -1 limit argument looks hacky but a fix could keep this -1 argument working.
EDIT: this looks to be java's string class behavior: http://stackoverflow.com/questions/2170557/split-method-of-string-class-does-not-include-trailing-empty-strings