<< Back to previous view

[CLJ-1360] clojure.string/split strips trailing delimiters Created: 18/Feb/14  Updated: 18/Feb/14

Status: Open
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Tim McCormack Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None


clojure.string/split and clojure.string/split-lines inherit the bizarre default behavior of java.lang.String#split(String,int) in stripping trailing consecutive delimiters:

(clojure.string/split "banana" #"an")
⇒ ["b" "" "a"]
(clojure.string/split "banana" #"na")
⇒ ["ba"]
(clojure.string/split "nanabanana" #"na")
⇒ ["" "" "ba"]

In the case of split-lines, processing a file line by line and rejoining results in truncation of trailing newlines in the file. In both cases, the behavior is surprising and cannot be inferred from the docstrings.

This behavior should either be fixed or documented.

Comment by Andy Fingerhut [ 18/Feb/14 10:51 AM ]

Probably documenting would be safer than changing the behavior at this point, given that some people may actually rely on the current behavior after testing, deploying, etc.

I don't currently have a suggestion for a modified doc string, but note that there are examples of this behavior and how one can use an extra "-1" limit argument at the end to get all split strings: http://clojuredocs.org/clojure_core/clojure.string/split

Generated at Mon Apr 20 20:36:20 CDT 2015 using JIRA 4.4#649-r158309.