Clojure

clojure.string/split strips trailing delimiters

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code
  • Approval:
    Prescreened

Description

clojure.string/split and clojure.string/split-lines inherit the bizarre default behavior of java.lang.String#split(String,int) in stripping trailing consecutive delimiters:

(clojure.string/split "banana" #"an")
⇒ ["b" "" "a"]
(clojure.string/split "banana" #"na")
⇒ ["ba"]
(clojure.string/split "nanabanana" #"na")
⇒ ["" "" "ba"]

In the case of split-lines, processing a file line by line and rejoining results in truncation of trailing newlines in the file. In both cases, the behavior is surprising and cannot be inferred from the docstrings. A workaround for split is to pass a limit of -1.

Proposed: As current users may be relying on the current behavior, the attached merely updates the docstring to warn of this behavior and suggest use of -1 as a limit to workaround.

Patch: clj-1360.patch

Activity

People

Vote (2)
Watch (3)

Dates

  • Created:
    Updated: