[CLJ-870] clojure.string/replace behaves unexpectedly when \ or $ are part of the result string Created: 04/Nov/11 Updated: 18/Aug/12 Resolved: 18/Aug/12 |
|
| Status: | Closed |
| Project: | Clojure |
| Component/s: | None |
| Affects Version/s: | Release 1.2, Release 1.3 |
| Fix Version/s: | Release 1.5 |
| Type: | Defect | Priority: | Minor |
| Reporter: | Heinz N. Gies | Assignee: | Stuart Sierra |
| Resolution: | Completed | Votes: | 2 |
| Labels: | None | ||
| Environment: |
HW/OS/SW indipendant - issue is part of java interface |
||
| Attachments: |
|
| Patch: | Code and Test |
| Approval: | Ok |
| Waiting On: | Rich Hickey |
| Description |
|
clojure.string/replace uses javas replace function to do it's work, the replace function has the tricky habit of 'double evaluating' the replacement string (third argument). This means that every \ in the string (so i.e. " Since this behavior isn't documented it can lead to confusing errors, for example (made up semi real world example to explain the issue): (clojure.string/replace "c:/windows/" #"/" "\\") This should replace all unix folder separators with windows separators, this crashes with a index out of bound exemption since java tries to evaluate " or (println (str "\"" (clojure.string/replace "my string with \\ and \" in it" #"[\"\\]" (fn [x] (str "\\" x))) "\"")) The expected result would be: "my string with \\ and \" in it" the actual result is: "my string with \ and " in it" This should return an 'escaped' string, it does not since the 'double evaluation' of the return string results in \\\\ being reduced to |
| Comments |
| Comment by Heinz N. Gies [ 04/Nov/11 5:31 AM ] |
|
A working code for the replace example is: (println (str "\"" (clojure.string/replace "my string with \\ and \" in it" #"[\"\\]" (fn [x] (str "\\\\" (if (= x "\\") "\\") x))) "\"")) which is horribly ugly since you need to hand escape the backslash or it crashes. |
| Comment by Stuart Sierra [ 09/Dec/11 3:31 PM ] |
|
Vetted on 1.4.0 master. Needs patch. |
| Comment by Alan Malloy [ 05/Jan/12 8:30 PM ] |
|
No hand-escaping is necessary, just use #(java.util.regex.Matcher/quoteReplacement %). Edit: I see, we're talking about when you use a function as a replacement, not a replacement string like " |
| Comment by Alan Malloy [ 05/Jan/12 8:53 PM ] |
|
Patch and test for this issue. |
| Comment by Andy Fingerhut [ 06/Jan/12 12:25 AM ] |
|
|
| Comment by Andy Fingerhut [ 06/Jan/12 12:32 AM ] |
|
If there is any interest in a combined patch for this issue, |
| Comment by Andy Fingerhut [ 12/Jan/12 12:05 AM ] |
|
Proposed combined patch for |
| Comment by Stuart Sierra [ 03/Feb/12 9:15 AM ] |
|
The 1/12/2012 combined patch is not in correct format. Please recreate, see http://clojure.org/patches for correct format. |
| Comment by Andy Fingerhut [ 04/Feb/12 10:51 AM ] |
|
|
| Comment by Andy Fingerhut [ 28/Feb/12 12:38 PM ] |
|
Attaching slightly improved patch |
| Comment by Aaron Bedra [ 14/Aug/12 8:44 PM ] |
|
This is a small nit pick, but re-qr is a pretty non-descriptive name. Perhaps we could change it to re-quote-replacement? |
| Comment by Andy Fingerhut [ 15/Aug/12 11:19 AM ] |
|
Patches clj-753-870-905-combined-fix4.patch dated Aug 15 2012 and the patch that Aaron reviewed, fix3 uses the name re-qr for a function, where fix4 uses the name re-quote-replacement. I have no strong preference for either of them. |
| Comment by Andy Fingerhut [ 15/Aug/12 11:23 AM ] |
|
Added patch addressing what I think is Aaron's reason for changing state to Incomplete, so as a result I am changing state back to its former Vetted state. If there is something else a non-screener should be doing to bring Incomplete tickets to the attention of screeners, please let me know, and I will document it here: http://dev.clojure.org/display/design/JIRA+workflow |
| Comment by Stuart Sierra [ 15/Aug/12 2:42 PM ] |
|
I will be looking at this. |
| Comment by Stuart Sierra [ 17/Aug/12 7:55 AM ] |
|
Screened. This is an adequate fix for this particular problem, given the current API. However, the long and complicated docstring proves that replace and replace-first should each be four separate functions: code This is how these functions were originally written back in the days of clojure.contrib.str-utils. This would be a minor breaking change. |