Clojure

Add clojure.string functions for portability to ClojureScript

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Critical Critical
  • Resolution: Unresolved
  • Affects Version/s: Release 1.6
  • Fix Version/s: None
  • Component/s: None
  • Labels:
  • Patch:
    Code
  • Approval:
    Triaged

Description

It would be useful if a few common functions from Java's String were available as Clojure functions for the purposes of increasing portability to other Clojure platforms like ClojureScript.

The functions below also cover the vast majority of cases where Clojure users currently drop into Java interop for String calls - this tends to be an issue for discoverability and learning. While the goal of this ticket is increased portability, improving that is a nice secondary benefit.

Proposed clojure.string fn java.lang.String method
index-of indexOf
last-index-of lastIndexOf
starts-with? startsWith
ends-with? endsWith
includes? contains

Patch: Patch clj-1449-more-v1.patch is a good start, but needs the following additional work:

1) Update function names per the description here.
2) Per the instructions at the top of the clojure.string ns, functions should take the broader CharSequence interface instead of String. Similar to existing clojure.string functions, you will need to provide a CharSequence implementation while also calling into the String functions when you actually have a String.
3) Consider return type hints - I'm not sure they're necessary here, but I would examine the bytecode in typical calling situations to see if it would be helpful.
4) The new functions need new tests.
5) Check performance implications of the new versions vs the old with a good tool (like criterium). It would be good to know what the difference in perf actually is - some hit is worth it for a portable target.

Activity

Hide
Andrew Rosa added a comment -

Thanks Andy Fingerhut, didn't saw the answer here. I'm happy that Nola Stowe could take this one.

Show
Andrew Rosa added a comment - Thanks Andy Fingerhut, didn't saw the answer here. I'm happy that Nola Stowe could take this one.
Hide
Nola Stowe added a comment -

Andrew Rosa, I am interested on working on this. Are you currently working it?

Show
Nola Stowe added a comment - Andrew Rosa, I am interested on working on this. Are you currently working it?
Hide
Andy Fingerhut added a comment - - edited

Andrew Rosa, I am perfectly happy if someone else wants to work on a revised patch for this. If you are interested, go for it! And from Alex's comments, I believe my initial patch covers such a small fraction of what he wants, do not worry about keeping my name associated with any patch you submit – if yours is accepted, my patch will not have helped you in getting there.

Show
Andy Fingerhut added a comment - - edited Andrew Rosa, I am perfectly happy if someone else wants to work on a revised patch for this. If you are interested, go for it! And from Alex's comments, I believe my initial patch covers such a small fraction of what he wants, do not worry about keeping my name associated with any patch you submit – if yours is accepted, my patch will not have helped you in getting there.
Hide
Andrew Rosa added a comment -

Andy Fingerhut, since you are the author of the original patch do you desire to continue the work on it? If you prefer (or even need) to focus on different stuff, I would like to tackle this issue.

Show
Andrew Rosa added a comment - Andy Fingerhut, since you are the author of the original patch do you desire to continue the work on it? If you prefer (or even need) to focus on different stuff, I would like to tackle this issue.
Hide
Bozhidar Batsov added a comment -

I'm fine with `includes?`. Ruby is famous for the bad English used in its core library.

Show
Bozhidar Batsov added a comment - I'm fine with `includes?`. Ruby is famous for the bad English used in its core library.
Hide
Devin Walters added a comment -

I agree with Alex's comment about the overlap. Personally, I prefer the way "includes?" reads over "include?", but IMO either one is better than adding to the "contains?" confusion.

Show
Devin Walters added a comment - I agree with Alex's comment about the overlap. Personally, I prefer the way "includes?" reads over "include?", but IMO either one is better than adding to the "contains?" confusion.
Hide
Alex Miller added a comment -

Ruby uses "include?" for this.

Show
Alex Miller added a comment - Ruby uses "include?" for this.
Hide
Alex Miller added a comment -

I'm not concerned about overlapping the name. I'm concerned about overlapping it and meaning something different, particularly vs one of the most confusing functions in core already. clojure.core/contains? is better than linear time key search. clojure.string/whatever will be a linear time subsequence match.

Show
Alex Miller added a comment - I'm not concerned about overlapping the name. I'm concerned about overlapping it and meaning something different, particularly vs one of the most confusing functions in core already. clojure.core/contains? is better than linear time key search. clojure.string/whatever will be a linear time subsequence match.
Hide
Andy Fingerhut added a comment -

clojure.string already has 2 name conflicts with clojure.core, for which most people probably do something like (require '[clojure.string :as str]) to avoid that:

user=> (use 'clojure.string)
WARNING: reverse already refers to: #'clojure.core/reverse in namespace: user, being replaced by: #'clojure.string/reverse
WARNING: replace already refers to: #'clojure.core/replace in namespace: user, being replaced by: #'clojure.string/replace
nil
Show
Andy Fingerhut added a comment - clojure.string already has 2 name conflicts with clojure.core, for which most people probably do something like (require '[clojure.string :as str]) to avoid that:
user=> (use 'clojure.string)
WARNING: reverse already refers to: #'clojure.core/reverse in namespace: user, being replaced by: #'clojure.string/reverse
WARNING: replace already refers to: #'clojure.core/replace in namespace: user, being replaced by: #'clojure.string/replace
nil
Hide
Alex Miller added a comment - - edited

I'd love to have a really high-quality patch on this ticket to consider for 1.8 that took into account my comments above.

Also, it occurred to me that I don't think this should be called "contains?", overlapping the core function with a different meaning (contains value vs contains key). Maybe "includes?"?

Show
Alex Miller added a comment - - edited I'd love to have a really high-quality patch on this ticket to consider for 1.8 that took into account my comments above. Also, it occurred to me that I don't think this should be called "contains?", overlapping the core function with a different meaning (contains value vs contains key). Maybe "includes?"?
Hide
Alex Miller added a comment -

Well nothing is certain of course, but I keep a special report of things I've "screened" prior to vetting that makes possible moving something straight from Triaged all the way through into Screened/Ok when Rich is able to look at them. This is a good candidate if things were in pristine condition.

That said, I don't know whether Rich will approve it or not, so it's up to you. I think the argument for portability is a strong one and complements the feature expression.

Show
Alex Miller added a comment - Well nothing is certain of course, but I keep a special report of things I've "screened" prior to vetting that makes possible moving something straight from Triaged all the way through into Screened/Ok when Rich is able to look at them. This is a good candidate if things were in pristine condition. That said, I don't know whether Rich will approve it or not, so it's up to you. I think the argument for portability is a strong one and complements the feature expression.
Hide
Andy Fingerhut added a comment -

Alex, all your comments make sense.

If you think a ready-and-waiting patch that does those things would improve the odds of the ticket being vetted by Rich, please let us know.

My guess is that his decision will be based upon the description, not any proposed patches. If that is your belief also, I'll wait until he makes that decision before working on a patch. Of course, any other contributor is welcome to work on one if they like.

Show
Andy Fingerhut added a comment - Alex, all your comments make sense. If you think a ready-and-waiting patch that does those things would improve the odds of the ticket being vetted by Rich, please let us know. My guess is that his decision will be based upon the description, not any proposed patches. If that is your belief also, I'll wait until he makes that decision before working on a patch. Of course, any other contributor is welcome to work on one if they like.
Hide
Alex Miller added a comment - - edited

Some things that would be helpful:

1) It would be better to combine the two patches into a single patch - I think changing current uses into new users is a good thing to include. Also, please keep track of the "current" patch in the description.
2) Patch needs tests.
3) Per the instructions at the top of the clojure.string ns (and the rest of the functions), the majority of these functions are implemented to take the broader CharSequence interface. Similar to those implementations, you will need to provide a CharSequence implementation while also calling into the String functions when you actually have a String.
4) Consider return type hints - I'm not sure they're necessary here, but I would examine bytecode for typical calling situations to see if it would be helpful.
5) Check performance implications of the new versions vs the old with a good tool (like criterium). You've put an additional var invocation and (soon) type check in the calling path for these functions. I think providing a portable target is worth a small cost, but it would be good to know what the cost actually is.

I don't expect we will look at this until after 1.7 is released.

Show
Alex Miller added a comment - - edited Some things that would be helpful: 1) It would be better to combine the two patches into a single patch - I think changing current uses into new users is a good thing to include. Also, please keep track of the "current" patch in the description. 2) Patch needs tests. 3) Per the instructions at the top of the clojure.string ns (and the rest of the functions), the majority of these functions are implemented to take the broader CharSequence interface. Similar to those implementations, you will need to provide a CharSequence implementation while also calling into the String functions when you actually have a String. 4) Consider return type hints - I'm not sure they're necessary here, but I would examine bytecode for typical calling situations to see if it would be helpful. 5) Check performance implications of the new versions vs the old with a good tool (like criterium). You've put an additional var invocation and (soon) type check in the calling path for these functions. I think providing a portable target is worth a small cost, but it would be good to know what the cost actually is. I don't expect we will look at this until after 1.7 is released.
Hide
Bozhidar Batsov added a comment -

No idea when/if this will get merged. Upvote the ticket to improve the odds of this happening sooner.
Someone on the ClojureScript team will have to implement this in terms of JavaScript.

Show
Bozhidar Batsov added a comment - No idea when/if this will get merged. Upvote the ticket to improve the odds of this happening sooner. Someone on the ClojureScript team will have to implement this in terms of JavaScript.
Hide
Yehonathan Sharvit added a comment - - edited

Great! Any idea when it will be added to Clojure?
Also, will it be automatically added to Clojurescript or someone will have to write a particular code for it.
The suggested patch relies on Java so I am curious to understand who is going to port the patch to cljs.

Show
Yehonathan Sharvit added a comment - - edited Great! Any idea when it will be added to Clojure? Also, will it be automatically added to Clojurescript or someone will have to write a particular code for it. The suggested patch relies on Java so I am curious to understand who is going to port the patch to cljs.
Hide
Bozhidar Batsov added a comment -

Once this is added to Clojure it will be implemented in ClojureScript as well.

Show
Bozhidar Batsov added a comment - Once this is added to Clojure it will be implemented in ClojureScript as well.
Hide
Yehonathan Sharvit added a comment -

What about an implementation that works also in cljs?

Show
Yehonathan Sharvit added a comment - What about an implementation that works also in cljs?
Hide
Andy Fingerhut added a comment -

Patch clj-1449-basic-v1.patch dated Sep 5 2014 is identical to the patch I added recently called clj-1149-basic-v1.patch. It is simply renamed without the typo'd ticket number in the file name.

Show
Andy Fingerhut added a comment - Patch clj-1449-basic-v1.patch dated Sep 5 2014 is identical to the patch I added recently called clj-1149-basic-v1.patch. It is simply renamed without the typo'd ticket number in the file name.
Hide
Andy Fingerhut added a comment -

Patch clj-1449-basic-v1.patch dated Aug 30 2014 adds starts-with? ends-with? contains? functions to clojure.string.

Patch clj-1449-more-v1.patch is the same, except it also replaces several Java method calls with calls to these Clojure functions.

Show
Andy Fingerhut added a comment - Patch clj-1449-basic-v1.patch dated Aug 30 2014 adds starts-with? ends-with? contains? functions to clojure.string. Patch clj-1449-more-v1.patch is the same, except it also replaces several Java method calls with calls to these Clojure functions.
Hide
Andy Fingerhut added a comment -

Updating summary line and description to add contains? as well. I can back this off if it changes your mind about triaging it, Alex.

Show
Andy Fingerhut added a comment - Updating summary line and description to add contains? as well. I can back this off if it changes your mind about triaging it, Alex.
Hide
Pierre Masci added a comment -

Hi, I was thinking the same about starts-with and .ends-with, as well as (.indexOf s "c") and (.lastIndexOf "c").

I read the whole Java String API recently, and these 4 functions seem to be the only ones that don't have an equivalent in Clojure.
It would be nice to have them.

Andy Fingerhut who maintains the Clojure Cheatsheet told me: "I maintain the cheatsheet, and I put .indexOf and .lastIndexOf on there since they are probably the most common thing I saw asked about that is in the Java API but not the Clojure API, for strings."
Which shows that there is a demand.

Because Clojure is being hosted on several platforms, and might be hosted on more in the future, I think these functions should be part of the de-facto ecosystem.

Show
Pierre Masci added a comment - Hi, I was thinking the same about starts-with and .ends-with, as well as (.indexOf s "c") and (.lastIndexOf "c"). I read the whole Java String API recently, and these 4 functions seem to be the only ones that don't have an equivalent in Clojure. It would be nice to have them. Andy Fingerhut who maintains the Clojure Cheatsheet told me: "I maintain the cheatsheet, and I put .indexOf and .lastIndexOf on there since they are probably the most common thing I saw asked about that is in the Java API but not the Clojure API, for strings." Which shows that there is a demand. Because Clojure is being hosted on several platforms, and might be hosted on more in the future, I think these functions should be part of the de-facto ecosystem.
Hide
Bozhidar Batsov added a comment -

Re substring, I know about subs, but it seems very odd that it's not in the string ns. After all most people will likely look for string-related functionality in clojure.string. I think it'd be best if `subs` was added to clojure.string and clojure.core/subs was deprecated.

Show
Bozhidar Batsov added a comment - Re substring, I know about subs, but it seems very odd that it's not in the string ns. After all most people will likely look for string-related functionality in clojure.string. I think it'd be best if `subs` was added to clojure.string and clojure.core/subs was deprecated.
Hide
Alex Miller added a comment -

Jozef, String.startsWith() checks for a prefix string, not just a prefix char.

Show
Alex Miller added a comment - Jozef, String.startsWith() checks for a prefix string, not just a prefix char.
Hide
Jozef Wagner added a comment -

As strings are collection of characters, you can use Clojure's sequence facilities to achieve such functionality:

user=> (= (first "asdf") \a)
true
user=> (= (last "asdf") \a)
false
Show
Jozef Wagner added a comment - As strings are collection of characters, you can use Clojure's sequence facilities to achieve such functionality:
user=> (= (first "asdf") \a)
true
user=> (= (last "asdf") \a)
false
Hide
Alex Miller added a comment -

Re substring, there is a clojure.core/subs for this (predates the string ns I believe).

clojure.core/subs
([s start] [s start end])
Returns the substring of s beginning at start inclusive, and ending
at end (defaults to length of string), exclusive.

Show
Alex Miller added a comment - Re substring, there is a clojure.core/subs for this (predates the string ns I believe). clojure.core/subs ([s start] [s start end]) Returns the substring of s beginning at start inclusive, and ending at end (defaults to length of string), exclusive.

People

Vote (25)
Watch (4)

Dates

  • Created:
    Updated: