ClojureScript

re-matches is incorrect

Details

  • Type: Defect Defect
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

The re-matches function does not have the correct semantics: it performs a search (not match) against the string and returns nil if the string and matched-string are unequal. This is not the same as true matching, which is like inserting "^" and "$" at the beginning and end of the pattern.

Example in Clojure:

user=> (re-find #"0|[1-9][0-9]+|0[xX][0-9a-zA-Z]+" "0x1")
"0"
user=> (re-matches #"0|[1-9][0-9]+|0[xX][0-9a-zA-Z]+" "0x1")
"0x1"

Compare Clojurescript:

ClojureScript:cljs.user> (re-find  #"0|[1-9][0-9]+|0[xX][0-9a-zA-Z]+" "0x1")
"0"
ClojureScript:cljs.user> (re-matches #"0|[1-9][0-9]+|0[xX][0-9a-zA-Z]+" "0x1")
nil

This bug is (one of the) reasons why CLJS-775.

I'm not completely sure what to do here. My first thought is to have re-matches inspect the -source property of its regex input, wrap the string with "^$", then carefully copy all flags over to a new regexp.

Questions:

  1. Are there any valid patterns where this is not safe? E.g., where we could not put ^ first? Is "^^abc$$" ok?
  2. Can we avoid cloning if ^ and $ are already the first and last chars of the pattern?
  3. How does multiline mode play in to this, if at all?
  4. regexinstance.lastIndex is a piece of mutability on regex instances (or the RegExp global on older browsers) which is used as a string offset for multiple invocations of exec() on the same string. I have no idea what to do if re-* gets a regex with the global flag set. (BTW, this is a very good reason to reject CLJS-150: allowing clojure to accept the global flag makes regular expression objects stateful, and would completely screw up re-seq for example.)

Activity

David Nolen made changes -
Field Original Value New Value
Priority Major [ 3 ] Minor [ 4 ]

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated: