tools.reader

Reader supports poorly defined regexes that break code

Details

  • Type: Defect Defect
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Declined
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

I ran into a strange case where CLJS emitted invalid code based on a poorly formatted regex that escaped / incorrectly.

Looking at such a regex, along with two similar but well formed regexes, passing through tools.reader:

(str #"/")   => "\/"
(str #"\/")  => "\\/"
(str #"\\/") => "\\\\/"

But what does

"\\/"
mean here?

Looking at Clojure execution of these regexes:

(re-find #"/"   "\/") => "/"
(re-find #"\/"  "\/") => "/"
(re-find #"\\/" "\/") => "\\/"

ie.

#"\/"
behaves exactly like
#"/"

Things get more unfortunate once CLJS get's involved, it does not expect the "heisen" regex - and the "dangling escape" ends up capturing the forward slash's escape, ie. an prematurely terminating regex is emitted.

Despite Clojure's existing "fortuitous" behaviour, perhaps the correct behaviour is to throw a reader exception for such regexes, as it does for

"\/"

Alternatively, if

 #"\/" 
remains supported (for familiarity with users used to /.../ syntax), then the reader should emit
"/"
not
"\\/"
as the string value of the literal, ie. this "tolerance" should be part of the reader semantics rather than a concern for emitters.

See http://dev.clojure.org/jira/browse/CLJS-1399

Activity

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: