Details
-
Type:
Enhancement
-
Status:
Open
-
Priority:
Minor
-
Resolution: Unresolved
-
Affects Version/s: Release 1.5
-
Fix Version/s: None
-
Component/s: None
-
Labels:
-
Patch:Code
Description
clojure.java.io/resource corrupts path containing UTF-8 characters without issuing warning.
user=> (System/getProperty "java.runtime.version") "1.8.0-ea-b79" user=> (clojure-version) "1.5.0" user=> (System/getProperty "user.dir") "/dir/déf" user=> (clojure.java.io/resource "myfile.txt") #<URL file:/dir/d%c3%a9f/resources/myfile.txt> user=> (slurp (clojure.java.io/resource "myfile.txt") :encoding "UTF-8") FileNotFoundException /dir/déf/resources/myfile.txt (No such file or directory) java.io.FileInputStream.open (FileInputStream.java:-2)
Below is a workaround, at least. I don't know, but perhaps the as-file method for URLs in io.clj of Clojure, the part that converts %hh sequences to a character with code point in the range 0 through 255, is at least partly at fault here. I don't know right now if it is possible to modify that code to handle the general case of whatever character encoding munging is going on here to when .getResource creates the URL object.
clojure.java.io/resource is documented to return a Java object of type java.net.URL, which seems like it does %hh escaping of many characters. Reference [1] to a Java bug from 2001 where a Java user was surprised by the then-recent change in behavior of the getResource method [2].
Doing a little searching I found this StackOverflow question [3], which has what might be a workaround. I tried it on my Mac OS X 10.6 system running JDK 1.6 and it seemed to work:
(slurp (.getContent (clojure.java.io/resource "abcíd/foo.txt")))
That getContent is a method for class java.net.URL [4]
[1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4466485
[2] http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Class.html#getResource%28java.lang.String%29
[3] http://stackoverflow.com/questions/13013629/best-international-alternative-to-javas-getclass-getresource
[4] http://docs.oracle.com/javase/1.5.0/docs/api/java/net/URL.html#getContent%28%29