<< Back to previous view

[CLJ-740] Unnecessary boxing of primitives in case form Created: 17/Feb/11  Updated: 15/Sep/14  Resolved: 15/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: Release 1.3
Fix Version/s: None

Type: Defect Priority: Major
Reporter: Mikhail Kryshen Assignee: Unassigned
Resolution: Not Reproducible Votes: 0
Labels: None


 Description   

Found this while profiling some performance-critical code.

Consider the following Clojure function:

(defn test-case ^double [^long i ^double d1 ^double d2]
  (case (int i)
    0 d1
    d2))

Current Clojure 1.3 snapshot compiles it to:

public final double invokePrim(long, double, double)   throws java.lang.Exception;
  Code:
   0:	lload_1
   1:	invokestatic	#67; //Method clojure/lang/RT.intCast:(J)I
   4:	istore	7
   6:	iload	7
   8:	i2l
   9:	invokestatic	#73; //Method clojure/lang/Numbers.num:(J)Ljava/lang/Number;
   12:	invokestatic	#79; //Method clojure/lang/Util.hash:(Ljava/lang/Object;)I
   15:	iconst_0
   16:	ishr
   17:	iconst_1
   18:	iand
   19:	tableswitch{ //0 to 0
		0: 36;
		default: 58 }
   36:	iload	7
   38:	i2l
   39:	invokestatic	#73; //Method clojure/lang/Numbers.num:(J)Ljava/lang/Number;
   42:	getstatic	#45; //Field const__3:Ljava/lang/Object;
   45:	invokestatic	#83; //Method clojure/lang/Util.equals:(Ljava/lang/Object;Ljava/lang/Object;)Z
   48:	ifeq	58
   51:	dload_3
   52:	invokestatic	#88; //Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
   55:	goto	63
   58:	dload	5
   60:	invokestatic	#88; //Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
   63:	checkcast	#92; //class java/lang/Number
   66:	invokevirtual	#96; //Method java/lang/Number.doubleValue:()D
   69:	dreturn
}

This bytecode contains boxing of primitives (calls to clojure/lang/Numbers.num and java/lang/Double.valueOf) and calls to clojure/lang/Util.hash and clojure/lang/Util.equals that does not seem necessary.

At 60-66 primitive double is boxed into Double only to be converted back into primitive.

The equivalent Java code compiles to much simpler and faster bytecode:

public double testCase(long, double, double);
  Code:
   0:	lload_1
   1:	l2i
   2:	lookupswitch{ //1
		0: 20;
		default: 22 }
   20:	dload_3
   21:	dreturn
   22:	dload	5
   24:	dreturn
}


 Comments   
Comment by Alexander Taggart [ 28/Feb/11 2:16 PM ]

Improved via patch on CLJ-426.

(defn test-case ^double [^long i ^double d1 ^double d2]
  (case (int i)
    0 d1
    d2))

now emits as

 0  lload_1 [i]
 1  invokestatic clojure.lang.RT.intCast(long) : int [67]
 4  istore 7 [G__7903] // let-bound expression
 6  iload 7 [G__7903]
 8  tableswitch default: 32
      case 0: 28
28  dload_3 [d2]
29  goto 34
32  dload 5 [arg2]
34  dreturn

or if the int cast of the expression is omitted:

 0  lload_1 [i]
 1  lstore 7 [G__7903] // let-bound expression
 3  lload 7 [G__7903]
 5  l2i
 6  tableswitch default: 35
      case 0: 24
24  lconst_0           // match, verify long expr wasn't truncated
25  lload 7 [G__7903]
27  lcmp
28  ifne 35
31  dload_3 [d2]
32  goto 37
35  dload 5 [arg2]
37  dreturn
Comment by Ghadi Shayban [ 14/Sep/14 10:00 PM ]

I can't reproduce this on -master. No boxing here:

  public final double invokePrim(long i, double d2, double arg2);
     0  lload_1 [i]
     1  invokestatic clojure.lang.RT.intCast(long) : int [44]
     4  istore 7 [G__3439]
     6  iload 7 [G__3439]
     8  tableswitch default: 32
          case 0: 28
    28  dload_3 [d2]
    29  goto 34
    32  dload 5 [arg2]
    34  dreturn
      Line numbers:
        [pc: 0, line: 1]
        [pc: 0, line: 2]
        [pc: 6, line: 2]
      Local variable table:
        [pc: 6, pc: 34] local: G__3439 index: 7 type: int
        [pc: 0, pc: 34] local: this index: 0 type: java.lang.Object
        [pc: 0, pc: 34] local: i index: 1 type: long
        [pc: 0, pc: 34] local: d1 index: 2 type: double
        [pc: 0, pc: 34] local: d2 index: 3 type: double
Comment by Alex Miller [ 15/Sep/14 10:54 AM ]

The case stuff has been worked on several times (including in 1.6) since it's release, so it's likely this was fixed as a side effect of prior changes.





[CLJ-1100] Reader literals cannot contain periods Created: 02/Nov/12  Updated: 12/Sep/14  Resolved: 12/Sep/14

Status: Closed
Project: Clojure
Component/s: None
Affects Version/s: None
Fix Version/s: Release 1.7

Type: Defect Priority: Minor
Reporter: Kevin Lynagh Assignee: Unassigned
Resolution: Declined Votes: 1
Labels: reader

Attachments: Text File CLJ-1100-reader-tags-with-periods.patch     Text File clj-1100-v2.patch    
Patch: Code and Test
Approval: Incomplete

 Description   

The reader tries to read a record instead of a literal if the tag contains periods.

user> (binding [*data-readers* {'foo/bar #'identity}] (read-string "#foo/bar 1"))
1
user> (binding [*data-readers* {'foo/bar.x #'identity}] (read-string "#foo/bar.x 1"))
ClassNotFoundException foo/bar.x  java.lang.Class.forName0 (Class.java:-2)

Summary of reader forms:

Kind Example Constraint Status
Record #user.Foo[1] record class name OK
Class #java.lang.String["abc"] class name OK
Clojure reader tag #uuid "c48d7d6e-f3bb-425a-abc5-44bd014a511d" not a class name, no "/" OK
Library reader tag #my/card "5H" not a class name, has "/" OK
  #my.ns/card "5H" not a class name, has "/" OK
  #my/playing.card "5H" not a class name, has "/" BROKEN - read as record

Note: reader tags should not be allowed to override the record reader.

Cause: In LispReader, CtorReader.invoke() decides between record and tagged literal based on whether the tag has a ".".

Proposed: Change the discriminator in CtorReader by doing more string inspection:

  • If name has a "/" -> readTagged (not a legal class name)
  • If name has no "/" or "." -> readTagged (records must have qualified names)
  • Else -> readRecord (also covers Java classes)

Tradeoffs: Clojure-defined data reader tags must not contain periods. Not possible to read a Java class with no package. Avoids unnecessary class loading/construction for all tags.

Patch: CLJ-1100-v2.patch

Screened by: Alex Miller

Alternatives considered:

Using class checks:

  • Attempt readRecord (also covers Java classes)
  • If failed, attempt readTagged

Tradeoffs: Clojure tags could not override Java/record constructors - not sure that's something we'd ever want to do, but this would cut that off. This alternative may attempt classloading when it would not have before.



 Comments   
Comment by Steve Miner [ 06/Nov/12 9:41 AM ]

The suggested patch (clj-1100-reader-literal-periods.patch) will break reading records when *default-data-reader-fn* is set. Try adding a test like this:

(deftest tags-containing-periods-with-default
      ;; we need a predefined record for this test so we (mis)use clojure.reflect.Field for convenience
      (let [v "#clojure.reflect.Field{:name \"fake\" :type :fake :declaring-class \"Fake\" :flags nil}"]
        (binding [*default-data-reader-fn* nil]
          (is (= (read-string v) #clojure.reflect.Field{:name "fake" :type :fake :declaring-class "Fake" :flags nil})))
        (binding [*default-data-reader-fn* (fn [tag val] (assoc val :meaning 42))]
          (is (= (read-string v) #clojure.reflect.Field{:name "fake" :type :fake :declaring-class "Fake" :flags nil})))))
Comment by Rich Hickey [ 29/Nov/12 9:36 AM ]

The problem assessment is ok, but the resolution approach may not be. What happens should be based not upon what is in data-readers but whether or not the name names a class.

Is the intent here to allow readers to circumvent records? I'm not in favor of that.

Comment by Steve Miner [ 29/Nov/12 4:01 PM ]

New patch following Rich's comments. The decision to read a record is now based on the symbol containing periods and not having a namespace. Otherwise, it is considered a data reader tag. User
defined tags are required to be qualified but they may now have periods in the name. Tests added to show that
data readers cannot override record classes. Note: Clojure-defined data reader tags may be unqualified, but they should not contain periods in order to avoid confusion with record classes.

Comment by Steve Miner [ 29/Nov/12 4:17 PM ]

I deleted my old patch and some comments referring to it to avoid confusion.

In Clojure 1.5 beta 1, # followed by a qualified symbol with a period in the name is considered a record and causes an exception for the missing record class. With the patch, only non-qualified symbols containing periods are considered records. That allows user-defined qualified symbols with periods in their names to be used as data reader tags.

Comment by Andy Fingerhut [ 07/Feb/13 9:05 AM ]

clj-1100-periods-in-data-reader-tags-patch-v2.txt dated Feb 7 2013 is identical to CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012, except it applies cleanly to latest master. The only change appears to be in some white space in the context lines.

Comment by Andy Fingerhut [ 07/Feb/13 12:53 PM ]

I've removed clj-1100-periods-in-data-reader-tags-patch-v2.txt mentioned in the previous comment, because I learned that CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012 applies cleanly to latest master and passes all tests if you use this command to apply it.

% git am --keep-cr -s --ignore-whitespace < CLJ-1100-periods-in-data-reader-tags.patch

I've already updated the JIRA workflow and screening patches wiki pages to mention this --ignore-whitespace option.

Comment by Andy Fingerhut [ 13/Feb/13 11:31 AM ]

Both of the current patches, CLJ-1100-periods-in-data-reader-tags.patch dated Nov 29 2012, and clj-1100-reader-literal-periods.patch dated Nov 6 2012, fail to apply cleanly to latest master (1.5.0-RC15) as of today, although they did last week. Given all of the changes around read / read-string and edn recently, they should probably be evaluated by their authors to see how they should be updated.

Comment by Steve Miner [ 14/Feb/13 12:23 PM ]

I deleted my patch: CLJ-1100-periods-in-data-reader-tags.patch. clj-1100-reader-literal-periods.patch is clearly wrong, but the original author or an administrator has to delete that.

Comment by Kevin Lynagh [ 14/Feb/13 1:28 PM ]

I cannot figure out how to remove my attachment (clj-1100-reader-literal-periods.patch) in JIRA.

Comment by Steve Miner [ 14/Feb/13 1:43 PM ]

Downarrow (popup) menu to the right of the "Attachments" section. Choose "manager attachments".

Comment by Kevin Lynagh [ 14/Feb/13 2:02 PM ]

Great, thanks Steve. Are you going to take another pass at this issue, or should I give it a go?

Comment by Steve Miner [ 14/Feb/13 3:04 PM ]

Kevin, I'm not planning to work on this right now as 1.5 is pretty much done. It might be worthwhile discussing the issue a bit on the dev mailing list before working on a patch, but that's up to you. I think my approach was correct, although now changes would have to be applied to both LispReader and EdnReader.

Comment by Alex Miller [ 09/Apr/14 10:29 AM ]

Updated description based on my understanding.

Comment by Steve Miner [ 22/Apr/14 3:30 PM ]

I will resurrect my old patch and update it for the changes since 1.5.

Comment by Steve Miner [ 28/Apr/14 8:21 AM ]

Added patch to allow reader tags to have periods, but only with a namespace. Added tests to confirm that it works, but does not allow overriding a record name with a data-reader.

Comment by Steve Miner [ 28/Apr/14 8:51 AM ]

The patch implements Alex's alternative 1. It's purely lexical. A tag symbol without a namespace and containing periods is handled as a record (Java class). Otherwise, it's a data-reader tag. Of course, unqualified symbols without periods are still data-reader tags.

IMHO, a Java class without a package is a pathological case which Clojure doesn't need to worry about. This patch follows the convention that Java classes are named by unqualified symbols containing dots.

I did try alternative 2, testing for an actual class, but the implementation was more complicated. Also, it would open the possibility of breaking working code by adding a record or Java class that accidentally collided with an unqualified dotted tag that had previously worked fine. It's better to follow a simple rule that unqualified dotted symbols always refer to classes. Maybe the class doesn't actually exist, but that doesn't mean the symbol might be a data-literal tag.

Comment by Alex Miller [ 13/May/14 4:49 PM ]

Added clj-1100-v2.patch - identical, just removes whitespace to simplify change.

Comment by Rich Hickey [ 29/Aug/14 9:16 AM ]

I think we should disallow this rather than enable it. We don't generally support foo/bar.x

Comment by Nicola Mometto [ 29/Aug/14 9:27 AM ]

I created http://dev.clojure.org/jira/browse/CLJ-1516 with a patch that throws an exception on `(def foo.bar)`

Comment by Alex Miller [ 12/Sep/14 4:45 PM ]

Declined per Rich's comment and replaced with alternative on CLJ-1516





Generated at Thu Sep 18 12:48:06 CDT 2014 using JIRA 4.4#649-r158309.