<!-- 
RSS generated by JIRA (4.4#649-r158309) at Wed Jun 19 19:46:55 CDT 2013

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary add field=key&field=summary to the URL of your request.
For example:
http://dev.clojure.org/jira/si/jira.issueviews:issue-xml/CLJ-829/CLJ-829.xml?field=key&field=summary
-->
<rss version="0.92" >
<channel>
    <title>Clojure JIRA</title>
    <link>http://dev.clojure.org/jira</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>4.4</version>
        <build-number>649</build-number>
        <build-date>25-07-2011</build-date>
    </build-info>

<item>
            <title>[CLJ-829] Transient hashmaps mishandle hash collisions</title>
                <link>http://dev.clojure.org/jira/browse/CLJ-829</link>
                <project id="10010" key="CLJ">Clojure</project>
                        <description>&lt;p&gt;Clojure 1.2.1 and 1.3.0-beta1 both exhibit the following behavior:&lt;/p&gt;

&lt;p&gt;user&amp;gt; (let [m (into {} (for [x (range 100000)] &lt;span class=&quot;error&quot;&gt;&amp;#91;(rand) (rand)&amp;#93;&lt;/span&gt;))]&lt;br/&gt;
        (println (count (distinct (map hash (keys m)))))&lt;br/&gt;
        ((juxt count identity) (persistent!&lt;br/&gt;
                (reduce dissoc! (transient m) (keys m)))))&lt;br/&gt;
99999&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;2 {0.42548900739367024 0.8725039567983159}&amp;#93;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;We create a large transient map with random keys and values, and check to see how many unique hashcodes we get. Then, we iterate over all the keys, dissoc&apos;ing each out of the transient map. The resulting map has one element in it (wrong - it should be empty, since we dissoc&apos;ed all the keys), and reports its count as being two (wrong - not sure whether it should be zero or one given the other breakage). As far as I can tell, each duplicated hash value is represented once in the output map, and the map&apos;s count is the number of keys that hashed to something duplicated.&lt;/p&gt;

&lt;p&gt;The problem seems to be restricted to transients, as if we remove the transient/persistent! pair and use dissoc instead of dissoc!, the map is always empty.&lt;/p&gt;

&lt;p&gt;Inspired by discussion at &lt;a href=&quot;http://groups.google.com/group/clojure/browse_thread/thread/313ac122667bb4b5/c3e7faa8635403f1&quot;&gt;http://groups.google.com/group/clojure/browse_thread/thread/313ac122667bb4b5/c3e7faa8635403f1&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
            <key id="14579">CLJ-829</key>
            <summary>Transient hashmaps mishandle hash collisions</summary>
                <type id="1" iconUrl="http://dev.clojure.org/jira/images/icons/bug.gif">Defect</type>
                                <priority id="3" iconUrl="http://dev.clojure.org/jira/images/icons/priority_major.gif">Major</priority>
                    <status id="6" iconUrl="http://dev.clojure.org/jira/images/icons/status_closed.gif">Closed</status>
                    <resolution id="1">Completed</resolution>
                                <assignee username="cgrand">Christophe Grand</assignee>
                                <reporter username="amalloy">Alan Malloy</reporter>
                        <labels>
                    </labels>
                <created>Wed, 24 Aug 2011 03:30:21 -0500</created>
                <updated>Fri, 1 Mar 2013 12:47:05 -0600</updated>
                    <resolved>Fri, 7 Oct 2011 09:12:36 -0500</resolved>
                            <version>Release 1.3</version>
                                <fixVersion>Release 1.4</fixVersion>
                                        <due></due>
                    <votes>0</votes>
                        <watches>3</watches>
                        <comments>
                    <comment id="26733" author="amalloy" created="Thu, 25 Aug 2011 12:26:12 -0500"  >&lt;p&gt;By the way, since this involves randomness, on occasion it doesn&apos;t fail. With the input as given it seems to fail around 80% of the time, but if you want to be sure to reproduce you can add another 0 to the input size.&lt;/p&gt;</comment>
                    <comment id="26735" author="aaron" created="Fri, 26 Aug 2011 09:20:43 -0500"  >&lt;p&gt;Thanks for the update.  I was able to reproduce with the extra zero. Moving this ticket to Release.Next so that it will ship with 1.3.&lt;/p&gt;</comment>
                    <comment id="26777" author="cgrand" created="Thu, 8 Sep 2011 13:15:20 -0500"  >&lt;p&gt;Sorry for the delay. Here is a fix and a reduced test case.&lt;/p&gt;</comment>
                    <comment id="26780" author="stu" created="Fri, 9 Sep 2011 15:50:56 -0500"  >&lt;p&gt;I have checked behavior with an independent test&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/clojure/test.generative/commit/b1350235eb219b76f5b7c4cc21c8255f567892b3&quot;&gt;https://github.com/clojure/test.generative/commit/b1350235eb219b76f5b7c4cc21c8255f567892b3&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;which looks good, but don&apos;t have context to evaluate the code (particularly the array allocation) in the time I have available today.&lt;/p&gt;</comment>
                    <comment id="26781" author="richhickey" created="Fri, 9 Sep 2011 16:14:01 -0500"  >&lt;p&gt;The array alloc looks suspicious:&lt;/p&gt;

&lt;p&gt;+		Object[] newArray = new Object&lt;span class=&quot;error&quot;&gt;&amp;#91;2*(array.length+1)&amp;#93;&lt;/span&gt;; // make room for next assoc&lt;br/&gt;
+		System.arraycopy(array, 0, newArray, 0, 2*count);&lt;/p&gt;

&lt;p&gt;should it not be array.length + 2, (or 2*(count + 1), whichever?&lt;/p&gt;</comment>
                    <comment id="26783" author="cgrand" created="Sat, 10 Sep 2011 03:48:51 -0500"  >&lt;p&gt;Thanks Rich for spotting this copy&amp;amp;paste error.&lt;/p&gt;

&lt;p&gt;I attached an updated patch.&lt;/p&gt;

&lt;p&gt;The problem reported by Alan was double:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;the misplaced removedLeaf.val = removedLeaf was causing the global count to be incorrect&lt;/li&gt;
	&lt;li&gt;the missing array copy in ensureEditable was causing the seq returned by (keys m) to be mutated (shortened) and this is why all values were not dissoced.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Mutable code is hard, one should invent a cool language with sane state management &lt;img class=&quot;emoticon&quot; src=&quot;http://dev.clojure.org/jira/images/icons/emoticons/smile.gif&quot; height=&quot;20&quot; width=&quot;20&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                    <comment id="26784" author="stu" created="Sat, 10 Sep 2011 14:54:57 -0500"  >&lt;p&gt;New tests continue to pass.&lt;/p&gt;

&lt;p&gt;It seems to me that the allocation in the second path is too big by two in some cases (e.g. it gets called in the dissoc! path when the array needs to be copied, but not get bigger). But this might be considered innocuous. The same method is called in the assoc! path where the +2 is needed, so avoiding two objects worth of allocation would require a more substantial patch.&lt;/p&gt;</comment>
                    <comment id="26787" author="cgrand" created="Mon, 12 Sep 2011 03:39:27 -0500"  >&lt;p&gt;The larger array allocation is similar to the one performed in BitmapIndexedNode.ensureEditable. My line of thought behind this heuristic is that the copying dominates the allocation and that I prefer one slightly larger array than having to allocates two arrays in case of growth.&lt;/p&gt;

&lt;p&gt;Anyway it&apos;s on collision nodes so it&apos;s a rare occurence: I won&apos;t bother arguing further if switching to a more conservative allocation helps the patch landing in 1.3.  &lt;/p&gt;</comment>
                </comments>
                    <attachments>
                    <attachment id="10340" name="clj-829-2.diff" size="2531" author="cgrand" created="Sat, 10 Sep 2011 03:48:51 -0500" />
                    <attachment id="10333" name="clj-829.diff" size="2537" author="cgrand" created="Thu, 8 Sep 2011 13:15:20 -0500" />
                </attachments>
            <subtasks>
        </subtasks>
                <customfields>
                                <customfield id="customfield_10002" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                <customfieldname>Approval</customfieldname>
                <customfieldvalues>
                        <customfieldvalue key="10007">Ok</customfieldvalue>

                </customfieldvalues>
            </customfield>
                                                                                    <customfield id="customfield_10010" key="com.pyxis.greenhopper.jira:gh-global-rank">
                <customfieldname>Global Rank</customfieldname>
                <customfieldvalues>
                    
                </customfieldvalues>
            </customfield>
                                            <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                <customfieldname>Patch</customfieldname>
                <customfieldvalues>
                        <customfieldvalue key="10002">Code and Test</customfieldvalue>

                </customfieldvalues>
            </customfield>
                                                                                        </customfields>
    </item>
</channel>
</rss>