Clojure-Contrib

clojure.contrib.sql runs out of memory on very large datasets

Details

  • Type: Defect Defect
  • Status: In Progress In Progress
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None

Description

If you have a query that results in a very large dataset you will run out of memory because by default, with-query-results pulls the entire result set into memory (even though it creates a lazy-seq using resultset-seq).

This issue has been discussed previously here:
http://groups.google.com/group/clojure/browse_thread/thread/7b0c250e0ba6c9eb/fb9001522b49c20a

The fix is simple, just call (.setFetchSize stmt 1). But, with-query-results doesn't give you a way to do that currently. I propose adding an optional parameter after sql-params that is a hash of attributes to set on the PreparedStatement before executing it.

So you can do this:

(with-connection {���} 
    (.setAutoCommit (sql/connection) false) ;; needed for postgres 
    (with-query-results results ["SELECT id, data FROM nodes"] 
      {:fetch-size 1000} 
      (doseq [r results] 
        ���)))

The new code in clojure.contrib.sql is very simple, but it depends on a new contrib namespace called clojure.contrib.bean. Here is more info on my proposed fix:
http://groups.google.com/group/clojure-dev/browse_thread/thread/d8334759f10f3f45

Activity

Hide
Assembla Importer added a comment -

importer said: [file:cFaBqYI38r3565eJe5cbLA]

Show
Assembla Importer added a comment - importer said: [file:cFaBqYI38r3565eJe5cbLA]
Hide
Assembla Importer added a comment -

importer said: [file:cjDxceI4er36sweJe5cbCb]

Show
Assembla Importer added a comment - importer said: [file:cjDxceI4er36sweJe5cbCb]
Hide
Assembla Importer added a comment -

importer said: Hey Steve,

Any progress on this? Another option instead of passing a map of attributes and using bean would be to allow the user to pass an optional function that would be called on the statement object. I've seen code like this in compojure, I think:

(with-connection {���} 
    (.setAutoCommit (sql/connection) false) ;; needed for postgres 
    (with-query-results results ["SELECT id, data FROM nodes"] 
      #(.setFetchSize % 1) 
      (doseq [r results] 
        ���)))

If this sounds better to you, I can submit a patch.

Cheers,
Justin

Show
Assembla Importer added a comment - importer said: Hey Steve, Any progress on this? Another option instead of passing a map of attributes and using bean would be to allow the user to pass an optional function that would be called on the statement object. I've seen code like this in compojure, I think:
(with-connection {���} 
    (.setAutoCommit (sql/connection) false) ;; needed for postgres 
    (with-query-results results ["SELECT id, data FROM nodes"] 
      #(.setFetchSize % 1) 
      (doseq [r results] 
        ���)))
If this sounds better to you, I can submit a patch. Cheers, Justin
Hide
Assembla Importer added a comment -

importer said: Kyle Burton just posted a blog post about this issue: http://asymmetrical-view.com/2010/10/14/clojure-lazy-walk-sql-table.html

His fix is to add a new method called with-query-results-cursor which turns off auto-commit and sets the fetch size.

Steve, would this approach work better for you?

Show
Assembla Importer added a comment - importer said: Kyle Burton just posted a blog post about this issue: http://asymmetrical-view.com/2010/10/14/clojure-lazy-walk-sql-table.html His fix is to add a new method called with-query-results-cursor which turns off auto-commit and sets the fetch size. Steve, would this approach work better for you?

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated: