[DCSV-5] No option for parsing into maps Created: 21/May/13 Updated: 21/May/13 |
|
| Status: | Open |
| Project: | data.csv |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Enhancement | Priority: | Major |
| Reporter: | Gary Fredericks | Assignee: | Jonas Enlund |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
I imagine a very common use case for parsing CSVs is to get the output as a sequence of maps. I'm happy to provide a patch for this but wanted to make sure I had the right design. My initial idea is to add another option to read-csv with the name :headers which can be a sequence of values, or a flag such as :first-row. Presumably though we ought to also support using the first row as keywords rather than strings, so I'm not sure whether that ought to be another option or a different flag (e.g., :first-row-keywords). |
| Comments |
| Comment by Jonas Enlund [ 21/May/13 1:28 PM ] |
|
I've seen this feature request before so I think that something like this should be added. One approach would be to provide a helper function: (defn csv-data->maps [vecs]
(map zipmap (repeat (first vecs)) (rest vecs)))
(csv-data->maps (read-csv reader))
|
[DCSV-4] \return as record separator with unquoted fields is read as part of the field Created: 24/Oct/12 Updated: 24/Oct/12 |
|
| Status: | Open |
| Project: | data.csv |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Defect | Priority: | Major |
| Reporter: | John Hume | Assignee: | Jonas Enlund |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
This regards the gray area of being "more forgiving." If I understand RFC 4180 correctly, I want to suggest substituting one bit of forgiveness for another: rather than supporting unquoted, multi-line cell values, I suggest supporting CSVs with just \return as the record-separator. Would you accept a patch for that? A file with \return as record-separator is interpreted by read-csv as a single row like (["Header1" "Header2\rval1" "val2"]). I believe the RFC only allows fields to contain CR and LF when they're escaped (i.e., surrounded in double quotes). See the ABNF at the end of section 2. As far as implementation, I believe this would require wrapping any Reader w/o markSupported in one that does, so that the LF following a CR can be consumed when present. [I've classified this as a major defect because I ran into a \return-delimited file as soon as I passed a CSV from a Linux machine to a Windows machine, so I'm guessing these files are common. Feel free to reclassify.] |
| Comments |
| Comment by Jonas Enlund [ 24/Oct/12 3:00 PM ] |
|
> rather than supporting unquoted, multi-line cell values, I suggest supporting CSVs with just \return as the record-separator. Would you accept a patch for that? Sounds good to me. > As far as implementation, I believe this would require wrapping any Reader w/o markSupported in one that does I think that's ok, since BufferedReader supports it. |
[DCSV-3] Some minor documentation typos Created: 14/Jun/12 Updated: 15/Jun/12 Resolved: 15/Jun/12 |
|
| Status: | Resolved |
| Project: | data.csv |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Defect | Priority: | Trivial |
| Reporter: | Trent Ogren | Assignee: | Jonas Enlund |
| Resolution: | Completed | Votes: | 0 |
| Labels: | docs, documentation, typo | ||
| Attachments: |
|
| Patch: | Code |
| Description |
|
I found a couple minor typos: one in the README, one in a docstring. I've included a patch. |
[DCSV-2] \return characters do not trigger value quoting Created: 10/Feb/12 Updated: 14/Feb/12 Resolved: 13/Feb/12 |
|
| Status: | Resolved |
| Project: | data.csv |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Defect | Priority: | Major |
| Reporter: | Giorgio Valoti | Assignee: | Jonas Enlund |
| Resolution: | Completed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Apache Maven 3.0.3 (r1075438; 2011-02-28 18:31:09+0100) |
||
| Attachments: |
|
| Patch: | Code and Test |
| Description |
|
If the csv file contains \return characters the values are not quoted. A possible patch is attached. |
| Comments |
| Comment by Jonas Enlund [ 13/Feb/12 11:16 PM ] |
|
This is fixed in version 0.1.1. I couldn't accept your patch though, as I didn't find you on the contributor list at http://clojure.org/contributing |
| Comment by Giorgio Valoti [ 14/Feb/12 12:36 AM ] |
|
oh, sorry about that. I’ve completely forgot it because of the problems with jira. Glad to hear it was useful, anyway. BTW |
[DCSV-1] pom.xml directives Created: 10/Feb/12 Updated: 10/Feb/12 |
|
| Status: | Open |
| Project: | data.csv |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Enhancement | Priority: | Minor |
| Reporter: | Giorgio Valoti | Assignee: | Jonas Enlund |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Apache Maven 3.0.3 (r1075438; 2011-02-28 18:31:09+0100) |
||
| Attachments: |
|
| Patch: | Fixed |
| Description |
|
If you build data.csv alone with the current pom.xml you get a couple of warnings and test are not executed. With the recent versions of Maven, these warnings can break the build. A fixed (I hope!) version is attached. |