<< Back to previous view

[DCSV-7] data.csv does not handle BOMs Created: 12/Aug/13  Updated: 25/May/17  Resolved: 25/May/17

Status: Resolved
Project: data.csv
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Defect Priority: Major
Reporter: John Walker Assignee: Jonas Enlund
Resolution: Declined Votes: 0
Labels: None
Environment:

Usually Windows (but also Linux)



 Description   

Sometimes BOMs are prepended to files in Microsoft Land. Data.csv does not handle this edge case, which causes the first field in the header of a csv file to be incorrect. This can be hard to detect, since \ufeff is usually invisible.

http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
http://www.fileformat.info/info/unicode/char/feff/index.htm



 Comments   
Comment by Jonas Enlund [ 12/Aug/13 11:46 PM ]

This isn't really a csv specific problem. I've encountered files with a byte order mark and then I have simply executed (.skip reader 1) before handing the reader over to read-csv. Is this not a good enough solution?

Comment by Jonas Enlund [ 25/May/17 1:26 PM ]

Instead of adding support for this, I added some docs on how to achieve it without changing data.csv

Generated at Sat Oct 21 18:22:34 CDT 2017 using JIRA 4.4#649-r158309.