CSV pain and suffering with punctuation and non-ASCII characters
am 15.12.2009 03:50:56 von Jan SteinmanI am trying to import a database of notable quotations. It is really
quite simple, but I'm finding importing via phpMyAdmin is (IMHO)
erroneous.
The database came from FileMaker Pro 7, and there are no options for
changing the output. There are two choices:
1) tabs between fields (with no opportunity to include the tab
character in a field), and
2) CSV, which double-quotes fields and has commas in-between, which
apparently doubles embedded quotes.
I am using UTF-8 bin as the character set, and as the encoding for all
fields.
The problem is that numerous punctuation characters appear to
terminate the reading of a field, whether imported as a TAB file or as
a CSV file.
For example, importing a quote with a "Context" field of:
The Hitchhiker's Guide to the Galaxy
results in a field containing:
The Hitchhiker
whether I use TAB or CSV. Note that this in an ASCII single quote
character, not something exotic. I have also noted this on fields that
contain "curly quotes."
I am using phpMyAdmin's "CSV using LOAD DATA" option, checking
"Replace table data with file", terminating fields with a comma,
enclosing fields with double quotes, and blanking (or leaving the '\',
it doesn't matter) the "Fields escaped by" field.
It seems to me that using CSV, with double-quoted, comma separated
fields should tell the import process, "Hands off until you see
another double quote!" And that using TAB format, nothing between TAB
characters should be interpreted. Why are characters inside the double
quotes or tabs being interpreted?
I have not tried mysqlimport from the command line, assuming (perhaps
incorrectly) that phpMyAdmin was simply passing things along, and not
interpreting them.
Ah, one light in the tunnel: my "Quote" field was TEXT. Changing it to
BLOB preserves non-ASCII characters, but I still see the strange
behaviour noted above with a single quote, which was in a VARCHAR(256)
UTF8 field.
Thanks for whatever help you can offer!
:::: My pants just went on a wild rampage through a Long Island
Bowling Alley! -- Zippy the Pinhead
:::: Jan Steinman ::::
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org