Postgresql Encoding – UTF8
June 9th, 2008 By: jamesOver the past many years I have been involved in projects requiring localization into many different languages. One issue with this is how data is stored in the database.
As we support and help various customers with Postgresql related database issues, we still find users who are encoding their databases in SQL_ASCII. Perhaps there is a reason for this, if there is I haven’t figured it out yet.
In many cases, this eventually causes them problems and they end up having to switch from SQL_ASCII or some other encoding over to UTF8.
There may be more, but I know of two different ways of converting the encoding of the database to UTF8.
One way is
iconv -f iso-8859-1 -t utf-8 dump_file dump_file_recoded
The preferred way however is to use the -E option of the pg_dump command like this:
pg_dump -U postgres -W -E UTF8 -d pg_bench >pgbench.backup.the -W command forces you to put in a password.
For more details on how to use the pg_dump command, here are the docs:
http://www.postgresql.org/docs/8.3/interactive/app-pg_dump.html

Four Southern Utah entrepreneurs detailed their technology-related
products Tuesday in St. George after receiving grants from a
state-funded economic development organization.
The Utah Science Technology and Research initiative, or USTAR,
coordinated the luncheon in an effort to highlight recent recipients of
its technology comme...
November 20th, 2008 at 3:22 am
How should “pg_dump -E UTF-8″ know the input charset?
As I understand it, -E is a no-op for SQL_ASCII.
December 1st, 2008 at 5:27 pm
Well if you don’t specify -E then it will use the current database encoding. If you specify -E then you can tell it what encoding you want to use.
November 11th, 2011 at 10:39 pm
Watch out for iconv! There’s usually a line in the sql file that sets the encoding. If you changed it without updating the line, you will get bad data.