# postgresql: encoding vs. locale

## pactoo

When creating a database, I can choose to set the locale and the encoding. Now, what is the difference between those two settings and which effect may these have for operating ?

Anyone with a bit of insight, please?

----------

## YuriyRusinov

Hello !

Locale option means default locale for database cluster, encoding option is character encoding for default database. Incorrect locale can results in that function like will not work correctly. e.g. if locale is KOI8-R, but database encoding is UTF8 then functions like/ilike for Russian letters does not work.

----------

## pactoo

Thanks for your reply, but that does not make it any more clear to me at all. And from what got, I am not qute sure wether this is correct at all, as, with two exceptions, you are able to choose the locale for each database individually. Same is true for encoding. 

When creating a cluster, you currently only fix LC_COLLATE und LC_CTYPE, which cannot be changed afterwards. The remaining locales can be freely defined for each new database.

That does however not explain what practical consequence those decisions have. What do I need the locale for and what the encoding.  

What is the difference if I define a database with locale=de_DE or locale=C (is locale=C the same as specifying --no-locale during initdb?)? 

And if I define, lets say, locale de_DE, how would the database behaviour change, if that same localized database was created with either an ISO8859 or an UNICODE encoding?

The only thing I have found out, is that LC_COLLATE affects the ordering of (search?) results. Now, if I only use php based web applications - some english, some translated - does this matter at all? And, more important, if, or if not, then why?

----------

## YuriyRusinov

If you call something like this 

```
initdb --encoding=<your_encoding>  --locale=<your_locale>
```

, then locale is default encoding for database cluster, and encoding is individual encoding for template database, if you create database you can define individual encoding for this database but cannot redefine locale for database cluster.

----------

## pactoo

Well, I know that I cannot change the locale for the cluster. But that does not matter here and was never the question. My question aimed at the actual databases, since that is, what my applications are working with. 

How does encoding and locale affect the daily operation of a database.  

To put it in a different way: I am stumbling over following sentence in the postgresql documentation:

```

One way to use multiple encodings safely is to set the locale to C or POSIX during initdb, thus disabling any real locale awareness.

```

What is the difference from an apllication point of view - be it the psql command or some php webapp - between running a database with --no-locale or --locale=xx_XX.UTF8. 

And, what difference would it make - again from the application point of view - if each of those databases metioned above are stored in either an ISO8859-15 encoded or an UTF8 encoded Database.

----------

## YuriyRusinov

 *Quote:*   

> When creating a database, I can choose to set the locale and the encoding

 

if you create database both using createdb script and "CREATE DATABASE" command, you can choose encoding only.

 *Quote:*   

> And if I define, lets say, locale de_DE, how would the database behaviour change, if that same localized database was created with either an ISO8859 or an UNICODE encoding?
> 
> 

 

I don't works deep on this, but my suppose is that database string-functions with either ISO8859 can produce incorrect results with specific de_DE symbols, because in general these symbols cannot be transform into another locale. But behavior of database depends on locale of database cluster.

----------

