» Best collation to use in web development with php and mysql

There are very little common things which developer easily skip and choose defaults options. It will create not a problem immedietly but you have to dig your head someday. Recently i faced a same challange with one of my sql query returning wrong results while comparing some hindi phrases.

After hours of mind juggling i finally found this is a problem of mysql collation. A collation is a set of rules that defines how to compare and sort character strings. Each collation in MySQL belongs to a single character set. Every character set has at least one collation, and most have two or more collations. A collation orders characters based on weights.

Now if you see while creating a Mysql Database there is long list of collations available. on php my admin mostly default selected as
utf8_general_ci, which i had selected while developing my web application. utf8_general_ci is a very simple — and on Unicode, very broken — collation, one that gives incorrect results on general Unicode text. So when i start storing content is unicode (hindi) it start giving wrong results. So i switched to utf8_unicode_ci .

utf8_general_ci is somewhat faster than utf8_unicode_ci, but less accurate (for sorting). The specific language utf8 encoding (such as utf8_swedish_ci) contain additional language rules that make them the most accurate to sort for those languages. Most of the time I use utf8_unicode_ci (I prefer accuracy to small performance improvements), unless I have a good reason to prefer a specific language.

6 thoughts on “Best collation to use in web development with php and mysql”

Georges Jean-Denis April 20, 2021 at 6:19 pm

Why not use “utf8mb4_unicode_ci”, the newer “utf8_unicode_ci” ?
I actually agree with you: “I prefer accuracy to small performance improvements”.

https://www.google.com/search?q=utf8+vs+utf8mb4&oq=utf8+vs+utf8mb4&aqs=chrome..69i57.6752j0j7&sourceid=chrome&ie=UTF-8

https://www.google.com/search?q=utf8_unicode_ci+vs+utf8mb4_unicode_ci&sxsrf=ALeKk03dqwj51ZYHOdf0WgwpxB3gxhi_Rg%3A1618942694131&ei=5hp_YK-zB8Gy5NoP49KO0AI&oq=utf8_unicode_ci+vs+utf8mb4&gs_lcp=Cgdnd3Mtd2l6EAEYADIFCAAQywEyBQgAEMsBMgYIABAIEB4yBggAEAgQHjIGCAAQCBAeMgYIABAIEB4yBggAEAgQHjoHCCMQsAMQJzoHCAAQRxCwAzoHCAAQsAMQQzoICAAQBxAeEBM6CggAEAgQBxAeEBNQhVZYsllg5YsBaAJwAngAgAHfAYgBxwOSAQUwLjIuMZgBAKABAqABAaoBB2d3cy13aXrIAQrAAQE&sclient=gws-wiz

Reply ↓
Georges Jean-Denis April 20, 2021 at 6:30 pm

Why not use “utf8mb4_unicode_ci” instead of “utf8_unicode_ci” ?
Because it is newer, you should guess that there is an advantage.

Reply ↓
Georges Jean-Denis April 20, 2021 at 6:30 pm

Why not use “utf8mb4_unicode_ci” instead of “utf8_unicode_ci” ?

Reply ↓
Georges Jean-Denis April 20, 2021 at 6:31 pm

Why not use “macroman_bin” instead of “utf8_unicode_ci” ?

Reply ↓
Georges Jean-Denis April 20, 2021 at 6:33 pm

Excuse me, I’m a newbie at this. But why not use “gbk_bin” instead of “utf8_unicode_ci” ?

Reply ↓
Georges Jean-Denis April 20, 2021 at 6:34 pm

Excuse me, I’m a newbie at this. But why not use “ascii_bin” instead of “utf8_unicode_ci” ?

Reply ↓

Best collation to use in web development with php and mysql

6 thoughts on “Best collation to use in web development with php and mysql”

Leave a Reply Cancel reply

At Innctech, Performance matters...
We are empowered by Innovation, Excellence, Simplicity, and a great measure of confidence. Engineered for Life...

Best collation to use in web development with php and mysql

6 thoughts on “Best collation to use in web development with php and mysql”

Leave a Reply Cancel reply

At Innctech, Performance matters... We are empowered by Innovation, Excellence, Simplicity, and a great measure of confidence. Engineered for Life...

At Innctech, Performance matters...
We are empowered by Innovation, Excellence, Simplicity, and a great measure of confidence. Engineered for Life...