TiDB 6.0 新特性解读丨 Collation 规则
对数据库而言,合适的字符集和规则能够大大提升使用者运维和分析的效率。TiDB 从 v4.0 开始支持新 collation 规则,并于 TiDB 6.0 版本再次进行了更新。本文将深入解读 Collation 规则在 TiDB 6.0 中的变更和应用。
Ps. 本文已入选 TiDB 6.0 Book Rush,如果你也想分享 6.0 的使用体验,并获得丰厚奖励,欢迎加入我们!
引
默认启用新 Collation 规则
TiDB 从 v4.0 开始支持新 collation 规则,在大小写不敏感、口音不敏感、padding 规则上与 MySQL 行为保持一致。
TiDB?[(none)]?18:45:27>?select?*?from?mysql.tidb?where?variable_name?=?'new_collation_enabled';
+-----------------------+----------------+----------------------------------------------------+
|?VARIABLE_NAME?????????|?VARIABLE_VALUE?|?COMMENT????????????????????????????????????????????|
+-----------------------+----------------+----------------------------------------------------+
|?new_collation_enabled?|?True???????????|?If?the?new?collations?are?enabled.?Do?not?edit?it.?|
+-----------------------+----------------+----------------------------------------------------+
1?row?in?set?(0.003?sec)
TiDB?[(none)]?18:45:51>?select?version()\G
***************************?1.?row?***************************
version():?5.7.25-TiDB-v6.0.0
1?row?in?set?(0.001?sec)
TiDB?[(none)]?18:46:00>?SELECT?*?FROM?information_schema.collations;
+--------------------+--------------------+------+------------+-------------+---------+
|?COLLATION_NAME?????|?CHARACTER_SET_NAME?|?ID???|?IS_DEFAULT?|?IS_COMPILED?|?SORTLEN?|
+--------------------+--------------------+------+------------+-------------+---------+
|?ascii_bin??????????|?ascii??????????????|???65?|?Yes????????|?Yes?????????|???????1?|
|?binary?????????????|?binary?????????????|???63?|?Yes????????|?Yes?????????|???????1?|
|?gbk_bin????????????|?gbk????????????????|???87?|????????????|?Yes?????????|???????1?|?(#28645)
|?gbk_chinese_ci?????|?gbk????????????????|???28?|?Yes????????|?Yes?????????|???????1?|?(#28645)
|?latin1_bin?????????|?latin1?????????????|???47?|?Yes????????|?Yes?????????|???????1?|
|?utf8_bin???????????|?utf8???????????????|???83?|?Yes????????|?Yes?????????|???????1?|
|?utf8_general_ci????|?utf8???????????????|???33?|????????????|?Yes?????????|???????1?|
|?utf8_unicode_ci????|?utf8???????????????|??192?|????????????|?Yes?????????|???????1?|?(#18678)
|?utf8mb4_bin????????|?utf8mb4????????????|???46?|?Yes????????|?Yes?????????|???????1?|
|?utf8mb4_general_ci?|?utf8mb4????????????|???45?|????????????|?Yes?????????|???????1?|
|?utf8mb4_unicode_ci?|?utf8mb4????????????|??224?|????????????|?Yes?????????|???????1?|?(#18678)
+--------------------+--------------------+------+------------+-------------+---------+
11?rows?in?set?(0.001?sec)
TiDB?[test]?19:05:14>?SHOW?CHARACTER?SET;
+---------+-------------------------------------+-------------------+--------+
|?Charset?|?Description?????????????????????????|?Default?collation?|?Maxlen?|
+---------+-------------------------------------+-------------------+--------+
|?ascii???|?US?ASCII????????????????????????????|?ascii_bin?????????|??????1?|
|?binary??|?binary??????????????????????????????|?binary????????????|??????1?|
|?gbk?????|?Chinese?Internal?Code?Specification?|?gbk_chinese_ci????|??????2?|
|?latin1??|?Latin1??????????????????????????????|?latin1_bin????????|??????1?|
|?utf8????|?UTF-8?Unicode???????????????????????|?utf8_bin??????????|??????3?|
|?utf8mb4?|?UTF-8?Unicode???????????????????????|?utf8mb4_bin???????|??????4?|
+---------+-------------------------------------+-------------------+--------+
6?rows?in?set?(0.001?sec)
TiDB-v5.4?[test]?10:17:22>?select?version()\G
***************************?1.?row?***************************
version():?5.7.25-TiDB-v5.4.0
1?row?in?set?(0.001?sec)
TiDB-v5.4?[test]?10:19:39>?SELECT?*?FROM?information_schema.collations;
+----------------+--------------------+------+------------+-------------+---------+
|?COLLATION_NAME?|?CHARACTER_SET_NAME?|?ID???|?IS_DEFAULT?|?IS_COMPILED?|?SORTLEN?|
+----------------+--------------------+------+------------+-------------+---------+
|?utf8mb4_bin????|?utf8mb4????????????|???46?|?Yes????????|?Yes?????????|???????1?|
|?latin1_bin?????|?latin1?????????????|???47?|?Yes????????|?Yes?????????|???????1?|
|?binary?????????|?binary?????????????|???63?|?Yes????????|?Yes?????????|???????1?|
|?ascii_bin??????|?ascii??????????????|???65?|?Yes????????|?Yes?????????|???????1?|
|?utf8_bin???????|?utf8???????????????|???83?|?Yes????????|?Yes?????????|???????1?|
|?gbk_bin????????|?gbk????????????????|???87?|?Yes????????|?Yes?????????|???????1?|
+----------------+--------------------+------+------------+-------------+---------+
6?rows?in?set?(0.001?sec)
新 Collation 注意事项
service_configs:
??tidb:
????new_collations_enabled_on_first_bootstrap:?true
Collation Bug 修复
DROP?TABLE?IF?EXISTS?t1;
CREATE?TABLE?t1?(
c1?char(20)?CHARACTER?SET?utf8?COLLATE?utf8_bin,
c2?char(20)?CHARACTER?SET?utf8mb4?COLLATE?utf8mb4_bin);
INSERT?INTO?t1?VALUES?('UUtJeaV','snRXXCZHBPW');
SET?names?utf8mb4?collate?utf8mb4_bin;
SELECT?greatest(?c1,?c2?)?as?expr1?FROM?t1;
SELECT?least(?c1,?c2?)?as?expr1?FROM?t1;
SET?names?utf8mb4?collate?utf8mb4_general_ci;
SELECT?greatest(?c1,?c2?)?as?expr1?FROM?t1;
SELECT?least(?c1,?c2?)?as?expr1?FROM?t1;

//?The?collation?of?JSON?is?always?utf8mb4_bin?in?builtin-func?which?is?same?as?MySQL
//?see?details?https://github.com/pingcap/tidb/issues/31320#issuecomment-1010599311
if?isJSON?{
???dstCharset,?dstCollation?=?charset.CharsetUTF8MB4,?charset.CollationUTF8MB4
}
DROP?TABLE?IF?EXISTS?t2;
CREATE?TABLE?t2?(c1?json);
INSERT?INTO?t2?VALUES?('{\"测试\":?\"你好\"}');
SELECT?collation(c1),?collation(upper(c1)),?collation(elt(1,?c1,?0x12))?FROM?t2;


新增内置函数 CHARSET()
TiDB?[test]?23:54:42>?select?version()\G
***************************?1.?row?***************************
version():?5.7.25-TiDB-v6.0.0
1?row?in?set?(0.001?sec)
TiDB?[test]?00:03:51>?set?names?utf8mb4;
Query?OK,?0?rows?affected?(0.000?sec)
TiDB?[test]?00:03:58>?select?charset(1);
+------------+
|?charset(1)?|
+------------+
|?binary?????|
+------------+
1?row?in?set?(0.001?sec)
TiDB?[test]?00:04:03>?select?charset('1');
+--------------+
|?charset('1')?|
+--------------+
|?utf8mb4??????|
+--------------+
1?row?in?set?(0.001?sec)
文档拾遗

小结
本文对 Collation 特性在 TiDB 6.0 中的变更进行了汇总阐释及举例说明,当前 TiDB 所提供的几种 Collation 已经可以支撑大部分业务场景和大部分系统迁移需求。建议在项目设计之初,就选用普适类型的字符集和规则,毕竟效率是提升生产力的重要因素之一。
活动预告
新发布的 TiDB 6.0 大幅增强了 TiDB 作为企业级产品的可管理性,具备更成熟的 HTAP 与容灾能力,加入多项云原生数据库所需的基础特性。抢先试用 TiDB 6.0、TiFlash、TiEM 和 PingCAP Clinic 等多款新品与服务,参与 Book Rush 贡献,将有机会享受“TiDB 荣誉体验官”在内的多项权益!
扫描下方二维码了解详情

关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/
关注网络尖刀微信公众号随时掌握互联网精彩
- 1 习近平将发表二〇二六年新年贺词 7904141
- 2 2026年国补政策来了 7808738
- 3 东部战区:开火!开火!全部命中! 7712893
- 4 2026年这些民生政策将惠及百姓 7616985
- 5 小学食堂米线过期2.5小时被罚5万 7519709
- 6 解放军喊话驱离台军 原声曝光 7428214
- 7 为博流量直播踩烈士陵墓?绝不姑息 7327605
- 8 每月最高800元!多地发放养老消费券 7238391
- 9 数字人民币升级 1月1日起将计付利息 7141831
- 10 2026年1月1日起 一批新规将施行 7040675








PingCAP
