doubi9615 2012-08-14 13:41
浏览 103

UTF-8数据存在重复的主键问题

We have a table called site_tags with a primary key made up of 2 fields - id and tag.

Some of the tag fields are in UTF-8 which we want to convert to ISO-8859-1.

In this table there are 2 tags Seany and seány and we want to convert the latter from UTF-8.

When we try to do this using...

DELETE FROM site_tags WHERE id = '1325133476' AND tag = 'seány'

INSERT INTO site_tags (id, tag, active) VALUES ('1325133476', 'seány', '0')

MySQL gives the error on the insert:

Duplicate entry '1325133476-seány' for key 'PRIMARY' 

This seems because it is finding the Seany tag so thinks it's a duplicate, MySQL is not being character set sensitive (even though the delete worked on the correct record).

The database is using the latin1_swedish_ci collation and MySQL is v5.1 (InnoDB)

Any advice on how we can acheive this?

  • 写回答

2条回答 默认 最新

  • dongwo5449 2012-08-14 14:01
    关注

    I would expect an UPDATE statement to work, but I might be missing something. (I don't have to deal with character sets very much.) Alter your foreign keys to cascade updates if you need to. Untested . . .

    update site_tags 
    set tag = 'seány'
    where id = '1325133476' AND tag = 'seány';
    

    But I think the safer approach might be to create a new column that has the right encoding, update it through MySQL's convert() function, then fix up the keys. (Create a new column, create a unique constraint on {id, new_column}--that's in addition to the existing constraints--update the new column, etc.)

    update site_tags
    set new_column = convert(tag using latin2)
    where new_column is null;
    

    I'm not sure latin2 is right; MySQL docs describe it as "ISO 8859-2 Central European".

    If you still get duplicate key errors, I'd expect to find that two different values in UTF8 map to a single value in ISO-8859-1.

    评论

报告相同问题?

悬赏问题

  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探
  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么