dongyan4424 2014-07-20 02:06
浏览 29
已采纳

Mysql - 删除表中的重复条目取决于几个列

I want to delete all duplicate entries in my mysql database.

I found a lot of solutions in the web but in every case the query worked only when you are looking up for 1 column.

In my case I need a query for more than 1 row which will be executed as far as possible - the table is 500MB big with more than 6 Million entries.

My Table (exampe)

id  name status   email
----------------------------
x   Mark  1      Mark@trash-mail.com
x   Anna  1      Anna@trash-mail.com
x   Mark  0      Mark@trash-mail.com
x   Mark  1      Mark@trash-mail.com

What should now happen ? I need a query like this:

DELETE * FROM my_table WHERE (name == name, status == status, email == email)

When executing this query the table has to look like this:

id  name status   email
----------------------------
x   Mark  1      Mark@trash-mail.com
x   Anna  1      Anna@trash-mail.com
x   Mark  0      Mark@trash-mail.com

The last Mark-Entry was deleted because the name, status and email field was equal to another. In some cases there are 20 or more with the same data which has to be deleted so that there is just one left.

Currently I have a bad solution with a PHP-Script .. I am going to each row, looking for the same data and delete it. It works like it should but its too slow ... I think maybe 5.000 Items per Hour ...

You know any solution to solve this in a better way ?

  • 写回答

1条回答 默认 最新

  • duananyu9231 2014-07-20 02:08
    关注

    My suggestion would be to use the method where you truncate the table and reinsert the data. Something like:

    create temporary table temp as
        select min(id) as id, name, status, email
        from mytable
        group by name, status, email;
    
    truncate table mytable;
    
    insert into mytable(id, name, status, email)
        select id, name, status, email
        from temp;
    

    Doing the deleting in place can be quite expensive. If you decide to do so, I would create the same temporary table with an index on id, and then use:

    delete m
        from mytable m left join
             temp
             on m.id = temp.id
        where temp.id is null;
    

    You can use a limit clause (such as limit 10000) to run this in batches.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法
  • ¥15 可否在不同线程中调用封装数据库操作的类
  • ¥15 微带串馈天线阵列每个阵元宽度计算
  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM