douyabu1528 2013-07-23 22:22
浏览 77
已采纳

MySQL - 查询列的重复项并返回原始行和重复行

I have a table that I use to store some systematically chosen "serial numbers" for each product that is bought...

The problem is, a CSV was uploaded that I believe contained some duplicate "serial numbers", which means that when the application tries to modify a row, it may not be modifying the correct one.

I need to be able to query the database and get all rows that are a double of the serial_number column. It should look something like this:

ID, serial_number, meta1, meta2, meta3
3, 123456, 0, 2, 4
55, 123456, 0, 0, 0
6, 345678, 0, 1, 2
99, 345678, 0, 1, 2

So as you can see, I need to be able to see both the original row and the duplicate row and all of it's columns of data ... this is so I can compare them and determine what data is now inconsistent.

  • 写回答

2条回答 默认 最新

  • dozabg1616 2013-07-23 22:34
    关注

    Some versions of MySQL implement in with a subquery very inefficiently. A safe alternative is a join:

    SELECT t.*
    FROM t join
         (select serial_number, count(*) as cnt
          from t
          group by serial_number
         ) tsum
         on tsum.serial_number = t.serial_number and cnt > 1
    order by t.serial_number;
    

    Another alternative is to use an exists clause:

    select t.*
    from t
    where exists (select * from t t2 where t2.serial_number = t.serial_number and t2.id <> t.id)
    order by t.serial_number;
    

    Both these queries (as well as the one proposed by @fthiella) are standard SQL. Both would benefit from an index on (serial_number, id).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line
  • ¥500 火焰左右视图、视差(基于双目相机)
  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)