dongpan2788 2013-06-27 18:14
浏览 72
已采纳

删除具有blob text / mediumtext mysql的重复行

I have seen lots of posts on deleting rows using sql commands but i need to filter out rows which have mediumtext.

I keep getting an error Error Code: 1170. BLOB/TEXT column used in key specification without a key length from solution such as:

ALTER IGNORE TABLE foobar ADD UNIQUE (title, SID)


My table is simple, i need to check for duplicates in mytext, id is unique and they are AUTO_INCREMENT.
As a note, the table has about a million rows, and all attempts keep timing out. I would need a solution that performs actions in batches such as WHERE id>0 AND id<100
Also I am using MySQL Workbench on amazons RDS

From a table like this

+---+-----+-----+------+-------+
|id |fname|lname|mytext|morevar|
|---|-----|-----|------|-------|
| 1 | joe | min | abc  | 123   |
| 2 | joe | min | abc  | 123   |
| 3 | mar | kam | def  | 789   |
| 4 | kel | smi | ghi  | 456   |
+------------------------------+ 

I would like to end up with a table like this

+---+-----+-----+------+-------+
|id |fname|lname|mytext|morevar|
|---|-----|-----|------|-------|
| 1 | joe | min | abc  | 123   |
| 3 | mar | kam | def  | 789   |
| 4 | kel | smi | ghi  | 456   |
+------------------------------+    

update forgot to mention this is on amazon RDS using mysql workbench
my table is very large and i keep getting an error Error Code: 1205. Lock wait timeout exceeded from this sql command:

DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name

Also, if anyone else is having issues with MySQL workbench timing out the fix is
Go to Preferences -> SQL Editor and set to a bigger value this parameter: DBMS connection read time out (in seconds)

  • 写回答

2条回答 默认 最新

  • dow57588 2013-06-28 08:57
    关注

    OPTION #1: Delete all duplicates records leaving one of each (e.g. the one with max(id))

    DELETE
    FROM yourTable
    WHERE id NOT IN
    (
       SELECT MAX(id)
       FROM yourTable
       GROUP BY mytext
    )
    

    You could prefer using min(id).

    Depending on the engine used, this won't work and, as it did, give you the Error Code: 1093. You can't specify target table 'yourTable' for update in FROM clause. Why? Because deleting one record may cause something to happen which made the WHERE condition FALSE, i.e. max(id) changes the value.

    In this case, you could try using another subquery as a temporary table:

    DELETE
    FROM yourTable
    WHERE id NOT IN
    (
        SELECT MAXID FROM
        (
            SELECT MAX(id) as MAXID
            FROM yourTable
            GROUP BY mytext
        ) as temp_table
    )
    

    OPTION #2: Use a temporary table like in this example or:

    First, create a temp table with the max ids:

       SELECT MAX(id) AS MAXID
       INTO tmpTable
       FROM yourTable
       GROUP BY mytext;
    

    Then execute the delete:

    DELETE
    FROM yourTable
    WHERE id NOT IN
    (
        SELECT MAXID FROM tmpTable
    );
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog