perhaps? 2009-08-21 17:04 采纳率: 100%
浏览 466
已采纳

检索每个组中的最后一条记录 -MySQL

There is a table messages that contains data as shown below:

Id   Name   Other_Columns
-------------------------
1    A       A_data_1
2    A       A_data_2
3    A       A_data_3
4    B       B_data_1
5    B       B_data_2
6    C       C_data_1

If I run a query select * from messages group by name, I will get the result as:

1    A       A_data_1
4    B       B_data_1
6    C       C_data_1

What query will return the following result?

3    A       A_data_3
5    B       B_data_2
6    C       C_data_1

That is, the last record in each group should be returned.

At present, this is the query that I use:

SELECT
  *
FROM (SELECT
  *
FROM messages
ORDER BY id DESC) AS x
GROUP BY name

But this looks highly inefficient. Any other ways to achieve the same result?

转载于:https://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group-mysql

  • 写回答

21条回答 默认 最新

  • elliott.david 2009-08-21 17:39
    关注

    MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:

    WITH ranked_messages AS (
      SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
      FROM messages AS m
    )
    SELECT * FROM ranked_messages WHERE rn = 1;
    

    Below is the original answer I wrote for this question in 2009:


    I write the solution this way:

    SELECT m1.*
    FROM messages m1 LEFT JOIN messages m2
     ON (m1.name = m2.name AND m1.id < m2.id)
    WHERE m2.id IS NULL;
    

    Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.

    For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.

    I'll write a query to find the most recent post for a given user ID (mine).

    First using the technique shown by @Eric with the GROUP BY in a subquery:

    SELECT p1.postid
    FROM Posts p1
    INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
                FROM Posts pi GROUP BY pi.owneruserid) p2
      ON (p1.postid = p2.maxpostid)
    WHERE p1.owneruserid = 20860;
    
    1 row in set (1 min 17.89 sec)
    

    Even the EXPLAIN analysis takes over 16 seconds:

    +----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
    | id | select_type | table      | type   | possible_keys              | key         | key_len | ref          | rows    | Extra       |
    +----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
    |  1 | PRIMARY     | <derived2> | ALL    | NULL                       | NULL        | NULL    | NULL         |   76756 |             | 
    |  1 | PRIMARY     | p1         | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY     | 8       | p2.maxpostid |       1 | Using where | 
    |  2 | DERIVED     | pi         | index  | NULL                       | OwnerUserId | 8       | NULL         | 1151268 | Using index | 
    +----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
    3 rows in set (16.09 sec)
    

    Now produce the same query result using my technique with LEFT JOIN:

    SELECT p1.postid
    FROM Posts p1 LEFT JOIN posts p2
      ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
    WHERE p2.postid IS NULL AND p1.owneruserid = 20860;
    
    1 row in set (0.28 sec)
    

    The EXPLAIN analysis shows that both tables are able to use their indexes:

    +----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
    | id | select_type | table | type | possible_keys              | key         | key_len | ref   | rows | Extra                                |
    +----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
    |  1 | SIMPLE      | p1    | ref  | OwnerUserId                | OwnerUserId | 8       | const | 1384 | Using index                          | 
    |  1 | SIMPLE      | p2    | ref  | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8       | const | 1384 | Using where; Using index; Not exists | 
    +----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
    2 rows in set (0.00 sec)
    

    Here's the DDL for my Posts table:

    CREATE TABLE `posts` (
      `PostId` bigint(20) unsigned NOT NULL auto_increment,
      `PostTypeId` bigint(20) unsigned NOT NULL,
      `AcceptedAnswerId` bigint(20) unsigned default NULL,
      `ParentId` bigint(20) unsigned default NULL,
      `CreationDate` datetime NOT NULL,
      `Score` int(11) NOT NULL default '0',
      `ViewCount` int(11) NOT NULL default '0',
      `Body` text NOT NULL,
      `OwnerUserId` bigint(20) unsigned NOT NULL,
      `OwnerDisplayName` varchar(40) default NULL,
      `LastEditorUserId` bigint(20) unsigned default NULL,
      `LastEditDate` datetime default NULL,
      `LastActivityDate` datetime default NULL,
      `Title` varchar(250) NOT NULL default '',
      `Tags` varchar(150) NOT NULL default '',
      `AnswerCount` int(11) NOT NULL default '0',
      `CommentCount` int(11) NOT NULL default '0',
      `FavoriteCount` int(11) NOT NULL default '0',
      `ClosedDate` datetime default NULL,
      PRIMARY KEY  (`PostId`),
      UNIQUE KEY `PostId` (`PostId`),
      KEY `PostTypeId` (`PostTypeId`),
      KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
      KEY `OwnerUserId` (`OwnerUserId`),
      KEY `LastEditorUserId` (`LastEditorUserId`),
      KEY `ParentId` (`ParentId`),
      CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
    ) ENGINE=InnoDB;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(20条)

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog