dtzd65908 2015-02-16 18:06
浏览 86
已采纳

MySQL,批量获取最后的活跃用户,但防止下一批次中的重复用户

I'm trying to extract users contributing to a specific topic in a message board.

Each request gets a batch of 10 Unique users.

The problem is that if some users where part of a previous batch they can occur in the next batch too.

SELECT p.post_id as id, p.author as uid, a.name 
FROM posts p 
INNER JOIN users a 
ON  a.id = p.author
AND p.topic_id = __TOPIC_ID__
AND p.post_id < __OFFSET_POST_ID__
GROUP BY p.author 
ORDER BY MAX(p.post_id) 
DESC LIMIT 10

My question is how I'm able to prevent those possible duplicates or at least get the lowest post_id.

Let's assume a single topic with 100 contributing users and 50000 posts written by them where only one of the first posts was made by the third user.

With a LIMIT of 10 it would be possible to get all 100 users in 10 queries. But this is not the way the above queries works:

If post 10000 up to 50000 were made by only ten users my ajax queries would get these users multiple times for many many requests. AND even worse...:

I could throw away all those requests because they would only contain duplicates every time.

What would be the "best" option to reduce the amount of queries?

One possible solution would be to query the n, 10 users but get the lowest post_id matching not as here the max() id. This way I could reduce the requests a bit in some cases but only in some cases.

Another way would be to use a:

AND p.author NOT IN( list of all uids queried before )

But This would make the problem even worse I guess...^^ Like:

SELECT * FROM X WHERE author_id NOT IN(1..to..4000000)...
  • 写回答

1条回答 默认 最新

  • douhuijun3776 2015-02-16 18:28
    关注

    You're iterating over posts, not users, while you need to iterate over users. I think this might do the trick:

    SELECT u.id, u.name, max(p.post_id)
    FROM users u
    INNER JOIN posts p ON p.author = u.id
    WHERE p.topic_id = :topic_id
    GROUP BY u.id
    ORDER BY max(p.post_id) DESC
    LIMIT 10 OFFSET :offset;
    

    As you can see, I group over users.id (primary key), and not posts.author, which is not primary/unique key, but just foreign key to users. You get duplicates exactly because you group on posts.author

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 BP神经网络控制倒立摆
  • ¥20 要这个数学建模编程的代码 并且能完整允许出来结果 完整的过程和数据的结果
  • ¥15 html5+css和javascript有人可以帮吗?图片要怎么插入代码里面啊
  • ¥30 Unity接入微信SDK 无法开启摄像头
  • ¥20 有偿 写代码 要用特定的软件anaconda 里的jvpyter 用python3写
  • ¥20 cad图纸,chx-3六轴码垛机器人
  • ¥15 移动摄像头专网需要解vlan
  • ¥20 access多表提取相同字段数据并合并
  • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
  • ¥20 Java-Oj-桌布的计算