dtzd65908 2015-02-16 18:06
浏览 86
已采纳

MySQL,批量获取最后的活跃用户,但防止下一批次中的重复用户

I'm trying to extract users contributing to a specific topic in a message board.

Each request gets a batch of 10 Unique users.

The problem is that if some users where part of a previous batch they can occur in the next batch too.

SELECT p.post_id as id, p.author as uid, a.name 
FROM posts p 
INNER JOIN users a 
ON  a.id = p.author
AND p.topic_id = __TOPIC_ID__
AND p.post_id < __OFFSET_POST_ID__
GROUP BY p.author 
ORDER BY MAX(p.post_id) 
DESC LIMIT 10

My question is how I'm able to prevent those possible duplicates or at least get the lowest post_id.

Let's assume a single topic with 100 contributing users and 50000 posts written by them where only one of the first posts was made by the third user.

With a LIMIT of 10 it would be possible to get all 100 users in 10 queries. But this is not the way the above queries works:

If post 10000 up to 50000 were made by only ten users my ajax queries would get these users multiple times for many many requests. AND even worse...:

I could throw away all those requests because they would only contain duplicates every time.

What would be the "best" option to reduce the amount of queries?

One possible solution would be to query the n, 10 users but get the lowest post_id matching not as here the max() id. This way I could reduce the requests a bit in some cases but only in some cases.

Another way would be to use a:

AND p.author NOT IN( list of all uids queried before )

But This would make the problem even worse I guess...^^ Like:

SELECT * FROM X WHERE author_id NOT IN(1..to..4000000)...
  • 写回答

1条回答 默认 最新

  • douhuijun3776 2015-02-16 18:28
    关注

    You're iterating over posts, not users, while you need to iterate over users. I think this might do the trick:

    SELECT u.id, u.name, max(p.post_id)
    FROM users u
    INNER JOIN posts p ON p.author = u.id
    WHERE p.topic_id = :topic_id
    GROUP BY u.id
    ORDER BY max(p.post_id) DESC
    LIMIT 10 OFFSET :offset;
    

    As you can see, I group over users.id (primary key), and not posts.author, which is not primary/unique key, but just foreign key to users. You get duplicates exactly because you group on posts.author

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!