duandou1903 2016-02-11 04:40
浏览 308
已采纳

LEFT JOIN中的COUNT返回重复值

I've the following tables (example):

users:

id | user | photo   | joined | country
1  | Igor | abc.jpg | 2015   | Brazil
2  | John | cga.png | 2014   | USA
3  | Lucas| hes.jpg | 2016   | Japan

posts (see that there are two lines with author = Igor and ft = 2 and one line with author = Igor and ft = 3 and Igor have three posts):

id | author | content | date | ft (2 = photos and 3 = videos)
1  | Igor   | hi      | 2016 | 2
2  | Igor   | hello   | 2016 | 3
3  | John   | hehehe  | 2016 | 2
4  | Igor   | huhuhuh | 2016 | 2
5  | Lucas  | lol     | 2016 | 3

friendship (when status = 2 means that they are friends):

id | friend1 | friend2 | status
1  | Igor    | Lucas   | 2
2  | Lucas   | John    | 2
3  | John    | Igor    | 2

And I want to do a COUNT of posts with ft = 2 and a COUNT of friends (status = 2) according to the currently logged user (Igor, in this case).

So, I do (assuming that the current user logged in is Igor):

SELECT photo, joined, country, sum(CASE WHEN ft = 2 THEN 1 ELSE 0 END) AS numPhotos, sum(CASE WHEN ft = 3 THEN 1 ELSE 0 END) AS numVideos
FROM users
LEFT JOIN posts
ON users.user = posts.author
WHERE users.user = 'Igor'
GROUP BY users.user
LIMIT 1

And when I check on a foreach, the data is correct: numPhotos = 2 and numVideos = 1.

But, I want to select too the number of friends, so, I do:

SELECT photo, joined, country, sum(CASE WHEN ft = 2 THEN 1 ELSE 0 END) AS numPhotos, sum(CASE WHEN ft = 3 THEN 1 ELSE 0 END) AS numVideos, count(friendship.status) AS numFriends
FROM users
LEFT JOIN posts
ON users.user = posts.author
LEFT JOIN friendship
ON (users.user = friend1 OR users.user = friend2) AND friendship.status = 2
WHERE users.user = 'Igor'
GROUP BY users.user
LIMIT 1

But, the output is: numPhotos = 4, numVideos = 2 and numFriends = 6.

In other words, he is duplicating all results but in numFriends he's taking the total of posts of Igor (3) and duplicating the value too. And if I change count(friendship.status) to sum(friendship.status) the output is: numPhotos = 4, numVideos = 2 and numFriends = 18 (triples the numFriends).

I tried too with count(distinct friendship.status) and the result is: numPhotos = 4, numVideos = 2 and numFriends = 1 (duplicates the values again as well as return the wrong value 1 for numFriends that should be 2 knowing he has two friends).

So, how I can do this? (I'm using MySQL)

EDIT:

I changed the count(distinct friendship.status) to count(distinct friendship.id) and it worked to select the number of friends. But the rest of values (numPhotos and numVideos) continue duplicated.

I discovered that the problem is in ON (users.user = friend1 OR users.user = friend2), because if I leave only ON (users.user = friend1) or ON (users.user = friend2) the output isn't duplicated. I tried too with ON 'Igor' IN (friend1, friend2) but the result is the same (numPhotosandnumVideos` continue duplicated).

  • 写回答

2条回答 默认 最新

  • duanheye7423 2016-02-14 00:32
    关注

    I think the left join may be joining on a one-to-many relationship, which is causing inflated counts. Since you are only retrieving the counts for 1 user, I suggest using a subquery to retrieve the friendship counts (for retrieving the counts for multiple users, a derived table may be faster than a subquery):

    SELECT 
        sum(ft = 2) AS numPhotos, 
        sum(ft = 3) AS numVideos,
        (select count(*) from friendships f
        where (friend1 = users.user 
        or friend2 = users.user)
        and status = 2) as friendship_count
    FROM users
    LEFT JOIN posts
    ON users.user = posts.author
    WHERE users.user = 'Igor'
    

    Note that I removed the group by because users.user is already in the where clause, which means there is only 1 group.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大