dongpeng7744 2014-09-11 02:32 采纳率: 100%
浏览 38
已采纳

MySQL问题与组合的SELECT语句

I have two primary SELECT statements that return the desired results when run individually, but when combined, don't return the desired result.

Query 1

This works fine, returning the expected result.

SELECT feed_mode_id FROM user WHERE id=2;
+--------------+
| feed_mode_id |
+--------------+
|            1 |
+--------------+

Query 2

This is also fine. Sometimes the result will be empty, sometimes not.

SELECT
        answer.id AS answer_id
    FROM
        answer
    WHERE
        answer.question_id = (
            SELECT
                question.id
            FROM
                question
            ORDER BY
                datetime_added_utc DESC
            LIMIT 1
        )
        AND answer.user_id = 2;

Empty set (0.00 sec)

Query 1 and 2 combined

When combining these into two sub-SELECT statements as shown below, feed_mode_id is NULL, but the result for x.feed_mode_id should be as shown in Query 1. This is my lack of understanding as to how these kind of combined statements work.

SELECT
    x.feed_mode_id,
    IF (COUNT(y.answer_id) < 1, 0, 1) AS answered_question
FROM
    (SELECT
        user.feed_mode_id
    FROM
        user
    WHERE
        user.id = 2) AS x,
    (SELECT
        answer.id AS answer_id
    FROM
        answer
    WHERE
        answer.question_id = (
            SELECT
                question.id
            FROM
                question
            ORDER BY
                datetime_added_utc DESC
            LIMIT 1
        )
        AND answer.user_id = 2) AS y

+--------------+-------------------+
| feed_mode_id | answered_question |
+--------------+-------------------+
|         NULL |                 0 |
+--------------+-------------------+

Why is feed_mode_id producing NULL and not 1? I'm open to different approaches to re-writing the query altogether as well. The desired result would be:

+--------------+-------------------+
| feed_mode_id | answered_question |
+--------------+-------------------+
|            1 |                 0 |
+--------------+-------------------+

This is somehow related to the fact that Query 2's result is empty for this case. For cases where Query 2 returns a value (not empty) then the combined query works as desired.

  • 写回答

3条回答 默认 最新

  • drwo2014 2014-09-11 03:45
    关注

    You have a Cartesian product between x and y. As long as each of those rowsources return only one row, the query will return a single row.

    I recommend you ditch the old-school comma syntax for the join operation, and use the JOIN keyword instead.

    Also, it's not clear why you need inline views and subqueries, apart from the subquery that returns that "most recently added" question.

    I'm not entirely clear what resultset you are actually attempting to return, but it looks like you are determining whether a particular user (uniquely identified by the "id" column of the "user" table), has provided an "answer" to the "most recently added question".

    If that's the result you are trying to return, I believe this query will return that result:

    SELECT u.feed_mode_id
         , IF( COUNT(a.id) < 1, 0, 1) AS answered_question
      FROM ( SELECT q.id
               FROM question q
              ORDER BY q.datetime_added_utc DESC, q.id DESC
              LIMIT 1
           ) r
      JOIN user u
        ON u.id = 2
      LEFT
      JOIN answer a
        ON a.user_id = u.id
       AND a.question_id = r.id
     GROUP BY u.id
    

    NOTES: The inline view aliased as r return the id of the "most recently added" question. (In the original query, if two or more question have the same datetime_added_utc, it's indeterminate which row will be returned. This query makes it determinate by adding another expression to the ORDER BY clause. (The inline view query could be pulled out and run separately, to verify it's returning the expected result.)

    The row returned from r (if there is one) is then joined to a row retrieved from the "user" table u. We're assuming here that the id column in the "user" table is a unique identifier, and likely the primary key of the "user" table.

    If we have a "most recently added" question (i.e. a row from r), and there's a row from "user" that matches the u.id=2 predicate in the ON clause, then so far, we are guaranteed that the query is going to return a single row.

    Next, we perform an "outer join" operation, to find matching rows from the "answer" table. The predicates in the ON clause restrict the rows returned to only those that have a user_id matching the id from u (in this example, equivalent to specifying a.user_id=2, and having a question_id that matches the id of the "most recently added" question r.

    The LEFT keyword identifies this as an "outer join"; if there are no matching rows from the "answer" table, then the query will still return the row from r and u. (If this were an inner join, that is, if we removed the LEFT keyword, then if there were no matching rows from the "answer" table, then the query would not return a row.)

    We add a GROUP BY u.id clause, in case we get more than one matching row from answer; the GROUP BY causes all rows that have the same value of u.id to be collapsed into a single row.

    The COUNT() aggregate counts the number of non-null occurrences of the id from the "answer" table. If there were no matching rows found, then a.id will be NULL, so COUNT(a.id) will return 0.


    This same query would also work if we were looking for multipler users, if we specified multiple values for user.id to match, e.g.

     ON u.id IN (2,3,5,7)
    

    or also if we left that predicate off entirely, so we got a row back for every user. This query would still work.

    But in either of those cases, we'd also want to add u.id AS user_id to the SELECT list of the query, so we would know which row was for which user.

    If we wanted to return the two most recently added questions, we could change the LIMIT clause in r, and then add r.id to the GROUP BY clause. Again, we'd then likely also want to return r.id AS question_id in the SELECT list, so we knew which row was for which question.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 组策略中的计算机配置策略无法下发
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?
  • ¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)
  • ¥50 mac mini外接显示器 画质字体模糊