2 unlbaise unlbaise 于 2014.12.10 10:54 提问

Mysql 查询语句怎么写?

开发市场调查业务,有一个调查问题表responses (一百万行), 是市场调查的原始数据,包含survey_id(调查表的类型列),response_no(被调查者列),interview_date(调查日期), question_label(问题列), value(回答列), section_unique_id(部门列)等。
每一行对应某个被调查者回答的一个问题和它的调查结果。一个被调查者一次会回答30个问题,所以会产生30行。
另有一个计算公式表 (40行), 是对调查结果的分析计算公式,这些公式都是sql语句。
根据计算公式,产生一个结果表results。
现在的问题是,要设计一些sql语句,调查有多少人的回答是类似如下这样的组合:
1. 回答问题Q1,答案是1或8或9
并且
2. 回答问题Q2,答案是1或8或9
并且
。。。。。。
最好能用group by section。

比如如下这个例子:
计算公式的说明是这样。
((Q2A = 1 OR Q2A = 8 OR Q2A = 9) AND (Q2B = 1 OR Q2B = 8 OR Q2B = 9) AND (Q2C = 1 OR Q2C = 8 OR Q2C = 9) AND (Q2D = 1 OR Q2D = 8 OR Q2D = 9) AND (Q2E = 1 OR Q2E = 8 OR Q2E = 9) AND (Q2F = 1 OR Q2F = 8 OR Q2F = 9) AND (Q2G = 1 OR Q2G = 8 OR Q2G = 9) AND (Q2H = 1 OR Q2H = 8 OR Q2H = 9) AND (Q2I = 1 OR Q2I = 8 OR Q2I = 9) AND (Q5 = 1 OR Q5 = 8 OR Q5 = 9) AND (Q6 = 1 OR Q6 = 8 OR Q6 = 9))

我现在写了这样一个mysql语句:
SELECT section_unique_id as "section_unique_id", COUNT(*) as "hit" FROM responses WHERE
question_label = "Q2A" AND value IN (1,8,9)
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2B" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2C" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2D" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2E" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2F" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2G" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2H" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q2I" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q5" AND value IN (1,8,9))
AND
(response_no, survey_id, interview_date) IN (SELECT DISTINCT response_no, survey_id, interview_date FROM responses WHERE question_label = "Q6" AND value IN (1,8,9))

结果,运行一次,用了12秒,太慢了。
请教mysql高手,有没有办法能加快计算速度。

4个回答

purcjame
purcjame   2014.12.10 11:41

调查有多少人的回答是类似如下这样的组合:
1. 回答问题Q1,答案是1或8或9
并且
2. 回答问题Q2,答案是1或8或9
并且
。。。。。。你的意思是调查所有组合数据是吗 如 Q1 ....Q20都是选 (1or8or9 )这3个答案范围的有多少人是吗

purcjame
purcjame   2014.12.10 11:46

我理解一下是不是一个调查客户他回答 30个问题 他30个问题都选(1 8 9)的范围的 统计+1

unlbaise
unlbaise   2014.12.10 11:50

是的。是所有被调查者回答的30个问题中选中8个问题,如果这8个问题的答案都是1或8或9,就+1

unlbaise
unlbaise   2014.12.10 12:03

比如,被调查者A回答了30个问题,其中Q2A到Q2H这8个问题很重要,如果这8个问题的答案都是1或8或9,那么+1.
这样调查了10000个人后,我们要看有多少人是符合以上的这个条件的。
但因为这10000个人是和不同的部门关联的,所以我们想用group by section_unique_id来得到每个部门符合这个条件的被调查者有多少。

Csdn user default icon
上传中...
上传图片
插入图片