dou44481 2017-10-27 15:44
浏览 45


I want to be show the percentage of the values : 0,1,2 in the field_couleurs_badge_value column depending on the timestamps in the revision_timestamp column everyday,everyweek and everymonth between 08:00 A.M and 19:00 and exclude the weekends.

An example of the result that i want is :

Yesterday value 0 has 50%
          value 1 has 30%
          value 2 has 20%

Last week value 0 has 40%
          value 1 has 40%
          value 2 has 20%

Last month value 0 has 30%
           value 1 has 30%
           value 2 has 40%

I have a mysql table with three rows (I put the dots for clarity):

revision_id, field_couleurs_badge_value, revision_timestamp
3724......., 0........................., 1491581853
4025......., 1........................., 1506418602
4026......., 2........................., 1506523134
4028......., 1........................., 1508232519
4029......., 2........................., 1508232886

Here is the code to create the table :

CREATE TABLE 'badge_stats' (
'entity_id' VARCHAR(45) NULL,
'revision_id' VARCHAR(45) NULL,
'field_couleurs_badge_value' VARCHAR(45) NULL,
'revision_timestamp' VARCHAR(45) NULL,
PRIMARY KEY ('id'));
INSERT INTO badge_stats VALUES('1', '3491', '3724', '1', '1491581853');
INSERT INTO badge_stats VALUES('2', '3491', '4026', '2', '1506523134');
INSERT INTO badge_stats VALUES('3', '3491', '4028', '1', '1508232519');
INSERT INTO badge_stats VALUES('4', '3491', '4029', '0', '1508232886');

What's the best way to do it ?

Here is my code to display the values with php :

$sql = "SELECT id, entity_id, revision_id, field_couleurs_badge_value, revision_timestamp FROM badge_stats";
$result = $conn->query($sql);

// Create table to display values
echo "<table border='1'>";
echo "<tr>";
echo "<th>id</th><th>entity_id</th><th>revision_id</th><th>field_couleurs_badge_value</th><th>revision_timestamp</th>";
echo "</tr>";

if ($result->num_rows > 0) {

    //  output data of each row
    while($row = $result->fetch_assoc()) {

        $id = $row["id"];       
        $entity_id = $row["entity_id"];     
        $revision_id = $row["revision_id"];     
        $field_couleurs_badge_value = $row["field_couleurs_badge_value"];       
        $revision_timestamp = $row["revision_timestamp"];

        echo "<th>";        
        echo $id;       
        echo "</th>";               
        echo "<th>";        
        echo $entity_id;        
        echo "</th>";               
        echo "<th>";        
        echo $revision_id;
        echo "</th>";
        echo "<th>";    
        echo $field_couleurs_badge_value;   
        echo "</th>";   
        echo "<th>";    
        echo  date("Y-m-d H:i:s", $revision_timestamp); 
        echo "</th>";
        echo "</tr>";


else {

    echo "Error !";


echo "</table>";

  • 写回答

1条回答 默认 最新

  • dongqichang7988 2017-10-27 18:34

    To get percentage, we need to get some counts, and then we can calculate percentages. Seems like a query of this form would get us the result we are after:

      SELECT v.cnt_zero / v.cnt_all * 100.0    AS pct_zero
           , v.cnt_one  / v.cnt_all * 100.0    AS pct_one
           , v.cnt_two  / v.cnt_all * 100.0    AS pct_two
        FROM (
             ) v

    To get the counts, that looks like a job for conditional aggregation.

    We can use an expression to check a condition, and return a 1 or 0.

      SELECT IF( t.field_couleurs_badge_value = '0'            ,1,0)
           , IF( t.field_couleurs_badge_value = '1'            ,1,0)
           , IF( t.field_couleurs_badge_value = '2'            ,1,0)
           , IF( t.field_couleurs_badge_value IN ('0','1','2') ,1,0)
        FROM badge_stats t
       WHERE ... 

    And we can use SUM aggregate on those expressions to add up the 1s and 0s.

    If we want to count rows with any value of in the field_couleurs_badge_value in "all" rows (including rows that don't have a '0','1' or '2' in that column), we could do that as well.

    For the timestamp value, we can convert that into a MySQL TIMESTAMP or DATETIME datatype, and use the convenient builtin functions.

    The FROM_UNIXTIME function will convert a unix-style timestamp value (32-bit integer seconds from beginning of era) into a DATETIME in the current MySQL session time zone (SHOW VARIABLES LIKE 'time_zone')

      FROM_UNIXTIME( 1491581853 )

    We can check if the time is "between 08:00 and 19:00". We'll take this to mean that the check should be inclusive of '08:00:00.000' up thru and including '19:00:00.000'.

    Normally when we work with time ranges, we avoid overlaps by doing checks >= start and < end, so that we don't ever have a time on a boundary that gets included in two separate ranges. For example, does 19:00:00 belong in the range 8am-7pm or does it belong in the range 7pm-11pm. (When we're counting rows in ranges, we typically don't want to count a row in two distinct ranges.)

    We can use the HOUR function or DATE_FORMAT function to extract the hours, or the time. My preference would be to pull the whole time hh:mm:ss

      DATE_FORMAT( FROM_UNIXTIME(1491581853), '%T')

    And the string returned by that can compared to literals to see if the time falls into a time range:

         DATE_FORMAT( FROM_UNIXTIME(1491581853), '%T') >= '08:00:00'
     AND DATE_FORMAT( FROM_UNIXTIME(1491581853), '%T') <= '19:00:00'

    Similarly, to "exclude the weekends", we can use the WEEKDAY function or the DATE_FORMAT function to extract the day of the week

     AND WEEKDAY( FROM_UNIXTIME(1491581853), '%T') < 5

    Putting all of that together, we get something like this:

      SELECT v.cnt_zero / v.cnt_012 * 100.0    AS pct_zero
           , v.cnt_one  / v.cnt_012 * 100.0    AS pct_one
           , v.cnt_two  / v.cnt_012 * 100.0    AS pct_two
           , v.cnt_012  / v.cnt_all * 100.0    AS pct_012
        FROM (
               SELECT SUM(IF( t.field_couleurs_badge_value = '0'            ,1,0)) AS cnt_zero
                    , SUM(IF( t.field_couleurs_badge_value = '1'            ,1,0)) AS cnt_one
                    , SUM(IF( t.field_couleurs_badge_value = '2'            ,1,0)) AS cnt_two
                    , SUM(IF( t.field_couleurs_badge_value IN ('0','1','2') ,1,0)) AS cnt_012
                    , SUM(1)                                                       AS cnt_all
                 FROM badge_stats t
                WHERE DATE_FORMAT(FROM_UNIXTIME( t.revision_timestamp +0),'%T') >= '08:00:00'
                  AND DATE_FORMAT(FROM_UNIXTIME( t.revision_timestamp +0),'%T') <= '19:00:00'
                  AND WEEKDAY(    FROM_UNIXTIME( t.revision_timestamp +0)     ) < 5
             ) v

    The specification is a little unclear what we are supposed to do with rows with values other than '0','1' and '2'... whether those are supposed to be included in the count of "all". As the specification is refined, we can tweak the query.

    To get the percentages for various date ranges, those conditions can be included in the conditionals in the aggregate expressions.

    First we need to have expressions that return us the begin and end of the datetime ranges we want to check. These definitions may not suit your use case, but as an example...

     SELECT DATE(NOW()) - INTERVAL WEEKDAY(DATE(NOW()))+8 DAY AS blw  -- begin last week
          , DATE(NOW()) - INTERVAL WEEKDAY(DATE(NOW()))+1 DAY AS elw  -- end last week (same as begin this week)
          , DATE_FORMAT(NOW(),'%Y-%m-01') - INTERVAL 1 MONTH  AS blm  -- begin last month
          , DATE_FORMAT(NOW(),'%Y-%m-01') - INTERVAL 0 MONTH  AS elm  -- end last month (same as begin this month)

    We can test those expressions (use a user-defined variable in place of NOW() so we test behavior with various datetimes, e.g. a sunday, on the first of the month, etc.) and tweak the expressions until we get the behavior we want.

      blw         elw         blm         elm
      ----------  ----------  ----------  ----------
      2017-10-15  2017-10-22  2017-09-01  2017-10-01  

    We can use a query like that as an inline view, so we can reference the values by the assigned alias. Given that the datetimes in the table are unix-style timestamps (why are these stored as character, and not int?), we can convert MySQL DATETIME/TIMESTAMP into unix-style timestamp integers with UNIX_TIMESTAMP function.

    Shortening the column names for the purposes of this demonstration, revision_timestamp to ts, and field_couleurs_badge_value to fcbv, something like this:

    SELECT v.lw_cnt_zero / v.lw_cnt_012 * 100.0    AS lw_pct_zero
         , v.lw_cnt_one  / v.lw_cnt_012 * 100.0    AS lw_pct_one
         , v.lw_cnt_two  / v.lw_cnt_012 * 100.0    AS lw_pct_two
         , v.lm_cnt_zero / v.lm_cnt_012 * 100.0    AS lm_pct_zero
         , v.lm_cnt_one  / v.lm_cnt_012 * 100.0    AS lm_pct_one
         , v.lm_cnt_two  / v.lm_cnt_012 * 100.0    AS lm_pct_two
      FROM (               -- last week  
            SELECT SUM(IF( t.ts+0 >= d.blw AND t.ts+0 < d.elw AND t.fcbv = '0'           ,1,0)) AS lw_cnt_zero
                 , SUM(IF( t.ts+0 >= d.blw AND t.ts+0 < d.elw AND t.fcbv = '1'           ,1,0)) AS lw_cnt_one
                 , SUM(IF( t.ts+0 >= d.blw AND t.ts+0 < d.elw AND t.fcbv = '2'           ,1,0)) AS lw_cnt_two
                 , SUM(IF( t.ts+0 >= d.blw AND t.ts+0 < d.elw AND t.fcbv IN ('0','1','2'),1,0)) AS lw_cnt_012
                           -- last month 
                 , SUM(IF( t.ts+0 >= d.blm AND t.ts+0 < d.elm AND t.fcbv = '0'           ,1,0)) AS lm_cnt_zero
                 , SUM(IF( t.ts+0 >= d.blm AND t.ts+0 < d.elm AND t.fcbv = '1'           ,1,0)) AS lm_cnt_one
                 , SUM(IF( t.ts+0 >= d.blm AND t.ts+0 < d.elm AND t.fcbv = '2'           ,1,0)) AS lm_cnt_two
                 , SUM(IF( t.ts+0 >= d.blm AND t.ts+0 < d.elm AND t.fcbv IN ('0','1','2'),1,0)) AS lm_cnt_012
              FROM ( SELECT UNIX_TIMESTAMP(DATE(NOW()) - INTERVAL WEEKDAY(DATE(NOW()))+8 DAY) AS blw -- begin last week
                          , UNIX_TIMESTAMP(DATE(NOW()) - INTERVAL WEEKDAY(DATE(NOW()))+1 DAY) AS elw -- end last week
                          , UNIX_TIMESTAMP(DATE_FORMAT(NOW(),'%Y-%m-01') - INTERVAL 1 MONTH)  AS blm -- begin last month
                          , UNIX_TIMESTAMP(DATE_FORMAT(NOW(),'%Y-%m-01') - INTERVAL 0 MONTH)  AS elm -- end last month
                   ) d
              JOIN badge_stats t
                ON t.ts+0 >= d.blm
             WHERE DATE_FORMAT(FROM_UNIXTIME( t.ts +0),'%T') >= '08:00:00'
               AND DATE_FORMAT(FROM_UNIXTIME( t.ts +0),'%T') <= '19:00:00'
               AND WEEKDAY(    FROM_UNIXTIME( t.ts +0)     ) < 5
           ) v

    Note that the comparisons to the "end" of the range is "less than". The value we're comparing to is actually the beginning datetime of the next range. We want values up to (but not including) the beginning of the next range.

    Note that we included a predicate in the ON clause of the JOIN to exclude rows from badge_stats that are earlier than the earliest datetime we're interested in, in this case, the beginning of last month.

    We are also adding a zero (+0) to the references to the varchar regression_timestamp column, to convert to numeric, so we can compare to numerics. (We don't want to do character comparisons on integer values.)

    This can be extended to also generate percentages for "yesterday". Add expressions to the inline view d to return the beginning of yesterday and the start of today.

    And repeat the same pattern we used for "last week" and "last month", with expressions referencing "beginning of yesterday" and "end of yesterday".

    本回答被题主选为最佳回答 , 对您是否有帮助呢?



    • ¥20 C语言字符串不区分大小写字典排序相关问题
    • ¥15 关于#python#的问题:我希望通过逆向技术爬取1688搜索页下滑加载的数据
    • ¥15 学习C++过程中遇到的问题
    • ¥15 关于Linux的终端里,模拟实现一个带口令保护的屏保程序遇到的输入输出的问题!(语言-c语言)
    • ¥15 学习C++过程中遇到的问题
    • ¥15 请问,这个嵌入式Linux系统怎么分析,crc检验区域在哪
    • ¥15 二分类改为多分类问题
    • ¥15 Unity微信小游戏上调用ReadPixels()方法报错
    • ¥15 如何通过求后验分布求得样本中属于两种物种其中一种的概率?
    • ¥15 q从常量变成sin函数,怎么改写python代码?