doufei2007
doufei2007
2018-05-15 19:03

一个函数中的多个SQL查询

I need to return multiple value in a single SQL query on my postgresql database. Here is my query so far:

SELECT AVG("percent"), MIN("percent"), MAX("percent"), AVG("profit"), MIN("profit"), MAX("profit")
FROM public.log_analyticss 
WHERE "buyPlatform" = 'platA' AND
"date" >= '1526356073.6126819'

Data

date             sellPlatform     profit      percent    
----------       ------------     ----------  ----------
1526356073.61    platA            0           10.1
1526356073.62    platA            22          11
1526356073.63    platA            3           7
1526356073.64    platA            1           8
1526356073.65    platA            11          9
1526356073.66    platA            12          10
1526356073.67    platA            13          15

Desired Result

date             sellPlatform     profit      percent    
----------       ------------     ----------  ----------
1526356073.61    platA            0           10.1         //MIN Profit
1526356073.62    platA            22          11           //MAX Profit
1526356073.63    platA            3           7            //MIN Perc
1526356073.67    platA            13          15           //MAX Perc

//Then somehow I want it return AVG as well if that is every possible. Otherwise, I don't mind running another query to do that.

The issue is that I don't just want the MIN and MAX values. I want the entire row of data from which the MIN and MAX value is taken.

I understand that I am asking for an AVG and a MIN/MAX value which is going to return data in two different formats. I know this is might not be possible. But any help on how to most efficiently do this would be extremely helpful.

Right now I am just pulling the entire dataset into my code and calculating the mean, min and max within my code which I know if very bad and also very very slow. The table has about 8 million rows and the dataset that I am grabbing is about 9000 rows so it is very slow the way I am doing it now.

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

3条回答

  • doucheng5705 doucheng5705 3年前

    The most efficient way to retrieve the row associated with the minimum / maximum generally doesn't involve the MIN()/MAX() aggregates at all; instead, you can just attach an ORDER BY to your query, and add a LIMIT 1 to fetch only the first record.

    This means you need four SELECT statements with four different orderings, but you can factor out the expensive part (the fetch from log_analyticss) into a temp table or a CTE, e.g.:

    WITH Data AS (
      SELECT *
      FROM public.log_analyticss 
      WHERE "buyPlatform" = 'platA' AND
        "date" >= '1526356073.6126819'
    )
    (SELECT 'Min percent', * FROM Data ORDER BY "percent" ASC LIMIT 1)
    UNION ALL
    (SELECT 'Max percent', * FROM Data ORDER BY "percent" DESC LIMIT 1)
    UNION ALL
    (SELECT 'Min profit', * FROM Data ORDER BY "profit" ASC LIMIT 1)
    UNION ALL
    (SELECT 'Max profit', * FROM Data ORDER BY "profit" DESC LIMIT 1)
    

    In your case, a temp table might be better than a CTE, as you can re-use it to compute the averages.

    Note that if one of these maximum/minimum values is shared by two different rows, this query would only return one of them. The chosen row is selected effectively at random, but you can attach more fields to the ORDER BY clause to serve as a tie-breaker.

    If you actually want both records in this case, you'll need something more like Auston's or Radim's answer, i.e. compute the aggregates first, then join back to the data on the profit and percent columns. You can still make use of a temp table / CTE here to avoid hitting log_analyticss more than once.

    点赞 评论 复制链接分享
  • douge7771 douge7771 3年前

    You need something along these lines:

    SELECT a.*
    FROM public.log_analyticss  a
    JOIN
    (
        SELECT 
            MIN("percent") min_percent, 
            MAX("percent") max_percent, 
            MIN("profit") min_profit, 
            MAX("profit") max_profit
        FROM public.log_analyticss 
    ) t ON a.date = t.date AND
           a.sellPlatform = t.sellPlatform AND
           (a.profit = minprofit OR
            a.profit = maxprofit OR
            a.percent = minpercent OR
            a.percent = maxpercent)
    
    点赞 评论 复制链接分享
  • dongshuogai2343 dongshuogai2343 3年前

    I think the best way to do this is by two queries: The first retrieve the metrics, like you have done; The second query retrieve the sample registers.

    Or you can try run over a temporary table (automatic dropped after end session):

    CREATE TEMP TABLE statistics AS 
        SELECT AVG(percent) as perc_avg, MIN(percent) as perc_avg, MAX(percent) as perc_max, AVG(profit) as pro_avg, MIN(profit) as pro_min, MAX(profit) as pro_max
        FROM public.log_analyticss 
        WHERE buyPlatform = 'platA' AND
        sellPlatform = 'platB' AND 
        productId = '183948' AND
        date >= '1526356073.6126819'
    ;
        SELECT date, sellPlatform, profit, percent
        FROM public.log_analyticss a join statistics s
         on (a.profit = s.pro_max or a.profit = s.pro_min or
             a.percent = s.perc_max or a.percent = s.perc_min)
        WHERE buyPlatform = 'platA' AND
        sellPlatform = 'platB' AND 
        productId = '183948' AND
        date >= '1526356073.6126819';
    

    Reference to temporary tables: http://www.postgresql.org/docs/9.2/static/sql-createtable.html

    点赞 评论 复制链接分享

相关推荐