donglin6313 2017-12-21 01:49
浏览 72
已采纳

如何使Golang to Postgres查询更快? 有什么其他选择吗?

I am using Golang and Postgres to filter some financial data. I have a Postgres database which has a single table containing a single Stock Market (if that's the correct term). This table has columns for id, symbol, date, open, high, low, close and volume. The total number of rows is 6,610,598 and the number of distinct stocks (symbols) is 2174.

Now what I want to do is to filter the data from that table, and save to another table. So the first one contains raw data and second one contains cleaned data.

We have three parameters, a date (EVALDATE) and 2 integers (MINCTD & MINDP). First, we have to select only those stocks that will pass our minimum calendar trading days parameter. So that will be selected by (NOTE: we use golang for this)

symbols []string got its value from ( Select distinct symbol from table_name; )
[]filteredSymbols
var symbol, date string
var open, high, low, close float64
var volume int
for _, symbol := range symbols {
    var count int
    query := fmt.Sprintf("Select count(*) from table_name where symbol = '%s' and date >= '%s';", symbol, EVALDATE)
    row := db.QueryRow(query)
    if err := row.Scan(&count); err != nil ........
    if count >= MINCTD
        filteredSymbols = append(filteredSymbols, symbol)
}

Basically, the operation above only asks for those symbols which has enough number of rows from the EVALDATE up to current date (latest date in data) that will satisfy MINCTD. The operation above took 30 minutes

If a symbol satisfies the first filter above, it will undergo a second filter which will test if within that period (EVALDATE to LATEST_DATE) it has enough rows that contain complete data (no OHLC without values). So the query below is used to filter the symbols which passed the filter above:

Select count(*) from table_name where symbol='symbol' and date>= 'EVALDATE' and open != 0 and high != 0 and low != 0 and close != 0;

This query took 36 minutes.

After getting the slice of symbols which passed both filter, I will then grab their data again using postgres query then begin a bulk insert to another table.

So 1 hour and 6 minutes is not very acceptable. What should I do then? Grab all data then filter using Golang in memory?

  • 写回答

2条回答 默认 最新

  • douzhaobo6488 2017-12-21 02:42
    关注

    Couple of things I note from the question.

    Try to avoid scanning 6 million+ rows to arrive at 2174 values (i.e. avoid Select distinct symbol from table_name;). Do you not have (or can you build) a "master table" of symbols with a primary key of the symbols?

    Combine your queries to test the data such as the following:

    select
           count(*) c1
         , count(case when open != 0 and high != 0 and low != 0 and close != 0 then 1 end) as c2
    from table_name 
    where symbol='symbol' 
    and date>= 'EVALDATE' 
    

    An index on (symbol, date) would assist performance.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 如何让企业微信机器人实现消息汇总整合
  • ¥50 关于#ui#的问题:做yolov8的ui界面出现的问题
  • ¥15 如何用Python爬取各高校教师公开的教育和工作经历
  • ¥15 TLE9879QXA40 电机驱动
  • ¥20 对于工程问题的非线性数学模型进行线性化
  • ¥15 Mirare PLUS 进行密钥认证?(详解)
  • ¥15 物体双站RCS和其组成阵列后的双站RCS关系验证
  • ¥20 想用ollama做一个自己的AI数据库
  • ¥15 关于qualoth编辑及缝合服装领子的问题解决方案探寻
  • ¥15 请问怎么才能复现这样的图呀