doudouba4520 2019-05-22 03:04
浏览 638
已采纳

BigQuery-使用goLang获取1000000条记录并对数据进行一些处理

I Have 1000000 records inside BigQuery. what is the best way to fetch data from DB and process using goLang? I'm getting timeout issue if fetch all the data without limit. already I increase the limit to 5min, but it takes more than 5 min. I want to do some streaming call or pagination implementation, But i don't know in golang how I do.

var FetchCustomerRecords = func(req *http.Request) *bigquery.RowIterator {
    ctx := appengine.NewContext(req)
    ctxWithDeadline, _ := context.WithTimeout(ctx, 5*time.Minute)
    log.Infof(ctx, "Fetch Customer records from BigQuery")
    client, err := bigquery.NewClient(ctxWithDeadline, "ddddd-crm")
    q := client.Query(
        "SELECT * FROM Something")

    q.Location = "US"
    job, err := q.Run(ctx)
    if err != nil {
        log.Infof(ctx, "%v", err)
    }
    status, err := job.Wait(ctx)
    if err != nil {
        log.Infof(ctx, "%v", err)

    }
    if err := status.Err(); err != nil {
        log.Infof(ctx, "%v", err)
    }
    it, err := job.Read(ctx)

    if err != nil {
        log.Infof(ctx, "%v", err)
    }
    return it
}
  • 写回答

2条回答 默认 最新

  • dtz33344 2019-05-22 08:27
    关注

    You can read the table contents directly without issuing a query. This doesn't incur query charges, and provides the same row iterator as you would get from a query.

    For small results, this is fine. For large tables, I would suggest checking out the new storage api, and the code sample on the samples page.

    For a small table or simply reading a small subset of rows, you can do something like this (reads up to 10k rows from one of the public dataset tables):

    func TestTableRead(t *testing.T) {
        ctx := context.Background()
        client, err := bigquery.NewClient(ctx, "my-project-id")
        if err != nil {
            t.Fatal(err)
        }
    
        table := client.DatasetInProject("bigquery-public-data", "stackoverflow").Table("badges")
        it := table.Read(ctx)
    
        rowLimit := 10000
        var rowsRead int
        for {
            var row []bigquery.Value
            err := it.Next(&row)
            if err == iterator.Done || rowsRead >= rowLimit {
                break
            }
    
            if err != nil {
                t.Fatalf("error reading row offset %d: %v", rowsRead, err)
            }
            rowsRead++
            fmt.Println(row)
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 宇视监控服务器无法登录
  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
  • ¥50 STM32单片机传感器读取错误
  • ¥50 power BI 从Mysql服务器导入数据,但连接进去后显示表无数据
  • ¥15 (关键词-阻抗匹配,HFSS,RFID标签天线)