doudouba4520 2019-05-22 03:04
浏览 636
已采纳

BigQuery-使用goLang获取1000000条记录并对数据进行一些处理

I Have 1000000 records inside BigQuery. what is the best way to fetch data from DB and process using goLang? I'm getting timeout issue if fetch all the data without limit. already I increase the limit to 5min, but it takes more than 5 min. I want to do some streaming call or pagination implementation, But i don't know in golang how I do.

var FetchCustomerRecords = func(req *http.Request) *bigquery.RowIterator {
    ctx := appengine.NewContext(req)
    ctxWithDeadline, _ := context.WithTimeout(ctx, 5*time.Minute)
    log.Infof(ctx, "Fetch Customer records from BigQuery")
    client, err := bigquery.NewClient(ctxWithDeadline, "ddddd-crm")
    q := client.Query(
        "SELECT * FROM Something")

    q.Location = "US"
    job, err := q.Run(ctx)
    if err != nil {
        log.Infof(ctx, "%v", err)
    }
    status, err := job.Wait(ctx)
    if err != nil {
        log.Infof(ctx, "%v", err)

    }
    if err := status.Err(); err != nil {
        log.Infof(ctx, "%v", err)
    }
    it, err := job.Read(ctx)

    if err != nil {
        log.Infof(ctx, "%v", err)
    }
    return it
}
  • 写回答

2条回答 默认 最新

  • dtz33344 2019-05-22 08:27
    关注

    You can read the table contents directly without issuing a query. This doesn't incur query charges, and provides the same row iterator as you would get from a query.

    For small results, this is fine. For large tables, I would suggest checking out the new storage api, and the code sample on the samples page.

    For a small table or simply reading a small subset of rows, you can do something like this (reads up to 10k rows from one of the public dataset tables):

    func TestTableRead(t *testing.T) {
        ctx := context.Background()
        client, err := bigquery.NewClient(ctx, "my-project-id")
        if err != nil {
            t.Fatal(err)
        }
    
        table := client.DatasetInProject("bigquery-public-data", "stackoverflow").Table("badges")
        it := table.Read(ctx)
    
        rowLimit := 10000
        var rowsRead int
        for {
            var row []bigquery.Value
            err := it.Next(&row)
            if err == iterator.Done || rowsRead >= rowLimit {
                break
            }
    
            if err != nil {
                t.Fatalf("error reading row offset %d: %v", rowsRead, err)
            }
            rowsRead++
            fmt.Println(row)
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP