dongzhenyin2001 2017-09-26 13:14
浏览 236

在Golang应用服务器中重新加载Tensorflow模型

I have a Golang app server wherein I keep reloading a saved tensorflow model every 15 minutes. Every api call that uses the tensorflow model, takes a read mutex lock and whenever I reload the model, I take a write lock. Functionality wise, this works fine but during the model load, my API response time increases as the request threads keep waiting for the write lock to be released. Could you please suggest a better approach to keep the loaded model up to date?

Edit, Code updated

Model Load Code:

    tags := []string{"serve"}

    // load from updated saved model
    var m *tensorflow.SavedModel
    var err error
    m, err = tensorflow.LoadSavedModel("/path/to/model", tags, nil)
    if err != nil {
        log.Errorf("Exception caught while reloading saved model %v", err)
        destroyTFModel(m)
    }

    if err == nil {
        ModelLoadMutex.Lock()
        defer ModelLoadMutex.Unlock()

        // destroy existing model
        destroyTFModel(TensorModel)
        TensorModel = m
    }

Model Use Code(Part of the API request):

    config.ModelLoadMutex.RLock()
    defer config.ModelLoadMutex.RUnlock()

    scoreTensorList, err = TensorModel.Session.Run(map[tensorflow.Output]*tensorflow.Tensor{
        UserOp.Output(0): uT,
        DataOp.Output(0): nT},
        []tensorflow.Output{config.SumOp.Output(0)},
        nil,
    )
  • 写回答

1条回答 默认 最新

  • douhuxi4145 2017-09-26 15:37
    关注

    Presumably destroyTFModel takes a long time. You could try this:

    old := TensorModel
    
    ModelLoadMutex.Lock()
    TensorModel = new
    ModelLoadMutex.Unlock()
    
    go destroyTFModel(old)
    

    So destroy after assign and/or try destroying on another goroutine if it needs to clean up resources and somehow takes a long time blocking this response. I'd look into what you're doing in destroyTFModel and why it is slow though, does it make network requests to the db or involve the file system? Are you sure there isn't another lock external to your app you're not aware of (for example if it had to open a file and locked it for reads while destroying this model?).

    Instead of using if err == nil { around it, consider returning on error.

    评论

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100