I have a Golang app server wherein I keep reloading a saved tensorflow model every 15 minutes. Every api call that uses the tensorflow model, takes a read mutex lock and whenever I reload the model, I take a write lock. Functionality wise, this works fine but during the model load, my API response time increases as the request threads keep waiting for the write lock to be released. Could you please suggest a better approach to keep the loaded model up to date?
Edit, Code updated
Model Load Code:
tags := []string{"serve"}
// load from updated saved model
var m *tensorflow.SavedModel
var err error
m, err = tensorflow.LoadSavedModel("/path/to/model", tags, nil)
if err != nil {
log.Errorf("Exception caught while reloading saved model %v", err)
destroyTFModel(m)
}
if err == nil {
ModelLoadMutex.Lock()
defer ModelLoadMutex.Unlock()
// destroy existing model
destroyTFModel(TensorModel)
TensorModel = m
}
Model Use Code(Part of the API request):
config.ModelLoadMutex.RLock()
defer config.ModelLoadMutex.RUnlock()
scoreTensorList, err = TensorModel.Session.Run(map[tensorflow.Output]*tensorflow.Tensor{
UserOp.Output(0): uT,
DataOp.Output(0): nT},
[]tensorflow.Output{config.SumOp.Output(0)},
nil,
)