dongyao5186 2017-06-21 12:19
浏览 126
已采纳

Redigo:Redis服务器关闭时快速失败

I'm struggling with getting go to fail fast when the redis server I'm connected to goes down, want to have a robust solution.

I'm using redigo and I'm setting up a connection pool like so:

// This has other stuff in it in the code, use it as a 
// central repository for things we want in memory
type State struct{
    redisPool *redis.Pool
}

func (state *State) GetRedisConn() redis.Conn {
    return state.redisPool.Get()
}

func main() {
    state.redisPool = &redis.Pool{
        MaxIdle:     200,
        MaxActive:   9000,
        IdleTimeout: time.Minute,
        Dial: func() (redis.Conn, error) {
            return redis.Dial("tcp", *redisAddress,
                redis.DialConnectTimeout(1*time.Second),
                redis.DialReadTimeout(100*time.Millisecond),
                redis.DialWriteTimeout(100*time.Millisecond),
            )
        },
    }
}

And requesting new connections and using them like so:

t0 := time.Now()

conn := state.GetRedisConn()
if conn != nil && conn.Err() == nil {
    defer conn.Close()
    // Do stuff
else {
    log.Printf("no redis probably")
}

log.Println(time.Now().Sub(t0).Seconds())

While redis is up, this works great, things happen in milliseconds. The moment I take redis down my 75th percentile goes up to 7+ seconds, with my 99th percentile going up to 10s (I can see this on prometheus)

What am I doing wrong? Why does this not timeout faster? I was under the impression that redis.DialConnectTimeout(1*time.Second) would cap the issue at 1 second, but it doesn't seem to be the case.

EDIT: It turns out this was due to a mistake I was making in Prometheus, setting the buckets too big, so while redis was timing out fine after a second, my buckets had been set up with a 1s bucket and a 10s bucket, so my requests (which were just over 1s) ended up in the 10s bucket, skewing the results. I'm sure this discussion will be useful to someone at some point though.

  • 写回答

1条回答 默认 最新

  • dongshuo8756 2017-07-01 20:55
    关注

    Rate limit dial attempts after a failure:

    func main() {
        var (
           nextDial time.Time
           mu sync.Mutex
        )
        state.redisPool = &redis.Pool{
            MaxIdle:     200,
            MaxActive:   9000,
            IdleTimeout: time.Minute,
            Dial: func() (redis.Conn, error) {
                mu.Lock()   // Dial can be called concurrently
                defer mu.Unlock()
                if time.Now().Before(nextDial) {
                   return nil, errors.New("waiting for dial")
                }
                c, err := redis.Dial("tcp", *redisAddress,
                    redis.DialConnectTimeout(1*time.Second),
                    redis.DialReadTimeout(100*time.Millisecond),
                    redis.DialWriteTimeout(100*time.Millisecond),
                )
                if err == nil {
                   nextDial = time.Time{}
                } else {
                   nextDial = time.Now().Add(time.Second) // don't attempt dial for one second
                }
                return c, err
            },
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)
  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法