weixin_39548805
weixin_39548805
2020-11-29 09:52

pgsql-check Does not Cleanly Disconnect from Postgresql

While using the pgsql-check module to verify a Postgres server is online, the check does not properly tear down the connection to Postgres. This results in a chatty error in the Postgres log for every check, on every server polled.

The issue here was originally identified in this thread at serverfault five years ago, and later brought up in this thread at Stack Overflow about three years ago. Apparently a bug was never actually filed for these events.

Output of haproxy -vv and uname -a


HA-Proxy version 1.6.3 2015/12/25
Copyright 2000-2015 Willy Tarreau <willy.org>

Build options :
  TARGET  = linux2628
  CPU     = generic
  CC      = gcc
  CFLAGS  = -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2
  OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
Running on OpenSSL version : OpenSSL 1.0.2g  1 Mar 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.38 2015-11-23
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Linux tpa-node7 4.15.18-12-pve #1 SMP PVE 4.15.18-35 (Wed, 13 Mar 2019 08:24:42 +0100) x86_64 x86_64 x86_64 GNU/Linux
</willy.org>

What's the configuration?


global
    log 127.0.0.1 local0 notice
    stats socket /var/lib/haproxy/stats level admin

defaults
    timeout client 30s
    timeout server 30s
    timeout connect 5s
    option tcplog
    log global

frontend fe
    bind 127.0.0.1:5432

    maxconn 1000
    timeout client 1h

    default_backend be

backend be
    mode tcp

    stick-table type ip size 1
    stick on dst

    option pgsql-check user haproxy

    timeout server 1h

    server zaniest zaniest:5432 maxconn 225 check inter 1500 downinter 6s rise 5 fall 3 
    server koala koala:5432 maxconn 225 check inter 1500 downinter 6s rise 5 fall 3 backup

Steps to reproduce the behavior

There are actually two different sets of erroneous behavior. The first occurs when no user is specified while using the pgsql-check module, and can be activated in this manner:

  1. Set up any HAProxy configuration that uses pgsql-check
  2. Use pgsql-check to verify a Postgres server
  3. Examine the Postgres logs.

The second occurs when using the user parameter:

  1. Set up any HAProxy configuration that uses pgsql-check
  2. Specify the user parameter with a Postgres user that exists and has a password.
  3. Use pgsql-check to verify a Postgres server
  4. Examine the Postgres logs.

Actual behavior

In the first scenario, the Postgres logs will contain two errors of this description:


LOG:  incomplete startup packet
LOG:  could not receive data from client: Connection reset by peer

If the user parameter to pgsql-check is provided, only one error is emitted:


LOG:  could not receive data from client: Connection reset by peer

This indicates that a proper connection termination as expected by the Postgres service was not provided.

Expected behavior

Postgres logs should not produce any errors during a pgsql-check invocation.

Do you have any idea what may have caused this?

This is being caused by either not utilizing the Postgres libpq connection API to handle communication with Postgres, or incomplete adherence to the Postgres connection handling specification.

Do you have an idea how to solve the issue?

Postgres documents the connection protocol flow and has a section specifically for Termination of connections. One or more of these expected components is not being implemented, and only fully following the specification will prevent these log messages.

该提问来源于开源项目:haproxy/haproxy

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

6条回答

  • weixin_39548805 weixin_39548805 5月前

    I can verify that updating to the latest version removes these errors. Unfortunately I don't know which patch actually corrected the problem. I'll check for that and close once I can identify a commit.

    点赞 评论 复制链接分享
  • weixin_39625586 weixin_39625586 5月前

    Hello, I think it could be anything, including something related to a basic bug in the connection layer. 1.6.3 was quite old and was missing almost 400 bug fixes. With this said, the close on checks will always remain simple (connection reset) so there will not be many changes in this area anyway.

    点赞 评论 复制链接分享
  • weixin_39548805 weixin_39548805 5月前

    Yeah, I can't track past the huge code reorg in 2018. I'll just assume that's when this was fixed, as pgsql-check doesn't appear in many commit messages or code in the entire history of the project that explains why it started working properly. Closing.

    点赞 评论 复制链接分享
  • weixin_39941859 weixin_39941859 5月前

    Its commit a48c141f44 (BUG/MAJOR: connection: refine the situations where we don't send shutw()), which is in 1.9 as well as 1.8.2 and later.

    点赞 评论 复制链接分享
  • weixin_39625586 weixin_39625586 5月前

    On Thu, Mar 28, 2019 at 03:26:47PM -0700, Lukas Tribus wrote:

    Its commit a48c141f44 (BUG/MAJOR: connection: refine the situations where we don't send shutw()), which is in 1.9 as well as 1.8.2 and later.

    Good catch! I digged and failed to spot it.

    Willy

    点赞 评论 复制链接分享
  • weixin_39956558 weixin_39956558 5月前

    Trying on 1.9 and still seeing the same error. Can someone help with this?

    LOG: could not receive data from client: Connection reset by peer

    点赞 评论 复制链接分享

相关推荐