douqiao7958 2018-07-13 05:04
浏览 78

在封闭的net.Conn上写入,但返回nil错误

Talk is cheap, so here we go the simple code:

package main

import (
    "fmt"
    "time"
    "net"
)

func main() {
    addr := "127.0.0.1:8999"

    // Server
    go func() {
        tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
        if err != nil {
            panic(err)
        }
        listen, err := net.ListenTCP("tcp", tcpaddr)
        if err != nil {
            panic(err)
        }
        for  {
            if conn, err := listen.Accept(); err != nil {
                panic(err)
            } else if conn != nil {
                go func(conn net.Conn) {
                    buffer := make([]byte, 1024)
                    n, err := conn.Read(buffer)
                    if err != nil {
                        fmt.Println(err)
                    } else {
                        fmt.Println(">", string(buffer[0 : n]))
                    }
                    conn.Close()
                }(conn)
            }
        }
    }()

    time.Sleep(time.Second)

    // Client
    if conn, err := net.Dial("tcp", addr); err == nil {
        for i := 0; i < 2; i++ {
            _, err := conn.Write([]byte("hello"))
            if err != nil {
                fmt.Println(err)
                conn.Close()
                break
            } else {
                fmt.Println("ok")
            }
            // sleep 10 seconds and re-send
            time.Sleep(10*time.Second)
        }
    } else {
        panic(err)
    }

}

Ouput:

> hello
ok
ok

The Client writes to the Server twice. After the first read, the Server closes the connection immediately, but the Client sleeps 10 seconds and then re-writes to the Server with the same already closed connection object(conn).

Why can the second write succeed (returned error is nil)?

Can anyone help?

PS:

In order to check if the buffering feature of the system affects the result of the second write, I edited the Client like this, but it still succeeds:

// Client
if conn, err := net.Dial("tcp", addr); err == nil {
    _, err := conn.Write([]byte("hello"))
    if err != nil {
        fmt.Println(err)
        conn.Close()
        return
    } else {
        fmt.Println("ok")
    }
    // sleep 10 seconds and re-send
    time.Sleep(10*time.Second)

    b := make([]byte, 400000)
    for i := range b {
        b[i] = 'x'
    }
    n, err := conn.Write(b)
    if err != nil {
        fmt.Println(err)
        conn.Close()
        return
    } else {
        fmt.Println("ok", n)
    }
    // sleep 10 seconds and re-send
    time.Sleep(10*time.Second)
} else {
    panic(err)
}

And here is the screenshot: attachment

  • 写回答

1条回答 默认 最新

  • douting0585 2018-07-13 08:06
    关注

    There are several problems with your approach.

    Sort-of a preface

    The first one is that you do not wait for the server goroutine to complete. In Go, once main() exits for whatever reason, all the other goroutines still running, if any, are simply teared down forcibly.

    You're trying to "synchronize" things using timers, but this only works in toy situations, and even then it does so only from time to time.

    Hence let's fix your code first:

    package main
    
    import (
        "fmt"
        "log"
        "net"
        "time"
    )
    
    func main() {
        addr := "127.0.0.1:8999"
    
        tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
        if err != nil {
            log.Fatal(err)
        }
        listener, err := net.ListenTCP("tcp", tcpaddr)
        if err != nil {
            log.Fatal(err)
        }
    
        // Server
        done := make(chan error)
        go func(listener net.Listener, done chan<- error) {
            for {
                conn, err := listener.Accept()
                if err != nil {
                    done <- err
                    return
                }
                go func(conn net.Conn) {
                    var buffer [1024]byte
                    n, err := conn.Read(buffer[:])
                    if err != nil {
                        log.Println(err)
                    } else {
                        log.Println(">", string(buffer[0:n]))
                    }
                    if err := conn.Close(); err != nil {
                        log.Println("error closing server conn:", err)
                    }
                }(conn)
            }
        }(listener, done)
    
        // Client
        conn, err := net.Dial("tcp", addr)
        if err != nil {
            log.Fatal(err)
        }
        for i := 0; i < 2; i++ {
            _, err := conn.Write([]byte("hello"))
            if err != nil {
                log.Println(err)
                err = conn.Close()
                if err != nil {
                    log.Println("error closing client conn:", err)
                }
                break
            }
            fmt.Println("ok")
            time.Sleep(2 * time.Second)
        }
    
        // Shut the server down and wait for it to report back
        err = listener.Close()
        if err != nil {
            log.Fatal("error closing listener:", err)
        }
        err = <-done
        if err != nil {
            log.Println("server returned:", err)
        }
    }
    

    I've spilled a couple of minor fixes like using log.Fatal (which is log.Print + os.Exit(1)) instead of panicking, removed useless else clauses to adhere to the coding standard of keeping the main flow where it belongs, and lowered the client's timeout. I have also added checking for possible errors Close on sockets may return.

    The interesting part is that we now properly shut the server down by closing the listener and then waiting for the server goroutine to report back (unfortunately Go does not return an error of a custom type from net.Listener.Accept in this case so we can't really check that Accept exited because we've closed the listener). Anyway, our goroutines are now properly synchronized, and there is no undefined behaviour, so we can reason about how the code works.

    Remaining problems

    Some problems still remain.

    The more glaring is you making wrong assumption that TCP preserves message boundaries—that is, if you write "hello" to the client end of the socket, the server reads back "hello". This is not true: TCP considers both ends of the connection as producing and consuming opaque streams of bytes. This means, when the client writes "hello", the client's TCP stack is free to deliver "he" and postpone sending "llo", and the server's stack is free to yield "hell" to the read call on the socket and only return "o" (and possibly some other data) in a later read.

    So, to make the code "real" you'd need to somehow introduce these message boundaries into the protocol above TCP. In this particular case the simplest approach would be either using "messages" consisting of a fixed-length and agreed-upon endianness prefix indicating the length of the following data and then the string data itself. The server would then use a sequence like

    var msg [4100]byte
    _, err := io.ReadFull(sock, msg[:4])
    if err != nil { ... }
    mlen := int(binary.BigEndian.Uint32(msg[:4]))
    if mlen < 0 {
      // handle error
    }
    if mlen == 0 {
      // empty message; goto 1
    }
    _, err = io.ReadFull(sock, msg[5:5+mlen])
    if err != nil { ... }
    s := string(msg[5:5+mlen])
    

    Another approach is to agree on that the messages do not contain newlines and terminate each message with a newline (ASCII LF, , 0x0a). The server side would then use something like a usual bufio.Scanner loop to get full lines from the socket.

    The remaining problem with your approach is to not dealing with what Read on a socket returns: note that io.Reader.Read (that's what sockets implement, among other things) is allowed to return an error while having had read some data from the underlying stream. In your toy example this might rightfully be unimportant, but suppose that you're writing a wget-like tool which is able to resume downloading of a file: even if reading from the server returned some data and an error, you have to deal with that returned chunk first and only then handle the error.

    Back to the problem at hand

    The problem presented in the question, I beleive, happens simply because in your setup you hit some TCP buffering problem due to the tiny length of your messages.

    On my box which runs Linux 4.9/amd64 two things reliably "fix" the problem:

    • Sending messages of 4000 bytes in length: the second call to Write "sees" the problem immediately.
    • Doing more Write calls.

    For the former, try something like

    msg := make([]byte, 4000)
    for i := range msg {
        msg[i] = 'x'
    }
    for {
        _, err := conn.Write(msg)
        ...
    

    and for the latter—something like

    for {
        _, err := conn.Write([]byte("hello"))
        ...
        fmt.Println("ok")
        time.Sleep(time.Second / 2)
    }
    

    (it's sensible to lower the pause between sending stuff in both cases).

    It's interesting to note that the former example hits the write: connection reset by peer (ECONNRESET in POSIX) error while the second one hits write: broken pipe (EPIPE in POSIX).

    This is because when we're sending in chunks worth 4k bytes, some of the packets generated for the stream manage to become "in flight" before the server's side of the connection manages to propagate the information on its closure to the client, and those packets hit an already closed socket and get rejected with the RST TCP flag set. In the second example an attempt to send another chunk of data sees that the client side already knows that the connection has been teared down and fails the sending without "touching the wire".

    TL;DR, the bottom line

    Welcome to the wonderful world of networking. ;-)

    I'd recommend buying a copy of "TCP/IP Illustrated", read it and experiment. TCP (and IP and other protocols above IP) sometimes works not like people expect them to by applying their "common sense".

    评论

报告相同问题?

悬赏问题

  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)
  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法
  • ¥15 可否在不同线程中调用封装数据库操作的类
  • ¥15 微带串馈天线阵列每个阵元宽度计算