种族条件读取子进程的标准输出和标准错误

In Go, I'm trying to:

start a subprocess
read from stdout and stderr separately
implement an overall timeout

After much googling, we've come up with some code that seems to do the job, most of the time. But there seems to be a race condition whereby some output is not read.

The problem seems to only occur on Linux, not Windows.

Following the simplest possible solution found with google, we tried creating a context with a timeout:

context.WithTimeout(context.Background(), 10*time.Second)

While this worked most of the time, we were able to find cases where it would just hang forever. There was some aspect of the child process that caused this to deadlock. (Something to do with grandchildren that were not sufficiently dissasociated from the child process, and thus caused the child to never completely exit.)

Also, it seemed that in some cases the error that is returned when the timeout occurrs would indicate a timeout, but would only be delivered after the process had actually exited (thus making the whole concept of the timeout useless).

func GetOutputsWithTimeout(command string, args []string, timeout int) (io.ReadCloser, io.ReadCloser, int, error) {
    start := time.Now()
    procLogger.Tracef("Initializing %s %+v", command, args)
    cmd := exec.Command(command, args...)

    // get pipes to standard output/error
    stdout, err := cmd.StdoutPipe()
    if err != nil {
        return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.StdoutPipe() error: %+v", err.Error())
    }
    stderr, err := cmd.StderrPipe()
    if err != nil {
        return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.StderrPipe() error: %+v", err.Error())
    }

    // setup buffers to capture standard output and standard error
    var buf bytes.Buffer
    var ebuf bytes.Buffer

    // create a channel to capture any errors from wait
    done := make(chan error)
    // create a semaphore to indicate when both pipes are closed
    var wg sync.WaitGroup
    wg.Add(2)

    go func() {
        if _, err := buf.ReadFrom(stdout); err != nil {
            procLogger.Debugf("%s: Error Slurping stdout: %+v", command, err)
        }
        wg.Done()
    }()
    go func() {
        if _, err := ebuf.ReadFrom(stderr); err != nil {
            procLogger.Debugf("%s: Error  Slurping stderr: %+v", command, err)
        }
        wg.Done()
    }()

    // start process
    procLogger.Debugf("Starting %s", command)
    if err := cmd.Start(); err != nil {
        procLogger.Errorf("%s: failed to start: %+v", command, err)
        return emptyReader(), emptyReader(), -1, fmt.Errorf("cmd.Start() error: %+v", err.Error())
    }

    go func() {
        procLogger.Debugf("Waiting for %s (%d) to finish", command, cmd.Process.Pid)
        err := cmd.Wait()                                             // this can  be 'forced' by the killing of the process
        procLogger.Tracef("%s finished: errStatus=%+v", command, err) // err could be nil here
        //notify select of completion, and the status
        done <- err
    }()

    // Wait for timeout or completion.
    select {
    // Timed out
    case <-time.After(time.Duration(timeout) * time.Second):
        elapsed := time.Since(start)
        procLogger.Errorf("%s: timeout after %.1f
", command, elapsed.Seconds())
        if err := TerminateTree(cmd); err != nil {
            return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), -1,
                fmt.Errorf("failed to kill %s, pid=%d: %+v",
                    command, cmd.Process.Pid, err)
        }
        wg.Wait() // this *should* take care of waiting for stdout and stderr to be collected after we killed the process
        return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), -1,
            fmt.Errorf("%s: timeout %d s reached, pid=%d process killed",
                command, timeout, cmd.Process.Pid)
    //Exited normally or with a non-zero exit code
    case err := <-done:
        wg.Wait() // this *should* take care of waiting for stdout and stderr to be collected after the process terminated naturally.
        elapsed := time.Since(start)
        procLogger.Tracef("%s: Done after %.1f
", command, elapsed.Seconds())
        rc := -1
        // Note that we have to use go1.10 compatible mechanism.
        if err != nil {
            procLogger.Tracef("%s exited with error: %+v", command, err)
            exitErr, ok := err.(*exec.ExitError)
            if ok {
                ws := exitErr.Sys().(syscall.WaitStatus)
                rc = ws.ExitStatus()
            }
            procLogger.Debugf("%s exited with status %d", command, rc)
            return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), rc,
                fmt.Errorf("%s: process done with error: %+v",
                    command, err)
        } else {
            ws := cmd.ProcessState.Sys().(syscall.WaitStatus)
            rc = ws.ExitStatus()
        }
        procLogger.Debugf("%s exited with status %d", command, rc)
        return ioutil.NopCloser(&buf), ioutil.NopCloser(&ebuf), rc, nil
    }
    //NOTREACHED: should not reach this line!
}

Calling GetOutputsWithTimeout("uname",[]string{"-mpi"},10) will return the expected single line of output most of the time. But sometimes it will return no output, as if the goroutine that reads stdout didn't start soon enough to "catch" all the output (or exited early?) The "most of the time" strongly suggests a race condition.

We will also sometimes see errors from the goroutines about "file already closed" (this seems to happen with the timeout condition, but will happen at other "normal" times as well).

I would have thought that starting the goroutines before the cmd.Start() would have ensured that no output would be missed, and that using the WaitGroup would guarantee they would both complete before reading the buffers.

So how are we missing output? Is there still a race condition between the two "reader" goroutines and the cmd.Start()? Should we ensure those two are running using yet another WaitGroup?

Or is there a problem with the implementation of ReadFrom()?

Note that we are currently using go1.10 due to backward-compatibility problems with older OSs but the same effect occurs with go1.12.4.

Or are we overthinking this, and a simple implementation with context.WithTimeout() would do the job?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

c语言的scanf使用问题，读取标准输入后回车没任何反应不会有输出值 c语言
2022-03-06 19:42

回答 1 已采纳第7行： scanf("%d\n", &i); 把 '\n'删了，修改为： scanf("%d", &i);第8行： for (i, total = 1;i = 1; i--) 这里的 i=
流程可以读取自己的“标准输出”流吗？
2019-03-15 05:59

回答 2 已采纳 Yes, You may use os.Pipe() then process it yourself: tmp := os.Stdout r, w, err := os.Pipe() if e
fstream文件输入输出流读取文件时中文出现乱码 c++
2022-10-06 14:15

回答 1 已采纳 txt可能是UTF8编码，而你的程序当成GBK编码读取输出。你把txt另存为ANSI就不会乱码了。
C++标准转换运算符之 reinterpret_cast
2020-04-27 09:19

ppipp1109的博客 reinterpret_cast 转换 ...与static_cast不同，但与const_cast类似，reinterpret_cast表达式不会编译成任何 CPU 指令（除非在整数和指针间转换，或在指针表示依赖其类型的不明架构上）。它纯粹是一个编译...
如何从正在运行的Docker容器中读取文件和标准输出 docker
2014-07-07 22:41

回答 5 已采纳 The stdout of the process started by the docker container is available through the docker logs com
c语言读取文件并排序输出 c语言
2022-07-16 11:51

回答 2 已采纳 #include <stdio.h> #include <stdlib.h> #include <string.h> #define MIN 1 #define
c语言，编一个代码，从标准输入中读取整数，直到读取到0为止。 c语言有问必答
2021-08-18 11:37

回答 2 已采纳搞个数组就好了，接收输入，然后先偶数再奇数输出啊 int main() { int a[10000]; int i,n,num = 0; scanf("%d",&n); while(n
2016审核标准
2017-03-08 10:14

weixin_30764771的博客 1.1 为AppStore开发程序，开发者必须遵守ProgramLicenseAgreement(PLA)、人机交互指南（HIG）以及开发者和苹果签订的任何协议和合同。以下规则和示例旨在帮助开发者的程序能获得 2.1 崩溃的程序将会被拒绝。 2.2 ...
用matlab读取文件显示错误 matlab
2022-05-12 16:22

回答 1 已采纳这只是警告，没有出错，把.mat放在工作目录下再导入应该就不会犯错了
java读取文件问题输出为空 java
2022-10-18 11:33

回答 1 已采纳可以看一下你文件吗？是不是分割有问题啊？可以参考一下这个：test.txt文件是：1 0 2 31 2 0 3 File file = new File("E:/test.txt")
pandas读取数据库错误 python
2021-07-12 10:15

回答 1 已采纳这个方法没有这个参数吧，或者charset?
机械制图国家标准的绘图模板_从制图到数字制图，你知道真相吗
2020-10-09 22:39

weixin_26735703的博客机械制图国家标准的绘图模板Before the pandemic we already saw a drive towards digitisation. Especially cultural institutions like art galleries, museums, libraries and archives faced criticism to make ...
C语言。数据读取错误！ c语言
2022-11-01 13:27

回答 4 已采纳 //建议这样写 printf("请输入成绩：\n"); scanf("%d",&score);
App Store审核规范
2018-05-04 17:36

前进的探索者的博客因此，App Store 已成长为一个激动人心且充满活力的生态系统，正为数百万的开发者和超过十亿的用户提供服务。不管是开发新手，还是由经验丰富的程序员所组成的大型团队，我们都非常欢迎您为 App Store 创建 app，...
Helgrind：线程错误检测器
2019-03-08 14:41

程序员物语的博客目录 7.1。概观 7.2。检测到的错误：POSIX pthread API的滥用 7.3。...Helgrind的种族检测算法 7.4.3。解释竞赛错误讯息 7.5。有效使用Helgrind的提示和提示 7.6。Helgrind命令行选项 7.7。...
没有解决我的问题, 去提问

悬赏问题

¥50 MATLAB实现圆柱体容器内球形颗粒堆积
¥15 python如何将动态的多个子列表，拼接后进行集合的交集
¥20 vitis-ai量化基于pytorch框架下的yolov5模型
¥15 如何实现H5在QQ平台上的二次分享卡片效果？
¥15 python爬取bilibili校园招聘网站
¥30 求解达问题（有红包）
¥15 请解包一个pak文件
¥15 不同系统编译兼容问题
¥100 三相直流充电模块对数字电源芯片在物理上它必须具备哪些功能和性能？
¥30 数字电源对DSP芯片的具体要求

码龄粉丝数原力等级 --

种族条件读取子进程的标准输出和标准错误

0条回答默认最新

悬赏问题

种族条件读取子进程的标准输出和标准错误

0条回答 默认 最新

悬赏问题

0条回答默认最新