douwu7563 2019-03-20 14:30
浏览 79

为什么我的Go数组排序代码比Java慢得多?

After migrating one of my computing heavy backend programs from Java to Go, I find that the performance degraded instead of improving. I tested around some and it seems the array sorting code is the culprit (which I used heavily in my program). I have written the below two simplified programs to do a comparison, and somehow it seems Go's built-in sort function is a lot slower than Java's Arrays.sort method?

package main

import (
    "fmt"
    "math/rand"
    "sort"
    "time"
)

func main() {
    fmt.Println("Starting")
    const x = 1000000
    const y = x * 10
    var s [y]float64
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    start1 := time.Now()
    for i := 0; i < y; i++ {
        s[i] = r1.Float64()
    }
    end1 := time.Since(start1)
    ss := s[:]
    start2 := time.Now()
    sort.Float64s(ss)
    end2 := time.Since(start2)
    fmt.Println(end1)
    fmt.Println(end2)
    fmt.Println("Number: ", ss[x])
}

and it produces results like this:

Starting
136.6331ms  // The time taken to generate 10,000,000 random numbers
3.456781s   // The time taken to sort the 10,000,000 random numbers
Number:  0.10000285497001288

While with the Java program here

import java.util.*;

class RSTest {
    public static void main(String[] args) {
        System.out.println("Starting");
        int x = 1000000;
        int y = x * 10;
        Random gen = new Random(System.currentTimeMillis());
        double[] s = new double[y];
        long start1 = System.nanoTime();
        for (int i = 0; i < y; i++) {
            s[i] = gen.nextDouble();
        }
        long end1 = System.nanoTime();
        long start2 = System.nanoTime();
        Arrays.sort(s);
        long end2 = System.nanoTime();
        System.out.println((end1 - start1) / (1000000000.0));
        System.out.println((end2 - start2) / (1000000000.0));
        System.out.println(s[x]);
    }
}

the results are like this

Starting
0.2252634  // The time taken to generate 10,000,000 random numbers
1.0303157  // The time taken to sort the 10,000,000 random numbers
0.0999513608326642

the Go program takes around 130ms to generate 10 million random numbers and assign them to an array while Jave takes around 230ms to generate 10 million random numbers and assign them to an array, this part I think is the improvement I expect from going from Java to Go.

But for the sorting part, it took Java only around 1s to sort the 10 million random numbers but it took Go around 3.5s to do the 10 million random number sort? And this is quite consistent from multiple runs of the test.

So does that mean Go's built-in sort function is really this much inferior to Java's Arrays.sort method? Or did I use Go's sort function wrong? Or something is wrong with my programs?

Thanks.

Note: this is from Go 1.11 and Java 8, the current versions I'm running on my server. Also, please note that the two programs I posted here are purely for testing purposes that I have written in a couple minutes so may (or rather, most certainly do) contain some code that doesn't make much sense for real production systems.

Some Update:

Thanks to @nussjustin's suggestion, I tried sort.Slice with some promising results.

As currently I'm out of office and using a slower notebook, the baseline results for the two above tests are like this now:

For the Java Arrays.sort test

Starting
0.3590694
1.6030528 // The time taken to sort the 10,000,000 random numbers
0.10000905418967532

For the Go sort.Float64s test

Go
Starting
233.1957ms
5.4633992s // The time taken to sort the 10,000,000 random numbers
Number:  0.10002801819954663

And now after modifying the Go test with sort.Slice

package main

import (
    "fmt"
    "math/rand"
    "sort"
    "time"
)

func main() {
    fmt.Println("Starting")
    const x = 1000000
    const y = x * 10
    var s [y]float64
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    start1 := time.Now()
    for i := 0; i < y; i++ {
        s[i] = r1.Float64()
    }
    end1 := time.Since(start1)
    ss := s[:]
    start2 := time.Now()
    sort.Slice(ss, func(i, j int) bool { return ss[i] < ss[j] })
    end2 := time.Since(start2)
    fmt.Println(end1)
    fmt.Println(end2)
    fmt.Println("Number: ", ss[x])
}

The result is a big improvement over sort.Float64s, but still not as good as Java's array sort

Starting
281.4262ms
3.6745684s // The time taken to sort the 10,000,000 random numbers
Number:  0.10010604106864159

And I think someone complained that there is only 1 distribution for the tests (who later removed his comment), I tested for sorting normal distribution of random numbers too (albeit I'd say such a huge performance difference in sorting uniform distribution of random numbers is already quite a bad sign since the algorithms of sorting uniform distribution of random numbers should be quite mature already)

I just replace the random number generator from uniform distribution to normal distribution like this

Go:

s[i] = r1.NormFloat64()

Java:

s[i] = gen.nextGaussian();

And the result of Java's Arrays.sort method is

Starting
1.4126348
1.6118655
-1.2820310313627319

And Go's sort.Slice

Starting
434.9106ms
3.8936811s
Number:  -1.2818667132095363

So Go is sort.Slice is still about twice slower than Java's Arrays.sort, same as for uniform distribution of random numbers. The good thing is that in generating the normal distribution of random numbers, Go is three times faster than Java, compared to about 70% faster in generating uniform distribution of numbers.

  • 写回答

2条回答 默认 最新

  • douyangquan2474 2019-03-20 14:49
    关注

    Twenty years ago, Java 1.1 was slow. Since then, thousands of people have put their minds to the task to fix that. Today, Java code is usually on par or faster than C++. With Java 12 and GraalVM, we're all going to see another boost.

    Bad code in Java can be slow but the same is true for C++. Java doesn't come with a brain, you have to use your own :-)

    To answer the question, the code looks correct. My guess is that Java's sort implementation has in fact been optimized to the brink with data from thousands of use cases. Just look at the length: ~3000 lines with tons of corner cases compared to 500 in Go.

    评论

报告相同问题?

悬赏问题

  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
  • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
  • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)