2018-10-22 16:30
浏览 77

为什么在Go中交换[] float64的元素比在Rust中交换Vec <f64>的元素快?

I have two (equivalent?) programs, one in Go the other in Rust. The average execution time is:

  • Go ~169ms
  • Rust ~201ms


package main

import (

func main() {
    work := []float64{0.00, 1.00}
    start := time.Now()

    for i := 0; i < 100000000; i++ {
        work[0], work[1] = work[1], work[0]

    elapsed := time.Since(start)
    fmt.Println("Execution time: ", elapsed)


I compiled with --release

use std::time::Instant;

fn main() {
    let mut work: Vec<f64> = Vec::new();

    let now = Instant::now();

    for _x in 1..100000000 {
        work.swap(0, 1); 

    let elapsed = now.elapsed();
    println!("Execution time: {:?}", elapsed);

Is Rust less performant than Go in this instance? Could the Rust program be written in an idiomatic way, to execute faster?

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dougu3290 2018-10-22 16:51

    Could the Rust program be written in an idiomatic way, to execute faster?

    Yes. To create a vector with a few elements, use the vec![] macro:

    let mut work: Vec<f64> = vec![0.0, 1.0];    
    for _x in 1..100000000 {
        work.swap(0, 1); 

    So is this code faster? Yes. Have a look at what assembly is generated:

      mov eax, 99999999
      add eax, -11
      jne .LBB0_1

    On my PC, this runs about 30 times faster than your original code.

    Why does the assembly still contain this loop that is doing nothing? Why isn't the compiler able to see that two pushes are the same as vec![0.0, 1.0]? Both very good questions and both probably point to a flaw in LLVM or the Rust compiler.

    However, sadly, there isn't much useful information to gain from your micro benchmark. Benchmarking is hard, like really hard. There are so many pitfalls that even professionals fall for. In your case, the benchmark is flawed in several ways. For a start, you never observe the contents of the vector later (it is never used). That's why a good compiler can remove all code that even touches the vector (as the Rust compiler did above). So that's not good.

    Apart from that, this does not resemble any real performance critical code. Even if the vector would be observed later, swapping an odd number of times equals a single swap. So unless you wanted to see if the optimizer could understand this swapping rule, sadly your benchmark isn't really useful.

    解决 无用
    打赏 举报
  • duanbushi1867 2018-10-22 18:09

    (Not an answer) but to augment what Lukas wrote, here's what Go 1.11 generates for the loop itself:

        xorl    CX, CX
        movsd   8(AX), X0
        movsd   (AX), X1
        movsd   X0, (AX)
        movsd   X1, 8(AX)
        incq    CX
        cmpq    CX, $100000000
        jlt     68

    (Courtesy of https://godbolt.org)

    In either case, note that most probably the time you measured was dominated by the startup and initialization of the processes, so you did not actually measured the speed of the execution of the loops. IOW your approach is not correct.

    解决 无用
    打赏 举报

相关推荐 更多相似问题