You have several mistakes in your logic, your benchmark and your assumptions.
As for the casting, your results show that your for loop is run 1000 times. Since you loop 1M times, that actually makes 1 billion casting operations... not too shabby.
Actually, I have reworked your code a bit:
const (
min = float64(math.SmallestNonzeroFloat32)
max = float64(math.MaxFloat32)
)
func cast(in float64) (out float32, err error) {
// We need to guard here, as casting from float64 to float32 looses precision
// Therefor, we might get out of scope.
if in < min {
return 0.00, fmt.Errorf("%f is smaller than smallest float32 (%f)", in, min)
} else if in > max {
return 0.00, fmt.Errorf("%f is bigger than biggest float32 (%f)", in, max)
}
return float32(in), nil
}
// multi64 uses a variadic in parameter, in order to be able
// to use the multiplication with arbitrary length.
func multi64(in ...float64) (result float32, err error) {
// Necessary to set it to 1.00, since float64's null value is 0.00...
im := float64(1.00)
for _, v := range in {
im = im * v
}
// We only need to cast once.
// You DO want to make the calculation with the original precision and only
// want to do the casting ONCE. However, this should not be done here - but in the
// caller, as the caller knows on how to deal with special cases.
return cast(im)
}
// multi32 is a rather non-sensical wrapper, since the for loop
// could easily be done in the caller.
// It is only here for comparison purposes.
func multi32(in ...float32) (result float32) {
result = 1.00
for _, v := range in {
result = result * v
}
return result
}
// openFile is here for comparison to show that you can do
// a... fantastic metric ton of castings in comparison to IO ops.
func openFile() error {
f, err := os.Open("cast.go")
if err != nil {
return fmt.Errorf("Error opening file")
}
defer f.Close()
br := bufio.NewReader(f)
if _, _, err := br.ReadLine(); err != nil {
return fmt.Errorf("Error reading line: %s", err)
}
return nil
}
With the following testcode
func init() {
rand.Seed(time.Now().UTC().UnixNano())
}
func BenchmarkCast(b *testing.B) {
b.StopTimer()
v := rand.Float64()
var err error
b.ResetTimer()
b.StartTimer()
for i := 0; i < b.N; i++ {
if _, err = cast(v); err != nil {
b.Fail()
}
}
}
func BenchmarkMulti32(b *testing.B) {
b.StopTimer()
vals := make([]float32, 10)
for i := 0; i < 10; i++ {
vals[i] = rand.Float32() * float32(i+1)
}
b.ResetTimer()
b.StartTimer()
for i := 0; i < b.N; i++ {
multi32(vals...)
}
}
func BenchmarkMulti64(b *testing.B) {
b.StopTimer()
vals := make([]float64, 10)
for i := 0; i < 10; i++ {
vals[i] = rand.Float64() * float64(i+1)
}
var err error
b.ResetTimer()
b.StartTimer()
for i := 0; i < b.N; i++ {
if _, err = multi64(vals...); err != nil {
b.Log(err)
b.Fail()
}
}
}
func BenchmarkOpenFile(b *testing.B) {
var err error
for i := 0; i < b.N; i++ {
if err = openFile(); err != nil {
b.Log(err)
b.Fail()
}
}
}
You get something like this
BenchmarkCast-4 1000000000 2.42 ns/op
BenchmarkMulti32-4 300000000 5.04 ns/op
BenchmarkMulti64-4 200000000 8.19 ns/op
BenchmarkOpenFile-4 100000 19591 ns/op
So, even with this relatively stupid and non optimized code, the perpetrator is the openFile benchmark.
Now, let us put this into perspective. 19,562ns equals to 0,019562 milliseconds. The average human can perceive latencies of about 20 milliseconds. So even those 100,000 ("one hundred thousand") file openings, line reads and file closes are about 1000 times faster than a human can perceive.
Casting, compared to this is several orders of magnitude faster - so cast all you like, your bottleneck will be I/O.
Edit
Which leaves the question why you do not import the values as float64 the first place?