Firstly, I have to say it: Profile first. Is this really a bottleneck in your code? If it is, you have a few options.
1) Disable bounds checking. I think there's an undocumented compiler flag that turns of slice bounds checking. I can't find it at the moment though. (EDIT: -B
according to OP).
2) Write the routine in C (or assembler), you can write C for [586]c and link in your go package (you'll need to include some headers from $GOROOT/src/pkg/runtime
), like so:
#include "runtime.h"
mypackage·swapslice(Slice s) {
int i, j;
//Not a real swap loop
for (i = 0, j = s.len - 1; i < j; i++, j--)
//swap s.arr[i] and s.arr[j];
}