All the answers on the question you found say it's not possible to use inline-asm in Go, with any syntax. GNU C inline-asm syntax isn't going to help.
But fortunately, you don't need inline asm for bsr
(which finds the bit-index of the highest set bit). Go 1.9 has an intrinsic / built-in function for bitwise operations that are close enough that they should compile efficiently.
Use math.bits.LeadingZeros32
to get lzcnt(x)
, which is 31-bsr(x)
for non-zero x
. This may cost extra instructions, especially on CPUs which only support bsr
, not lzcnt
(e.g. Intel pre-Haswell).
Or use Len32(x) - 1
Len32(x)
returns the number of bits required to represent x
. It returns 0
for x=0
, and presumably it returns 1
for x=1
, so it's bsr(x) + 1
, with defined behaviour for 0
(thus potentially costing extra instructions). Hopefully Len32(x) - 1
can compile directly to a bsr
.
Of course, if what you really wanted was lzcnt
, then use LeadingZeros32
in the first place.
Note that bsr
leaves the destination register unmodified for input = 0
. Intel's docs only say with an undefined value, so compilers probably don't take advantage of this guarantee that AMD documents and Intel does provide in hardware.
At least in theory, though, Len32(x) - 1
could compile to a single bsr
instruction if the compiler can prove that x
is non-zero.