See math.Sqrt for an example of how to do this.
- Write a stub function with the documentation
- Write a generic implementation as an unexported function.
- For each architecture, write a function in assembler that jumps to the unexported generic implementation or implements the function directly.
To handle the cpuid check, set a package variable in
init() and conditionally jump based on that variable in the assembly implementation.