I came across the following Go code:

type Element [12]uint64

func CSwap(x, y *Element, choice uint8)

func Add(z, x, y *Element)

where the CSwap and Add functions are basically coming from an assembly, and look like the following:

TEXT ·CSwap(SB), NOSPLIT, $0-17

    MOVQ    x+0(FP), REG_P1
    MOVQ    y+8(FP), REG_P2
    MOVB    choice+16(FP), AL   // AL = 0 or 1
    MOVBLZX AL, AX              // AX = 0 or 1
    NEGQ    AX                  // RAX = 0x00..00 or 0xff..ff

    MOVQ    (0*8)(REG_P1), BX
    MOVQ    (0*8)(REG_P2), CX
    // Rest removed for brevity

TEXT ·Add(SB), NOSPLIT, $0-24

    MOVQ    z+0(FP), REG_P3
    MOVQ    x+8(FP), REG_P1
    MOVQ    y+16(FP), REG_P2

    MOVQ    (REG_P1), R8
    MOVQ    (8)(REG_P1), R9
    MOVQ    (16)(REG_P1), R10
    MOVQ    (24)(REG_P1), R11
    // Rest removed for brevity

What I try to do is that translate the assembly to a syntax that is more familiar to me (I think mine is more like NASM), while the above syntax is Go assembler. Regarding the Add method I didn't have much problem, and translated it correctly (according to test results). It looks like this in my case:

.global add_asm
  push   r12
  push   r13
  push   r14
  push   r15

  mov    r8, [reg_p1]
  mov    r9, [reg_p1+8]
  mov    r10, [reg_p1+16]
  mov    r11, [reg_p1+24]
  // Rest removed for brevity

But, I have a problem when translating the CSwap function, I have something like this:

.global cswap_asm
  push   r12
  push   r13
  push   r14

  mov    al, 16
  mov    rax, al
  neg    rax

  mov    rbx, [reg_p1+(0*8)]
  mov    rcx, [reg_p2+(0*8)]

But this doesn't seem to be quite correct, as I get error when compiling it. Any ideas how to translate the above CSwap assembly part to something like NASM?


Okay, after the two answers below, and some testing and digging, I found out that the code uses the following three registers for parameter passing:

#define reg_p1  rdi
#define reg_p2  rsi
#define reg_p3  rdx

Accordingly, rdx has the value of the choice parameter. So, all that I had to do was use this:

movzx  rax, dl // Get the lower 8 bits of rdx (reg_p3)
neg    rax

Using byte [rdx] or byte [reg_3] was giving an error, but using dl seems to work fine for me.

dsjklb0205 由于mov需要2个相同大小的操作数,请尝试movzx
2 年多之前 回复
dououde4065 您将GoAdd转换为NASMadd_asm的翻译看起来是错误的,您缺少将参数加载到REG_P1,REG_P2和REG_P3中的部分。从它们的REG_P1等看,它们看起来像寄存器,但谁知道什么寄存器。
2 年多之前 回复
douchen7366 您始终可以通过使用Go汇编器进行汇编并在目标文件上使用反汇编器(例如objdump-drwC-Mintel)来检查翻译。这给了您GNU.intel_syntax,类似于MASM,但是当您习惯于NASM时,它很容易阅读。
2 年多之前 回复
douping1581 绝对不是气,这是Go汇编程序。
2 年多之前 回复


Basic docs about Go's asm: https://golang.org/doc/asm. It's not totally equivalent to NASM or AT&T syntax: FP is a pseudo-register name for whichever register it decides to use as the frame pointer. (Typically RSP or RBP). Go asm also seems to omit function prologue (and probably epilogue) instructions. As @RossRidge comments, it's a bit more like a internal representation like LLVM IR than truly asm.

Go also has its own object-file format, so I'm not sure you can make Go-compatible object files with NASM.

If you want to call this function from something other than Go, you'll also need to port the code to a different calling convention. Go appears to be using a stack-args calling convention even for x86-64, unlike the normal x86-64 System V ABI or the x86-64 Windows calling convention. (Or maybe those mov function args into REG_P1 and so on instructions disappear when Go builds this source for a register-arg calling convention?)

(This is why you could you had to use movzx eax, dl instead of loading from the stack at all.)

BTW, rewriting this code in C instead of NASM would probably make even more sense if you want to use it with C. Small functions are best inlined and optimized away by the compiler.

It would be a good idea to check your translation, or get a starting point, by assembling with the Go assembler and using a disassembler.

objdump -drwC -Mintel or Agner Fog's objconv disassembler would be good, but they don't understand Go's object-file format. If Go has a tool to extract the actual machine code or get it in an ELF object file, do that.

If not, you could use ndisasm -b 64 (which treats input files as flat binaries, disassembling all the bytes as if they were instructions). You can specify an offset/length if you can find out where the function starts. x86 instructions are variable length, and disassembly will likely be "out of sync" at the start of the function. You might want to add a bunch of single-byte NOP instructions (kind of a NOP sled) for the disassembler, so if it decodes some 0x90 bytes as part of an immediate or disp32 for a long instruction that was really not part of the function, it will be in sync. (But the function prologue will still be messed up).

You might add some "signpost" instructions to your Go asm functions to make it easy to find the right place in the mess of crazy asm from disassembling metadata as instructions. e.g. put a pmuludq xmm0, xmm0 in there somewhere, or some other instruction with a unique mnemonic that you can search for which the Go code doesn't include. Or an instruction with an immediate that will stand out, like addq $0x1234567, SP. (An instruction that will crash so you don't forget to take it out again is good here.)

Or you could use gdb's built-in disassembler: add an instruction that will segfault (like a load from a bogus absolute address (movl 0, AX null-pointer deref), or a register holding a non-pointer value e.g. movl (AX), AX). Then you'll have an instruction-pointer value for the instructions in memory, and can disassemble from some point behind that. (Probably the function start will be 16-byte aligned.)

Specific instructions.

MOVBLZX AL, AX reads AL, so that's definitely an 8-bit operand. The size for AX is given by the L part of the mnemonic, meaning long for 32 bit, like in GAS AT&T syntax. (The gas mnemonic for that form of movzx is movzbl %al, %eax). See What does cltq do in assembly? for a table of cdq / cdqe and the AT&T equivalent, and the AT&T / Intel mnemonic for the equivalent MOVSX instruction.

The NASM instruction you want is movzx eax, al. Using rax as the destination would be a waste of a REX prefix. Using ax as the destination would be a mistake: it wouldn't zero-extend into the full register, and would leave whatever high garbage. Go asm syntax for x86 is very confusing when you're not used to it, because AX can mean AX, EAX, or RAX depending on the operand size.

Obviously mov rax, al isn't a possibility: Like most instructions, mov requires both its operands to be the same size. movzx is one of the rare exceptions.

MOVB choice+16(FP), AL is a byte load into AL, not an immediate move. choice+16 is a an offset from FP. This syntax is basically the same as AT&T addressing modes, with FP as a register and choice as an assemble-time constant.

FP is a pseudo-register name. It's pretty clear that it should simply be loading the low byte of the 3rd arg-passing slot, because choice is the name of a function arg. (In Go asm, choice is just syntactic sugar, or a constant defined as zero.)

Before a call instruction, rsp points at the first stack arg, so that + 16 is the 3rd arg. It appears that FP is that base address (and might actually be rsp+8 or something). After a call (which pushes an 8 byte return address), the 3rd stack arg is at rsp + 24. After more pushes, the offset will be even larger, so adjust as necessary to reach the right location.

If you're porting this function to be called with a standard calling convention, the 3 integer args will be passed in registers, with no stack args. Which 3 registers depends on whether you're building for Windows vs. non-Windows. (See Agner Fog's calling conventions doc: http://agner.org/optimize/)

BTW, a byte load into AL and then movzx eax, al is just dumb. Much more efficient on all modern CPUs to do it in one step with

movzx  eax, byte [rsp + 24]      ; or rbp+32 if you made a stack frame.

I hope the source in the question is from un-optimized Go compiler output? Or the assembler itself makes such optimizations?

douyou8266 哦,您并不是说要从Go之外的其他地方调用NASM函数。 事后看来,这是有道理的,但是是的,当然,您必须将其移植到另一个调用约定中,而不是仅仅翻译所有指令。 您是正确的,根本不需要从内存加载args。
2 年多之前 回复
dongsi7759 谢谢。 更新了反汇编建议,其中包含不需要了解目标文件格式的内容。
2 年多之前 回复
dongtidai6519 movzx eax,字节[rsp + 16]可以很好地与NASM和YASM结合使用。 如果您从“类似的东西”中得到一个错误,则您尝试的一切还不够接近,或者您有一个%define宏正在破坏某些东西。 也许您将字节%define为空字符串? 但这给出了“未指定操作大小”,而不是不匹配。
2 年多之前 回复
dqhnp44220 好的,我找到了解决方案,并将其放在已编辑的问题中。 不过,您的回答很有帮助。 谢谢。
2 年多之前 回复
duanji9311 另外,如果我执行movzx eax,字节[rsp + 16]之类的操作(也与rbp相同),我仍然会收到错误消息:movzx退出代码的操作数大小不匹配:1.但是如果我将它们分开,例如mov al,byte [rsp + 16]然后movzx eax,可以编译,但是结果仍然错误,我想我没有得到第三个参数u8变量的值。
2 年多之前 回复
douqingji3026 我猜@RossRidge是正确的。 我确实运行了go工具asm test.s,它为我生成了test.o。 但是当我运行objconv -fnasm test.o时,出现一条错误消息:错误2018:未知文件0.x6F206F67的类型:test.o。 另外,如果我运行objdump -drwC -Mintel test.o,也会收到错误消息:objdump:test.o:无法识别文件格式。
2 年多之前 回复
doq91130 我不确定objdump是否可以工作,因为Go显然具有其自己的目标文件格式。 它的汇编语言也不是真正的汇编语言,它有点像LLVM的IL之类的内部表示形式。 FP寄存器是伪寄存器,与任何实际寄存器都不对应。 操作数choice + 16(FP)仅表示第三个参数,选择部分将被忽略。 网站上有一些有关Go的“汇编”语言的基本文档:golang.org/doc/asm
2 年多之前 回复
du27271 什么NASM源代码行给您movzx错误? movzx eax,字节[valid_mem_address]应该可以正常汇编。
2 年多之前 回复
dtmtu02882 我换了个样子,我认为FP是一个寄存器名称,即RSP或RBP。 达到选择的正确偏移量取决于您执行了多少次推送。
2 年多之前 回复
doushi2845 另外,我得到一个错误:“ movzx”的操作大小不匹配
2 年多之前 回复
dpdbu24262 选择实际上是Go函数中参数的名称。
2 年多之前 回复

我认为您可以将其翻译为</ p>

  mov rbx,[reg_p1]

mov rcx,[reg_p2]
</ code> </ pre>

除非我缺少一些技巧,否则可以忽略零偏移量。 * 8 </ code>并不是大小提示,因为它已经在指令中了。</ p>

其余的代码看起来还是错误的。 原始文件中的 MOVB choice + 16(FP),AL </ code>应该是将 choice </ code>参数传入AL,但是您将AL设置为常数16,并且 加载其他参数的代码似乎完全丢失了,其他函数中所有参数的代码也是如此。</ p>
</ div>



I think you can translate these as just

mov rbx, [reg_p1]
mov rcx, [reg_p2]

Unless I'm missing some subtlety, the offsets which are zero can just be ignored. The *8 isn't a size hint since that's already in the instruction.

The rest of your code looks wrong though. The MOVB choice+16(FP), AL in the original is supposed to be fetching the choice argument into AL, but you're setting AL to a constant 16, and the code for loading the other arguments seems to be completely missing, as is the code for all of the arguments in the other function.

douying7289 您可以忽略(0 * 8)部分,因为我已部分删除了代码,该部分也从(1 * 8)向下重复到(11 * 8)。 问题是我不知道如何翻译以下三行:MOVB choice + 16(FP),AL MOVBLZX AL,AX NEGQ AX在Add函数中,通过最初推入几个r寄存器,我的版本工作得很好。
2 年多之前 回复
Csdn user default icon