I came across the following Go code:
type Element uint64 //go:noescape func CSwap(x, y *Element, choice uint8) //go:noescape func Add(z, x, y *Element)
Addfunctions are basically coming from an assembly, and look like the following:
TEXT ·CSwap(SB), NOSPLIT, $0-17 MOVQ x+0(FP), REG_P1 MOVQ y+8(FP), REG_P2 MOVB choice+16(FP), AL // AL = 0 or 1 MOVBLZX AL, AX // AX = 0 or 1 NEGQ AX // RAX = 0x00..00 or 0xff..ff MOVQ (0*8)(REG_P1), BX MOVQ (0*8)(REG_P2), CX // Rest removed for brevity TEXT ·Add(SB), NOSPLIT, $0-24 MOVQ z+0(FP), REG_P3 MOVQ x+8(FP), REG_P1 MOVQ y+16(FP), REG_P2 MOVQ (REG_P1), R8 MOVQ (8)(REG_P1), R9 MOVQ (16)(REG_P1), R10 MOVQ (24)(REG_P1), R11 // Rest removed for brevity
What I try to do is that translate the assembly to a syntax that is more familiar to me (I think mine is more like NASM), while the above syntax is Go assembler. Regarding the
Addmethod I didn't have much problem, and translated it correctly (according to test results). It looks like this in my case:
.text .global add_asm add_asm: push r12 push r13 push r14 push r15 mov r8, [reg_p1] mov r9, [reg_p1+8] mov r10, [reg_p1+16] mov r11, [reg_p1+24] // Rest removed for brevity
But, I have a problem when translating the
CSwapfunction, I have something like this:
.text .global cswap_asm cswap_asm: push r12 push r13 push r14 mov al, 16 mov rax, al neg rax mov rbx, [reg_p1+(0*8)] mov rcx, [reg_p2+(0*8)]
But this doesn't seem to be quite correct, as I get error when compiling it. Any ideas how to translate the above
CSwapassembly part to something like NASM?
Okay, after the two answers below, and some testing and digging, I found out that the code uses the following three registers for parameter passing:
#define reg_p1 rdi #define reg_p2 rsi #define reg_p3 rdx
rdxhas the value of the
choiceparameter. So, all that I had to do was use this:
movzx rax, dl // Get the lower 8 bits of rdx (reg_p3) neg rax
byte [reg_3]was giving an error, but using
dlseems to work fine for me.
Basic docs about Go's asm: https://golang.org/doc/asm. It's not totally equivalent to NASM or AT&T syntax:
FP is a pseudo-register name for whichever register it decides to use as the frame pointer. (Typically RSP or RBP). Go asm also seems to omit function prologue (and probably epilogue) instructions. As @RossRidge comments, it's a bit more like a internal representation like LLVM IR than truly asm.
Go also has its own object-file format, so I'm not sure you can make Go-compatible object files with NASM.
If you want to call this function from something other than Go, you'll also need to port the code to a different calling convention. Go appears to be using a stack-args calling convention even for x86-64, unlike the normal x86-64 System V ABI or the x86-64 Windows calling convention. (Or maybe those
mov function args into
REG_P1 and so on instructions disappear when Go builds this source for a register-arg calling convention?)
(This is why you could you had to use
movzx eax, dl instead of loading from the stack at all.)
BTW, rewriting this code in C instead of NASM would probably make even more sense if you want to use it with C. Small functions are best inlined and optimized away by the compiler.
It would be a good idea to check your translation, or get a starting point, by assembling with the Go assembler and using a disassembler.
objdump -drwC -Mintel or Agner Fog's
objconv disassembler would be good, but they don't understand Go's object-file format. If Go has a tool to extract the actual machine code or get it in an ELF object file, do that.
If not, you could use
ndisasm -b 64 (which treats input files as flat binaries, disassembling all the bytes as if they were instructions). You can specify an offset/length if you can find out where the function starts. x86 instructions are variable length, and disassembly will likely be "out of sync" at the start of the function. You might want to add a bunch of single-byte NOP instructions (kind of a NOP sled) for the disassembler, so if it decodes some 0x90 bytes as part of an immediate or disp32 for a long instruction that was really not part of the function, it will be in sync. (But the function prologue will still be messed up).
You might add some "signpost" instructions to your Go asm functions to make it easy to find the right place in the mess of crazy asm from disassembling metadata as instructions. e.g. put a
pmuludq xmm0, xmm0 in there somewhere, or some other instruction with a unique mnemonic that you can search for which the Go code doesn't include. Or an instruction with an immediate that will stand out, like
addq $0x1234567, SP. (An instruction that will crash so you don't forget to take it out again is good here.)
Or you could use
gdb's built-in disassembler: add an instruction that will segfault (like a load from a bogus absolute address (
movl 0, AX null-pointer deref), or a register holding a non-pointer value e.g.
movl (AX), AX). Then you'll have an instruction-pointer value for the instructions in memory, and can disassemble from some point behind that. (Probably the function start will be 16-byte aligned.)
MOVBLZX AL, AX reads AL, so that's definitely an 8-bit operand. The size for AX is given by the
L part of the mnemonic, meaning
long for 32 bit, like in GAS AT&T syntax. (The gas mnemonic for that form of
movzbl %al, %eax). See What does cltq do in assembly? for a table of cdq / cdqe and the AT&T equivalent, and the AT&T / Intel mnemonic for the equivalent MOVSX instruction.
The NASM instruction you want is
movzx eax, al. Using
rax as the destination would be a waste of a REX prefix. Using
ax as the destination would be a mistake: it wouldn't zero-extend into the full register, and would leave whatever high garbage. Go asm syntax for x86 is very confusing when you're not used to it, because AX can mean AX, EAX, or RAX depending on the operand size.
mov rax, al isn't a possibility: Like most instructions,
mov requires both its operands to be the same size.
movzx is one of the rare exceptions.
MOVB choice+16(FP), AL is a byte load into
AL, not an immediate move.
choice+16 is a an offset from
FP. This syntax is basically the same as AT&T addressing modes, with FP as a register and
choice as an assemble-time constant.
FP is a pseudo-register name. It's pretty clear that it should simply be loading the low byte of the 3rd arg-passing slot, because
choice is the name of a function arg. (In Go asm,
choice is just syntactic sugar, or a constant defined as zero.)
rsp points at the first stack arg, so that + 16 is the 3rd arg. It appears that
FP is that base address (and might actually be
rsp+8 or something). After a
call (which pushes an 8 byte return address), the 3rd stack arg is at
rsp + 24. After more pushes, the offset will be even larger, so adjust as necessary to reach the right location.
If you're porting this function to be called with a standard calling convention, the 3 integer args will be passed in registers, with no stack args. Which 3 registers depends on whether you're building for Windows vs. non-Windows. (See Agner Fog's calling conventions doc: http://agner.org/optimize/)
BTW, a byte load into AL and then
movzx eax, al is just dumb. Much more efficient on all modern CPUs to do it in one step with
movzx eax, byte [rsp + 24] ; or rbp+32 if you made a stack frame.
I hope the source in the question is from un-optimized Go compiler output? Or the assembler itself makes such optimizations?
</ code> </ pre>
* 8 </ code>并不是大小提示，因为它已经在指令中了。</ p>
MOVB choice + 16（FP），AL </ code>应该是将
choice </ code>参数传入AL，但是您将AL设置为常数16，并且 加载其他参数的代码似乎完全丢失了，其他函数中所有参数的代码也是如此。</ p>
I think you can translate these as just
mov rbx, [reg_p1] mov rcx, [reg_p2]
Unless I'm missing some subtlety, the offsets which are zero can just be ignored. The
*8 isn't a size hint since that's already in the instruction.
The rest of your code looks wrong though. The
MOVB choice+16(FP), AL in the original is supposed to be fetching the
choice argument into AL, but you're setting AL to a constant 16, and the code for loading the other arguments seems to be completely missing, as is the code for all of the arguments in the other function.
- 学院 Python可以这样学（第一季：Python内功修炼）
- 下载 获取Linux下Ftp目录树并逐步绑定到treeview
- 下载 NS网络模拟和协议仿真源代码
- 下载 简单的NS3网络模拟仿真（计算机网络作业）
- 学院 手把手实现Java图书管理系统（附源码）
- 学院 三个项目玩转深度学习（附1G源码）
- 学院 150讲轻松搞定Python网络爬虫
- 下载 cuda开发cutilDLL
- 下载 Tensorflow与python3.7适配版本
- 学院 4小时玩转微信小程序——基础入门与微信支付实战
- 下载 实现简单的文件系统
- 学院 机器学习初学者必会的案例精讲
- 下载 四分之一悬架模型simulink.7z
- 学院 C++语言基础视频教程
- 学院 Java8零基础入门视频教程
- 学院 HoloLens2开发入门教程
- 下载 pokemmo的资源
- 下载 test_head.py
- 博客 Java面试史上最全的JAVA专业术语面试100问 (前1-50)
- 学院 2019 AI开发者大会
- 下载 DirectX修复工具V4.0增强版
- 博客 20行代码教你用python给证件照换底色
- 学院 2019 Python开发者日-培训
- 博客 我以为我对Mysql事务很熟，直到我遇到了阿里面试官