分段堆栈如何工作

How do segmented stacks work? This question also applies to Boost.Coroutine so I am using the C++ tag as well here. The main doubt comes from this article It looks like what they do is keep some space at the bottom of the stack and check if it has gotten corrupted by registering some sort of signal handler with the memory allocated there (perhaps via mmap and mprotect?) And then when they detect that they have run out of space they continue by allocating more memory and then continuing from there. 3 questions about this

Isn't this construct a user space thing? How do they control where the new stack is allocated and how do the instructions the program is compiled down to get aware of that?

A push instruction is basically just adding a value to the stack pointer and then storing the value in a register on the stack, then how can the push instruction be aware of where the new stack starts and correspondingly how can the pop know when it has to move the stack pointer back to the old stack?
They also say

After we've got a new stack segment, we restart the goroutine by retrying the function that caused us to run out of stack

what does this mean? Do they restart the entire goroutine? Won't this possibly cause non deterministic behavior?
How do they detect that the program has overrun the stack? If they keep a canary-ish memory area at the bottom then what happens when the user program creates an array big enough that overflows that? Will that not cause a stack overflow and is a potential security vulnerability?

If the implementations are different for Go and Boost I would be happy to know how either of them deal with this situation

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dqqfuth6736 2017-09-16 06:47
关注
I'll give you a quick sketch of one possible implementation.

First, assume most stack frames are smaller than some size. For ones that are larger, we can use a longer instruction sequence at entry to make sure there is enough stack space. Let's assume we're on an architecture that that has 4k pages and we're choosing 4k - 1 as the maximum size stack frame handled by the fast path.

The stack is allocated with a single guard page at the bottom. That is, a page that is not mapped for write. At function entry, the stack pointer is decremented by the stack frame size, which is less than the size of a page, and then the program arranges to write a value at the lowest address in the newly allocated stack frame. If the end of the stack has been reached, this write will cause a processor exception and ultimately be turned into some sort of upcall from the OS to the user program -- e.g. a signal in UNIX family OSes.

The signal handler (or equivalent) has to be able to determine this is a stack extension fault from the address of the instruction that faulted and the address it was writing to. This is determinable as the instruction is in the prolog of a function and the address being written to is in the guard page of the stack for the current thread. The instruction being in the prolog can be recognized by requiring a very specific pattern of instructions at the start of functions, or possibly by maintaining metadata about functions. (Possibly using traceback tables.)

At this point the handler can allocate a new stack block, set the stack pointer to the top of the block, do something to handle unchaining the stack block, and then call the function that faulted again. This second call is safe because the fault is in the function prolog the compiler generated and no side effects are allowed before validating there is enough stack space. (The code may also need to fixup the return address for architectures that push it onto the stack automatically. If the return address is in a register, it just needs to be in the same register when the second call is made.)

Likely the easiest way to handle unchaining is to push a small stack frame onto the new extension block for a routine that when returned to unchains the new stack block and frees the allocated memory. It then returns the processor registers to the state they were in when the call was made that caused the stack to need to be extended.

The advantage of this design is that the function entry sequence is very few instructions and is very fast in the non-extending case. The disadvantage is that in the case where the stack does need to be extended, the processor incurs an exception, which may cost much much more than a function call.

Go doesn't actually use a guard page if I understand correctly. Rather the function prolog explicitly checks the stack limit and if the new stack frame won't fit it calls a function to extend the stack.

Go 1.3 changed its design to not use a linked list of stack blocks. This is to avoid the trap cost if the extension boundary is crossed in both directions many times in a certain calling pattern. They start with a small stack, and use a similar mechanism to detect the need for extension. But when a stack extension fault does occur, the entire stack is moved to a larger block. This removes the need for unchaining entirely.

There are quite a few details glossed over here. (E.g. one may not be able to do the stack extension in the signal handler itself. Rather the handler can arrange to have the thread suspended and hand it off to a manager thread. One likely has to use a dedicated signal stack to handle the signal as well.)

Another common pattern with this sort of thing is the runtime requiring there to be a certain amount of valid stack space below the current stack frame for either something like a signal handler or for calling special routines in the runtime. Go works this way and the stack limit test guarantees a certain fixed amount of stack space is available below the current frame. One can e.g. call plain C functions on the stack so long as one guarantees they do not consume more than the fixed stack reserve amount. (One can use this to call C library routines in theory, though most of these have no formal specification of how much stack they might use.)

Dynamic allocation in the stack frame, such as alloca or stack allocated variable length arrays, add some complexity to the implementation. If the routine can compute the entire final size of the frame in the prolog then it is fairly straightforward. Any increase in the frame size while the routine is running likely has to be modeled as a new call, though with Go's new architecture that allows moving the stack, it is possible the alloca point in the routine can be made such that all the state allows a stack move to happen there.

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

分段堆栈如何工作 c++
2017-09-16 03:39

回答 1 已采纳 I'll give you a quick sketch of one possible implementation. First, assume most stack frames are
oj平台显示分段错误，可能是堆栈溢出 c语言
2022-11-21 19:30

回答 1 已采纳第13行插入：str[i] = '\0';第15行：for(int j=0;j<=strlen(str);j++) 修改为：for(int j=0;j<strlen(str);j++)
Java语言高分悬赏：借助2个堆栈，实现java语言输入表达式自动计算表达式的值开发语言
2020-04-21 10:55

回答 2 已采纳数学表达式的计算是基于栈的先进后处的特性的，这里有个完整的实现思路和代码，请参考：https://www.cnblogs.com/xumz/p/7725053.html
Phrack安全杂志：详细分析堆栈溢出Smashing The Stack For Fun And Profit（Aleph One) 关于粉碎堆栈的秘密
2021-08-14 15:10

IT鹅的博客原文首发于Phrack，这是全球第一篇讲述堆栈漏洞的文章，里面提供了学习二进制安全的方法和基础理论知识，详细的描述了栈溢出的原理是什么？这里首次提到function prolouge 和 function epilogue，什么是Stack ？什么...
汇编中堆栈段的预留有什么作用？单片机开发语言有问必答
2021-10-15 21:56

回答 1 已采纳这是执行 push,pop指令时数据存存的地方，也是call ,int 子程序调用和中断调用时保护现场寄存器的地方。也就是说，没有这个栈空间，也就不能正常的调用子程序，调用中断程序
汇编语言扬声器实验不出声音开发语言
2023-02-21 15:28

回答 4 已采纳基于Monster 组和GPT的调写：程序的主要功能是控制扬声器发声，但没有提供实际的音频数据来产生声音。因此，即使程序没有报错，扬声器也不会有声音。要在程序中添加音频数据，要使用一个合适的音频格式
Java编程算法和数据结构? java 开发语言算法
2021-09-28 11:51

回答 1 已采纳是的。。。数据结构就那么几种
汇编语言程序设计入门
2021-11-30 19:27

kuchin的博客汇编语言程序设计一，汇编语言程序设计概述1，程序设计语言2，汇编语言源程序3，汇编语言程序开发过程二，汇编语言基本语法（重点）1，汇编语言的语句类型2，常量、标识符和表达式3，汇编语言程序伪指令(重点)4，DOS...
高分悬赏：Java语言定义一个堆栈类，然后用这个堆栈对输入的表达式检查，判断括号是否配对开发语言
2020-03-29 11:48

回答 2 已采纳 https://blog.csdn.net/tianzeyu1992/article/details/49760341
C++中关于堆栈操作的问题 c++
2018-12-06 03:32

回答 2 已采纳打印时候不应该用引用，而且建议结尾加个换行，这样更好看一些： ``` #include #include using namespace std; void StackSort(st
动态库是怎么解决不同编程语言共享相同的调用堆栈的？
2016-08-11 02:41

回答 2 已采纳动态链接库一旦写好，那么调用方式就是确定的了，无论什么编程语言，使用和动态库一致的堆栈约定（一般是stdcall）就可以调用其中的函数。
Go语言 Socket编程
2022-07-22 16:07

Ding Jiaxiong的博客 Go语言 Socket编程
在工作过程中生成紧急堆栈跟踪
2016-02-25 11:27

回答 2 已采纳 If you want to get a stack trace from a Go process, the default signal handler for SIGQUIT will p
汇编语言总结
2022-06-19 11:59

萌主¤小狸的博客计算机基本知识1.1计算机系统的概述1.1.1硬件1.1.2软件 1.2计算机中的数制1.2.1 数制基本概念 1.2.2十进制、二进制、十六进制转化1.3 BCD码，反码补码1.3.1 BCD码1.3.2 原码、反码、补码1.4 机器语言、汇编语言、...
Linux汇编语言编程-汇编语言
2023-12-14 20:39

Anokata的博客 RET 使用堆栈工作，将在第 7 章 7.2 节中讨论。 Program to Add Two Numbers 我们现在拥有编写一些简单但完整的程序所需的所有命令。这是一个 Edlinas 程序，用于将两个数字相加： MOV EDX, 0 ;Making all 32 bits...
没有解决我的问题, 去提问

悬赏问题

¥15 Oracle中如何从clob类型截取特定字符串后面的字符
¥15 想通过pywinauto自动电机应用程序按钮，但是找不到应用程序按钮信息
¥15 MATLAB中streamslice问题
¥15 如何在炒股软件中，爬到我想看的日k线
¥15 51单片机中C语言怎么做到下面类似的功能的函数（相关搜索：c语言）
¥15 seatunnel 怎么配置Elasticsearch
¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
¥15 (标签-MATLAB|关键词-多址)
¥15 关于#MATLAB#的问题，如何解决？（相关搜索：信噪比，系统容量）
¥500 52810做蓝牙接受端

分段堆栈如何工作

1条回答 默认 最新

悬赏问题

1条回答默认最新