如何描述在 Linux 上运行的 c + + 代码？

I have a C++ application, running on Linux, which I'm in the process of optimizing. How can I pinpoint which areas of my code are running slowly?

转载于:https://stackoverflow.com/questions/375913/how-can-i-profile-c-code-running-on-linux

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

10条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
YaoRaoLov 2009-06-12 20:58
关注
If your goal is to use a profiler, use one of the suggested ones.

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

Just halt it several times, and each time look at the call stack. If there is some code that is wasting some percentage of the time, 20% or 50% or whatever, that is the probability that you will catch it in the act on each sample. So that is roughly the percentage of samples on which you will see it. There is no educated guesswork required. If you do have a guess as to what the problem is, this will prove or disprove it.

You may have multiple performance problems of different sizes. If you clean out any one of them, the remaining ones will take a larger percentage, and be easier to spot, on subsequent passes. This magnification effect, when compounded over multiple problems, can lead to truly massive speedup factors.

Caveat: Programmers tend to be skeptical of this technique unless they've used it themselves. They will say that profilers give you this information, but that is only true if they sample the entire call stack, and then let you examine a random set of samples. (The summaries are where the insight is lost.) Call graphs don't give you the same information, because

they don't summarize at the instruction level, and

they give confusing summaries in the presence of recursion.

They will also say it only works on toy programs, when actually it works on any program, and it seems to work better on bigger programs, because they tend to have more problems to find. They will say it sometimes finds things that aren't problems, but that is only true if you see something once. If you see a problem on more than one sample, it is real.

P.S. This can also be done on multi-thread programs if there is a way to collect call-stack samples of the thread pool at a point in time, as there is in Java.

P.P.S As a rough generality, the more layers of abstraction you have in your software, the more likely you are to find that that is the cause of performance problems (and the opportunity to get speedup).

Added: It might not be obvious, but the stack sampling technique works equally well in the presence of recursion. The reason is that the time that would be saved by removal of an instruction is approximated by the fraction of samples containing it, regardless of the number of times it may occur within a sample.

Another objection I often hear is: "It will stop someplace random, and it will miss the real problem". This comes from having a prior concept of what the real problem is. A key property of performance problems is that they defy expectations. Sampling tells you something is a problem, and your first reaction is disbelief. That is natural, but you can be sure if it finds a problem it is real, and vice-versa.

ADDED: Let me make a Bayesian explanation of how it works. Suppose there is some instruction I (call or otherwise) which is on the call stack some fraction f of the time (and thus costs that much). For simplicity, suppose we don't know what f is, but assume it is either 0.1, 0.2, 0.3, ... 0.9, 1.0, and the prior probability of each of these possibilities is 0.1, so all of these costs are equally likely a-priori.

Then suppose we take just 2 stack samples, and we see instruction I on both samples, designated observation o=2/2. This gives us new estimates of the frequency f of I, according to this:

Prior P(f=x) x P(o=2/2|f=x) P(o=2/2&&f=x) P(o=2/2&&f >= x) P(f >= x) 0.1 1 1 0.1 0.1 0.25974026 0.1 0.9 0.81 0.081 0.181 0.47012987 0.1 0.8 0.64 0.064 0.245 0.636363636 0.1 0.7 0.49 0.049 0.294 0.763636364 0.1 0.6 0.36 0.036 0.33 0.857142857 0.1 0.5 0.25 0.025 0.355 0.922077922 0.1 0.4 0.16 0.016 0.371 0.963636364 0.1 0.3 0.09 0.009 0.38 0.987012987 0.1 0.2 0.04 0.004 0.384 0.997402597 0.1 0.1 0.01 0.001 0.385 1 P(o=2/2) 0.385

The last column says that, for example, the probability that f >= 0.5 is 92%, up from the prior assumption of 60%.

Suppose the prior assumptions are different. Suppose we assume P(f=0.1) is .991 (nearly certain), and all the other possibilities are almost impossible (0.001). In other words, our prior certainty is that I is cheap. Then we get:

Prior P(f=x) x P(o=2/2|f=x) P(o=2/2&& f=x) P(o=2/2&&f >= x) P(f >= x) 0.001 1 1 0.001 0.001 0.072727273 0.001 0.9 0.81 0.00081 0.00181 0.131636364 0.001 0.8 0.64 0.00064 0.00245 0.178181818 0.001 0.7 0.49 0.00049 0.00294 0.213818182 0.001 0.6 0.36 0.00036 0.0033 0.24 0.001 0.5 0.25 0.00025 0.00355 0.258181818 0.001 0.4 0.16 0.00016 0.00371 0.269818182 0.001 0.3 0.09 0.00009 0.0038 0.276363636 0.001 0.2 0.04 0.00004 0.00384 0.279272727 0.991 0.1 0.01 0.00991 0.01375 1 P(o=2/2) 0.01375

Now it says P(f >= 0.5) is 26%, up from the prior assumption of 0.6%. So Bayes allows us to update our estimate of the probable cost of I. If the amount of data is small, it doesn't tell us accurately what the cost is, only that it is big enough to be worth fixing.

Yet another way to look at it is called the Rule Of Succession. If you flip a coin 2 times, and it comes up heads both times, what does that tell you about the probable weighting of the coin? The respected way to answer is to say that it's a Beta distribution, with average value (number of hits + 1) / (number of tries + 2) = (2+1)/(2+2) = 75%.

(The key is that we see I more than once. If we only see it once, that doesn't tell us much except that f > 0.)

So, even a very small number of samples can tell us a lot about the cost of instructions that it sees. (And it will see them with a frequency, on average, proportional to their cost. If n samples are taken, and f is the cost, then I will appear on nf+/-sqrt(nf(1-f)) samples. Example, n=10, f=0.3, that is 3+/-1.4 samples.)

ADDED, to give an intuitive feel for the difference between measuring and random stack sampling:
There are profilers now that sample the stack, even on wall-clock time, but what comes out is measurements (or hot path, or hot spot, from which a "bottleneck" can easily hide). What they don't show you (and they easily could) is the actual samples themselves. And if your goal is to find the bottleneck, the number of them you need to see is, on average, 2 divided by the fraction of time it takes. So if it takes 30% of time, 2/.3 = 6.7 samples, on average, will show it, and the chance that 20 samples will show it is 99.2%.

Here is an off-the-cuff illustration of the difference between examining measurements and examining stack samples. The bottleneck could be one big blob like this, or numerous small ones, it makes no difference.

Measurement is horizontal; it tells you what fraction of time specific routines take. Sampling is vertical. If there is any way to avoid what the whole program is doing at that moment, and if you see it on a second sample, you've found the bottleneck. That's what makes the difference - seeing the whole reason for the time being spent, not just how much.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(9条)

报告相同问题？

关注问题

如何描述在 Linux 上运行的 c + + 代码？ c++ unix
2008-12-17 20:29

回答 10 已采纳 If your goal is to use a profiler, use one of the suggested ones. However, if you're in a hurry a
linux下c代码怎么在Windows&VS2017上运行？ c++ c语言 linux unix
2018-12-26 17:25

回答 1 已采纳 1.检查你的文件路径.是否放在windows 下时是将整个项目的文件夹拿过的，之间的相对路径有没有错误 2.读取文件的时候在linux 下用的是 read(), write(), 等系统调
代码“本地编辑+远程运行”怎么做比较合适？ java linux python
2023-01-09 14:43

回答 1 已采纳对于本地编辑和远程运行代码的需求，我会推荐使用开发工具的远程开发功能，这样可以让你在本地使用开发工具的所有功能，同时将代码和运行环境全部放在远程服务器上。目前常见的开发工具都支持远程开发功能，包括
linux操作系统+linux下编程+实验报告及代码和操作过程+期末复习
2023-11-07 08:53

要求：将main主程序、计算阶乘的函数factorial分别保存在两个源文件中（main.c、factorial.c）,写一个makefile文件，编译main.c和factorial.c并生成可执行文件myfact。（4）一个项目包含4个源文件：main.c用来调用...
只编写一份Cef代码，就能同时在windows和linux上运行吗？ c++
2022-09-21 16:52

回答 2 已采纳对，要这样写才行，接口本来就不一样，没办法的。#ifdef win #else linux
在运行服务jar的时候，报 linux 1]+ Exit 1 linux 服务器
2021-12-17 16:31

回答 2 已采纳命令是没有错的，你既然出错了，那就检查下出错的问题，比如可以到/var/log/message看看有没有日志提示信息比如说直接运行java -jar xx.jar行不行
linux源代码在哪个位置？？？
2019-03-04 20:52

回答 1 已采纳去看linux 0.1的源代码，比较简单，现在的内核就很复杂了。有一本书叫做linux源代码分析，你可以参考它
基于C语言的FTP服务器与客户端的简单实现（Linux系统编程课程设计）+源代码+文档说明
2024-01-08 22:07

基于C语言的FTP服务器与客户端的简单实现（Linux系统编程课程设计）+源代码+文档说明 - 小白不懂运行，下载完可以私聊问，可远程教学该资源内项目源码是个人的课程设计，代码都测试ok，都是运行成功后才上传资源，...
可以在Linux上正常运行程序，但不能在Windows上运行
2018-11-09 21:20

回答 1 已采纳 Thanks for help from @zerkms. Answer is: input = strings.Replace(input, " ", "", -1) input = str
操作系统与Linux程序设计团队作业，基于Socket编程的多人聊天室 <C语言实现>+源代码+文档说明
2024-01-08 17:08

操作系统与Linux程序设计团队作业，基于Socket编程的多人聊天室 <C语言实现>+源代码+文档说明 - 小白不懂运行，下载完可以私聊问，可远程教学该资源内项目源码是个人的课程设计，代码都测试ok，都是运行成功后才...
在Linux环境下，通过C语言编程实现四个程序，分别为：堆排序、用栈实现表达式求值、B+树和红黑树+源代码+文档说明+实验报告
2024-01-08 02:11

在Linux环境下，通过C语言编程实现四个程序，分别为：堆排序、用栈实现表达式求值、B+树和红黑树+源代码+文档说明+实验报告 - 小白不懂运行，下载完可以私聊问，可远程教学该资源内项目源码是个人的课程设计，代码...
一款超轻量高性能跨平台的嵌入式脚本语言可以运行在WindowsLinuxMCU上致力于打造一款开源高效强大的编程语言
2024-04-08 09:23

一款超轻量、高性能、跨平台的嵌入式脚本语言，可以运行在Windows、Linux、MCU上。致力于打造一款开源、高效、强大的编程语言。洛书(Losu) 编程语言，全称 Language Of Systemd Units ，超轻量、跨平台、易扩展、...
学会在Linux上编译调试C++项目
2021-06-09 06:39

本课程主要针对没有或者很少写过linux上C++程序的同学，本课程会教你如何从0基础开始，安装配置ubuntu虚拟机、使用GCC编译普通程序、动态库、静态库，编写复杂项目配置文件makefile，使用GDB工具调试C++程序。
在Linux系统中运行C语言程序
2022-01-11 15:57

|NRUTER|的博客在之前的学习中已经在Windows系统中用Microsoft VC++上实现了C语言程序的运行，现在将在Linux系统上运行C语言程序。首先明确C语言程序开发的4个步骤：编辑、编译、链接、运行在Microsoft VC++中编译和运行都可以...
没有解决我的问题, 去提问

悬赏问题

¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
¥15 ETLCloud 处理json多层级问题
¥15 matlab中使用gurobi时报错
¥15 这个主板怎么能扩出一两个sata口
¥15 不是，这到底错哪儿了😭
¥15 2020长安杯与连接网探
¥15 关于#matlab#的问题：在模糊控制器中选出线路信息，在simulink中根据线路信息生成速度时间目标曲线（初速度为20m/s，15秒后减为0的速度时间图像）我想问线路信息是什么

如何描述在 Linux 上运行的 c + + 代码？

10条回答 默认 最新

悬赏问题

10条回答默认最新