 # 如何有效地计算一个正在运行的标准差？

I have an array of lists of numbers, e.g.:

`````` (0.01, 0.01, 0.02, 0.04, 0.03)
 (0.00, 0.02, 0.02, 0.03, 0.02)
 (0.01, 0.02, 0.02, 0.03, 0.02)
...
[n] (0.01, 0.00, 0.01, 0.05, 0.03)
``````

What I would like to do is efficiently calculate the mean and standard deviation at each index of a list, across all array elements.

To do the mean, I have been looping through the array and summing the value at a given index of a list. At the end, I divide each value in my "averages list" by `n`.

To do the standard deviation, I loop through again, now that I have the mean calculated.

I would like to avoid going through the array twice, once for the mean and then once for the SD (after I have a mean).

Is there an efficient method for calculating both values, only going through the array once? Any code in an interpreted language (e.g. Perl or Python) or pseudocode is fine.

• 写回答

#### 14条回答默认 最新

• 笑故挽风 2009-08-28 18:24
关注

The answer is to use Welford's algorithm, which is very clearly defined after the "naive methods" in:

It's more numerically stable than either the two-pass or online simple sum of squares collectors suggested in other responses. The stability only really matters when you have lots of values that are close to each other as they lead to what is known as "catastrophic cancellation" in the floating point literature.

You might also want to brush up on the difference between dividing by the number of samples (N) and N-1 in the variance calculation (squared deviation). Dividing by N-1 leads to an unbiased estimate of variance from the sample, whereas dividing by N on average underestimates variance (because it doesn't take into account the variance between the sample mean and the true mean).

I wrote two blog entries on the topic which go into more details, including how to delete previous values online:

You can also take a look at my Java implement; the javadoc, source, and unit tests are all online:

本回答被题主选为最佳回答 , 对您是否有帮助呢?
评论

#### 悬赏问题

• ¥15 Pycharm无法自动补全，识别第三方库函数接收的参数！
• ¥15 STM32U575 pwm和DMA输出的波形少一段
• ¥30 android百度地图SDK海量点显示标题
• ¥15 windows导入environment.yml运行conda env create -f environment_win.yml命令报错
• ¥15 这段代码可以正常运行，打包后无法执行，在执行for内容之前一直不断弹窗，请修改调整
• ¥15 C语言判断有向图是否存在环路
• ¥15 请问4.11到4.18以及4.27和4.29公式的具体推导过程是怎样的呢
• ¥20 将resnet50中的卷积替换微ODConv动态卷积
• ¥15 通过文本框输入商品信息点击按钮将商品信息列举出来点击加入购物车商品信息添加到表单中
• ¥100 这是什么压缩算法？如何解压？