doubi7739 2018-10-11 19:31
浏览 146
已采纳

为什么我的Python代码比PHP中的相同代码慢100倍?

I have two points (x1 and x2) and want to generate a normal distribution in a given step count. The sum of y values for the x values between x1 and x2 is 1. To the actual problem:

I'm fairly new to Python and wonder why the following code produces the desired result, but about 100x slower than the same program in PHP. There are about 2000 x1-x2 pairs and about 5 step values per pair.

I tried to compile with Cython, used multiprocessing but it just improved things 2x, which is still 50x slower than PHP. Any suggestions how to improve speed to match at least PHP performance?

from scipy.stats import norm
import numpy as np
import time

# Calculates normal distribution
def calculate_dist(x1, x2, steps, slope):
    points = []
    range = np.linspace(x1, x2, steps+2)

    for x in range:
        y = norm.pdf(x, x1+((x2-x1)/2), slope)
        points.append([x, y])

    sum = np.array(points).sum(axis=0)[1]

    norm_points = []
    for point in points:
        norm_points.append([point[0], point[1]/sum])

    return norm_points

start = time.time()
for i in range(0, 2000):
    for j in range(10, 15):
        calculate_dist(0, 1, j, 0.15)

print(time.time() - start) # Around 15 seconds or so

Edit, PHP Code:

$start = microtime(true);

for ($i = 0; $i<2000; $i++) {
    for ($j = 10; $j<15; $j++) {
        $x1 = 0; $x2 = 1; $steps = $j; $slope = 0.15;
        $step = abs($x2-$x1) / ($steps + 1);

        $points = [];
        for ($x = $x1; $x <= $x2 + 0.000001; $x += $step) {
            $y = stats_dens_normal($x, $x1 + (($x2 - $x1) / 2), $slope);
            $points[] = [$x, $y];
        }

        $sum = 0;
        foreach ($points as $point) {
            $sum += $point[1];
        }

        $norm_points = [];
        foreach ($points as &$point) {
            array_push($norm_points, [$point[0], $point[1] / $sum]);
        }
    }
}

return microtime(true) - $start; # Around 0.1 seconds or so

Edit 2, profiled each line and found that norm.pdf() was taking 98% of time, so found a custom normpdf function and defined it, now time is around 0.67s which is considerably faster, but still around 10x slower than PHP. Also I think redefining common functions goes against the idea of Pythons simplicity?!

The custom function (source is some other Stackoverflow answer):

from math import sqrt, pi, exp
def normpdf(x, mu, sigma):
    u = (x-mu)/abs(sigma)
    y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
    return y

展开全部

  • 写回答

1条回答 默认 最新

  • douan4347 2018-10-12 00:39
    关注

    The answer is, you aren't using the right tools/data structures for the tasks in python.

    Calling numpy functionality has quite an overhead (scipy.stats.norm.pdf uses numpy under the hood) in python and thus one would never call this functions for one element but for the whole array (so called vectorized computation), that means instead of

    for x in range:
            y = norm.pdf(x, x1+((x2-x1)/2), slope)
            ys.append(y)
    

    one would rather use:

    ys = norm.pdf(x,x1+((x2-x1)/2), slope)
    

    calculating pdf for all elements in x and paying the overhead only once rather than len(x) times.

    For example to calculate pdf for 10^4 elements takes less than 10 times more time than for one element:

    %timeit norm.pdf(0)   # 68.4 µs ± 1.62 µs
    %timeit norm.pdf(np.zeros(10**4))   # 415 µs ± 12.4 µs
    

    Using vectorized computation will not only make your program faster but often also shorter/easier to understand, for example:

    def calculate_dist_vec(x1, x2, steps, slope):
        x = np.linspace(x1, x2, steps+2)
        y = norm.pdf(x, x1+((x2-x1)/2), slope)
        ys = y/np.sum(y)
        return x,ys
    

    Using this vectorized version gives you a speed-up around 10.

    The problem: norm.pdf is optimized for long vectors (nobody really cares how fast/slow it is for 10 elements if it is very fast for one million elements), but your test is biased against numpy, because it uses/creates only short arrays and thus norm.pdf cannot shine.

    So if it is really about small arrays and you are serious about speeding it up you will have to roll out your own version of norm.pdf Using cython for creating this fast and specialized function might be worth a try.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

悬赏问题

  • ¥15 全志t113i启动qt应用程序提示internal error
  • ¥15 ensp可以看看嘛.
  • ¥80 51单片机C语言代码解决单片机为AT89C52是清翔单片机
  • ¥60 优博讯DT50高通安卓11系统刷完机自动进去fastboot模式
  • ¥15 minist数字识别
  • ¥15 在安装gym库的pygame时遇到问题,不知道如何解决
  • ¥20 uniapp中的webview 使用的是本地的vue页面,在模拟器上显示无法打开
  • ¥15 网上下载的3DMAX模型,不显示贴图怎么办
  • ¥15 关于#stm32#的问题:寻找一块开发版,作为智能化割草机的控制模块和树莓派主板相连,要求:最低可控制 3 个电机(两个驱动电机,1 个割草电机),其次可以与树莓派主板相连电机照片如下:
  • ¥15 Mac(标签-IDE|关键词-File) idea
手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部