doubi7739 2018-10-12 03:31
浏览 146
已采纳

为什么我的Python代码比PHP中的相同代码慢100倍?

I have two points (x1 and x2) and want to generate a normal distribution in a given step count. The sum of y values for the x values between x1 and x2 is 1. To the actual problem:

I'm fairly new to Python and wonder why the following code produces the desired result, but about 100x slower than the same program in PHP. There are about 2000 x1-x2 pairs and about 5 step values per pair.

I tried to compile with Cython, used multiprocessing but it just improved things 2x, which is still 50x slower than PHP. Any suggestions how to improve speed to match at least PHP performance?

from scipy.stats import norm
import numpy as np
import time

# Calculates normal distribution
def calculate_dist(x1, x2, steps, slope):
    points = []
    range = np.linspace(x1, x2, steps+2)

    for x in range:
        y = norm.pdf(x, x1+((x2-x1)/2), slope)
        points.append([x, y])

    sum = np.array(points).sum(axis=0)[1]

    norm_points = []
    for point in points:
        norm_points.append([point[0], point[1]/sum])

    return norm_points

start = time.time()
for i in range(0, 2000):
    for j in range(10, 15):
        calculate_dist(0, 1, j, 0.15)

print(time.time() - start) # Around 15 seconds or so

Edit, PHP Code:

$start = microtime(true);

for ($i = 0; $i<2000; $i++) {
    for ($j = 10; $j<15; $j++) {
        $x1 = 0; $x2 = 1; $steps = $j; $slope = 0.15;
        $step = abs($x2-$x1) / ($steps + 1);

        $points = [];
        for ($x = $x1; $x <= $x2 + 0.000001; $x += $step) {
            $y = stats_dens_normal($x, $x1 + (($x2 - $x1) / 2), $slope);
            $points[] = [$x, $y];
        }

        $sum = 0;
        foreach ($points as $point) {
            $sum += $point[1];
        }

        $norm_points = [];
        foreach ($points as &$point) {
            array_push($norm_points, [$point[0], $point[1] / $sum]);
        }
    }
}

return microtime(true) - $start; # Around 0.1 seconds or so

Edit 2, profiled each line and found that norm.pdf() was taking 98% of time, so found a custom normpdf function and defined it, now time is around 0.67s which is considerably faster, but still around 10x slower than PHP. Also I think redefining common functions goes against the idea of Pythons simplicity?!

The custom function (source is some other Stackoverflow answer):

from math import sqrt, pi, exp
def normpdf(x, mu, sigma):
    u = (x-mu)/abs(sigma)
    y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
    return y
  • 写回答

1条回答 默认 最新

  • douan4347 2018-10-12 08:39
    关注

    The answer is, you aren't using the right tools/data structures for the tasks in python.

    Calling numpy functionality has quite an overhead (scipy.stats.norm.pdf uses numpy under the hood) in python and thus one would never call this functions for one element but for the whole array (so called vectorized computation), that means instead of

    for x in range:
            y = norm.pdf(x, x1+((x2-x1)/2), slope)
            ys.append(y)
    

    one would rather use:

    ys = norm.pdf(x,x1+((x2-x1)/2), slope)
    

    calculating pdf for all elements in x and paying the overhead only once rather than len(x) times.

    For example to calculate pdf for 10^4 elements takes less than 10 times more time than for one element:

    %timeit norm.pdf(0)   # 68.4 µs ± 1.62 µs
    %timeit norm.pdf(np.zeros(10**4))   # 415 µs ± 12.4 µs
    

    Using vectorized computation will not only make your program faster but often also shorter/easier to understand, for example:

    def calculate_dist_vec(x1, x2, steps, slope):
        x = np.linspace(x1, x2, steps+2)
        y = norm.pdf(x, x1+((x2-x1)/2), slope)
        ys = y/np.sum(y)
        return x,ys
    

    Using this vectorized version gives you a speed-up around 10.

    The problem: norm.pdf is optimized for long vectors (nobody really cares how fast/slow it is for 10 elements if it is very fast for one million elements), but your test is biased against numpy, because it uses/creates only short arrays and thus norm.pdf cannot shine.

    So if it is really about small arrays and you are serious about speeding it up you will have to roll out your own version of norm.pdf Using cython for creating this fast and specialized function might be worth a try.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用PLC设计纸袋糊底机送料系统
  • ¥15 simulink仿真中dtc控制永磁同步电机如何控制开关频率
  • ¥15 用C语言输入方程怎么
  • ¥15 网站显示不安全连接问题
  • ¥15 github训练的模型参数无法下载
  • ¥15 51单片机显示器问题
  • ¥20 关于#qt#的问题:Qt代码的移植问题
  • ¥50 求图像处理的matlab方案
  • ¥50 winform中使用edge的Kiosk模式
  • ¥15 关于#python#的问题:功能监听网页