doudou1309 2009-09-16 16:58
浏览 199
已采纳

地图聚类算法

My current code is pretty quick, but I need to make it even faster so we can accommodate even more markers. Any suggestions?

Notes:

  • The code runs fastest when the SQL statement is ordered by marker name - which itself does a very partial job of clustering the markers (the names of markers in the same location are often, but not always similar).
  • I can't pre-cluster the markers, because they can be dynamically searched and filtered.
  • I've tried grid-based clustering - but the results often aren't very nice.
  • I know that the clusters are slightly skewed on a Mercator projection.
  • I'm not interested in a commercial clustering service.

The code:

$singleMarkers = array();
$clusterMarkers = array();

while (count($markers)) {
    $marker  = array_pop($markers);
    $cluster = array();

    // Compare marker against all remaining markers.
    foreach ($markers as $key => $compareMarker) {
        // This function returns the distance between two markers, at a defined zoom level.
        $pixels = pixelDistance($marker['lat'], $marker['lng'], $compareMarker['lat'], $compareMarker['lng'], $zoomLevel);
        // If two markers are closer than defined distance, remove compareMarker from array and add to cluster.
        if ($pixels < $distance) {
            unset($markers[$key]);
            $cluster[] = $compareMarker;
        }
    }

    // If a marker was added to cluster, also add the marker we were comparing to.
    if (count($cluster) > 0) {
        $cluster[] = $marker;
        $clusterMarkers[] = $cluster;
    } else {
        $singleMarkers[] = $marker;
    }
}

function pixelDistance($lat1, $lon1, $lat2, $lon2, $zoom) {
    $x1 = $lon1*10000000; //This is what I did to compensate for using lat/lon values instead of pixels.
    $y1 = $lat1*10000000;
    $x2 = $lon2*10000000;
    $y2 = $lat2*10000000;

    return sqrt(pow(($x1-$x2),2) + pow(($y1-$y2),2)) >> (21 - $zoom); //21 is the max zoom level
}

UPDATE

Here's the current code:

$singleMarkers = array();
$clusterMarkers = array();

// Minimum distance between markers to be included in a cluster, at diff. zoom levels
$DISTANCE = (10000000 >> $ZOOM) / 100000;

// Loop until all markers have been compared.
while (count($markers)) {
    $marker  = array_pop($markers);
    $cluster = array();

    // Compare against all markers which are left.
    foreach ($markers as $key => $target) {
        $pixels = abs($marker['lat']-$target['lat']) + abs($marker['lng']-$target['lng']);

        // If the two markers are closer than given distance remove target marker from array and add it to cluster.
        if ($pixels < $DISTANCE) {
            unset($markers[$key]);
            $cluster[] = $target;
        }
    }

    // If a marker has been added to cluster, add also the one we were comparing to.
    if (count($cluster) > 0) {
        $cluster[] = $marker;
        $clusterMarkers[] = $cluster;
    } else {
        $singleMarkers[] = $marker;
    }
}
  • 写回答

8条回答 默认 最新

  • donglvhe7591 2009-09-16 17:40
    关注

    Do you actually need to calculate the Euclidean distance? If you are just comparing relative magnitudes of distances, you can probably get away with using the Manhattan distance, which will save you two calls to pow() and one to sqrt():

    function pixelDistance($lat1, $lon1, $lat2, $lon2, $zoom) {
        $x1 = $lon1*10000000; //This is what I did to compensate for using lat/lon values instead of pixels.
        $y1 = $lat1*10000000;
        $x2 = $lon2*10000000;
        $y2 = $lat2*10000000;
    
        return ($x1-$x2) + ($y1-$y2) >> (21 - $zoom);
    }
    

    Not sure if you need the >> (21 - $zoom) bit for your calculations, so I left it in. But unless you actually need to use the calculated distance values elsewhere, you can probably get away with just using the latitude/longitude directly (no need to multiply by anything) and taking the Manhattan distance, assuming you pre-calculate $distance to fit in with that measure, which will be a lot cheaper in computational terms than coercing all the distances to fit in with the units and magnitude of $distance.

    EDIT: When I was researching this problem last year, I found some useful stuff on Wikipedia - yes, it can happen ;-)

    I can also highly recommend the book Programming Collective Intelligence: Building Smart Web 2.0 Applications which goes into clustering in great depth, as applied not only to geographical data but also to other areas of data analysis.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(7条)

报告相同问题?

悬赏问题

  • ¥35 平滑拟合曲线该如何生成
  • ¥100 c语言,请帮蒟蒻写一个题的范例作参考
  • ¥15 名为“Product”的列已属于此 DataTable
  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 自己瞎改改,结果现在又运行不了了
  • ¥15 链式存储应该如何解决
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站