doubeijian2257 2016-06-28 16:56
浏览 6
已采纳

将大型可自定义项目列表传输到客户端

I’m creating an API, that retrieves items from a third party component and returns these in a specified XML/CSV/TEXT structure, that can be customized by the admin via a template.

The problem: One API-request may easily include millions of items. So it’s memory-wise not possible to create the whole list server side and send it to the client.

Instead the items should be created on-the-fly and the results should be sent to the client immediately, without storing them in PHP’s memory.

How it this possible?

Example template:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

<items>
    {items}
        <item no="{number}">{item}</item>
    {/items}
</items>

Current code example without streaming. Not actually working, but you should get die idea:

echo preg_replace_callback('@{items}(.*){/items}@si', function (array $matches)
{
    return createItems($matches[1]);
}, $template);


function createItems($itemTemplate)
{
    $items = '';
    while (itemsExist()) {
        $items .= getItem($itemTemplate);
    }
}

I guess, I should stop buffering each item in a var and instead echo them directly? But how do I keep the XML’s/CSV’s/JSON’s structure intact or whatever else is in the template around the list?

  • 写回答

1条回答 默认 最新

  • dongwo7858 2016-06-28 17:20
    关注

    If you're reaching the point where the result sets that you're generating on the server are too large to fit into memory, you should consider how the clients of your API are going to process such a large result set too.

    There are two patterns that I've seen to solve this kind of problem:

    1. Pagination

    Use pagination within your API to return pages of results, just as you would on a webpage. Usually, this involves supplying a URL to the "next page" of results in the resultset in your API response. Then, the client can simply iterate each of the API responses until there is no "next page" URL present in the response, indicating that the end of the resultset has been reached.

    Your API response would look something like this:

    { 
       items: [ { }, { } ... ],
       next_page: "http://my.domain.com/results?page=2"
    }
    

    2. Asynchronous resultset generation

    With this approach, your clients would POST to your API and be immediately given a token.

    The API would perform the generation of the entire response in the background - usually using a message queue system such as RabbitMQ or SQS - saving the result to a file on the web server. Note that this takes place outside of an HTTP request, so the client is not blocking the webserver for the duration of the process.

    The client polls the API regularly, passing the token that it received from the API previously. Eventually, the API will respond with some data to indicate that the resultset has been generated and is ready to be downloaded. The API could then either include the contents of the resultset in it's response, or provide a URL that the client could download the resultset from.

    There is a third alternate, but I wouldn't recommend it unless you plan on building client libraries for your API consumers. You could make use of PHP's stream_* functions to create a stream that your API will operate over. This will allow you to push data onto the stream, and your clients to read data from the stream, without consuming high amounts of memory. There is a lot of additional work involved with this, however, especially if you need an entire XML/JSON document to be parsed by the client.

    I would recommend pagination. Its easy to reason about, not difficult to implement on the API end, reusable and removes memory consumption issues on both the client and server side.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器