doushi5117 2014-01-17 01:58
浏览 59
已采纳

远程解析XML与使用cURL本地保存文件并在本地解析

There is an XML file of data that is released every 10 minutes on a server that I will be using a cron job to parse and update on my site. I'll take the info, save it to a MySQL database and then display on my site. I have a question about best practice when doing this.

The file is about 200 - 300 KB so it's not very large but I had two ideas on how to do this:

1) Just use simplexml_load_file() to load the file and parse the info.

2) Use cURL to grab the file and save it to my server and then do the parse from my server locally.

I'm curious what best practice is and what would be the most efficient. With simplexml_load_file(), is the file loaded locally and then parsed or is loaded several times over as you go through the data? If it's just loaded once, I suppose that would be the best bet. One of my concerns is that I don't want to be bogging down the server that I'm grabbing the XML file from every time my cron job runs. I imagine it wouldn't since it's such a small file but I'm trying to just grab the file at the intervals and then do what needs to be done with the data in the best possible way.

I'm trying to wrap my head around how these functions work. Let me know if you need any more clarification on the question. I appreciate the help!

  • 写回答

1条回答 默认 最新

  • drt5813 2014-01-17 02:10
    关注

    Both will work fine. Normally, on a file that small I'd probably do what you're doing now. That being said, if it is time-sensitive and on a cron job, I'd do something a little different.

    Pull the file over to your server and save a hash value. If the new file has a different hash than the other, then parse, else rerun the script in 30 seconds. If that runs every 8-9 mins your good +/- 2 mins.

    That way you don't run the risk having the cron run 30 secs early and fall 9:30 mins behind.

    To answer your question, "With simplexml_load_file(), is the file loaded locally and then parsed or is loaded several times over as you go through the data?" Yes, it pulls it to your server once, then parses the xml.

    Hope that helps. :)

    Edit: For more a more in depth explanation of what is going on you can search "http stateless get request" It's a ton to get your head around and the more I lear the more questions I have, ;) but it'll explain what's going on when your script makes a request to GET the xml (or other MIME type) file from another server

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 2024-五一综合模拟赛
  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭