远程解析XML与使用cURL本地保存文件并在本地解析

There is an XML file of data that is released every 10 minutes on a server that I will be using a cron job to parse and update on my site. I'll take the info, save it to a MySQL database and then display on my site. I have a question about best practice when doing this.

The file is about 200 - 300 KB so it's not very large but I had two ideas on how to do this:

1) Just use simplexml_load_file() to load the file and parse the info.

2) Use cURL to grab the file and save it to my server and then do the parse from my server locally.

I'm curious what best practice is and what would be the most efficient. With simplexml_load_file(), is the file loaded locally and then parsed or is loaded several times over as you go through the data? If it's just loaded once, I suppose that would be the best bet. One of my concerns is that I don't want to be bogging down the server that I'm grabbing the XML file from every time my cron job runs. I imagine it wouldn't since it's such a small file but I'm trying to just grab the file at the intervals and then do what needs to be done with the data in the best possible way.

I'm trying to wrap my head around how these functions work. Let me know if you need any more clarification on the question. I appreciate the help!

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
drt5813 2014-01-17 02:10
关注
Both will work fine. Normally, on a file that small I'd probably do what you're doing now. That being said, if it is time-sensitive and on a cron job, I'd do something a little different.

Pull the file over to your server and save a hash value. If the new file has a different hash than the other, then parse, else rerun the script in 30 seconds. If that runs every 8-9 mins your good +/- 2 mins.

That way you don't run the risk having the cron run 30 secs early and fall 9:30 mins behind.

To answer your question, "With simplexml_load_file(), is the file loaded locally and then parsed or is loaded several times over as you go through the data?" Yes, it pulls it to your server once, then parses the xml.

Hope that helps. :)

Edit: For more a more in depth explanation of what is going on you can search "http stateless get request" It's a ton to get your head around and the more I lear the more questions I have, ;) but it'll explain what's going on when your script makes a request to GET the xml (or other MIME type) file from another server

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

远程解析XML与使用cURL本地保存文件并在本地解析 mysql php xml
2014-01-17 01:58

回答 1 已采纳 Both will work fine. Normally, on a file that small I'd probably do what you're doing now. That
如何使用CURL将文件下载并保存到本地路径？ php
2013-07-24 07:01

回答 3 已采纳 Ok, got the solution. Sharing my answer. $getFile = 'http://url to file/file.csv'; $getParams =
解析CURL XML响应PHP php xml
2016-07-31 08:29

回答 1 已采纳 You can try this $string = '<result> <contact id="90676"> <Group_Tag name="Sequenc
libcurl-上传文件API-http工作原理-curl_mime_init-curl_mime_addpart-curl_mime_name-curl_mime_data
2022-06-18 08:00

插件开发的博客 http里没有专门用于文件上传的请求方式，文件上传请求是在post请求基础之上定义出来的一种方式。http协议规定 POST 请求提交的数据，必须放在请求体（entity-body）中，但协议并没有规定编码方式。开发者可以自己...
如何使用php中的curl将文件从本地上传到服务器？ php
2019-02-11 08:23

回答 1 已采纳 I think you miss important information. fopen ("user.com/user/fromLocal.txt", 'w+') this means
PHP - 从链接下载pdf文件并保存在本地文件夹中 php
2017-10-27 06:42

回答 3 已采纳 It is not clear what exactly you are doing and why your php script has to recognize pdf. If you a
无法使用curl检索JSON以在php中解析 json php
2017-04-28 10:39

回答 2 已采纳 cURL doesn't return transfer but it outputs. You must not output your transfer. Add CURLOPT_RETURN
大数据开发面试知识点复习2
2022-04-26 20:40

爱敲代码的小黑的博客文章目录大数据开发复习课程1、Hadoop1.1、介绍Hadoop1.2、Hadoop特性优点1.3、hadoop集群中hadoop都需要启动哪些进程，他们的作用分别是什么？1.4、Hadoop主要的配置文件1.5、Hadoop集群重要命令1.6、HDFS的垃圾桶...
如何保存通过curl获取的xml文件？ php xml
2013-10-08 12:57

回答 1 已采纳 You can just use file_put_contents function. In your case it will be: // Make the call $xml = c
使用cURL返回API XML响应 php xml
2017-08-12 19:17

回答 1 已采纳 I've just tried your code and it works fine with atom: $headers[] = 'Accept: application/atom+xm
Php Curl解析m3u文件 php
2017-01-24 05:33

回答 3 已采纳 Behold the magik of Regx $string = <<<CUT #EXTM3U #EXTINF:-1 tvg-id="" tvg-name="A&E" tv
Hadoop生态圈 大数据文档
2021-12-01 09:45

BigData_XiaoBai的博客文档基于介绍基于Hadoop的大数据生态圈。介绍下图每一个组件的使用场景及使用方法，同时还对每一个组件有更深入的介绍。 ...
使用curl和awk获取本地服务器的外网IP地址 centos 服务器
2022-09-26 13:02

回答 3 已采纳你需要加个参数 -s 隐藏耗时 curl -s myip.ipip.net | awk '{print $2}'
【elasticsearch实战】知识库文件系统检索工具FSCrawler
2024-02-21 16:59

竹二木的博客使用官方文档FSCrawler 提供了一站式的集成方案用来解决各种文档数据转化并存储到es数据库，也有一定的灵活性来自定义拓展字段，可以作为一种文档转换存储工具的选择之一。
大数据知识点归纳总结
2022-04-11 13:21

dinha的博客文章目录Hadoop数据采集Flume应用架构安装使用KafkaKafka架构Kafka优点主要组件brokertopic（主题）partition（分区）offsetproducer（生产者）consumer（消费者）consumer group（消费者组）partition replicas...
分布式文件系统FastDFS
2019-09-08 11:14

羌俊恩的博客 FastDFS是一个开源的轻量级分布式文件系统，是我国一款开源的分布式文件系统由阿里巴巴开发，其真题架构由跟踪服务器（tracker server）、存储服务器（storage server）和客户端（client）三个部分组成，主要解决了...
大数据笔记
2021-12-28 15:16

仰望天空。那逝去的流年的博客 大数据Linux 配置vim编辑器Centos7配置分区安装软件更换镜像集群防火墙设置环境配置网络设置建立主机名和ip的映射修改IP地址和主机名免密登录三台机器同步Hadoop1 配置core-site.xml2 配置hdfs-site.xml3 配置mapred...
没有解决我的问题, 去提问

悬赏问题

¥15 2024-五一综合模拟赛
¥15 下图接收小电路，谁知道原理
¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
¥15 ETLCloud 处理json多层级问题
¥15 matlab中使用gurobi时报错
¥15 这个主板怎么能扩出一两个sata口
¥15 不是，这到底错哪儿了😭

远程解析XML与使用cURL本地保存文件并在本地解析

1条回答 默认 最新

悬赏问题

1条回答默认最新