JavaScript函数同步抓取HTML和JS

Is there a library that would support synchronous JavaScript functions like the following?

function getPageHTML(url){
     // scrape HTML from external web page
     return html;
}

function getPageJS(url){
     // scrape final JavaScript variable results from external web page
     return js;
}

I like the concept behind pjscrape, but don't want to use command-line script. I don't mind using PHP, but I want my function to be synchronous.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongyue7796 2016-07-23 17:31
关注
There is no Javascript environment where it is recommended to use synchronous networking to retrieve data from some external server. This is just not how Javascript is designed. Javascript is designed to use asynchronous I/O where the result will be returned via a promise or a callback and cannot be returned directly from your function call.

The "A" in "Ajax" stands for asynchronous. That is a cornerstone of making networked requests from Javascript in the browser. The browser can technically do a synchronous Ajax call, but that is not recommended for a variety of reasons (like it hangs the UI in the browser during the call) and it is being deprecated in many circumstances too because it's almost never a good idea to use synchronous ajax. In addition Ajax calls from the browser are limited to either the same origin that your web page came from or to servers that explicitly allow cross origin requests. So, you can't expect to make an Ajax call to fetch any arbitrary page on the internet. You won't be able to fetch most other pages from a browser web page Ajax call.

What the browser is good at is asynchronous networking where the result is returned asynchronous via a callback or promise sometime in the future and the rest of your Javascript continues to run until then. This is how you should code your access to network requests.

If you want to get scraped results in a browser from some external site, the preferred architecture for that would be to set up a server that will do the work for you. Your Javascript in your web page will make an Ajax call to your own server asking it to scrape a specific web site. The server (which has no cross origin limitations on what hosts it can make requests from) will then fetch the content, scrape it into the desired results and then return the resulting scraped data to your Ajax call.

So, you could design a promise based interface in your client that could work asynchronously like this:

getPageJS(someUrl).then(function(data) { // process data here }).catch(function(err) { // process error here });
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

JavaScript函数同步抓取HTML和JS javascript php
2016-07-23 16:39

回答 1 已采纳 There is no Javascript environment where it is recommended to use synchronous networking to retrie
Web前端开发JavaScript html css html5 javascript
2022-05-20 18:49

回答 3 已采纳 <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" />
javascript为什么函数自动触发 html javascript 前端
2022-11-12 19:43

回答 2 已采纳方法后面不能加括号，会立即执行传的参数固定，可以考虑上面声明个变量，或者通过点击对象，获取标签自定义属性
web前端 - JavaScript 中一流函数的日常用例
2021-12-22 18:01

爱创乐育知识速递的博客例如，在 JavaScript 中，我们可以将一个函数分配给一个变量。 var sum = function(a, b) { return a + b; } var total = sum(10, 1); 如果我们是第一次阅读，这个定义会有点混乱。然而，事实是我们在不知
JavaScript 前端调用后端接口异常问题 javascript json 前端
2022-07-16 09:07

回答 3 已采纳你看下你是不是缺少了必传参数。
js能否实现html前端不同背景指定颜色 html javascript 前端
2022-07-22 22:53

回答 2 已采纳 <!doctype html> <html lang="en"> <head> <meta charset="UTF-8" /> &l
JavaScript的escape和unescape html5 javascript 前端
2022-04-05 18:04

回答 1 已采纳 escape可以使用 encodeURI 或 encodeURIComponent 代替unescape可以使用 decodeURI或者decodeURIComponent
CSS，HTML，JS 以及Vue前端面试题八股文总结【看完你就变高手】
2023-01-30 21:08

蒙奇不想敲代码的博客集合了前端大部分基础知识以及常见面试题，看完过初试完全没问题
在HTML中的JavaScript函数中使用PHP html javascript php
2017-03-10 19:39

回答 1 已采纳 So it took a bit time to figure out but the mistake was pretty simple. All I had to do was REMOVE
JavaScript中函数对象与函数有什么区别？ javascript
2019-09-02 16:16

回答 3 已采纳 JS中的函数是一种叫做Function引用类型的实例，因此函数是一个对象。函数名则是指向这个对象的引用地址。做为一个对象，函数是可以赋值传递的。 dom元素的事件所需要的是函数对象的引用地址。
请问如何用HTML和JS实现鼠标触碰文字跳动 html5 javascript 前端
2022-09-21 20:59

回答 2 已采纳 <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <m
前端面试题目总结（HTML、CSS、JS、VUE、HTTP）
2022-04-30 23:18

jasmine_qiqi的博客 1、HMTL 2、CSS 3、JS 4、VUE 5、HTTP 6、总结
JavaScript中的构造函数Function有什么用处？ javascript 前端
2021-08-18 21:58

回答 1 已采纳 JavaScript new Function的使用教程_阿飞-CSDN博客 JavaScript new Function 的使用new Function，可以往函数里动态的传递内容，语
JavaScript文件的同步和异步加载 - 20170820 前端开发日报
2017-08-21 22:49

前端开发博客的博客 JavaScript文件的同步和异步加载对于JS文件的引用，尽管当前有不少框架和工具（比如webpack，commonjs，requiresjs等）都做了很好的处理。但是抛开这些框架，了解原生的加载方式还是不无裨益。本文简述一些js文件...
前端面经——html篇
2023-04-10 11:38

小萨摩！的博客前端面经——html篇
没有解决我的问题, 去提问

悬赏问题

¥15 shape_predictor_68_face_landmarks.dat
¥15 slam rangenet++配置
¥15 有没有研究水声通信方面的帮我改俩matlab代码
¥15 对于相关问题的求解与代码
¥15 ubuntu子系统密码忘记
¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
¥15 保护模式-系统加载-段寄存器
¥15 电脑桌面设定一个区域禁止鼠标操作
¥15 求NPF226060磁芯的详细资料
¥15 使用R语言marginaleffects包进行边际效应图绘制

JavaScript函数同步抓取HTML和JS

1条回答 默认 最新

悬赏问题

1条回答默认最新