dream0614 2012-03-09 05:51
浏览 103
已采纳

当我得到一个包含cURL请求的页面时,如果路径是相对的,如何导航页面?

This is probably an easy question but I can't find the answer... I have a PHP script named 'send.php' which makes a cURL request to open an external web page. It outputs the external page to the browser. All completely by the books.

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);

All it does is posts some POST data to a processing script on an external site and then displays on the browser whatever that external script would display normally; ie, a confirmation message, thank you, etc.

Problem is: My 'send.php' is still the url that appears up in the navigation bar. So if I click around on that page, and the links are using relative paths, it tries to append my current path with those relative paths, which of course leads to a 404. Additionally, if there are more form fields on the page, and the action path is an empty string, it will try to post those submissions to send.php again on my server, which then generates errors.

How can I make it so it will still send the post data and output the result of the processing script but still allow the user to navigate the output page as they normally would? Or if it's a multi-page form, they can continue filling out page 2 as if they were just on that site?

Thanks in advance

Update: Solved by adding these lines to the above code:

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
$response = str_ireplace('<head>', "<head><base href=\"$url\" />", $response);
echo $response;
  • 写回答

3条回答 默认 最新

  • dpvv37755 2012-03-09 06:30
    关注

    You can get the URL that curl resolves to (if you're using FOLLOWLOCATION with curl_getinfo and CURLINFO_EFFECTIVE_URL. You can prepend this URL to all relative paths. As for how to tell whether a path is relative .. well .. if it starts with a '/' it's absolute, which actually makes it "relative" to the domain. If it starts with a scheme, it's also absolute, and it may even lead to a different domain.

    As to how to actually find the URLs .. you could use DOMDocument::loadHTML and use DOMXPath to find all anchor tags (or links, if you like). Think about how much money Google engineers get paid for site scraping and URL following -- this is probably not the simplest thing in the world to do optimally.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 matlab求解平差
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料
  • ¥15 使用R语言marginaleffects包进行边际效应图绘制
  • ¥20 usb设备兼容性问题
  • ¥15 错误(10048): “调用exui内部功能”库命令的参数“参数4”不能接受空数据。怎么解决啊
  • ¥15 安装svn网络有问题怎么办