如何在网站目录中查找文件？

I'm creating a web crawler. I'm ganna give it an URL and it will scan through the directory and sub directories for .html files. I've been looking at two alternatives:

scandir($url). This works on local files but not on http sites. Is this because of file permissions? I'm guessing it shouldn't work since it would be dangerous for everyone to have access to your website files.
Searching for links and following them. I can do file_get_contents on the index file, find links and then follow them to their .html files.

Do any of these 2 work or is there a third alternative?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dtvhqlc57127 2012-04-05 09:39
关注
The only way to look for html files is to parse throuhg the file content returned by the server, unless by small chance they have enabled directory browsing on the server, which is one of the first things disabled usually, you dont have access to browse directory listings, only the content they are prepared to show you, and let you use.

You would have to start a http://www.mysite.com and work onwards scanning for links to html files, what if they have asp/php or other files which then return html content?

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

如何在网站目录中查找文件？ php
2012-04-05 09:34

回答 2 已采纳 The only way to look for html files is to parse throuhg the file content returned by the server, u
图中网址在ｆｔｐ哪个文件？ php 有问必答
2022-01-13 19:04

回答 1 已采纳通过网页代码是不能看不出文件所在路径的。
在包含文件的文件夹中查找关键字 php
2017-06-21 10:38

回答 2 已采纳 Try this, $findThisString = stripcslashes($_POST["recherchemotcle"]); $path = "tuto"; $dir = open
linux怎么在目录下查找文件,linux find-在指定目录下查找文件
2021-05-09 01:59

闻省的博客 find命令用来在指定目录下查找文件。任何位于参数之前的字符串都将被视为欲查找的目录名。如果使用该命令时，不设置任何参数，则find命令将在当前目录下查找子目录与文件。并且将查找到的子目录和文件全部进行显示。...
使用PHP在存档中查找特定文件 php
2012-04-22 02:25

回答 1 已采纳 If i understood it well, you need to find a FOI.zip in the archive, is it right? If yes, you have
循环遍历json文件以在PHP中查找特定值 json php
2018-08-11 05:56

回答 1 已采纳 First of all, the JSON you posted is invalid. I think it should look like this: [ { "
使用PHP验证查找恶意PDF文件？ javascript php
2016-09-21 11:28

回答 3 已采纳 Take a look into this project https://github.com/urule99/jsunpack-n - A Generic JavaScript Unpacke
如何在 PHP 中列出目录中的文件
2022-04-11 13:58

allway2的博客在本文中，我们将讨论如何在 PHP 中获取目录中所有文件的列表。在日常 PHP 开发中，您经常需要处理文件系统——例如，获取特定目录中的文件列表。PHP 提供了几种不同的方式来轻松读取目录的内容。今天，我们将通过...
在目录中按名称搜索特定文件 php
2018-08-02 05:51

回答 2 已采纳 In the line that you compare the name with the key, you can replace with if(strpos($book->getF
尝试使用PHP在XML文件中查找字符串 php xml
2015-07-24 15:07

回答 1 已采纳 Depending on what type of line breaks they are (which usually depends on the OS), you can use one
Apache在Laravel 5中查找index.php文件 apache laravel php
2015-09-09 16:54

回答 1 已采纳 You have to point apache to the public folder: <VirtualHost *:80> DocumentRoot /var/www/h
linux查找文件在哪个目录下,linux查找文件在哪个文件夹
2021-05-09 01:14

猫腻MX的博客 linux查找文件在哪个文件夹linux下查找文件可以使用find命令例如：find / -name tnsnames.ora查到：/opt/app/oracle/product/10.2/network/admin/tnsnames.ora/opt/app/oracle/product/10.2/network/admin/samples/...
PHP - 用于查找文件的通配符？ php
2014-06-10 02:22

回答 1 已采纳 This should work. if ($file !== "." && $file !== ".." && substr($file, -8) !== "desc.txt"){ );
linux下如何查找文件php.ini文件在哪里？
2019-09-29 22:17

qq_37138818的博客 linux下如何查找文件php.ini文件在哪里？命令如下： find / -name "php.ini" 方法一：这个是比较简单的方法，使用如下命令，可以清楚的看出当前的php使用的配置文件。 php --ini 方法二打印出phpinfo()...
python查找文件指定内容_python实现在目录中查找指定文件的方法
2020-12-09 23:13

weixin_39694838的博客本文实例讲述了python实现在目录中查找指定文件的方法。分享给大家供大家参考。具体实现方法如下：1. 模糊查找代码如下:import osfrom glob import glob #用到了这个模块def search_file(pattern, search_path=os....
没有解决我的问题, 去提问

悬赏问题

¥17 pro*C预编译“闪回查询”报错SCN不能识别
¥15 微信会员卡接入微信支付商户号收款
¥15 如何获取烟草零售终端数据
¥15 数学建模招标中位数问题
¥15 phython路径名过长报错不知道什么问题
¥15 深度学习中模型转换该怎么实现
¥15 HLs设计手写数字识别程序编译通不过
¥15 Stata外部命令安装问题求帮助！
¥15 从键盘随机输入A-H中的一串字符串，用七段数码管方法进行绘制。提交代码及运行截图。
¥15 TYPCE母转母，插入认方向

如何在网站目录中查找文件？

2条回答 默认 最新

悬赏问题

2条回答默认最新