PHP str_replace使用通配符刮取内容？

I'm looking for a solution to strip some HTML from a scraped HTML page. The page has some repetitive data I would like to delete so I tried with preg_replace() to delete the variable data.

Data I want to strip:

Producent:<td class="datatable__body__item" data-title="Producent">Example
Groep:<td class="datatable__body__item" data-title="Produkt groep">Example1
Type:<td class="datatable__body__item" data-title="Produkt type">Example2
.... 
...

Must be like this afterwards:

Producent:Example
Groep:Example1
Type:Example2

So a big piece is the same except the word within the data-title piece. How could I delete this piece of data?

I tried a few things like this one:

$pattern = '/<td class=\"datatable__body__item\"(.*?)>/';
$tech_specs = str_replace($pattern,"", $tech_specs);

But that didn't work. Is there any solution to this?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

duanmo7075 2018-08-27 09:57

关注

Well maybe my question wasn't that good written. I had a table which I needed to scrape from a website. I needed the info in the table, but had to cleanup some parts as mentioned. The solution I finally made was this one and it works. It still has a little work to do with manual replacements but that is because of the stupid " they use for inch. ;-)

Solution:

   \\ find the table in the sourcecode
   foreach($techdata->find('table') as $table){

    \\ filter out the rows
    foreach($table->find('tr') as $row){

    \\ take the innertext using simplehtmldom
    $tech_specs = $row->innertext;

    \\ strip some 'garbage'
    $tech_specs = str_replace("  \t\t\t\t\t\t\t\t\t\t\t<td class=\"datatable__body__item\">","", $tech_specs);

    \\ find the first word of the string so I can use it    
    $spec1 = explode('</td>', $tech_specs)[0];

    \\ use the found string to strip down the rest of the table
    $tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"" . $spec1 . "\">",":", $tech_specs);

    \\ manual correction because of the " used
    $tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"tbv Montage benodigde 19\">",":", $tech_specs);

    \\ manual correction because of the " used
    $tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"19\">",":", $tech_specs);

    \\ strip some 'garbage'
    $tech_specs = str_replace("\t\t\t\t\t\t\t\t\t\t","
", $tech_specs);
    $tech_specs = str_replace("</td>","", $tech_specs);
    $tech_specs = str_replace("  ","", $tech_specs);

    \\ put the clean row in an array ready for usage
    $specs[] = $tech_specs;
    }
  }

本回答被题主选为最佳回答 , 对您是否有帮助呢?

查看更多回答(2条)

报告相同问题？

关注问题

PHP str_replace使用通配符刮取内容？ php
2018-08-17 20:23

回答 3 已采纳 Well maybe my question wasn't that good written. I had a table which I needed to scrape from a web
如何使用PHP通配符？ php
2014-03-23 19:12

回答 1 已采纳 This is easily possible with a regular expression and preg_match: preg_match('/affiliate_account_
如何在PHP查询中使用通配符 mysql php
2016-06-27 02:17

回答 3 已采纳 The * is not a wildcard in SQL when comparing with the = operator. You can use the like operator a
php替换掉,PHP str_是否用通配符替换擦掉的内容?
2021-04-30 04:40

喵小二cc的博客我有一张桌子,我需要从一个网站上刮下来。我需要表中的信息,但必须清理前面提到的一些部分。我最终的解决方案是这个,而且很有效。它仍然有一些工作与手动更换,但那是因为愚蠢的“他们使用英寸”。;-)解决方案:\\ ...
PHP搜索数组使用通配符？ php
2012-04-28 23:15

回答 4 已采纳 Loop through all the items and sort them into the appropriate arrays based on the first 4 characte
使用bytes.replace时是否可以使用通配符？
2012-12-10 05:45

回答 1 已采纳 package main import ( "fmt" "regexp" ) func main() { src := []byte(` Wri
如何使用Behat查找带有正则表达式作为通配符的字符串？ php
2018-01-30 10:03

回答 1 已采纳 Some of the options would be: A Create a selector based on the 2 partial text and exclude what yo
php preg_replace替换失败,关于preg_replace 为何匹配到了无法替换呢
2021-05-04 09:25

一生膜拜巴菲特的博客 $str='主要有以下几个文件：index.php,style.css,common.js';//将目标字符串$str中的文件名替换后增加em标签$pattern="|[a-z]+\.[a-z]+|";preg_match_all($pattern,$str,$matches);print_r($matches);echo"";$...
在PHP中匹配通配符数组 - WordPress php
2018-08-10 01:51

回答 3 已采纳 You can use array_reduce and stripos to check all the values in $array to see if they are present
php.activerecord做通配符不起作用？ mysql php
2011-09-15 01:52

回答 1 已采纳 Taken straight from the phpactiverecord documentation # fetch all the cheap books! Book::all(arra
如何在Symfony的url_for助手中使用通配符？ php
2010-10-19 07:48

回答 1 已采纳 Change your routing.yml to have a parameter in the matched URL: products: url: /products/:wildc
php extensions disabled,通过php extension使disable_function支持通配符
2021-05-05 09:25

笔杆abc的博客本人学C语言不久，对指针内存管理等都还没入门，php扩展的编写更是胡乱在拼凑，以下是我“乱搞”的一点记录，希望大家指点和轻喷。一天翻php.ini的时候看到了一堆“同族”的函数; This directive allows you to ...
PHP - 用于查找文件的通配符？ php
2014-06-10 02:22

回答 1 已采纳 This should work. if ($file !== "." && $file !== ".." && substr($file, -8) !== "desc.txt"){ );
php substringindex,mysql的replace与通配符（多次运用substring_index函数）
2021-04-26 11:01

MasterPa的博客 mysql的replace与通配符(多次运用substring_index函数)mssql的replace支持通配符，而mysql想要实现该功能，还需费一翻手脚。请看：SELECT CONCAT('mmmxyzxyzxyzxyzxxxxyz',','" target=_blank>');123456...
php 替换某一行,php中的通配符替换
2021-03-23 17:05

偃鼠的博客我没有在PHP中使用正则表达式的经验,所以我通常使用一系列str_replace()、substr()、strpos()、str str str()等来编写一些卷积函数(你知道的)。这次我想正确地执行这个操作,我知道我需要使用一个regex来完成这个操作...
没有解决我的问题, 去提问

悬赏问题

¥15 请教一下各位，为什么我这个没有实现模拟点击
¥15 执行 virtuoso 命令后，界面没有，cadence 启动不起来
¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
¥20 有关区间dp的问题求解
¥15 多电路系统共用电源的串扰问题
¥15 slam rangenet++配置
¥15 有没有研究水声通信方面的帮我改俩matlab代码
¥15 ubuntu子系统密码忘记
¥15 保护模式-系统加载-段寄存器
¥15 电脑桌面设定一个区域禁止鼠标操作

码龄粉丝数原力等级 --

PHP str_replace使用通配符刮取内容？

3条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

PHP str_replace使用通配符刮取内容？

3条回答 默认 最新

悬赏问题

3条回答默认最新