doushaju4901
doushaju4901
2013-02-26 07:02

正则表达式过滤表

  • regex
  • codeigniter
  • php
  • ruby

ok well i have a table that gets outputted by some open source software but it does not get outputted in an actual table format eg

<table> 
  <thead>
     <td>Heading</td>
  <thead>
  <tbody>
    <tr>
       <td>Content</td>
    </tr>
  <tbody>
</table

Instead The people that developed the software decided that it would be a good a idea to output the table like so

+------------+-------------+-------+-------------+------------+---------------+----------+
| HEADING 1  | HEADING 2   | ETC   | ANOTHER     | HEADING3   | HEADING4     | SML |
+------------+-------------+-------+-------------+------------+---------------+----------+
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
| content   | more content | cont  | More more   | content    | content 2.0  | litl |
+------------+-------------+-------+-------------+------------+--------------+----------+
| TOTALS        AGENTS:21  |  total|        total|       total|         total| total|
+------------+-------------+-------+-------------+------------+--------------+----------+

So i cant build a web scraper to get the Data or well im not shure if i could build a scraper to scrape that since its all wrapped inside one <pre> </pre> tag . So instead i have been trying to use ruby and Regex to try and get the job done so far i have managed to get all the leading |'s out and also i have managed to get the heading +-------+----- But only that far since it seems that i have to Repeat the pattern the whole time it doesnt want to repeat itself ok But enough talking for now Here is the Code i have used so far

text.lines.to_a.each do |line|
   line.sub(/^\| |^\+*-*\+*\-*/) do |match|
    puts "Regexp Match: " << match
end
STDIN.getc
puts "New Line "<< line
end

and for example the output for the first line would only be +-----------------+---------- it has be in CSV format so il use Gsub to replace the remaining |'s with ,'s

I can use PHP or Ruby so any answer is more than welcome

  • 点赞
  • 回答
  • 收藏
  • 复制链接分享

4条回答