dongyunwei8596 2012-07-12 08:08 采纳率: 100%
浏览 63
已采纳

无法让PHP接受来自美丽的汤Python脚本的POUND符号

SO I have a script that pulls information from an event webpage. URL is this: http://everguide.com.au/melbourne/event/2012-jul-14/colour/

This php script is calling a python script (its part of a for loop):

${"tmp" . $i} = utf8_encode (exec("python myscrape.py ${"eu" . $i}"));

It passes a URL. The python script is this:

# -*- coding: utf-8 -*-
import sys
URL = sys.argv[1]
#$URL = 'http://everguide.com.au/melbourne/event/2012-jul-14/colour/'

import urllib2
req = urllib2.Request(URL)
response = urllib2.urlopen(req)
html = response.read()

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html.decode('utf-8'))
soup.prettify()

import re


for node in soup.findAll(itemprop="name"):
    n = ''.join(node.findAll(text=True)) 
for node in soup.findAll(itemprop="url"):
    v = ''.join(node.findAll(text=True))

for node in soup.findAll("div", { "class" : "time" }):
    d = ''.join(node.findAll(text=True))

for node in soup.findAll("a", { "id" : "ctl00_holderBody_ctl00_lnkCat" }):
    c = ''.join(node.findAll(text=True)) 

vu = v
vu.encode('utf-8', 'xmlcharrefreplace')
re.escape(vu)

print n,"|", d,"|", vu,"|", c

Which works really well, but only returns up to the or pipe before VU - it cant go past that!

The UTF-8 encoding is set on all files, HTML and php.

When there is a special character in the V variable, it breaks and stops. If there are no special characters, it works perfectly.

Expected output is:

Colour | 14 July @ 7:30PM | 1000 £ Bend | Clubs & Parties

This ouutput can be seen when running the script on the server (with same python command) but over PHP - i cant get the Venue string back in!

Please help

Rick

  • 写回答

1条回答 默认 最新

  • dtpw54085 2012-07-12 08:17
    关注

    vu.encode returns encoded string ... as you're not assigning the encoded result, this is just getting thrown away. Have you tried

    vu = vu.encode('utf-8', 'xmlcharrefreplace')

    You'll also need to skip the escape as it will mess up encoded unicode.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 找人不需要人工智能回答的gamit解算后做形变分析
  • ¥20 RL+GNN解决人员排班问题时梯度消失
  • ¥15 统计大规模图中的完全子图问题
  • ¥15 使用LM2596制作降压电路,一个能运行,一个不能
  • ¥60 要数控稳压电源测试数据
  • ¥15 能帮我写下这个编程吗
  • ¥15 ikuai客户端l2tp协议链接报终止15信号和无法将p.p.p6转换为我的l2tp线路
  • ¥15 phython读取excel表格报错 ^7个 SyntaxError: invalid syntax 语句报错
  • ¥20 @microsoft/fetch-event-source 流式响应问题
  • ¥15 ogg dd trandata 报错