file_get_contents（）打破了ISO-8859-1编码

I am trying to read a page using file_get_contents() but I cannot get the character encoding to work.

this is my code:

    $username = "masked";
    $password = "maskedPass";
    $remote_url = 'https://utfws.utfpr.edu.br/aluno01/sistema/mplistahorario.inicio?p_curscodnr=212';

    // Create a stream
    $opts = array(
        'http'=>array(
            'method'=>"GET",
            'header' => array(
                "Authorization: Basic " . base64_encode("$username:$password"),
                'Accept-Charset: iso-8859-1'
            )

        )
    );

    $context = stream_context_create($opts);

    // Open the file using the HTTP headers set above
    $file = file_get_contents($remote_url, false, $context);

    echo $file;

I tried to change the character encoding to utf-8 but I always get a page with question marks instead of áéíóúãõç.

When I open the page directly in my browser it works just fine. Why is this happening?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doulipi3742 2016-04-05 16:48
关注
It sounds to me like this might just be a problem of lost encoding details.

What you're describing is:

request document from webserver, specifying encoding 8859-1

server responds with document in requested encoding, including header specifying the encoding is 8859-1. This will look correct in a browser.

output document ( but not header data! ) from php ( where this goes isn't specified

open the data in some sort of viewer.

See where the encoding specification was lost, there in step 3?

The data can correctly be decoded with 8859-1, but only will be decoded with 8859-1 if the viewer is configured to use that encoding by default. Some apps may have a default of 8859-1, but UTF-8 is a lot more common these days.

If you load the data into a different storage engine, say mysql, the problem may compound. mysql associates a charset with text data. If your database defaults to utf-8, and you don't tell it the data is actually in 8859-1, but you don't tell it the data is in 8859-1, now you're feeding it data that is assumed to be in utf-8, and the data will be treated as such in the database going forward. Now even if you ask the database for 8859-1 in the future, the data will be re-encoded from utf-8 to 8859-1, but it's not valid utf-8 - it's yet another incorrect set of bytes.

To address this problem, specify the encoding when you view the data, or when you save it to a database.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

file_get_contents（）打破了ISO-8859-1编码 html http php
2016-04-05 16:17

回答 1 已采纳 It sounds to me like this might just be a problem of lost encoding details. What you're describi
file_get_contents - URL中的特殊字符 - 特殊情况 php
2015-07-30 09:48

回答 2 已采纳 URLs cannot contain "Ö"! Start from this basic premise. Any characters not within a narrowly defin
file_get_contents无效 - 连接被拒绝 php
2017-01-01 20:02

回答 1 已采纳 Isn't Hostgator blocking the requests because of the DDoS protection? Give them a call, my hosting
PHP file_get_contents / curl - 获得与浏览器不同的结果 php
2014-01-29 19:12

回答 3 已采纳 try to test it using saving cookies to same directory where the script resides first so set the co
如何在php中执行file_get_contents后清除内存 php
2015-03-20 20:26

回答 3 已采纳 if ($_POST["submit"]) { $ip = $_POST['ip']; $subnet = $_POST['subnet'];
PHP使用file_get_contents（）检查外部服务器上是否存在文件 php
2014-08-18 01:29

回答 3 已采纳 I think best method for me is using this script: $file = "http://website.com/dir/filename.php"; $
如何将file_get_contents转换为cURL php
2018-03-01 16:02

回答 1 已采纳 You may still get a 401 with curl but you can try the following $ch = curl_init(); curl_setopt($c
为什么我的POST file_get_contents返回HTTP错误请求？ php
2016-02-03 16:24

回答 2 已采纳 http_build_query() converts an array to a URL-encoded query string like name=john&password=s3cr3t
使用file_get_contents创建php缓存 php
2015-06-26 18:57

回答 2 已采纳 This line file_get_contents('includes/menu.php'); will just read the php file, without executin
Android面试准备复习之Android知识点大扫描 .
2015-06-19 11:38

黑色之路的博客 SmsManager smsManager = SmsManager.getDefault(); PendingIntent sentIntent = PendingIntent.getBroadcast(SMSSender.this, 0, new Intent(), 0); if(content.length()>70){//如果字数超过70,需拆分成...
file_get_contents（）失败，URL中包含特殊字符 php
2015-06-28 08:48

回答 2 已采纳 The problem is likely due to urlencode escaping your protocol: https://en.wikipedia.org/wiki/Ålan
android----面试基础概括总结
2012-11-07 23:01

zshow0901的博客 SmsManager smsManager = SmsManager.getDefault(); PendingIntent sentIntent = PendingIntent.getBroadcast(SMSSender.this, 0, new Intent(), 0); if(content.length()>70){//如果字数超过70,需拆分成...
·java
2023-03-01 16:47

john.xiang的博客 one hundred and four Print various graphics （图形） one hundred and five file operations（文件操作） one hundred and six Directory operation（目录操作） one hundred and seven exception handling（异常...
没有解决我的问题, 去提问

悬赏问题

¥15 2024-五一综合模拟赛
¥15 如何将下列的“无限压缩存储器”设计出来
¥15 下图接收小电路，谁知道原理
¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
¥15 ETLCloud 处理json多层级问题
¥15 matlab中使用gurobi时报错
¥15 这个主板怎么能扩出一两个sata口

file_get_contents（）打破了ISO-8859-1编码

1条回答 默认 最新

悬赏问题

1条回答默认最新