I am generating CSV files. Occasionally the data source will pass along characters with accents etc... that I would like to strip out. Is there a reasonably straightforward way to detect and strip out UTF-8 characters?
2条回答 默认 最新
- drpmazn9021 2012-08-07 22:31关注
If you're sure you're getting UTF-8 as input, use iconv to convert the values to the encoding you're using in your output - detecting UTF-8 chars isn't failsafe (as the values are valid iso-8859-1 characters as well (or all 8 bit encodings, really).
If you just want to use the regular ascii set of values (byte-values 0 - 127), you can let iconv convert to the 'ascii' encoding and transliterate:
iconv("utf-8", "ascii//TRANSLIT", "Hei og hå")
will result in
hei og ha
being returned.
本回答被题主选为最佳回答 , 对您是否有帮助呢?解决 无用评论 打赏 举报
悬赏问题
- ¥15 poi合并多个word成一个新word,原word中横版没了.
- ¥15 【火车头采集器】搜狐娱乐这种列表页网址,怎么采集?
- ¥15 求MCSCANX 帮助
- ¥15 机器学习训练相关模型
- ¥15 Todesk 远程写代码 anaconda jupyter python3
- ¥15 我的R语言提示去除连锁不平衡时clump_data报错,图片以下所示,卡了好几天了,苦恼不知道如何解决,有人帮我看看怎么解决吗?
- ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
- ¥20 关于URL获取的参数,无法执行二选一查询
- ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
- ¥15 marlin编译错误,如何解决?