douxun4860 2011-09-08 13:49
浏览 48

从字符串中删除希伯来语格式化字符

I have a problem that is kicking my ass for a couple of days now.

I have an array of strings and each string contains a single hebrew word.

These words where ripped from a PDF and appear in the array in the same order as shown in the PDF.

I want to take these words and reconstruct them into a sentence in the order they are in the array and the PDF. Seems very simple.

edit: Here is the code, its actually XML I'm looping through, I think its irrelevant but since I'm showing the code I better have it right :)

foreach($text->TOKEN as $word) {
    $sentence = $sentence . ' ' . $word;
}

/*
This sentence will sometimes (not always) not have the same order as the XML.
Hebrew is read right to left but thats not the issue, I just want to make a 
string in the same order as the words.
*/
echo $sentence;

Its like the words have a mind of their own and the order gets jumbled up to what does not seem like a logical order to a non Hebrew reader. Commas will move around to different words even. But this is not always the case.

I do not read or speak Hebrew but from what I can gather there are some special characters in the language that might be affecting the order? My question is what do I have to do to strip them out?

I'm using PHP for this.

  • 写回答

1条回答 默认 最新

  • doufu6130 2011-09-08 14:02
    关注

    Without seeing your code, here are two suggestions:

    1. Print out the array of hebrew words with print_r and see what order they're in.
    2. Keep in mind that Hebrew is read right to left and not left to right.

    Otherwise, please provide more of your code for further assistance.

    评论

报告相同问题?

悬赏问题

  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题