duannue2455 2013-06-25 21:35
浏览 98

使用NLP /机器学习教一台机器如何检测字符串是否是数学的?

I want to be able to detect if a string is mathematical.

Strings that would evaluate to true on being mathematical would be "2", "42000", "-10", "-55.22", "forty-two", "fifty six", "negative ninety nine", and "negative one point seven".

And since it is not numerical and mathematical something as complex as "negative two times seven", or "two plus two", or "3 plus two", or "two - 1", or "2 ^ 7" would pass.

Basically spelled out numbers, spelled out possessive numbers (first, thirteenth, thousandth) and the words "plus", "negative", "positive", "minus", "subtracted", "from", "times", "multiplied", "by" "divided", "over", "point", "to", "the", "power", "of", and, "and", "raised"

And the function would return false if it is not like one of those examples.

Is it proper to use machine learning / NLP to do this? Is there a better way to do this than NLP / Machine Learning?

Are there any existing scripts or functions that can do this?

If not, how can I do this with NLPTools or PHP NLP tools ?

  • 写回答

1条回答 默认 最新

  • dongyun3897 2013-06-26 02:52
    关注

    Parsing is a better tool than machine learning for this problem. What you have described is a relatively simple grammar for arithmetic, with some aliases for numbers, and a touch of syntax for those aliases. A tokenizer and some basic syntactic analysis, which you could code directly, will produce better more reliable results with significantly less computational effort than machine learning and optimization will.

    One reason why parsing is sufficient is that you don't need to worry about misspellings as often as you do, say, with people's names. If you want to get fancy about that, then use your Jaro-Winkler-based things for lexical analysis and then use syntatic analysis on what you think are your tokens. That is still much cheaper and less complex than machine learning.

    I don't know much about PHP, but Google does, and there seem to be a few libraries that will help you. The search terms that will get you started are: token; lexical analysis; grammar; syntax; LR Parser; yacc; bison.

    评论

报告相同问题?

悬赏问题

  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥70 PlayWright在Java上连接CDP关联本地Chrome启动失败,貌似是Windows端口转发问题
  • ¥15 帮我写一个c++工程
  • ¥30 Eclipse官网打不开,官网首页进不去,显示无法访问此页面,求解决方法
  • ¥15 关于smbclient 库的使用
  • ¥15 微信小程序协议怎么写
  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?