谁家的June 2022-05-31 21:07 采纳率: 50%
浏览 202
已结题

Python从给定字符串中提取单词

问题遇到的现象和发生背景

统计如下字符串str 中每个单词出现的次数,结果存入 dict 中,单词为key,次数为 value,
并按照 value 由高到底排序,输出此 dict

问题相关代码,请勿粘贴截图
str  = """The Zen of Python, by Tim Peters
            Beautiful is better thanugly.
            Explicit is better than implicit.
            Simple is better than complex.
            Complex is better than complicated.
            Flat is better than nested.
            Sparseisbetterthandense.
            Readability counts.
            Specialcasesaren'tspecialenoughtobreaktherules.
            Although practicality beats purity.
            Errors should never pass silently.
            Unless explicitly silenced.
            In the face of ambiguity, refuse the temptation to guess.
            Thereshouldbeone--andpreferablyonlyone --obviouswayto do it.
            Although that way may not be obvious at first unless you're Dutch.
            Now is better than never.
            Although never is often better than *right* now.
            If the implementation is hard to explain, it's a bad idea.
            If the implementation is easy to explain, it may be a good idea.
            Namespacesareonehonkinggreatidea--let'sdomoreofthose!"""
我的解答思路和尝试过的方法

尝试用正则表达式解答过 没有空格的句子无法提取单词出来,例如“Specialcasesaren'tspecialenoughtobreaktherules”直接视为一个单词了,还有are'nt无法识别提取成一个单词

我想要达到的结果

其中没有空格区分的句子和are'nt该如何区分提取单词出来

  • 写回答

3条回答 默认 最新

  • Hann Yang 全栈领域优质创作者 2022-05-31 22:08
    关注

    注:str dict 都是内置函数,尽量不要用它们作变量名。

    zen  = """The Zen of Python, by Tim Peters
                Beautiful is better than ugly.
                Explicit is better than implicit.
                Simple is better than complex.
                Complex is better than complicated.
                Flat is better than nested.
                Sparse is better than dense.
                Readability counts.
                Special cases aren't special enough to break the rules.
                Although practicality beats purity.
                Errors should never pass silently.
                Unless explicitly silenced.
                In the face of ambiguity, refuse the temptation to guess.
                There should be one-- and preferably only one --obvious way to do it.
                Although that way may not be obvious at first unless you're Dutch.
                Now is better than never.
                Although never is often better than *right* now.
                If the implementation is hard to explain, it's a bad idea.
                If the implementation is easy to explain, it may be a good idea.
                Namespaces are one honking great idea -- let's do more of those!"""
    
    
    punc = [',','.','-','!','*']
    
    for p in punc:
        zen = zen.replace(p,' ')
    
    lst = zen.lower().split()
    
    dic = {}
    
    for i in lst:
        dic[i] = dic.get(i,0) + 1
    
    for key,value in sorted(dic.items(), key=lambda x:x[1], reverse=True):
        print(f'{key:>15}:{value}')
    


    ↓↓↓如有帮助请点个采纳,谢谢!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

问题事件

  • 系统已结题 6月9日
  • 已采纳回答 6月1日
  • 创建了问题 5月31日

悬赏问题

  • ¥15 微信购物商城购物车的所有功能以及购物车,和首页的搜索功能
  • ¥40 servlet的web程序部署出错
  • ¥50 activiti 新建流程系列问题
  • ¥50 为什么我版本升级之后运行速度变慢5倍??
  • ¥15 如何在gazebo中加载机械臂和机械手
  • ¥15 纯运放实现隔离采样方案设计
  • ¥20 easyconnect无法连接后缀带.com的网站,一直显示找不到服务器ip地址,但是带有数字的网站又是可以连接上的,如何解决
  • ¥15 电脑开机过商标后就直接这样,求解各位
  • ¥15 mysql , 用自己创建的本地主机和用户名 登录不上
  • ¥15 关于#web项目#的问题,请各位专家解答!