willyetao 2021-08-22 11:48 采纳率: 100%
浏览 119
已结题

python 多种方法实现数组统计

example input for index_from_tokens
[("cat",1),("cat",2),("cat",2),("door",1),"water",3)]
example output for index_from_tokens
index:{'cat':[(1,2)],'door':[(1,1)],'water':[(3,1)]}
doc_freq:{'cat':2,'door':1,'water':1}

  • 写回答

3条回答 默认 最新

  • 广大菜鸟 2021-08-22 12:52
    关注
    
    def function(index_from_tokens: list):
        index = {}
        doc_freq = {}
        tmpDict = {}  # 组合key,value为新的键
        for i in range(len(index_from_tokens)):
            key, value = index_from_tokens[i]
            newKey = key + "_" + str(value)
            if newKey not in tmpDict:
                tmpDict[newKey] = 0
            tmpDict[newKey] += 1
        for newKey, num in tmpDict.items():
            idx = newKey.index('_')
            oldKey = newKey[:idx]
            oldValue = newKey[idx + 1:]
            if oldKey not in index:
                index[oldKey] = []
            index[oldKey].append((int(oldValue), num))
            if oldKey not in doc_freq:
                doc_freq[oldKey] = 0
            doc_freq[oldKey] += 1
        return index, doc_freq
    
    
    index_from_tokens = [("cat", 1), ("cat", 1), ("cat", 2), ("door", 1), ("water", 3)]
    index, doc_freq = function(index_from_tokens)
    print(index)
    print(doc_freq)
    

    img

    方法2:

    
    def function(index_from_tokens: list):
        index = {}
        doc_freq = {}
        tmpDict = {}  # 组合key,value为新的键
        for i in range(len(index_from_tokens)):
            key, value = index_from_tokens[i]
            if key not in tmpDict:
                tmpDict[key] = []
            tmpDict[key].append(value)
        for key, value in tmpDict.items():
            if key not in index:
                index[key] = []
            if key not in doc_freq:
                doc_freq[key] = 0
            # 检查有几类
            types = len(set(value))
            doc_freq[key] += types
            for num in set(value):
                index[key].append((num, value.count(num)))
        return index, doc_freq
    
    
    index_from_tokens = [("cat", 1), ("cat", 1), ("cat", 2), ("door", 1), ("water", 3)]
    index, doc_freq = function(index_from_tokens)
    print(index)
    print(doc_freq)
    

    img
    方法3

    
    def function(index_from_tokens: list):
        index = {}
        doc_freq = {}
        for i in range(len(index_from_tokens)):
            key, value = index_from_tokens[i]
            if key not in index:
                index[key] = dict()
            if value not in index[key]:
                index[key][value] = 0
                if key not in doc_freq:
                    doc_freq[key] = 0
                doc_freq[key] += 1
            index[key][value] += 1
        for key, dict_value in index.items():
            tmp = dict_value
            index[key] = []
            for k, v in tmp.items():
                index[key].append((k, v))
        return index, doc_freq
    
    
    index_from_tokens = [("cat", 1), ("cat", 1), ("cat", 2), ("door", 1), ("water", 3)]
    index, doc_freq = function(index_from_tokens)
    print(index)
    print(doc_freq)
    

    img

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
    1人已打赏
查看更多回答(2条)

报告相同问题?

问题事件

  • 已结题 (查看结题原因) 8月22日
  • 已采纳回答 8月22日
  • 创建了问题 8月22日

悬赏问题

  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探