2301_81330815 2023-11-23 09:29 采纳率: 0%
浏览 11

统计一个文件中单词出现的次数

img


统计一个文件中单词出现的次数
map阶段的代码
reduce阶段的代码
driver阶段的代码
测试结果

  • 写回答

1条回答 默认 最新

  • MA_SS 2023-11-29 20:14
    关注

    Map 阶段

    def map_phase(filename):
        word_count = {}
        with open(filename, 'r') as file:
            for line in file:
                words = line.strip().split()
                for word in words:
                    word = word.lower()  # 将单词转换为小写,以保证统计时的一致性
                    word_count[word] = word_count.get(word, 0) + 1
        return word_count
    

    Reduce 阶段

    def reduce_phase(mapped_data):
        reduced_data = {}
        for word_count in mapped_data:
            for word, count in word_count.items():
                reduced_data[word] = reduced_data.get(word, 0) + count
        return reduced_data
    

    Driver 阶段

    def driver_phase(filenames):
        mapped_data = []
        for filename in filenames:
            mapped_data.append(map_phase(filename))
    
        reduced_data = reduce_phase(mapped_data)
        return reduced_data
    
    评论

报告相同问题?

问题事件

  • 创建了问题 11月23日