dousi1961 2019-07-11 07:59
浏览 143
已采纳

从切片中删除字符串中包含一个单词的所有单词

i'm trying to make a topic extraction, what i do is remove all auxiliary word from string, my pseudo-code is :

topic := make(map[string]int)
auxiliaryWord := []string{"hbs", "habis", "dan", "kapan", "bagaimana", "kita", "kamu", "warga", "pada", "paling", "ga", "gak", "enggak", "tidak", "bukan", "usai", "juga", "yg", "yang", "kpd", "kepada", "nya", "adanya", "jd", "jadi", "sih", "lah", "kan", "photo", "from", "by", "ini", "saja", "utk", "untuk", "lebih", "ternyata", "apa", "sok", "tau", "bagi", "eksis", "keluar", "kk", "kakak"}
for chat := range chats {
    arrWord := chat.Split(chat, " ")
    for word := arrWord {
        if word not in auxiliaryWord {
            if topic[word] not exist {
                topic[word] = 1
            } else {
                topic[word]+= 1
            }
        }
    }
}

my question is, is there any faster way to do this ?

  • 写回答

1条回答 默认 最新

  • donglu1971 2019-07-11 09:43
    关注

    Just precalculate auxilaryWord to hash, then do a lookup.

    package main
    
    import (
        "fmt"
        "strings"
    )
    
    var auxilaryWords = []string{"hbs", "habis", "dan", "kapan", "bagaimana", "kita", "kamu", "warga", "pada", "paling", "ga", "gak", "enggak", "tidak", "bukan", "usai", "juga", "yg", "yang", "kpd", "kepada", "nya", "adanya", "jd", "jadi", "sih", "lah", "kan", "photo", "from", "by", "ini", "saja", "utk", "untuk", "lebih", "ternyata", "apa", "sok", "tau", "bagi", "eksis", "keluar", "kk", "kakak"}
    var auxHash = map[string]bool{}
    
    func CountTopics(chatWords []string) map[string]int {
        result := map[string]int{}
        for _, word := range chatWords {
            if !auxHash[word] {
                result[word] += 1
            }
        }
        return result
    }
    
    func init() {
        for _, word := range auxilaryWords {
            auxHash[word] = true
        }
    }
    
    func main() {
        arrWord := strings.Split(`hai kakak habis makan apa`, " ")
        fmt.Println(CountTopics(arrWord))   
    }
    

    https://play.golang.org/p/Wr2gK_zizL0

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 TLS1.2协议通信解密
  • ¥40 图书信息管理系统程序编写
  • ¥20 Qcustomplot缩小曲线形状问题
  • ¥15 企业资源规划ERP沙盘模拟
  • ¥15 树莓派控制机械臂传输命令报错,显示摄像头不存在
  • ¥15 前端echarts坐标轴问题
  • ¥15 ad5933的I2C
  • ¥15 请问RTX4060的笔记本电脑可以训练yolov5模型吗?
  • ¥15 数学建模求思路及代码
  • ¥50 silvaco GaN HEMT有栅极场板的击穿电压仿真问题