如何同时搜索一大堆地图[字符串]字符串

I need to search a huge slice of maps[string]string. My thought was that this is a good chance for go's channel and go routines.

The Plan was to divide the slice in parts and send search them in parallel. But I was kind of shocked that my parallel version timed out while the search of the whole slice did the trick.

I am not sure what I am doing wrong. Down below is my code which I used to test the concept. The real code would involve more complexity

//Search for a giving term
//This function gets the data passed which will need to be search
//and the search term and it will return the matched maps
// the data is pretty simply the map contains { key: andSomeText }
func Search(data []map[string]string, term string) []map[string]string {

    set := []map[string]string{}

    for _, v := range data {
        if v["key"] == term {

            set = append(set, v)
        }

    }
    return set

}

So this works pretty well to search the slice of maps for a given SearchTerm.

Now I thought if my slice would have like 20K entries, I would like to do the search in parallel

// All searches all records concurrently
// Has the same function signature as the the search function
// but the main task is to fan out the slice in 5 parts and search
// in parallel
func All(data []map[string]string, term string) []map[string]string {
    countOfSlices := 5

    part := len(data) / countOfSlices

    fmt.Printf("Size of the data:%v
", len(data))
    fmt.Printf("Fragemnt Size:%v
", part)

    timeout := time.After(60000 * time.Millisecond)

    c := make(chan []map[string]string)

    for i := 0; i < countOfSlices; i++ {
        // Fragments of the array passed on to the search method
        go func() { c <- Search(data[(part*i):(part*(i+1))], term) }()

    }

    result := []map[string]string{}

    for i := 0; i < part-1; i++ {
        select {
        case records := <-c:
            result = append(result, records...)
        case <-timeout:
            fmt.Println("timed out!")
            return result
        }
    }
    return result
}

Here are my tests:

I have a function to generate my test data and 2 tests.

func GenerateTestData(search string) ([]map[string]string, int) {
    rand.Seed(time.Now().UTC().UnixNano())
    strin := []string{"String One", "This", "String Two", "String Three", "String Four", "String Five"}
    var matchCount int
    numOfRecords := 20000
    set := []map[string]string{}
    for i := 0; i < numOfRecords; i++ {
        p := rand.Intn(len(strin))
        s := strin[p]
        if s == search {
            matchCount++
        }
        set = append(set, map[string]string{"key": s})
    }
    return set, matchCount
}

The 2 tests: The first just traverses the slice and the second searches in parallel

func TestSearchItem(t *testing.T) {

    tests := []struct {
        InSearchTerm string
        Fn           func(data []map[string]string, term string) []map[string]string
    }{
        {
            InSearchTerm: "This",
            Fn:           Search,
        },
        {InSearchTerm: "This",
            Fn: All,
        },
    }

    for i, test := range tests {

        startTime := time.Now()
        data, expectedMatchCount := GenerateTestData(test.InSearchTerm)
        result := test.Fn(data, test.InSearchTerm)

        fmt.Printf("Test: [%v]:
Time: %v 

", i+1, time.Since(startTime))
        assert.Equal(t, len(result), expectedMatchCount, "expected: %v to be: %v", len(result), expectedMatchCount)

    }
}

It would be great if someone could explain me why my parallel code is so slow? What is wrong with the code and what I am missing here as well as what the recommended way would be to search huge slices in memory 50K+.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanhuan5409 2017-01-21 19:46
关注
This looks like just a simple typo. The problem is that you divide your original big slice into 5 pieces (countOfSlices), and you properly launch 5 goroutines to search each part:

for i := 0; i < countOfSlices; i++ { // Fragments of the array passed on to the search method go func() { c <- Search(data[(part*i):(part*(i+1))], term) }() }

This means you should expect 5 results, but you don't. You expect 4000-1 results:

for i := 0; i < part-1; i++ { select { case records := <-c: result = append(result, records...) case <-timeout: fmt.Println("timed out!") return result } }

Obviously if you only launched 5 goroutines, each of which delivers 1 single result, you can only expect as many (5). And since your loop waits a lot more (which will never come), it times out as expected.

Change the condition to this:

for i := 0; i < countOfSlices; i++ { // ... }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

C语言指针字符串数组程序
2020-06-09 11:39

回答 1 已采纳 # 问题解决的话，请点下采纳 ``` #include int main() { char s[1000]; scanf("%[^\n]", s); char s1[1000]; s
C#字符串占用的堆内存无法释放问题 c# 数据结构运维开发
2022-04-12 11:18

回答 1 已采纳 system("dir /b /a-d c:\. >d:\allfiles.txt");//读文件d:\allfiles.txt的内容即C:\下所有文件的名字system("dir /b /a-
一维字符数组：字符串逆序 c++ c语言
2021-07-04 15:28

回答 2 已采纳 void fun(char *str,int n) { char *p = str + n - 1; int i; for (int i = 0; i < n / 2;
python字符串前面去两位_去掉Python字符串前面的“b”
2021-02-10 07:10

weixin_39897070的博客在长话短说，我有一些netCDF文件，我从中生成一系列地图。这个脚本工作得很好，但我遇到了很大的问题，使标题显示正确。我从netCDF文件中获取变量，它将充当我的标题(基本上是一个简单的时间戳)。首先，我尝试将其...
关于字符串输出却输出一堆汉字的问题 c语言
2022-02-15 18:29

回答 4 已采纳 字符串必须以'\0'结尾，puts输出的时候，以'\0'作为输出的终止符因为题主你没写，所以它认为字符串还没结束，就输出乱码了如果您坚持不加的话倒也能输出，只不过需要使用printf函数，一个字符一
在一个字符串中寻找bug字符 c++ c语言有问必答
2021-12-13 10:09

回答 2 已采纳找到的字符从字符串移出，不断循环搜索这三个字符
delphi 字符串查找或者匹配的问题？
2018-08-05 14:07

回答 1 已采纳 ``` Arr : array[0..4] of WideString =( WideString('中国'), WideString('乌拉圭'), WideString('日本'),
java如何把字符串A到Z_如何将字符串映射到Java中的函数？
2021-03-10 10:23

不过如此lee的博客当前，我有一堆实现Processor接口的Java类，这意味着它们都具有processRequest(String key)方法。这个想法是，每个类都有几个(例如<10)成员Strings，并且每个成员都通过processRequest方法映射到该类中的方法，...
可以用一维指针进行交换字符串的操作吗？
2018-11-27 05:43

回答 2 已采纳 C语言中实参形参变量之间的数据传递是单向的“值传递”，不可能通过执行调用函数来改变实参指针变量的值，但是可以改变实参变量所指变量的值。你可以使用二级指针做参数来交换
搜索字符串并替换（重复使用） c语言
2023-04-19 15:35

回答 4 已采纳 #include <stdio.h> #include <string.h> int main() { char s1[101], s2[101], s3[101];
字符串某一位数的查询vs
2016-03-29 23:46

回答 1 已采纳 http://baike.baidu.com/link?url=DVFK66M5B-Hwevhr7qK1Q87PfJtdOUyW0sy5BuigVhmNbwyLS-f_47iF6qIsYJKiT4UQ
Java测试右侧trim的字符串_java基础知识回顾之---java String final类普通方法的应用之“模拟字符串Trim方法”...
2021-03-15 11:14

weixin_39789206的博客 * 一个变量作为从头开始判断字符串空格的角标。不断++。* 一个变量作为从尾开始判断字符串空格的角标。不断--。* 2,判断到不是空格为止，取头尾之间的字符串即可。** 使用char charAt(int index);方法根据index索引...
java处理隐藏字符串的问题，识别清除非法字符 java javascript
2017-12-27 09:40

回答 17 已采纳如果针对手机号，可以用正则匹配下，去掉数字以外的字符,str.replaceAll("[^0-9]", "")
java字符串处理代码
2023-02-04 11:45

优胜111111的博客 //把取道的字符串变成一个char数组 for(int i=0;i if(Character.isLetter(ch[i])){ //判断是否字母 abcCount++; } else if(Character.isDigit(ch[i])){ //判断是否数字 numCount++; } else if(Character.isSpaceChar...
前端json对象与json字符串相互转换的方式_前端对象转json字符串
2024-04-19 04:49

程序员狂喜的博客今天的文章可谓是积蓄了我这几年来的应聘和面试经历总结出来的经验，干货满满呀！...网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。
没有解决我的问题, 去提问

悬赏问题

¥60 求一个简单的网页(标签-安全|关键词-上传)
¥35 lstm时间序列共享单车预测，loss值优化，参数优化算法
¥15 基于卷积神经网络的声纹识别
¥15 Python中的request，如何使用ssr节点，通过代理requests网页。本人在泰国，需要用大陆ip才能玩网页游戏，合法合规。
¥100 为什么这个恒流源电路不能恒流？
¥15 有偿求跨组件数据流路径图
¥15 写一个方法checkPerson，入参实体类Person，出参布尔值
¥15 我想咨询一下路面纹理三维点云数据处理的一些问题，上传的坐标文件里是怎么对无序点进行编号的，以及xy坐标在处理的时候是进行整体模型分片处理的吗
¥15 一直显示正在等待HID—ISP
¥15 Python turtle 画图

如何同时搜索一大堆地图[字符串]字符串

3条回答 默认 最新

悬赏问题

3条回答默认最新