duansanzi5265 2019-01-05 16:48
浏览 38
已采纳

从N个元素的切片生成K个元素的算法

I'm trying to port an algorithm from this Stackoverflow question in Go. The algorithm I'm trying to get working is as follows: given a slice of strings of an arbitrary length, and a "depth", find all the combinations of the elements in the original slice that are of length depth. For example, if given a slice containing A, B, C, D, E, and F, and a depth of 3, the result should be:

[A, B, C]
[A, B, D]
[A, B, E]
[A, B, F]
[A, C, D]
[A, C, E]
[A, C, F]
[A, D, E]
[A, D, F]
[A, E, F]
[B, C, D]
[B, C, E]
[B, C, F]
[B, D, E]
[B, D, F]
[B, E, F]
[C, D, E]
[C, D, F]
[C, E, F]
[D, E, F]

I've tried to implement a few of the proposed solutions in the above post in Go, but unfortunately my Go skills aren't quite up to snuff yet. I've only started programming in Go a few weeks ago.

Here is the broken code that was a failed attempt to port this implementation in Java:

package main

import (
    "fmt"
)

func main() {
    combos := []string{"A","B","C","D","E","F"}
    combos = GetCombos(combos, 3)

    fmt.Println(combos)
}

func GetCombos(set []string, depth int) []string {
    var results []string
    element := make([]string, depth)
    return GetEnvCombos2(set, depth, 0, element, results)
}

func GetCombos2(set []string, depth int, start int, element, results []string) []string {
    if depth == 0 {
        var guess string
        for _, e := range element {
            guess += e
        }
        results = append(results, guess)
        return results
    }
    for i := start; i <= len(set) - depth; i++ {
        element[len(element) - depth] = set[i]
        results = append(results, GetEnvCombos2(set, depth - 1, i + 1, element, results)...)
    }

    return nil
}

I don't know if that implementation in Java is the most efficient way to do it, but it seemed fairly efficient and (I thought) relatively easy to port to Go. If there's a totally different, yet more efficient way to do this, I'd gladly accept that.

  • 写回答

1条回答 默认 最新

  • dongtu4028 2019-01-05 18:56
    关注

    Note:

    The correct answer to any combinatoric problem is practically never to put all possibly combinations into a container and process them afterwards. There are typically an enormous number of combinations and the temporary container tends to use up all available memory for items which are only going to be referenced once. The original Java program buries the processing step (in this case, "print the combination") deep inside the generation function, which is also practically never a good solution because it requires creating an entire new generator function for every different action.

    One way to structure combinatorial generation is to use a function which finds the next combination, given the previous one. Such a function is typically called an "iterator". If the last combination is provided, the function returns a return value indicating that there are no more combinations available. Often, the provided combination is modified "in-place", so that the return value is just a boolean indicating whether the combination was the last one or not. (It's usually considered best practice to reset the supplied combination to the first combination as well as reporting that it was previously the last combination.) That strategy doesn't work well with recursive algorithms such as the one you are porting.

    Many languages include some facility which allows recursive generation of possible values. For example, in Go you can write an iterator as a "go routine". That can produce really elegant code, although there is an underlying cost.

    It is always possible to reimplement a recursive function as an iterative one, by simulating the call stack with some kind of stack datastructure; however, the result is harder to understand and often slower (because native recursion is almost always faster than simulated recursion). And you might be able to find a non-recursive algorithm for iterating (possibly changing the iteration order).

    I'm not going to do any of those things, here, though. The following simply satisfies the same prototype as the original code, returning a (possibly enormous) slice of results, because the underlying issue is simply a question of design of recursive functions.


    The prototype of the internal recursive generator is

    func GetCombos2(set []string,
                    depth int, start int,
                    element []string,
                    results []string) []string
    

    (I added the type of element, for clarity.) It's useful to try to articulate what this function does, exactly, which might go something like this:

    Given a list of items set, a partial combination element which still requires depth items to be appended to it, and a list of combinations results, return results appended with the possible combinations starting with the prefix indicated by element whose continuations only contain items whose index is greater than equal to start. Combinations are generated in monotonically increasing index order, and it is required that all items in the prefix have indices less than start.

    That's a bit of a mouthful, and I'm not sure that reading it is immediately clearer than the code. But it is possibly helpful as a way of understanding what is going on. Here, I'm just going to focus on one small part:

    Given… results, … return results appended with… [the new combinations computed with these arguments]

    That's not the only possible way of writing this recursion. Another way would be to not require results as an argument, and to simply return the list of combinations generated according to the other arguments. That would produce slightly simpler code but it could be quite a bit slower because of the number of slices of partial results generated and immediately discarded. The use of "accumulator" arguments like results is a common technique for making recursion more efficient.

    What's important about this discussion is understanding what the return value of the recursive function is. If you use the "accumulator" strategy (with the results argument) then the return value is the entire list of results found up to this point, and you only append to it if you are adding a new result. If you use the non-accumulator strategy, then when you find a new result you return it immediately, leaving it to the caller to concatenate the various lists it receives from multiple calls.

    So the two strategies would look like this:

    Accumulator version:

    func GetCombos2(set []string, depth int, start int, element []string, results []string) []string {
        if depth == 0 {
            results = append(results, strings.Join(element, "")) 
        } else {
            for i := start; i <= len(set) - depth; i++ {
                element[len(element) - depth] = set[i]
                results = GetEnvCombos2(set, depth - 1, i + 1, element, results)
            }
        }
        return results
    }
    

    Non-accumulator version:

    func GetCombos2(set []string, depth int, start int, element []string) []string {
        if depth == 0 {
            return []string { strings.Join(element, "") }
        } else {
            var results []string
            for i := start; i <= len(set) - depth; i++ {
                element[len(element) - depth] = set[i]
                results = append(results, GetCombos2(set, depth - 1, i + 1, element)...)
            }
            return results
        }
    }
    

    EDIT: After writing that, I realised that the use of the string array elements is really a Java-ism which doesn't translate well to Go. (Or perhaps it's a C-ism badly translated to Java.) Anyway, the functions are slightly faster and quite a bit easier to read if we just pass a string representing the prefix so that we don't need to do the Join. (Go strings are immutable, so there's no need to copy them before putting them into the results slice.)

    That reduces the code to the following:

    Accumulator version (recommended, but an iterator would be even better):

    func GetCombos(set []string, depth int) []string {
        return GetCombosHelper(set, depth, 0, "", []string{})
    }
    
    func GetCombosHelper(set []string, depth int, start int, prefix string, accum []string) []string {
        if depth == 0 {
            return append(accum, prefix)
        } else {
            for i := start; i <= len(set) - depth; i++ {
                accum = GetCombosHelper(set, depth - 1, i + 1, prefix + set[i], accum)
            }
            return accum
        }
    }
    

    Non-accumulator version:

    func GetCombos(set []string, depth int) []string {
        return GetCombosHelper(set, depth, 0, "")
    }
    
    func GetCombosHelper(set []string, depth int, start int, prefix string) []string {
        if depth == 0 {
            return []string{prefix}
        } else {
            accum := []string{}
            for i := start; i <= len(set) - depth; i++ {
                accum = append(accum, GetCombosHelper(set, depth - 1, i + 1, prefix + set[i])...)
            }
            return accum
        }
    }
    

    On my laptop, given a set of 62 elements (upper and lower case letters plus digits) with depth 6, the non-accumulator version took 29.7 seconds (elapsed) and the accumulator version took 13.4 seconds. Both used about 4.5 gigabytes of memory, which seemed a bit high to me since there are "only" 61,474,519 six-character combinations, and the memory consumption works out to almost 80 bytes peak memory usage per combination.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 数学建模,尽量用matlab回答,论文格式
  • ¥15 昨天挂载了一下u盘,然后拔了
  • ¥30 win from 窗口最大最小化,控件放大缩小,闪烁问题
  • ¥20 易康econgnition精度验证
  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能