duansanzi5265 2019-01-05 16:48
浏览 38
已采纳

从N个元素的切片生成K个元素的算法

I'm trying to port an algorithm from this Stackoverflow question in Go. The algorithm I'm trying to get working is as follows: given a slice of strings of an arbitrary length, and a "depth", find all the combinations of the elements in the original slice that are of length depth. For example, if given a slice containing A, B, C, D, E, and F, and a depth of 3, the result should be:

[A, B, C]
[A, B, D]
[A, B, E]
[A, B, F]
[A, C, D]
[A, C, E]
[A, C, F]
[A, D, E]
[A, D, F]
[A, E, F]
[B, C, D]
[B, C, E]
[B, C, F]
[B, D, E]
[B, D, F]
[B, E, F]
[C, D, E]
[C, D, F]
[C, E, F]
[D, E, F]

I've tried to implement a few of the proposed solutions in the above post in Go, but unfortunately my Go skills aren't quite up to snuff yet. I've only started programming in Go a few weeks ago.

Here is the broken code that was a failed attempt to port this implementation in Java:

package main

import (
    "fmt"
)

func main() {
    combos := []string{"A","B","C","D","E","F"}
    combos = GetCombos(combos, 3)

    fmt.Println(combos)
}

func GetCombos(set []string, depth int) []string {
    var results []string
    element := make([]string, depth)
    return GetEnvCombos2(set, depth, 0, element, results)
}

func GetCombos2(set []string, depth int, start int, element, results []string) []string {
    if depth == 0 {
        var guess string
        for _, e := range element {
            guess += e
        }
        results = append(results, guess)
        return results
    }
    for i := start; i <= len(set) - depth; i++ {
        element[len(element) - depth] = set[i]
        results = append(results, GetEnvCombos2(set, depth - 1, i + 1, element, results)...)
    }

    return nil
}

I don't know if that implementation in Java is the most efficient way to do it, but it seemed fairly efficient and (I thought) relatively easy to port to Go. If there's a totally different, yet more efficient way to do this, I'd gladly accept that.

  • 写回答

1条回答 默认 最新

  • dongtu4028 2019-01-05 18:56
    关注

    Note:

    The correct answer to any combinatoric problem is practically never to put all possibly combinations into a container and process them afterwards. There are typically an enormous number of combinations and the temporary container tends to use up all available memory for items which are only going to be referenced once. The original Java program buries the processing step (in this case, "print the combination") deep inside the generation function, which is also practically never a good solution because it requires creating an entire new generator function for every different action.

    One way to structure combinatorial generation is to use a function which finds the next combination, given the previous one. Such a function is typically called an "iterator". If the last combination is provided, the function returns a return value indicating that there are no more combinations available. Often, the provided combination is modified "in-place", so that the return value is just a boolean indicating whether the combination was the last one or not. (It's usually considered best practice to reset the supplied combination to the first combination as well as reporting that it was previously the last combination.) That strategy doesn't work well with recursive algorithms such as the one you are porting.

    Many languages include some facility which allows recursive generation of possible values. For example, in Go you can write an iterator as a "go routine". That can produce really elegant code, although there is an underlying cost.

    It is always possible to reimplement a recursive function as an iterative one, by simulating the call stack with some kind of stack datastructure; however, the result is harder to understand and often slower (because native recursion is almost always faster than simulated recursion). And you might be able to find a non-recursive algorithm for iterating (possibly changing the iteration order).

    I'm not going to do any of those things, here, though. The following simply satisfies the same prototype as the original code, returning a (possibly enormous) slice of results, because the underlying issue is simply a question of design of recursive functions.


    The prototype of the internal recursive generator is

    func GetCombos2(set []string,
                    depth int, start int,
                    element []string,
                    results []string) []string
    

    (I added the type of element, for clarity.) It's useful to try to articulate what this function does, exactly, which might go something like this:

    Given a list of items set, a partial combination element which still requires depth items to be appended to it, and a list of combinations results, return results appended with the possible combinations starting with the prefix indicated by element whose continuations only contain items whose index is greater than equal to start. Combinations are generated in monotonically increasing index order, and it is required that all items in the prefix have indices less than start.

    That's a bit of a mouthful, and I'm not sure that reading it is immediately clearer than the code. But it is possibly helpful as a way of understanding what is going on. Here, I'm just going to focus on one small part:

    Given… results, … return results appended with… [the new combinations computed with these arguments]

    That's not the only possible way of writing this recursion. Another way would be to not require results as an argument, and to simply return the list of combinations generated according to the other arguments. That would produce slightly simpler code but it could be quite a bit slower because of the number of slices of partial results generated and immediately discarded. The use of "accumulator" arguments like results is a common technique for making recursion more efficient.

    What's important about this discussion is understanding what the return value of the recursive function is. If you use the "accumulator" strategy (with the results argument) then the return value is the entire list of results found up to this point, and you only append to it if you are adding a new result. If you use the non-accumulator strategy, then when you find a new result you return it immediately, leaving it to the caller to concatenate the various lists it receives from multiple calls.

    So the two strategies would look like this:

    Accumulator version:

    func GetCombos2(set []string, depth int, start int, element []string, results []string) []string {
        if depth == 0 {
            results = append(results, strings.Join(element, "")) 
        } else {
            for i := start; i <= len(set) - depth; i++ {
                element[len(element) - depth] = set[i]
                results = GetEnvCombos2(set, depth - 1, i + 1, element, results)
            }
        }
        return results
    }
    

    Non-accumulator version:

    func GetCombos2(set []string, depth int, start int, element []string) []string {
        if depth == 0 {
            return []string { strings.Join(element, "") }
        } else {
            var results []string
            for i := start; i <= len(set) - depth; i++ {
                element[len(element) - depth] = set[i]
                results = append(results, GetCombos2(set, depth - 1, i + 1, element)...)
            }
            return results
        }
    }
    

    EDIT: After writing that, I realised that the use of the string array elements is really a Java-ism which doesn't translate well to Go. (Or perhaps it's a C-ism badly translated to Java.) Anyway, the functions are slightly faster and quite a bit easier to read if we just pass a string representing the prefix so that we don't need to do the Join. (Go strings are immutable, so there's no need to copy them before putting them into the results slice.)

    That reduces the code to the following:

    Accumulator version (recommended, but an iterator would be even better):

    func GetCombos(set []string, depth int) []string {
        return GetCombosHelper(set, depth, 0, "", []string{})
    }
    
    func GetCombosHelper(set []string, depth int, start int, prefix string, accum []string) []string {
        if depth == 0 {
            return append(accum, prefix)
        } else {
            for i := start; i <= len(set) - depth; i++ {
                accum = GetCombosHelper(set, depth - 1, i + 1, prefix + set[i], accum)
            }
            return accum
        }
    }
    

    Non-accumulator version:

    func GetCombos(set []string, depth int) []string {
        return GetCombosHelper(set, depth, 0, "")
    }
    
    func GetCombosHelper(set []string, depth int, start int, prefix string) []string {
        if depth == 0 {
            return []string{prefix}
        } else {
            accum := []string{}
            for i := start; i <= len(set) - depth; i++ {
                accum = append(accum, GetCombosHelper(set, depth - 1, i + 1, prefix + set[i])...)
            }
            return accum
        }
    }
    

    On my laptop, given a set of 62 elements (upper and lower case letters plus digits) with depth 6, the non-accumulator version took 29.7 seconds (elapsed) and the accumulator version took 13.4 seconds. Both used about 4.5 gigabytes of memory, which seemed a bit high to me since there are "only" 61,474,519 six-character combinations, and the memory consumption works out to almost 80 bytes peak memory usage per combination.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥50 求解vmware的网络模式问题 别拿AI回答
  • ¥24 EFS加密后,在同一台电脑解密出错,证书界面找不到对应指纹的证书,未备份证书,求在原电脑解密的方法,可行即采纳
  • ¥15 springboot 3.0 实现Security 6.x版本集成
  • ¥15 PHP-8.1 镜像无法用dockerfile里的CMD命令启动 只能进入容器启动,如何解决?(操作系统-ubuntu)
  • ¥30 请帮我解决一下下面六个代码
  • ¥15 关于资源监视工具的e-care有知道的嘛
  • ¥35 MIMO天线稀疏阵列排布问题
  • ¥60 用visual studio编写程序,利用间接平差求解水准网
  • ¥15 Llama如何调用shell或者Python
  • ¥20 谁能帮我挨个解读这个php语言编的代码什么意思?