douhui9192 2018-06-14 21:58
浏览 51
已采纳

加权采样,无需使用gonum进行替换

I have a big array of items and another array of weights of the same size. I would like to sample without replacement from the first array based on the weights from the second array. Is there a way to do this using gonum?

  • 写回答

1条回答 默认 最新

  • dpndp64206 2018-06-14 22:54
    关注

    Weighted and its relative method .Take() look exactly like what you want.

    From the doc:

    func NewWeighted(w []float64, src *rand.Rand) Weighted
    

    NewWeighted returns a Weighted for the weights w. If src is nil, rand.Rand is used as the random source. Note that sampling from weights with a high variance or overall low absolute value sum may result in problems with numerical stability.

    func (s Weighted) Take() (idx int, ok bool)
    

    Take returns an index from the Weighted with probability proportional to the weight of the item. The weight of the item is then set to zero. Take returns false if there are no items remaining.

    Therefore Take is indeed what you need for sampling without replacement.

    You can use NewWeighted to create a Weighted with the given weights, then use Take to extract one index with probability based on the previously set weights, and then select the item at the extracted index from your array of samples.


    Working example:

    package main
    
    import (
        "fmt"
        "time"
    
        "golang.org/x/exp/rand"
    
        "gonum.org/v1/gonum/stat/sampleuv"
    )
    
    func main() {
        samples := []string{"hello", "world", "what's", "going", "on?"}
        weights := []float64{1.0, 0.55, 1.23, 1, 0.002}
    
        w := sampleuv.NewWeighted(
            weights,
            rand.New(rand.NewSource(uint64(time.Now().UnixNano())))
        )
    
        i, _ := w.Take()
    
        fmt.Println(samples[i])
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥30 酬劳2w元求合作写文章
  • ¥15 在现有系统基础上增加功能
  • ¥15 远程桌面文档内容复制粘贴,格式会变化
  • ¥15 关于#java#的问题:找一份能快速看完mooc视频的代码
  • ¥15 这种微信登录授权 谁可以做啊
  • ¥15 请问我该如何添加自己的数据去运行蚁群算法代码
  • ¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”
  • ¥15 网络设备配置与管理这个该怎么弄
  • ¥20 机器学习能否像多层线性模型一样处理嵌套数据
  • ¥20 西门子S7-Graph,S7-300,梯形图