将平面列表加权为正态分布

I have list of string items of any length, I need to "normalize" this list so that each item is part of a normal distribution, appending the weight to the string.

What is more effective and mathematical/statistical way to go about this other than what I have below?

func normalizeAppend(in []string, shuffle bool) []string {
    var ret []string

    if shuffle {
        shuffleStrings(in)
    }

    l := len(in)
    switch {
    case remain(l, 3) == 0:
        l3 := (l / 3)
        var low, mid, high []string
        for i, v := range in {
            o := i + 1
            switch {
            case o <= l3:
                low = append(low, v)
            case o > l3 && o <= l3*2:
                mid = append(mid, v)
            case o >= l3*2:
                high = append(high, v)
            }
        }

        q1 := 1600 / len(low)
        q2 := 6800 / len(mid)
        q3 := 1600 / len(high)

        for _, v := range low {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q1))
        }

        for _, v := range mid {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q2))
        }

        for _, v := range high {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q3))
        }
    case remain(l, 2) == 0 && l >= 4:
        l4 := (l / 4)
        var first, second, third, fourth []string
        for i, v := range in {
            o := i + 1
            switch {
            case o <= l4:
                first = append(first, v)
            case o > l4 && o <= l4*2:
                second = append(second, v)
            case o > l4*2 && o <= l4*3:
                third = append(third, v)
            case o > l4*3:
                fourth = append(fourth, v)
            }
        }
        q1 := 1600 / len(first)
        q2 := 3400 / len(second)
        q3 := 3400 / len(third)
        q4 := 1600 / len(fourth)

        for _, v := range first {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q1))
        }

        for _, v := range second {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q2))
        }

        for _, v := range third {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q3))
        }

        for _, v := range fourth {
            ret = append(ret, fmt.Sprintf("%s_%d", v, q4))
        }
    default:
        var first, second, third []string
        q1 := (1 + math.Floor(float64(l)*.16))
        q3 := (float64(l) - math.Floor(float64(l)*.16))
        var o float64
        for i, v := range in {
            o = float64(i + 1)
            switch {
            case o <= q1:
                first = append(first, v)
            case o > q1 && o < q3:
                second = append(second, v)
            case o >= q3:
                third = append(third, v)
            }
        }
        lq1 := 1600 / len(first)
        lq2 := 3400 / len(second)
        lq3 := 1600 / len(third)
        for _, v := range first {
            ret = append(ret, fmt.Sprintf("%s_%d", v, lq1))
        }

        for _, v := range second {
            ret = append(ret, fmt.Sprintf("%s_%d", v, lq2))
        }

        for _, v := range third {
            ret = append(ret, fmt.Sprintf("%s_%d", v, lq3))
        }

    }

    return ret
}

Some requested clarification:

I have a list of items that will chosen from the list many times one at a time by weighted selection, to start with I have a list with (implied) weights of 1:

[a_1, b_1, c_1, d_1, e_1, f_1, g_1, h_1, i_1, j_1, k_1]

I'm looking for a better way to make that list into something producing a more 'normal' distribution of weighting for selection:

[a_1, b_2, c_3, d_5, e_14, f_30, g_14, h_5, i_3, j_2, k_1]

or perhaps it is likely I need to change my methods to something more grounded statistically. Bottom line is I want to control selection from a list of items in many ways, one of which here is ensuring that items are returned in way approximating a normal curve.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanji6997 2016-11-05 13:30
关注
If you just want to calculate the weights for a given list, then you need the following things:

The mean of the normal distribution

The variance of the normal distribution

A discretizer for the values

The first one is quite simple. You want the mean to be in the center of the list. Therefore (assuming zero-based indexing):

mean = (list.size - 1) / 2

The second is kind of arbitrary and depends on how steep you want your weights to fall off. Weights of the normal distribution are practically zero beyond a distance of 3 * standard_deviation from the mean. So a good standard deviation in most cases is probably something between a fourth and a sixth list length:

standard_deviation = (1/4 .. 1/6) * list.size variance = standard_deviation^2

Assuming that you want integer weights, you need to discretize the weights from the normal distribution. The easiest way to do this is by specifying the maximum weight (of the element at the mean position).

That's it. The weight for an element at position i is then:

weight[i] = round(max_weight * exp(-(i - mean)^2 / (2 * variance)))
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

将平面列表加权为正态分布
2016-11-04 19:50

回答 1 已采纳 If you just want to calculate the weights for a given list, then you need the following things:
Python程序设计浙大版6-4列表数字元素加权求和 python
2021-08-22 17:27

回答 2 已采纳因为有bug，p＋1应该写在这里s += f(i, p＋1)，不然在一个子元素是列表时p的值加了1，下一个子元素是int时的乘的p是加了1的p
关于python的加权中值滤波的实现 python
2022-07-22 00:37

回答 2 已采纳 import numpy as np import cv2 as cv def median_filter(input_image, kernel, stride=1, padding=False
LiTAMIN：基于正态分布几何近似的SLAM
2021-03-08 13:53

3Ｄ视觉工坊的博客为了降低成本，作者提出以下协方差转换：将λ凭经验设置为10-6，因为该数字对应于标准偏差为1 mm的正态分布；该数字不会影响ICP结果，因为LiDAR的测量误差范围为几厘米。用GICP中的diag（1,1，ε）替换特征值表明，...
Python程序设计浙大版 6-4 列表数字元素加权求和 python
2021-08-10 11:55

回答 2 已采纳题主的写法之错误是显而易见的：当递归到数值型元素时，继续递归，就会执行for i in l这一句，此时l不是列表，而是数值了，会抛出异常。浙大这本Python程序设计也够烂的，纯粹是误人子弟，居然能
加权积法用matlab怎么写代码 matlab
2022-12-14 01:11

回答 2 已采纳
数学建模Python目标函数加权求和 python 数据结构算法
2022-06-14 12:51

回答 2 已采纳这是一个非线性规划问题。问题的目标函数是 object(x) = s*max([q[i]*x[i] for i in range(1,n+1)]) - (1-s)*sum([(r[i]-p[i])*x
概率统计极简入门：通俗理解微积分/期望方差/正态分布前世今生(23修订版)
2012-12-17 19:24

v_JULY_v的博客（关键词：微积分、概率分布、期望、方差、协方差、数理统计简史、大数定律、中心极限定理、正态分布）导言：本文从微积分相关概念，梳理到概率论与数理统计中的相关知识，但本文之压轴戏在本文第4节（彻底颠覆...
c语言加权求和计算出错，求解答 c语言
2022-05-26 21:47

回答 1 已采纳你的各个函数中判断x是否等于0只用了一个等号，应该用两个的。（低级错误！！！）
用循环计算移动加权平均以及预测 python 均值算法
2022-03-31 17:19

回答 1 已采纳你这问的啥呀，预测模型好多的呀一阶二阶三阶平滑,arima
SQL如图计算周数据加权平均统计 mssql sql sqlserver
2022-09-26 11:35

回答 6 已采纳 --创建测试数据 create table test_20220926 (日期 date, 客户a实际到货量 int,客户a需求量 int, 客户b实际到货量 int,客户b需求量 int, 客户c实
单变量微积分笔记——钟形曲线（Bell Curve）的积分以及（标准）正态分布
2019-05-28 23:00

RavenRaaven的博客最近开始听MIT 18.01单变量微积分来复习微积分课程，听到第23讲的时候（对应的讲义可以到MIT opencourseware下载，讲义索引是session 65a），这节课我居然看到了关于概率分布函数的一些讲解，醍醐灌顶，可能我已经忘...
关于加权前缀和的问题 c++ c语言
2022-12-09 11:46

回答 2 已采纳你计算加权的部分感觉有点复杂，我想是你把求和和录入数组放在了一起的原因，其实可以分开，这样方便也不容易出错。所以建议换这种更清晰好懂的写法：用数组a来存放数列的每一个项第二次输入1，i，j则循环累加a
随机变量及分布函数——多元高斯分布与正态分布
2023-09-12 10:21

禅与计算机程序设计艺术的博客 $$ p(x_1, x_2) = p(x_1)\cdot p(x_2), \quad (x_1, x_2) \in [a, b]^{2} $$ 即，二维平面上的点(x1, x2)处的概率等于分别在x轴和y轴的单变量分布的概率之积，那么称此联合分布为X1和X2的联合分布，记作$p(x_1, x_2...
点云从入门到精通技术详解100篇-基于参数平面拉伸的点云流形攻击
2024-03-26 00:30

格图素书的博客点云的外观形状是识别点云的重要信息，而点云的中心点对于分类网络没有影响，以点云中心点为原点，将点云由笛卡尔坐标系转为球坐标系，那么将（ 3 ）基于点添加的对抗攻击 Xiang 等人 [5] 在 C&W 的对抗攻击框架下...
android 空间卷积滤镜,Android OpenGL ES(四)-为平面图添加滤镜
2021-06-09 05:09

Lohengr1n的博客上文Android OpenGL ES(三)-平面图形的最后，我们通过渲染纹理，终于将我们的2D图片渲染到了OpenGL中。这章，我们再接再厉，为我们的纹理添加单独的滤镜效果上一章加载图片的过程，在这里就不做赘述。黑白效果基础...
【模式识别】模式识别课程复习
2018-12-11 19:38

Rebecca(swust)的博客在两类二维问题中，每类的特征向量都是正态分布的，协方差矩阵相同为，并且已知两类的均值分别为μ1=(1, 0)T，μ2=(－1, 0)T先验概率相等。 (1) 根据最小错误率分类器对特征向量x=(0, 1)T分类； (2) 主轴长度为 ...
2.机器学习基础（四）
2022-04-15 23:58

abolition cc的博客文章目录2.11 模型评估2.11.1 模型评估常用方法？2.11.2 误差、偏差和方差有什么区别和联系2.11.3 经验误差与泛化误差2.11.4 图解欠拟合、过拟合2.11.5 如何解决过...FPR2.11.14 如何计算AUC2.11.15 为什么使用Roc和Au
没有解决我的问题, 去提问

悬赏问题

¥15 （希望可以解决问题）ma和mb文件无法正常打开，打开后是空白，但是有正常内存占用，但可以在打开Maya应用程序后打开场景ma和mb格式。
¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
¥20 腾讯企业邮箱邮件可以恢复么
¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗？
¥15 错误 LNK2001 无法解析的外部符号
¥50 安装pyaudiokits失败
¥15 计组这些题应该咋做呀
¥60 更换迈创SOL6M4AE卡的时候，驱动要重新装才能使用，怎么解决？
¥15 让node服务器有自动加载文件的功能
¥15 jmeter脚本回放有的是对的有的是错的

将平面列表加权为正态分布

1条回答 默认 最新

悬赏问题

1条回答默认最新