如何在围棋中生成一个固定长度的随机字符串？

I want a random string of characters only (uppercase or lowercase), no numbers, in Go. What is the fastest and simplest way to do this?

转载于:https://stackoverflow.com/questions/22892120/how-to-generate-a-random-string-of-a-fixed-length-in-go

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

8条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
derek5. 2015-08-05 12:18
关注
Paul's solution provides a simple, general solution.

The question asks for the "the fastest and simplest way". Let's address this. We'll arrive at our final, fastest code in an iterative manner. Benchmarking each iteration can be found at the end of the answer.

All the solutions and the benchmarking code can be found on the Go Playground. The code on the Playground is a test file, not an executable. You have to save it into a file named XX_test.go and run it with go test -bench ..

I. Improvements

1. Genesis (Runes)

As a reminder, the original, general solution we're improving is this:

func init() { rand.Seed(time.Now().UnixNano()) } var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") func RandStringRunes(n int) string { b := make([]rune, n) for i := range b { b[i] = letterRunes[rand.Intn(len(letterRunes))] } return string(b) }

2. Bytes

If the characters to choose from and assemble the random string contains only the uppercase and lowercase letters of the English alphabet, we can work with bytes only because the English alphabet letters map to bytes 1-to-1 in the UTF-8 encoding (which is how Go stores strings).

So instead of:

var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

we can use:

var letters = []bytes("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

Or even better:

const letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

Now this is already a big improvement: we could achieve it to be a const (there are string constants but there are no slice constants). As an extra gain, the expression len(letters) will also be a const! (The expression len(s) is constant if s is a string constant.)

And at what cost? Nothing at all. strings can be indexed which indexes its bytes, perfect, exactly what we want.

Our next destination looks like this:

const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" func RandStringBytes(n int) string { b := make([]byte, n) for i := range b { b[i] = letterBytes[rand.Intn(len(letterBytes))] } return string(b) }

3. Remainder

Previous solutions get a random number to designate a random letter by calling rand.Intn() which delegates to Rand.Intn() which delegates to Rand.Int31n().

This is much slower compared to rand.Int63() which produces a random number with 63 random bits.

So we could simply call rand.Int63() and use the remainder after dividing by len(letterBytes):

func RandStringBytesRmndr(n int) string { b := make([]byte, n) for i := range b { b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))] } return string(b) }

This works and is significantly faster, the disadvantage is that the probability of all the letters will not be exactly the same (assuming rand.Int63() produces all 63-bit numbers with equal probability). Although the distortion is extremely small as the number of letters 52 is much-much smaller than 1<<63 - 1, so in practice this is perfectly fine.

^{To make this understand easier: let's say you want a random number in the range of 0..5. Using 3 random bits, this would produce the numbers 0..1 with double probability than from the range 2..5. Using 5 random bits, numbers in range 0..1 would occur with 6/32 probability and numbers in range 2..5 with 5/32 probability which is now closer to the desired. Increasing the number of bits makes this less significant, when reaching 63 bits, it is negligible.}

4. Masking

Building on the previous solution, we can maintain the equal distribution of letters by using only as many of the lowest bits of the random number as many is required to represent the number of letters. So for example if we have 52 letters, it requires 6 bits to represent it: 52 = 110100b. So we will only use the lowest 6 bits of the number returned by rand.Int63(). And to maintain equal distribution of letters, we only "accept" the number if it falls in the range 0..len(letterBytes)-1. If the lowest bits are greater, we discard it and query a new random number.

Note that the chance of the lowest bits to be greater than or equal to len(letterBytes) is less than 0.5 in general (0.25 on average), which means that even if this would be the case, repeating this "rare" case decreases the chance of not finding a good number. After n repetition, the chance that we sill don't have a good index is much less than pow(0.5, n), and this is just an upper estimation. In case of 52 letters the chance that the 6 lowest bits are not good is only (64-52)/64 = 0.19; which means for example that chances to not have a good number after 10 repetition is 1e-8.

So here is the solution:

const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" const ( letterIdxBits = 6 // 6 bits to represent a letter index letterIdxMask = 1<<letterIdxBits - 1 // All 1-bits, as many as letterIdxBits ) func RandStringBytesMask(n int) string { b := make([]byte, n) for i := 0; i < n; { if idx := int(rand.Int63() & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i++ } } return string(b) }

5. Masking Improved

The previous solution only uses the lowest 6 bits of the 63 random bits returned by rand.Int63(). This is a waste as getting the random bits is the slowest part of our algorithm.

If we have 52 letters, that means 6 bits code a letter index. So 63 random bits can designate 63/6 = 10 different letter indices. Let's use all those 10:

const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" const ( letterIdxBits = 6 // 6 bits to represent a letter index letterIdxMask = 1<<letterIdxBits - 1 // All 1-bits, as many as letterIdxBits letterIdxMax = 63 / letterIdxBits // # of letter indices fitting in 63 bits ) func RandStringBytesMaskImpr(n int) string { b := make([]byte, n) // A rand.Int63() generates 63 random bits, enough for letterIdxMax letters! for i, cache, remain := n-1, rand.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = rand.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i-- } cache >>= letterIdxBits remain-- } return string(b) }

6. Source

The Masking Improved is pretty good, not much we can improve on it. We could, but not worth the complexity.

Now let's find something else to improve. The source of random numbers.

There is a crypto/rand package which provides a Read(b []byte) function, so we could use that to get as many bytes with a single call as many we need. This wouldn't help in terms of performance as crypto/rand implements a cryptographically secure pseudorandom number generator so it's much slower.

So let's stick to the math/rand package. The rand.Rand uses a rand.Source as the source of random bits. rand.Source is an interface which specifies a Int63() int64 method: exactly and the only thing we needed and used in our latest solution.

So we don't really need a rand.Rand (either explicit or the global, shared one of the rand package), a rand.Source is perfectly enough for us:

var src = rand.NewSource(time.Now().UnixNano()) func RandStringBytesMaskImprSrc(n int) string { b := make([]byte, n) // A src.Int63() generates 63 random bits, enough for letterIdxMax characters! for i, cache, remain := n-1, src.Int63(), letterIdxMax; i >= 0; { if remain == 0 { cache, remain = src.Int63(), letterIdxMax } if idx := int(cache & letterIdxMask); idx < len(letterBytes) { b[i] = letterBytes[idx] i-- } cache >>= letterIdxBits remain-- } return string(b) }

Also note that this last solution doesn't require you to initialize (seed) the global Rand of the math/rand package as that is not used (and our rand.Source is properly initialized / seeded).

One more thing to note here: package doc of math/rand states:

The default Source is safe for concurrent use by multiple goroutines.

So the default source is slower than a Source that may be obtained by rand.NewSource(), because the default source has to provide safety under concurrent access / use, while rand.NewSource() does not offer this (and thus the Source returned by it is more likely to be faster).

(7. Using rand.Read())

Go 1.7 added a rand.Read() function and a Rand.Read() method. We should be tempted to use these to read as many bytes as we need in one step, in order to achieve better performance.

There is one small "problem" with this: how many bytes do we need? We could say: as many as the number of output letters. We would think this is an upper estimation, as a letter index uses less than 8 bits (1 byte). But at this point we are already doing worse (as getting the random bits is the "hard part"), and we're getting more than needed.

Also note that to maintain equal distribution of all letter indices, there might be some "garbage" random data that we won't be able to use, so we would end up skipping some data, and thus end up short when we go through all the byte slice. We would need to further get more random bytes, "recursively". And now we're even losing the "single call to rand package" advantage...

We could "somewhat" optimize the usage of the random data we acquire from math.Rand(). We may estimate how many bytes (bits) we'll need. 1 letter requires letterIdxBits bits, and we need n letters, so we need n * letterIdxBits / 8.0 bytes rounding up. We can calculate the probability of a random index not being usable (see above), so we could request more that will "more likely" be enough (if it turns out it's not, we repeat the process). We can process the byte slice as a "bit stream" for example, for which we have a nice 3rd party lib: github.com/icza/bitio (disclosure: I'm the author).

But Benchmark code still shows we're not winning. Why is it so?

The answer to the last question is because rand.Read() uses a loop and keeps calling Source.Int63() until it fills the passed slice. Exactly what the RandStringBytesMaskImprSrc() solution does, without the intermediate buffer, and without the added complexity. That's why RandStringBytesMaskImprSrc() remains on the throne. Yes, RandStringBytesMaskImprSrc() uses an unsynchronized rand.Source unlike rand.Read(). But the reasoning still applies; and which is proven if we use Rand.Read() instead of rand.Read() (the former is also unsynchronzed).

II. Benchmark

All right, let's benchmark the different solutions.

BenchmarkRunes 1000000 1703 ns/op BenchmarkBytes 1000000 1328 ns/op BenchmarkBytesRmndr 1000000 1012 ns/op BenchmarkBytesMask 1000000 1214 ns/op BenchmarkBytesMaskImpr 5000000 395 ns/op BenchmarkBytesMaskImprSrc 5000000 303 ns/op

Just by switching from runes to bytes, we immediately have 22% performance gain.

Getting rid of rand.Intn() and using rand.Int63() instead gives another 24% boost.

Masking (and repeating in case of big indices) slows down a little (due to repetition calls): -20%...

But when we make use of all (or most) of the 63 random bits (10 indices from one rand.Int63() call): that speeds up 3.4 times.

And finally if we settle with a (non-default, new) rand.Source instead of rand.Rand, we again gain 23%.

Comparing the final to the initial solution: RandStringBytesMaskImprSrc() is 5.6 times faster than RandStringRunes().
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(7条)

报告相同问题？

关注问题

如何在围棋中生成一个固定长度的随机字符串？ golang
2014-04-06 09:22

回答 8 已采纳 Paul's solution provides a simple, general solution. The question asks for the "the fastest and s
如何在官方的“围棋之旅”中播种随机数生成器？
2017-07-23 19:15

回答 2 已采纳 By default rand.Intn uses the globalRand.Intn. Its created internally, refer here. So when you set
C语言可以将特殊字符存入数组中吗？ c语言有问必答
2021-12-22 14:24

回答 2 已采纳你这 ╋ 不是 +， ╋ 用 utf-8 编码占三个字符，而 char 类型只能储存一个字符大小
TensorFlow Serving工具的使用方法，以及如何将模型部署到生产环境 ? How to Deploy Machine Learning Models in Production
2023-08-10 09:10

禅与计算机程序设计艺术的博客在实际工作中，我们需要部署机器学习模型，作为机器学习应用的一部分。在开发阶段，我们已经训练好了模型并对其进行了评估。但是如何将其部署到生产环境中，让它可以被实际使用的呢？下面我们将介绍TensorFlow ...
eclips语言中五子棋的弃手功能怎么编程？？？
2016-01-15 11:15

回答 3 已采纳禁手？ http://jingyan.baidu.com/article/92255446892067851648f40d.html
如何检查一个文件是否存在在围棋？ golang
2012-09-20 18:42

回答 7 已采纳 To check if a file doesn't exist, equivalent to Python's if not os.path.exists(filename): if _, e
Go编程语言规范
2021-03-12 14:09

小朱和泡芙的博客黑人的命也是命。支持平等司法倡议。...介绍符号源代码表示人物字母和数字词汇要素评论代币分号身份标识关键字词运算符和标点符号整数文字浮点文字虚构文字符文文字字符串文字常数变数种类方法集布尔类型数值类型
程序设计语言—Python&几种语言区别的总结
2024-04-22 20:50

张颖..的博客程序设计语言篇——Python，还有几种编程语言基础部分的一些对比。
算法编程
2022-10-17 15:57

BTDAER的博客 //求解大于str1的字符串个数以及大于str2的字符串个数，然后两者相减就能得到处于str1和str2之间的字符串个数 //对于求解长度len在[len1,len2]之间，且字典序大于某个字符串(str)的字符串个数： //顺序遍历(i=0:n-1...
Python全栈视频教程-尹成-专题视频课程
2017-07-18 08:41

尹成的博客 Python全栈视频培训教程，帮助开发者从零到一入门python：1.深入理解python语法机制与底层原理，2.深入实战python各种案例， 3.语音识别，图形界面，多线程爬虫，语音合成，游戏控制等等精彩案例。...
【一千个论文合集】计算机科学的26个细分领域近年必读论文集合
2022-05-11 15:17

AMiner学术搜索和科技情报挖掘的博客 AMiner必读论文是一个可以帮您了解某个领域、机构、期刊、会议的学术专辑，包括必读论文和代表学者，由AI初筛+学者复核后提供给您，您可以收藏为自己的论文清单
P vs. NP 五十年：AI正在解决不可解问题
2022-01-01 17:19

人工智能学家的博客来源：AI科技评论作者：Lance Fortnow编译：Don编辑：青暮P和NP问题一直是计算机领域的老大难问题，那么在近50年间，人们对这个问题有什么深入的研究呢？让我们在本文中深挖这个...
深度学习崛起十年：“开挂”的OpenAI革新者
2022-03-18 11:02

智源社区的博客来源｜The Robot Brains Podcast编译｜OneFlow社区AlexNet，AlphaGo，GPT、CLIP、DALL-E和Codex，这些在AI界轰动一时的杰作，对于任...
没有解决我的问题, 去提问

悬赏问题

¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
¥15 ETLCloud 处理json多层级问题
¥15 matlab中使用gurobi时报错
¥15 这个主板怎么能扩出一两个sata口
¥15 不是，这到底错哪儿了😭
¥15 2020长安杯与连接网探
¥15 关于#matlab#的问题：在模糊控制器中选出线路信息，在simulink中根据线路信息生成速度时间目标曲线（初速度为20m/s，15秒后减为0的速度时间图像）我想问线路信息是什么

如何在围棋中生成一个固定长度的随机字符串？

8条回答 默认 最新

I. Improvements

1. Genesis (Runes)

2. Bytes

3. Remainder

4. Masking

5. Masking Improved

6. Source

(7. Using rand.Read())

II. Benchmark

悬赏问题

8条回答默认最新

(7. Using `rand.Read()`)