如何在 Python 中使用大写字母和数字随机生成字符串?

我想生成一个大小为 n 的字符串。

它应该由数字和大写英文字母组成,例如:

  • 6U1S75
  • 4Z4UKK
  • U911K4

我怎样才能以一种毕达哥拉斯的方式实现这个目标?

转载于:https://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits-in-python

csdnceshi59
ℙℕℤℝ Shouldn't this be renamed "Cryptographically secure random string generation..."?
大约 2 年之前 回复
weixin_41568174
from.. You would be interested in this answer then.
大约 5 年之前 回复
csdnceshi69
YaoRaoLov And while it might seem silly to think of the world's population, that's just because you want a huge buffer for potential collisions. See the birthday problem: en.wikipedia.org/wiki/Birthday_problem
5 年多之前 回复
csdnceshi69
YaoRaoLov It's easy to calculate the number of possible combinations. 10 numbers + 26 letters = 36 possible characters, to the power of 6 (length of string) is equal to about two billion. My rule of thumb for random values is "if I generated values for every human on Earth, how many values could they have each?". In this case that would be less than one value per person, so if this is to identify users or objects, it's too few characters. One alternative would be to add in lower case letters, which lands you at 62^6 = almost 57 billion unique values.
5 年多之前 回复
weixin_41568131
10.24 This is a very popular question. I wish an expert would add his take on the uniqueness of these random numbers for the top 3 answers i.e. the collision probability for range of string size, say from 6 to 16.
6 年多之前 回复

27个回答

Answer in one line:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

or even shorter starting with Python 3.6 using random.choices():

''.join(random.choices(string.ascii_uppercase + string.digits, k=N))

A cryptographically more secure version; see https://stackoverflow.com/a/23728630/2213647:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

In details, with a clean function for further reuse:

>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
...    return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'

How does it work ?

We import string, a module that contains sequences of common ASCII characters, and random, a module that deals with random generation.

string.ascii_uppercase + string.digits just concatenates the list of characters representing uppercase ASCII chars and digits:

>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

Then we use a list comprehension to create a list of 'n' elements:

>>> range(4) # range create a list of 'n' numbers
[0, 1, 2, 3]
>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'
['elem', 'elem', 'elem', 'elem']

In the example above, we use [ to create the list, but we don't in the id_generator function so Python doesn't create the list in memory, but generates the elements on the fly, one by one (more about this here).

Instead of asking to create 'n' times the string elem, we will ask Python to create 'n' times a random character, picked from a sequence of characters:

>>> random.choice("abcde")
'a'
>>> random.choice("abcde")
'd'
>>> random.choice("abcde")
'b'

Therefore random.choice(chars) for _ in range(size) really is creating a sequence of size characters. Characters that are randomly picked from chars:

>>> [random.choice('abcde') for _ in range(3)]
['a', 'b', 'b']
>>> [random.choice('abcde') for _ in range(3)]
['e', 'b', 'e']
>>> [random.choice('abcde') for _ in range(3)]
['d', 'a', 'c']

Then we just join them with an empty string so the sequence becomes a string:

>>> ''.join(['a', 'b', 'b'])
'abb'
>>> [random.choice('abcde') for _ in range(3)]
['d', 'c', 'b']
>>> ''.join(random.choice('abcde') for _ in range(3))
'dac'
csdnceshi59
ℙℕℤℝ If anyone is curious, this can yield 36^6=2 176 782 336 different combinations of letters/numbers.
2 年多之前 回复
csdnceshi64
游.程 You'll get it in every version if you don't import it...
2 年多之前 回复
csdnceshi61
derek5. I get name 'string' is not defined in python 3.6
2 年多之前 回复
csdnceshi71
Memor.の just wondering why all the examples are using string.ascii_uppercase. Is there any difference if string.uppercase is used? According to source code ascii_uppercase = uppercase. So I think there shouldn't be any difference.
2 年多之前 回复
csdnceshi56
lrony* You really wat to bind those global objects to locals, avoiding mapping and attribute lookups.
接近 3 年之前 回复
weixin_41568196
撒拉嘿哟木头 Every year I keep stumbling to this answer and being amazed by Python
接近 4 年之前 回复
csdnceshi64
游.程 Correct, but that was never a requirement in the first place. And even so, the attacker won't necessarily know that; there are still 62**N possible passwords even if every single character in an arbitrary password is a numeral.
5 年多之前 回复
csdnceshi52
妄徒之命 Looks nice, but does not ensure the string will contain letters and digits. If its used with a ranger lower than 12, there are many cases where the output contains only letters.
5 年多之前 回复
weixin_41568184
叼花硬汉 random.sample creates samples without replacement, in other words, without the possibility to repeat characters, which isn't in the OP's requirements. I don't think that would be desirable for most applications.
大约 6 年之前 回复
csdnceshi76
斗士狗 Very useful. Interestingly, Django is using this piece of code for generating passwords & CSRF tokens. Although you should replace random with random.SystemRandom() : github.com/django/django/blob/…
6 年多之前 回复
csdnceshi64
游.程 I don't know the math off-hand, but it is out there somewhere. Maybe try asking on Math.SE.
6 年多之前 回复
csdnceshi76
斗士狗 Is this implementation suitable for generating unique sequences? What's the probability of collision with 6 characters?
6 年多之前 回复
csdnceshi78
程序go I would use random.sample instead of random.choice
6 年多之前 回复
csdnceshi64
游.程 That wouldn't have a huge effect on the algorithm for sane values of N.
7 年多之前 回复
csdnceshi50
三生石@ I'd replace range with xrange.
7 年多之前 回复
csdnceshi64
游.程 en.wikipedia.org/wiki/Gettext#Operation
接近 8 年之前 回复
csdnceshi54
hurriedly% how does I18n relate to using _ as a dummy variable ?
接近 8 年之前 回复
csdnceshi70
笑故挽风 Great code - I also remove letters and numbers that can appear similar so that users can type them in without getting confused. Eg, I might remove the chars '0O1L'. Of cause, this depends on what font you are rendering with.
大约 8 年之前 回复
csdnceshi64
游.程 Except when I18N matters.
8 年多之前 回复
csdnceshi79
python小菜 nice function. simple and effective. Would suggest 1 minor change, which is to use '_' instead of 'x' in the list comp. since _ is generally an idiom for throw away dummy var in python.
8 年多之前 回复
csdnceshi64
游.程 "... has a period of 2**19937-1."
8 年多之前 回复
weixin_41568127
?yb? How soon the string going to repeat by this way? Does random.choice strong enough?
8 年多之前 回复
csdnceshi69
YaoRaoLov yep, that's what I meant.
8 年多之前 回复
weixin_41568134
MAO-EYE Do you mean SystemRandom? If not, please supply URL.
8 年多之前 回复
csdnceshi69
YaoRaoLov If you want something a little bit more randomly secure, for your consideration: import random ; random = random.SecureRandom()
接近 9 年之前 回复
csdnceshi55
~Onlooker I added a quick note about this in the answer, and a link to a more detailed answer about iterable, list comprehension, generators and eventually the yield keyword.
接近 9 年之前 回复
csdnceshi55
~Onlooker I edited the answer so you'll have a detail explanation of how this stuff works.
接近 9 年之前 回复
csdnceshi64
游.程 It's not a list comprehension; it's a generator expression.
9 年多之前 回复
csdnceshi65
larry*wei could you comment on why the list comprehension in this case does not require surrounding square brackets?
9 年多之前 回复
csdnceshi62
csdnceshi62 How does this work??? I am new to python and love it's extreme high level-ness but this just blew my mind. Is there anywhere where I can read documentation on this?
接近 10 年之前 回复

Simply use Python's builtin uuid:

If UUIDs are okay for your purposes, use the built-in uuid package.

One Line Solution:

import uuid; uuid.uuid4().hex.upper()[0:6]

In Depth Version:

Example:

import uuid
uuid.uuid4() #uuid4 => full random uuid
# Outputs something like: UUID('0172fc9a-1dac-4414-b88d-6b9a6feb91ea')

If you need exactly your format (for example, "6U1S75"), you can do it like this:

import uuid

def my_random_string(string_length=10):
    """Returns a random string of length string_length."""
    random = str(uuid.uuid4()) # Convert UUID format to a Python string.
    random = random.upper() # Make all characters uppercase.
    random = random.replace("-","") # Remove the UUID '-'.
    return random[0:string_length] # Return the random string.

print(my_random_string(6)) # For example, D9E50C
csdnceshi77
狐狸.fox uuid.uuid4().hex[:string_length].upper() is another way to do the one line solution. Works in both Python 2 and 3.
大约 2 年之前 回复
csdnceshi66
必承其重 | 欲带皇冠 Why limit yourself to just hex characters. Base64 or Base32 (for only uppercase characters and 6 different digits) to encode a random os.urandom() bytes sequence. Bypass the uuid middleman for more speed!
接近 3 年之前 回复
weixin_41568131
10.24 Python 3: str(uuid.uuid4()).upper()[:6] (due to 'UUID' object has no attribute 'get_hex' error)
接近 3 年之前 回复
csdnceshi73
喵-见缝插针 This should be the accepted answer to me
3 年多之前 回复
csdnceshi50
三生石@ uuid4, at least on my machine, returns only hexadecimal characters. If you want to convert to full alphabet string, you can, but you have to do trickery to get anything beyond the letters [a-f]. Also, strictly speaking, this does not answer the original question, which implied full alphabetical requirements.
5 年多之前 回复
csdnceshi56
lrony* Is it a good idea to truncate a UUID? Depending on how small string_length is, the probability of collision can be a concern.
6 年多之前 回复
csdnceshi67
bug^君 If you want to skip the string casting & hyphen replacing, you can just call my_uuid.get_hex() or uuid.uuid4().get_hex() and it will return a string generated from the uuid that does not have hyphens.
6 年多之前 回复
csdnceshi70
笑故挽风 uui1: Generate a UUID from a host ID, sequence number, and the current time. uuid4: Generate a random UUID.
6 年多之前 回复
weixin_41568208
北城已荒凉 If I do uuid1 three times in a row I get: d161fd16-ab0f-11e3-9314-00259073e4a8, d3535b56-ab0f-11e3-9314-00259073e4a8, d413be32-ab0f-11e3-9314-00259073e4a8, which all seem to be suspiciously similar (the first 8 chars differ and the rest are the same). This isn't the case with uuid4
6 年多之前 回复
csdnceshi58
Didn"t forge +1 For thinking behind the question. Perhaps you could briefly explain the difference between uuid1 and uuid4.
6 年多之前 回复

This Stack Overflow quesion is the current top Google result for "random string Python". The current top answer is:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

This is an excellent method, but the PRNG in random is not cryptographically secure. I assume many people researching this question will want to generate random strings for encryption or passwords. You can do this securely by making a small change in the above code:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

Using random.SystemRandom() instead of just random uses /dev/urandom on *nix machines and CryptGenRandom() in Windows. These are cryptographically secure PRNGs. Using random.choice instead of random.SystemRandom().choice in an application that requires a secure PRNG could be potentially devastating, and given the popularity of this question, I bet that mistake has been made many times already.

If you're using python3.6 or above, you can use the new secrets module.

''.join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))

The module docs also discuss convenient ways to generate secure tokens and best practices.

weixin_41568134
MAO-EYE no. It may return "AAA000", which is a random string, and next "AAA000", which is also a random string. You must explicitly add a check for uniqueness.
2 年多之前 回复
weixin_41568110
七度&光 will the random sting will be alway be unique ? i wanted to use a primary key.
大约 4 年之前 回复
csdnceshi58
Didn"t forge xrange is not really a small note (unless everyone here is using Python 3=P). I'm really confused as to why most of these examples ignore performance big time. Another thing would be not to concatenate strings in a loop, it makes a huge difference. This code gets roughly (not very scientifically tested with time on a laptop) 20% faster with xrange and doing alphabet = string.ascii_uppercase + string.digits only once.
大约 4 年之前 回复
csdnceshi57
perhaps? small note - better to use xrange instead of range as the latter generates an in-memory list, while the former creates an iterator.
大约 4 年之前 回复
weixin_41568131
10.24 Great answer. Small note: You changed it to string.uppercase which can lead to unexpected results depending on the locale set. Using string.ascii_uppercase (or string.ascii_letters + string.digits for base62 instead of base36) is safer in cases where encoding is involved.
5 年多之前 回复
csdnceshi77
狐狸.fox Yes, the official standard library for random has warn this: "Warning: The pseudo-random generators of this module should not be used for security purposes. Use os.urandom() or SystemRandom if you require a cryptographically secure pseudo-random number generator." Here is the ref: random.SystemRandom and os.urandom
5 年多之前 回复
csdnceshi53
Lotus@ I added it to the accepted answer.
接近 6 年之前 回复
csdnceshi67
bug^君 This deserves to be higher up or incorporated in the accepted answer. Subtle but important difference.
6 年多之前 回复

Based on another Stack Overflow answer, Most lightweight way to create a random string and a random hexadecimal number, a better version than the accepted answer would be:

('%06x' % random.randrange(16**6)).upper()

much faster.

csdnceshi70
笑故挽风 This is nice, though it will only use 'A-F' and not 'A-Z'. Also, the code gets a little less nice when parametrize N.
6 年多之前 回复

this is a take on Anurag Uniyal 's response and something that i was working on myself.

import random
import string

oneFile = open('‪Numbers.txt', 'w')
userInput = 0
key_count = 0
value_count = 0
chars = string.ascii_uppercase + string.digits + string.punctuation

for userInput in range(int(input('How many 12 digit keys do you want?'))):
    while key_count <= userInput:
        key_count += 1
        number = random.randint(1, 999)
        key = number

        text = str(key) + ": " + str(''.join(random.sample(chars*6, 12)))
        oneFile.write(text + "\n")
oneFile.close()
import random
q=2
o=1
list  =[r'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','s','0','1','2','3','4','5','6','7','8','9','0']
while(q>o):
    print("")

    for i in range(1,128):
        x=random.choice(list)
        print(x,end="")

Here length of string can be changed in for loop i.e for i in range(1,length) It is simple algorithm which is easy to understand. it uses list so you can discard characters that you do not need.

A simpler, faster but slightly less random way is to use random.sample instead of choosing each letter separately, If n-repetitions are allowed, enlarge your random basis by n times e.g.

import random
import string

char_set = string.ascii_uppercase + string.digits
print ''.join(random.sample(char_set*6, 6))

Note: random.sample prevents character reuse, multiplying the size of the character set makes multiple repetitions possible, but they are still less likely then they are in a pure random choice. If we go for a string of length 6, and we pick 'X' as the first character, in the choice example, the odds of getting 'X' for the second character are the same as the odds of getting 'X' as the first character. In the random.sample implementation, the odds of getting 'X' as any subsequent character are only 6/7 the chance of getting it as the first character

weixin_41568184
叼花硬汉 I thought about it this way: without loss of generality, after picking arbitrary character 'G' from the 36 ^ 6 characters available, the 'G' from one of the 6 sets of 36 characters has been exhausted, leaving only 5/6 'G's available to be selected. Subsequently, 4/6, 3/6, 2/6 and 1/6 'G's are left, hence the sequence I posted. You're right, though, the denominator is changing. The sequence is actually (6/(36^6)) * (5/(36^6 - 1)) * (4/(36^6 - 2)) * (3/(36^6 - 3)) * (2/(36^6 - 4)) * (1/(36^6 - 5)) < 6!/((36^6)^6) = 20/(36^35) < 1/(36^6) of the random choice.
大约 7 年之前 回复
csdnceshi64
游.程 shouldn't next char probability be 6/7 of first as first=6/36 second=5/35
大约 7 年之前 回复
csdnceshi64
游.程 yes I agree this fact should be considered while choosing this solution, I will add it to answer, thanks.
大约 7 年之前 回复
weixin_41568184
叼花硬汉 The chance of getting a particular character repeated drops off as you move through the generated string. Generating a string of 6 characters from the 26 uppercase letters plus 10 digits, randomly choosing each character independently, any particular string occurs with frequency 1/(36^6). The chance of generating 'FU3WYE' and 'XXXXXX' is the same. In the sample implementation, the chance of generating 'XXXXXX' is (1/(36^6)) * ((6/6) * (5/6) * (4/6) * (3/6) * (2/6) * (1/6)) due to the non-replacement feature of random.sample. 'XXXXXX' is 324 times less likely in the sample implementation.
大约 7 年之前 回复
weixin_41568184
叼花硬汉 If random.sample prevents character reuse, multiplying the size of the character set makes multiple repetitions possible, but they are still less likely then they are in a pure random choice. If we go for a string of length 6, and we pick 'X' as the first character, in the choice example, the odds of getting 'X' for the second character are the same as the odds of getting 'X' as the first character. In the random.sample implementation, the odds of getting 'X' as any subsequent character are only 5/6 the chance of getting it as the first character.
大约 7 年之前 回复
csdnceshi64
游.程 thanks, now that makes this the correct and better answer :)
7 年多之前 回复
weixin_41568208
北城已荒凉 ''.join(random.sample(char_set*6,6)) solves the problem.
7 年多之前 回复
csdnceshi72
谁还没个明天 One of the examples has a repeat, so I doubt he is looking to disallow repeats.
10 年多之前 回复
csdnceshi64
游.程 for the given use case(if no repeat is ok) i will say it is still the best solution.
10 年多之前 回复
csdnceshi73
喵-见缝插针 This way isn't bad but it's not quite as random as selecting each character separately, as with sample you'll never get the same character listed twice. Also of course it'll fail for N higher than 36.
10 年多之前 回复

Taking the answer from Ignacio, this works with Python 2.6:

import random
import string

N=6
print ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

Example output:

JQUBT2

csdnceshi74
7*4 This is true and is what the OP implicitly requires. I also tested it just now and got: L1KBO7, BLSDEB, 3RQB59, PFOO20, so anything is possible.
3 年多之前 回复
csdnceshi56
lrony* since we are generating random string with string.ascii_uppercase + string.digits ~ ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789, there will be a possibility that random.choice may pick only letters, I tested it and got NZAUDKM, WBZMKFH,ZUJHHTU
3 年多之前 回复

I thought no one had answered this yet lol! But hey, here's my own go at it:

import random

def random_alphanumeric(limit):
    #ascii alphabet of all alphanumerals
    r = (range(48, 58) + range(65, 91) + range(97, 123))
    random.shuffle(r)
    return reduce(lambda i, s: i + chr(s), r[:random.randint(0, len(r))], "")
csdnceshi63
elliott.david true my solution seems a bit overkill for the task, but I was aware of the other simpler solutions, and just wished to find an alternative route to a good answer. Without freedom, creativity is in danger, thus I went ahead and posted it.
接近 8 年之前 回复
csdnceshi72
谁还没个明天 I won't vote this down, but I think it's far too complicated for such a simple task. The return expression is a monster. Simple is better than complex.
接近 8 年之前 回复

For those of you who enjoy functional python:

from itertools import imap, starmap, islice, repeat
from functools import partial
from string import letters, digits, join
from random import choice

join_chars = partial(join, sep='')
identity = lambda o: o

def irand_seqs(symbols=join_chars((letters, digits)), length=6, join=join_chars, select=choice, breakup=islice):
    """ Generates an indefinite sequence of joined random symbols each of a specific length
    :param symbols: symbols to select,
        [defaults to string.letters + string.digits, digits 0 - 9, lower and upper case English letters.]
    :param length: the length of each sequence,
        [defaults to 6]
    :param join: method used to join selected symbol, 
        [defaults to ''.join generating a string.]
    :param select: method used to select a random element from the giving population. 
        [defaults to random.choice, which selects a single element randomly]
    :return: indefinite iterator generating random sequences of giving [:param length]
    >>> from tools import irand_seqs
    >>> strings = irand_seqs()
    >>> a = next(strings)
    >>> assert isinstance(a, (str, unicode))
    >>> assert len(a) == 6
    >>> assert next(strings) != next(strings)
    """
    return imap(join, starmap(breakup, repeat((imap(select, repeat(symbols)), None, length))))

It generates an indefinite [infinite] iterator, of joined random sequences, by first generating an indefinite sequence of randomly selected symbol from the giving pool, then breaking this sequence into length parts which is then joined, it should work with any sequence that supports getitem, by default it simply generates a random sequence of alpha numeric letters, though you can easily modify to generate other things:

for example to generate random tuples of digits:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> next(irand_tuples)
(0, 5, 5, 7, 2, 8)
>>> next(irand_tuples)
(3, 2, 2, 0, 3, 1)

if you don't want to use next for generation you can simply make it callable:

>>> irand_tuples = irand_seqs(xrange(10), join=tuple)
>>> make_rand_tuples = partial(next, irand_tuples) 
>>> make_rand_tuples()
(1, 6, 2, 8, 1, 9)

if you want to generate the sequence on the fly simply set join to identity.

>>> irand_tuples = irand_seqs(xrange(10), join=identity)
>>> selections = next(irand_tuples)
>>> next(selections)
8
>>> list(selections)
[6, 3, 8, 2, 2]

As others have mentioned if you need more security then set the appropriate select function:

>>> from random import SystemRandom
>>> rand_strs = irand_seqs(select=SystemRandom().choice)
'QsaDxQ'

the default selector is choice which may select the same symbol multiple times for each chunk, if instead you'd want the same member selected at most once for each chunk then, one possible usage:

>>> from random import sample
>>> irand_samples = irand_seqs(xrange(10), length=1, join=next, select=lambda pool: sample(pool, 6))
>>> next(irand_samples)
[0, 9, 2, 3, 1, 6]

we use sample as our selector, to do the complete selection, so the chunks are actually length 1, and to join we simply call next which fetches the next completely generated chunk, granted this example seems a bit cumbersome and it is ...

共27条数据 1 3 尾页
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问