Let's suppose your public ID is any 6-character string drawn from a 64-character alphabet (e.g. the 26 letters of the English alphabet, in upper and lower case; the 10 decimal digits of the Hindu-Arabic numeral system; and two further characters e.g. +
and /
).
The number of possible IDs that can be expressed in this way is 646, or almost 69 billion. To put it in context, that's almost 10 IDs for every person on the planet; or one ID every second until the year 4191 CE. I imagine that you won't exhaust the possibilities before your application is retired or replaced.
So, how can one obtain such a 6-character string? It's worth noting that Base-64 encoding a 4-byte value, less any padding, will yield 6 characters (although it will only yield 84 ≅ 4 billion possible values, as the final character will be drawn from only 4 of the 64 possibilities).
The question then becomes "what 4-byte value should I use?" The suggestions you put forward were:
-
A singular transformation of "real" ID
A MySQL INT
is 4-bytes. If your "real ID" is INT UNSIGNED AUTO_INCREMENT
, perhaps you could use that:*
SELECT TRIM(TRAILING '=' FROM TO_BASE64(LPAD(CHAR(id),4,CHAR(0))))
FROM my_table
WHERE ...;
SELECT *
FROM my_table
WHERE id = CONV(HEX(FROM_BASE64(CONCAT(?, '=='))),16,10);
Note however that you merely have an encoding of the id
that can easily be decoded: it wouldn't be too hard for anyone else to convert back to a number and defeat the object of this exercise. One could mitigate against such attacks by bitwise XORing the id
against a known secret, or better yet using a one-way cryptographic hash function.
-
A transformation of timestamp
A MySQL TIMESTAMP
is also 4-bytes. You could similarly use that (just apply UNIX_TIMESTAMP()
and FROM_UNIXTIME()
in the above examples as appropriate). Note that, if two timestamps are the same (to the second), you will have a collision: you might use the extra 4 bits in the final character to handle such collisions, albeit that only delays the problem to when there are 16 records with the same timestamp).
-
Generate list of unused numbers, as it is suggested in: Generate 6 Digit unique number
Certainly this is a possibility, but it would be a very large table indeed. More below.
So, to answer your questions:
-
Is the last option the best for you? Or can you suggest any other commonly used method/technique.
It depends on the threat model. Due to its simplicity, I'd probably go with Base-64 encoding the id
(possibly XOR'd against a known secret).
-
Is it OK to keep table with 900 000 rows for all 6 digits numbers?
Even if it were 6 decimal digits, you'd be talking about more than 900,000 rows (1 million, in fact). As shown above, by using a larger alphabet than just decimal digits, one has a vastly greater space to consider.
It isn't clear what you mean by "OK". If one has the storage space to use, I don't see what the issue would be. Whilst we are talking about many gigabytes of storage, it's pretty cheap stuff.
-
How can I handle situation if (theoretically) my number of active offers exceeds the number of 900 000 (I know that if this happened, point 2 would be irrelevant)
This is why allowing a variable length is handy. It isn't clear why you object to variable lengths.
-
Is it correct to have more inactive offers with potentially same ID? For example, it could happen that some user would like to refer to his old offer.
I wouldn't reuse IDs.
-
Wouldn't be better to use letters together with digits in order to extend the number of possible combinations?
Er, yes. See above.
* Note that the MySQL function Base64 functions were added in v5.6.1; if using an earlier version, you will either need to install a suitable UDF, perform the encoding manually—e.g. in a stored function—or else conduct it in a higher layer of your application.