dpl9717 2012-04-18 16:23
浏览 48
已采纳

碰撞令人担忧

If I have a system where a hash is generated out of a total permutatiuon of 1 million possibilities. If there's a 10% chance of a collision, should I worry about the generating algoritm running 5 times?

HUH?!

Let's try that again:

I have a system similar to jsfiddle, where a user can "save" a file on my server. Now I'm using '23456789abcdefghijkmnopqrstuvwxyz' which is 33 chars, and the file is 4 chars long, for a total of 1185921possabilities.

The "filename" is generated randomly and if there's a collision it reruns to get another filename. Using a birthday paradox calculator I can see that after I have 500 entries I have a 10% chance of a collision.

What are the chances that I'll get a collision more than 5 times in a row? what about 4?

Is there any way to figure this out? Should I worry about it? What happens after 5000 entries?

Is there a program out there that can figure this out with any inputs?

  • 写回答

2条回答 默认 最新

  • dtjw6660 2012-04-18 16:37
    关注

    I don't think that the birthday paradox calculations apply. There's a difference between the odds of 500 random numbers out of 1185921 being all different and the odds of one new number being different once you have 500 known unique numbers.

    If you have 500 assigned numbers and generate a new number at random, it will have odds of 500/1185921 of being a collision. With 500 names taken, the chances of 4 collisions in a row are (500/1185921)4 < 10-13. With 5000 existing file names, the odds of a new name being a collision are 5000/1185921, and the chance of 4 collisions in a row are < 10-9.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 用ns3仿真出5G核心网网元
  • ¥15 matlab答疑 关于海上风电的爬坡事件检测
  • ¥88 python部署量化回测异常问题
  • ¥30 酬劳2w元求合作写文章
  • ¥15 在现有系统基础上增加功能
  • ¥15 远程桌面文档内容复制粘贴,格式会变化
  • ¥15 关于#java#的问题:找一份能快速看完mooc视频的代码
  • ¥15 这种微信登录授权 谁可以做啊
  • ¥15 请问我该如何添加自己的数据去运行蚁群算法代码
  • ¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”