从Mysql迁移到Cassandra

Previously I was using the class found here to convert userID to some random string.

From his blog:

Running:

alphaID(9007199254740989);

will return 'PpQXn7COf' and:

alphaID('PpQXn7COf', true);

will return '9007199254740989'

So the idea was that users could do www.mysite.com/user/PpQXn7COf and i convert that to a normal integer so i could do in mysql

"Select * from Users where userID=".alphaID('PpQXn7COf', true)

Now i'm just started working with Cassandra an i'm looking for some replacement.

I want url like www.mysite.com/user/PpQXn7COf not like www.mysite.com/user/username1
The "PpQXn7COf" uuid must be as short as possible.

In the Twissandra example explained here: http://www.rackspace.com/cloud/blog/2010/05/12/cassandra-by-example/

They create some long uuid (i guess it is so long because then its almost 100 percent sure its random).

In mysql i just had a userID column with auto increasement so when i used the alphaID() function i always got a very short random string.

Anyone an idea how to solve this as clean as possible?

Edit:

It is used for a social media site so it must be persistent. Thats also why i don't want to use usernames/realnames in urls, user cant remain google undetected if they need.

I just got a simple idea, however i don't know how scalable it is

<?php
//createUUID() makes +- 14 char string with A-Z a-z 1-0 based on micro/milli/nanoseconds
while(get_count(createUUID()) > 0){//uuid  is unique
  //insert username pass, uuid etc into cassandra
  if($result == "1"){
      header('Location: http://www.mysite.com/usercenter');
  }else{
      echo "error";
  }
}
?>

When this gets the size of lets say twitter/facebook:

Will it execute in acceptable time?
Will it still generate unique uuid fast enough so if 10000 users/second are registering it isnt cluttering up?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doukanzhuo4297 2011-04-16 14:03
关注
Auto-increments are not suitable for a robust distributed system. You can only assign a unique ID if every node in your system is available, to ensure it's unique.

You can of course, invent your own unique-id generator, but you must then ensure that it will generate unique IDs anywhere in your infrastructure.

For example, each node can just have a file which it (with suitable locking etc) just increments, but you will also need to ensure that they don't clash - for instance, by having the server ID included in the generation algorithm.

This may be operationally nontrivial - your ops engineers will need to ensure that all the servers in the infrastructure are configured correctly with their own ID generators set up so that they don't generate the same ID. However, it's possible.

UUIDs are the reasonable alternative, because they will definitely be unique.

A UUID is 128 bits; if we store 6 bits per character (i.e. base64) then that takes 22 characters, which is quite a long URI. If you want it shorter, you will need to generate unique IDs a different way.

Plus it all depends on "how unique" you actually need your IDs to be. If your IDs can safely be reused after a few months, you can probably do it in < 60 bits (depending also on the number of servers in your infrastructure, and how frequently you need to generate them).

We use

Server ID

Time (granularity = 2 seconds), but wraps after a few months

A per-server counter (which wraps frequently, but not within 2 seconds)

And stick all the bits together. This generates an ID which is < 64 bits long, but is guaranteed to be unique for the length of time it needs to be (which in our case is only a couple of months)

Our algorithm will malfunction and generate a duplicate ID if:

The system clock on one of our nodes goes backwards by the same amount of time in which the counter wraps.

Our operations engineers make a mistake and assign the same server ID to two servers.

Eventually, after about 9 months.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

从Mysql迁移到Cassandra mysql php
2011-04-16 11:47

回答 1 已采纳 Auto-increments are not suitable for a robust distributed system. You can only assign a unique ID
不断重新连接到Cassandra
2019-05-06 09:07

回答 1 已采纳 Special thanks to @Jim Wartnick for this. I just tried turning Cassandra off on my local machine a
使用多个主机ip迁移cassandra的库实现
2018-10-04 08:29

回答 1 已采纳 The ALTER should be realised / replicated across the cluster. Migrate uses the highest level of co
在Chaordic上从MySQL过渡到Cassandra
2020-04-25 23:18

danpu0978的博客 Ivan Linhares是Chaordic的工程经理。他已经担任首席工程师和软件开发... Chaordic是巴西基于大数据的电子商务个性化解决方案的领导者。该国最大的在线零售商，例如Saraiva，沃尔玛和Centauro，使用我们的解决...
Golang gocql无法连接到Cassandra（使用Docker） database docker
2018-09-10 22:11

回答 1 已采纳 Use the service name cassandra00 for the hostname per the docker-compose documentation https://doc
将地图作为值插入到Cassandra中
2016-02-15 04:05

回答 1 已采纳 I haven't used Go casandra client before but I guess passing map as a map instead of string should
连接到Cassandra时出现“ gocql：在超时时间内对连接启动无响应”
2018-02-02 04:41

回答 1 已采纳 Ok.. this is resolved. Posting in case this ends up helping someone the first error was resolved
cassandra_在Chaordic上从MySQL过渡到Cassandra
2020-05-20 02:35

danpu0978的博客 cassandra Ivan Linhares是Chaordic的工程经理。他已经担任首席工程师和软件开发人员超过15年。他热衷于发展优秀的团队并解决具有挑战性的技术问题。在Chaordic，他领导数据平台团队扩展其针对电子商务领域的...
集成cassandra和PHP php
2016-10-14 11:31

回答 1 已采纳 Datastax academy has a getting started course for PHP. Also, there is a demo application here wit
如何使用gocql在Cassandra中创建键空间
2018-03-12 21:12

回答 2 已采纳 I dont think there is any specific command in the library, but they create keyspaces as part of th
具有连接池支持的Golang Cassandra客户端
2017-03-06 05:43

回答 1 已采纳 You don't need a pool. Create a global Session. From https://godoc.org/github.com/gocql/gocql#Sess
Cassandra迁移工具
2020-04-26 10:18

danpu0978的博客他通过公开演讲和公司博客来传播自己的知识，还致力于开源工具，以帮助Cassandra社区。 SmartCat以低于西方国家竞争对手的价格提供经过认证的一流大数据系统实施服务。我们是该地区大数据流处理技术的推广者，...
Cassandra的Golang客户端[关闭]
2015-12-11 17:03

回答 1 已采纳 This is a very simple example of what I was referring to in the comments: type CassAPI interface
cassandra 工具_Cassandra迁移工具
2020-05-21 07:25

danpu0978的博客 cassandra 工具 Matija Gobec（@ mad_max0204）是大数据咨询公司SmartCat的联合创始人兼CTO，主要致力于解决数据密集型环境中的问题。他通过公开演讲和公司博客来传播自己的知识，还致力于开源工具以帮助Cassandra...
Cassandra在海量数据存储及大型项目案例介绍-part1
2022-04-16 17:34

思通数科x的博客 Cassandra起源 2007 年 Facebook 为了解决消息收件箱搜索问题（ Inbox Search problem）而开始设计 Cassandra 项目。当时 Facebook 遇到了传统的方法难以解决的超大数据量存储可扩展性问题。具体来说，项目团队...
没有解决我的问题, 去提问

悬赏问题

¥15 关于#目标检测#的问题：大概就是类似后台自动检测某下架商品的库存，在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
¥15 神经网络怎么把隐含层变量融合到损失函数中？
¥30 自适应 LMS 算法实现 FIR 最佳维纳滤波器matlab方案
¥15 lingo18勾选global solver求解使用的算法
¥15 全部备份安卓app数据包括密码，可以复制到另一手机上运行
¥20 测距传感器数据手册i2c
¥15 RPA正常跑，cmd输入cookies跑不出来
¥15 求帮我调试一下freefem代码
¥15 matlab代码解决，怎么运行
¥15 R语言Rstudio突然无法启动

从Mysql迁移到Cassandra

1条回答 默认 最新

悬赏问题

1条回答默认最新