doupingtang9627 2012-02-11 01:56
浏览 148
已采纳

建议在分片中使用以下哪些数据复制选项?

High performance mysql book suggests that for sharding a blog application, one may want to put comments data across 2 shards: first, on the shard of a person posting comment, and on the shard where the post is stored.

So this raises the question how to reliably duplicate this data. Which of the following data duplication options across shards is recommended?

Option 1: Make 2 separate inserts from the PHP script.
Pros: a) Logic is in application layer.
Cons: a) User is held for 2 inserts. b) This logic will need to be duplicated in every client trying to insert similar data.
Conclusion: Seems reasonable.

Option 2: Form federated tables and use some trigger to handle the insert of duplicate.
Pros: a) App layer doesn't need to worry about multiple inserts
Cons: a) Every shard need to have federated connection to every other shard; b) Federation will work on machines in LAN, but what about at 2 different sites. c) what if connection to federated server fails.
Conclusion: Doesn't seem like a sound idea.

Option 3: Messaging such as RabbitMQ
Pros: a) Different clients can insert data at one place, and all subscribers can consume the insert.
Cons: a) Complex; b) may impose overhead in order to host messaging server, and clients; c) not sure how will it work with a look-up service to locate appropriate shards
Conclusion: Not sure

Option 4: your suggestion?

I will greatly appreciate your help.

  • 写回答

1条回答 默认 最新

  • dtnbjjq51949 2012-02-12 01:01
    关注

    As you point out, having triggers between the various shards is silly; the whole reason for sharding is independent database operations. So you can throw it out right away.

    Updating both tables at the same time is the approach with the fewest moving parts. Over the long term, it will be the most maintainable. And it will be the easiest to debug if something goes wrong.

    But if response time is important, then you might think of some sort of messaging approach: update the comments-by-entry table, and queue a message to update the comments-by-user table. If it takes an hour for that message to be processed -- or if it gets lost in a system crash -- no big deal, you can always recover. By no means should you use a messaging approach to update both tables.

    Answer by: @kdgregory Link: https://softwareengineering.stackexchange.com/a/134607/41398

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)