High performance mysql book suggests that for sharding a blog application, one may want to put comments data across 2 shards: first, on the shard of a person posting comment, and on the shard where the post is stored.
So this raises the question how to reliably duplicate this data. Which of the following data duplication options across shards is recommended?
Option 1: Make 2 separate inserts from the PHP script.
Pros: a) Logic is in application layer.
Cons: a) User is held for 2 inserts. b) This logic will need to be duplicated in every client trying to insert similar data.
Conclusion: Seems reasonable.
Option 2: Form federated tables and use some trigger to handle the insert of duplicate.
Pros: a) App layer doesn't need to worry about multiple inserts
Cons: a) Every shard need to have federated connection to every other shard; b) Federation will work on machines in LAN, but what about at 2 different sites. c) what if connection to federated server fails.
Conclusion: Doesn't seem like a sound idea.
Option 3: Messaging such as RabbitMQ
Pros: a) Different clients can insert data at one place, and all subscribers can consume the insert.
Cons: a) Complex; b) may impose overhead in order to host messaging server, and clients; c) not sure how will it work with a look-up service to locate appropriate shards
Conclusion: Not sure
Option 4: your suggestion?
I will greatly appreciate your help.