doujiao3346 2019-08-06 08:54
浏览 107
已采纳

如何使用这种架构在Elastic Search中复制索引?

I have a scenario where I have to import data (millions of records) from multiple sources and save it in a database. A user should get results in under 2-3 seconds when they try to search for any information related to that data.

For this, I designed an architecture where I used golang to import data from multiple sources and pushed data in AWS SQS. I've created a lambda function which triggers when AWS SQS has some data. This lambda function then pushes data in AWS Elastic Search. I've created a Rest API using which I give results to the user.

I use CRON to do this importing work every morning. Now my problem is if a new batch of data comes I want to delete the existing data and replace all of them with the new data. I'm stuck at how I can achieve this deleting and adding new data part.

I thought of creating a temporary index and then replacing it with the original index. But the problem is I do not know when importing has ended and can make this index switch.

  • 写回答

1条回答 默认 最新

  • duanhe7471 2019-08-16 14:06
    关注

    The concept you're after is an index alias. The basic workflow would be:

    1. Import today's data into an index with my-index-2019-09-16 (for example).
    2. Make sure the import is complete and worked correctly.
    3. Point the alias to the new index (it's an atomic switch between the indices):

      POST /_aliases
      {
          "actions" : [
              { "remove" : { "index" : "my-index-2019-09-15", "alias" : "my-index" } },
              { "add" : { "index" : "my-index-2019-09-16", "alias" : "my-index" } }
          ]
      }
      
    4. Delete the old index.

    You will double the disk space during the import process, but otherwise this should work without any issues and you only delete data once it has a proper replacement.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 Stata链式中介效应代码修改
  • ¥15 latex投稿显示click download
  • ¥15 请问读取环境变量文件失败是什么原因?
  • ¥15 在若依框架下实现人脸识别
  • ¥15 网络科学导论,网络控制
  • ¥100 安卓tv程序连接SQLSERVER2008问题
  • ¥15 利用Sentinel-2和Landsat8做一个水库的长时序NDVI的对比,为什么Snetinel-2计算的结果最小值特别小,而Lansat8就很平均
  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错