带有光标的Google App Engine数据存储区查询不会迭代所有项

In my application I have a datastore query with a filter, such as:

datastore.NewQuery("sometype").Filter("SomeField<", 10)

I'm using a cursor to iterate batches of the result (e.g in different tasks). If the value of SomeField is changed while iterating over it, the cursor will no longer work on google app engine (works fine on devappserver).

I have a test project here: https://github.com/fredr/appenginetest In my test I ran /db that will setup the db with 10 items with their values set to 0, then ran /run/2 that will iterate over all items where the value is less than 2, in batches of 5, and update the value of each item to 2.

The result on my local devappserver (all items are updated): devappserver result

The result on appengine (only five items are updated): appengine result

Am I doing something wrong? Is this a bug? Or is this the expected result? In the documentation it states:

Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongshuan8722 2015-05-28 11:35
关注
The problem is the nature and implementation of the cursors. The cursor contains the key of the last processed entity (encoded), and so if you set a cursor to your query before executing it, the Datastore will jump to the entity specified by the key encoded in the cursor, and will start listing entities from that point.

Let's examine your case

Your query filter is Value<2. You iterate over the entities of the query result, and you change (and save) the Value property to 2. Note that Value=2 does not satisfy the filter Value<2.

In the next iteration (next batch) a cursor is present which you apply properly. Therefore when the Datastore executes the query, it jumps to the last entity processed in the previous iteration, and wants to list entities that come after this. But the entity pointed by the cursor may already not satisfy the filter; because the index entry for its new Value 2 will most likely be already updated (non-deterministic behavior - see eventual consistency for more details which applies here because you did not use an Ancestor query which would guarantee strongly consistent results; the time.Sleep() delay just increases the probability of this).

So the Datastore sees that the last processed entity does not satisfy the filter and will not search all the entities again but report that no more entities are matching the filter, hence no more entities will be updated (and no errors wil be reported).

Suggestion: don't use cursors and filter or sort by the same property you're updating at the same time.

By the way:

The part from the Appengine docs you quoted:

Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values.

This is not what you think. This means: cursors may not work properly on a property which has multiple values AND the same property is either included in an inequality filter or is used to sort the results by.

By the way #2

In the screenshot you are using SDK 1.9.17. The latest SDK version is 1.9.21. You should update it and always use the latest available version.

Alternatives to achieve your goal

1) Don't use cursors

If you have many records, you won't be able to update all your entities in one step (in one loop), but let's say you update 300 entities. If you repeat the query, the already updated entities will not be in the results of executing the same query again because the updated Value=2 does not satisfy the filter Value<2. Just redo the query+update until the query has no results. Since your change is idempotent, it would not cause any harm if the update of the index entry of an entity is delayed and would get returned by the query multiple times. It would be best to delay the execution of the next query to minimize the chance of this (e.g. wait a few seconds between redoing the query).

Pros: Simple. You already have the solution, just exclude the cursor handling part.

Cons: Some entities might get updated multiple times (therefore the change must be idempotent). Also the change performed on entities must be something which will exclude the entity from the next query.

2) Using Task Queue

You could first execute a keys-only query and defer the update to using tasks. You could create tasks with let's say passing 100 keys to each, and the tasks could load the entities by key and do the update. This would ensure each entity would only get updated once. This solution would have a little bigger delay due to involving the task queue, but that is not a problem in most cases.

Pros: No duplicated updates (therefore change may be non-idempotent). Works even if the change to be performed would not exclude the entity from the next query (more general).

Cons: Higher complexity. Bigger lag/delay.

3) Using Map-Reduce

You could use the map-reduce framework/utility to do massively parallel processing of many entities. Not sure if it has been implemented in Go.

Pros: Parallel execution, can handle even millions or billions of entities. Much faster in case of large entity number. Plus pros listed at 2) Using Task Queue.

Cons: Higher complexity. Might not be available in Go yet.

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

问一下你们有知道怎么样设置Qtablewidget 光标的位置吗 qt
2016-04-25 03:04

回答 2 已采纳选择第一个格子 ui.tableWidget->setItemSelected(ui.tableWidget->item(0, 0), true);
Android 手持PDA 如何让扫描到的数据，跟随光标，输入到任意EditText android android-studio
2019-12-14 09:12

回答 1 已采纳 PDA扫描到信息后，先找到当前有焦点的控件 View rootview = activity.getWindow().getDecorView(); View focusView = rootvi
请问有光标的这一行为什么会错呢？应该怎么改呢？3 c语言
2021-12-08 16:44

回答 1 已采纳（p1,p2,a,b）就可以了吧。。p1,p2已经是地址了直接传就可以了
转载：《七周成为数据分析师》
2020-05-25 10:06

虚竹大帅哥的博客 1.彻底结束之前预定暑假完成的天善学院课程《七周数据分析师》 2.总结《七周数据分析师》。完成情况 1.完成《七周成为数据分析师》任务 2.周总结与《七周数据分析师》一起完成。《七周数据分析师》总结第一周...
有人能解释一下光标画的两句话的意思吗 c语言
2021-08-31 21:16

回答 2 已采纳 C语言memmove()函数：复制内存内容（可以处理重叠的内存块） ��غ�� bcopy, memccpy, memcpy, strcpy, strncpy ͷ�ļ� #include
为何我的程序好像暂停了，不会进行下一步计算，也不提示程序结束，下面一直有个光标在闪 c语言
2021-09-25 09:57

回答 2 已采纳是否等待输入
C#如何文本框TextBox的光标一直闪烁呢 c#
2022-05-13 10:56

回答 2 已采纳其实这个在开发启动运行的时候是会6秒钟闪烁后停止，等你到debug中打开软件，他就会一直闪烁了。哈哈，问题其实都不用解决了。
百日计划：第一周，《七周成为数据分析师》课程近万字总结
2018-03-12 00:29

无小意的博客公众号：数据路（shuju_lu）百日计划第一周总结 1. 计划 1.彻底结束之前预定暑假完成的天善学院课程《七周数据分析师》 2.总结《七周数据分析师》。 2. 完成情况 1.完成《七周成为数据分析师》任务 ...
VMware 的Ubuntu系统进不去登录页面，一直卡在这个黑屏，左上角有个光标一直在闪烁 linux ubuntu
2022-07-27 20:43

回答 1 已采纳选择镜像了吗？换个镜像试试
c语言隐藏光标和光标跳转问题 c语言
2022-03-08 20:00

回答 1 已采纳 pos.X=x; pos.X=y; 这里错了，两个都是X
easyx更改光标样式 c++ 游戏程序
2023-03-19 19:26

回答 4 已采纳补充：代码会弹出如图警告
秦路数据分析 Week All
2018-05-11 20:22

十三吖的博客公众号：数据路（shuju_lu）百日计划第一周总结 1. 计划 1.彻底结束之前预定暑假完成的天善学院课程《七周数据分析师》 2.总结《七周数据分析师》。 2. 完成情况 1.完成《七周...
如何配置openai的返回Stream数据并转发到h5页面按markdown格式流式输出
2023-04-03 07:58

学习3人组的博客 ChatGPT：可以使用OpenAI API的stream参数来实现流式输出，并且可以使用max_tokens参数控制每次返回数据的长度。要在前端HTML中显示Markdown格式，您可以使用一个叫做Markdown解析器的库或工具。1.首先，通过CDN或将...
计算机专业英语词汇总结
2022-01-11 12:30

JavaGPT的博客 Data Structures 基本数据结构 Dictionaries 字典 Priority Queues 堆 Graph Data Structures 图 Set Data Structures 集合 Kd-Trees 线段树 Numerical Problems 数值问题 Solving Linear Equations 线性方程组 ...
【AI编程工具合集】42 款 AI 代码助手工具大盘点！开发效率神器！
2023-06-05 23:32

研发之道的博客 Shift+Alt+Enter 即可开始使用该工具 Google Colab Copilot 的用例可满足各种专业人士的需求：寻求自动化 Google Colab 工作区以提高生产力的数据科学家研究人员希望在使用 Google Colab 时节省时间和精力希望在...
没有解决我的问题, 去提问

悬赏问题

¥15 请问一下这个运行结果是怎么来的
¥15 这个复选框什么作用？
¥15 单通道放大电路的工作原理
¥30 YOLO检测微调结果p为1
¥20 求快手直播间榜单匿名采集ID用户名简单能学会的
¥15 DS18B20内部ADC模数转换器
¥15 做个有关计算的小程序
¥15 MPI读取tif文件无法正常给各进程分配路径
¥15 如何用MATLAB实现以下三个公式（有相互嵌套）
¥30 关于#算法#的问题：运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题求各位帮我解答一下

带有光标的Google App Engine数据存储区查询不会迭代所有项

1条回答 默认 最新

Let's examine your case

By the way:

By the way #2

Alternatives to achieve your goal

1) Don't use cursors

2) Using Task Queue

3) Using Map-Reduce

悬赏问题

1条回答默认最新