在数据存储区中存储长文本

Is Datastore suitable to store really long text, e.g. profile descriptions and articles?

If not, what's the Google Cloud alternative?

If yes, what would be the ideal way to store it in order to maintain formatting such as linebreaks and markdown supported keywords? Simply store as string or convert to byte? And should I be worried about dirty user input?

I need it for a Go project (I don't think language is relevant, but maybe Go have some useful features for this)

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douwen9540 2017-04-03 11:56
关注
Yes, it's suitable if you're OK with certain limitations.

These limitations are:

the overall entity size (properties + indices) must not exceed 1 MB (this should be OK for profiles and most articles)

texts longer than a certain limit (currently 1500 bytes) cannot be indexed, so the entity may store a longer string, but you won't be able to search in it / include it in query filters; don't forget to tag these fields with "noindex"

As for the type, you may simply use string, e.g.:

type Post struct { UserID int64 `datastore:"uid"` Content string `datastore:"content,noindex"` }

string types preserve all formatting, including newlines, HTML, markup and whatever formatting.

"Dirty user input?" That's the issue of rendering / presenting the data. The datastore will not try to interpret it or attempt to perform any action based on its content, nor will transform it. So from the Datastore point of view, you have nothing to worry about (you don't create text GQLs by appending text ever, right?!).

Also note that if you're going to store large texts in your entities, those large texts will be fetched whenever you load / query such entities, and you also must send it when you modify and (re)save such an entity.

Tip #1: Use projection queries if you don't need the whole texts in certain queries to avoid "big" data movement (and so to ultimately speed up queries).

Tip #2: To "ease" the burden of not being able to index large texts, you may add duplicate properties like a short summary or title of the large text, because string values shorter than 1500 bytes can be indexed.

Tip #3: If you want to go over the 1 MB entity size limit, or you just want to generally decrease your datastore size usage, you may opt to store large texts compressed inside entities. Since they are long, you can't search / filter them anyway, but they are very well compressed (often below 40% of the original). So if you have many long texts, you can shrink your datastore size to like 1 third just by storing all texts compressed. Of course this will add to the entity save / load time (as you have to compress / decompress the texts), but often it is still worth it.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

在数据存储区中存储长文本
2017-04-03 11:22

回答 1 已采纳 Yes, it's suitable if you're OK with certain limitations. These limitations are: the overall en
如何在数据存储区中定义密钥类型？
2018-10-01 07:49

回答 1 已采纳 Since your Post entities contain complete Keys, use datastore.Key as the field type: type Post st
文本字段在数据库中的存储问题 mysql oracle sql sqlite
2020-03-11 15:09

回答 1 已采纳 mysql 的 text类型可以存utf-8的字符 20000多个
全局静态存储区、堆区和栈区深入剖析
2020-09-05 17:54

在C++中，内存可分为系统数据区，自由存储区，文本区，const数据区，全局静态区，堆区和栈区
从文本文件中获取数据并存储输出PHP php
2019-04-27 18:02

回答 1 已采纳 The following code snippet will read the file line by line and put the string after : $fn = fopen
怎么将数据存储在一台redis服务分布在不同的redis数据库中 redis 数据库服务器缓存
2018-01-24 02:07

回答 3 已采纳 redis 本身支持16个数据库，通过数据库id 设置，默认为0 设置redis的dbIndex方法有2种： 1.通过构造函数设置； 2.通过set方法设置；先说第一种：在使用re
关于数据结构中元素的存储地址数据结构
2022-12-29 17:28

回答 1 已采纳如果第12个数据元素的存储地址为15，则第1个数据元素的存储地址为： 15 - (12 - 1) * 4 = 1 所以，第1个数据元素的存储地址是1。望采纳。
数据存储技术的相关概念
2022-07-26 19:18

Andy_shenzl的博客有人可能会问了我的数据就存放在自己电脑的excel表里或者其他的本地文件就可以了，为什么还要搞个数据库呢？这是因为数据库比excel有更多的优势。数据库可以存放大量的数据，允许很多人同时使用里面的数据。前面讲的...
Sql Server2012 如何在存储过程中实现根据判断插入更新数据
2017-12-21 06:42

回答 7 已采纳先判断删除存在的，再一次性插入不就完了么？
数据结构中散列表的存储结构数据结构算法链表
2023-01-18 21:17

回答 2 已采纳根据题目给出的散列函数，插入元素7、4、5、3、6、2、8、9依次插入散列表的存储结构如下： 1.插入元素7，H(7) = (7^2 + 2) mod 9 = 49 + 2 mod 9 = 51 mo
数据在硬盘里得存储方式
2015-10-14 11:34

回答 2 已采纳这个是文件系统决定的，现代大多数的文件系统（你听说的全部，比如ntfs ext hfs）都使用簇来存储文件，在簇内部，文件是连续存放的，簇和簇之间是不连续的。至于二进制的文件怎么识别，是文件格式和
java存储数据_Java中六种数据存储方式
2021-02-12 10:21

土盐的博客这是最快的存储区，因为它位于不同于其他存储区的地方——处理器内部。但是寄存器的数量极其有限，所以寄存器由编译器根据需求进行分配。你不能直接控制，也不能在程序中感觉到寄存器存在的任何迹象。2．堆栈(stack...
如何在GAE的数据存储区中存储信息 json php
2014-08-04 01:46

回答 1 已采纳 GAE wont lets us modify complex things like List and Text. Let me briefly explain, Text or List
在C语言中的数据的存储区域
2023-07-07 09:08

钜锋王老师的博客它在程序加载时分配，并一直存在于程序的整个生命周期。...常量存储区在程序加载时分配，并且在整个程序的生命周期内都存在。代码区通常是只读的，保存了已编译的程序指令。堆的大小和生命周期由程序员手动管理，通过。
solid-file-client:一个用于在Solid数据存储区中创建和管理文件和文件夹的Javascript库
2021-04-29 00:04

该库支持文本和二进制文件，并且可以从Solid Pods，本地文件系统以及虚拟内存中读取和写入数据。它还可以在任何这些存储位置之间递归移动和复制整个文件夹树。对于高级用户，还有许多选项和较低级别的方法可以对...
没有解决我的问题, 去提问

悬赏问题

¥15 虚幻5 UE美术毛发渲染
¥15 CVRP 图论物流运输优化
¥15 Tableau online 嵌入ppt失败
¥100 支付宝网页转账系统不识别账号
¥15 基于单片机的靶位控制系统
¥15 真我手机蓝牙传输进度消息被关闭了，怎么打开？(关键词-消息通知)
¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测

在数据存储区中存储长文本

1条回答 默认 最新

悬赏问题

1条回答默认最新