dongzhuxun5136 2017-04-03 11:22
浏览 34
已采纳

在数据存储区中存储长文本

Is Datastore suitable to store really long text, e.g. profile descriptions and articles?

If not, what's the Google Cloud alternative?

If yes, what would be the ideal way to store it in order to maintain formatting such as linebreaks and markdown supported keywords? Simply store as string or convert to byte? And should I be worried about dirty user input?

I need it for a Go project (I don't think language is relevant, but maybe Go have some useful features for this)

  • 写回答

1条回答 默认 最新

  • douwen9540 2017-04-03 11:56
    关注

    Yes, it's suitable if you're OK with certain limitations.

    These limitations are:

    • the overall entity size (properties + indices) must not exceed 1 MB (this should be OK for profiles and most articles)
    • texts longer than a certain limit (currently 1500 bytes) cannot be indexed, so the entity may store a longer string, but you won't be able to search in it / include it in query filters; don't forget to tag these fields with "noindex"

    As for the type, you may simply use string, e.g.:

    type Post struct {
        UserID  int64  `datastore:"uid"`
        Content string `datastore:"content,noindex"`
    }
    

    string types preserve all formatting, including newlines, HTML, markup and whatever formatting.

    "Dirty user input?" That's the issue of rendering / presenting the data. The datastore will not try to interpret it or attempt to perform any action based on its content, nor will transform it. So from the Datastore point of view, you have nothing to worry about (you don't create text GQLs by appending text ever, right?!).

    Also note that if you're going to store large texts in your entities, those large texts will be fetched whenever you load / query such entities, and you also must send it when you modify and (re)save such an entity.

    Tip #1: Use projection queries if you don't need the whole texts in certain queries to avoid "big" data movement (and so to ultimately speed up queries).

    Tip #2: To "ease" the burden of not being able to index large texts, you may add duplicate properties like a short summary or title of the large text, because string values shorter than 1500 bytes can be indexed.

    Tip #3: If you want to go over the 1 MB entity size limit, or you just want to generally decrease your datastore size usage, you may opt to store large texts compressed inside entities. Since they are long, you can't search / filter them anyway, but they are very well compressed (often below 40% of the original). So if you have many long texts, you can shrink your datastore size to like 1 third just by storing all texts compressed. Of course this will add to the entity save / load time (as you have to compress / decompress the texts), but often it is still worth it.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测