weixin_39983383
weixin_39983383
2021-01-12 13:50

Titan Version Upgrade

Titan auto-upgrades the database as soon as it touches it (e.g. the database is 0.3.1, and then i access it with 0.3.2 and 0.3.1 clients no longer work). That auto-upgrade can lead to accidental upgrade which can break things unexpectedly if you aren't paying attention.

I'm not sure what's viable here, but i think that having some configuration that either allows/disallows (default to disallow) upgrade on connection would solve the problem. That way one would have to explicitly control the upgrade process.

该提问来源于开源项目:thinkaurelius/titan

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

6条回答

  • weixin_39983383 weixin_39983383 4月前

    Great explanation . I think i fully understand how "upgrade" will work now. i think that if all x version of 0.x.y will be backward/forward compatible that will be good. In that way, folks can take advantage of titan patches without a lot of thinking. thanks for the details.

    点赞 评论 复制链接分享
  • weixin_39640417 weixin_39640417 4月前

    Hi Stephen. Thanks for documenting this. We already have multiple services talking with the graph and we need to be able to upgrade systems independently. Steve asked me to escalate this issue. Thanks!

    点赞 评论 复制链接分享
  • weixin_39983383 weixin_39983383 4月前

    or can one of you comment on what version we might see a fix for this issue and how Titan might better deal with this issue?

    点赞 评论 复制链接分享
  • weixin_39621185 weixin_39621185 4月前

    Matthias and I conferred about this. We think the format version of the storage version should be decoupled and incremented separately from the Titan version. The only compatibility change in the 0.3.x series that I'm aware of affects ES. I think we can relax the check for 0.3.x and move to a different storage format versioning scheme (separate from Titan release versions, incremented only for backwards-incompatible layout changes) for 0.4.x and beyond.

    For the titan03 branch, I think we can support independent system upgrades on a shared backend by relaxing the check-and-upgrade logic in Backend.java's initialize(...) method to accept storage versions starting with 0.3. interchangeably. I'm planning to test this out and update this issue.

    点赞 评论 复制链接分享
  • weixin_39983383 weixin_39983383 4月前

    can you elaborate a bit on what the behavior is now in regard to this. i'm sorry, but from your description it's not all clear to me. There were two problems we were facing...first, the auto-upgrade allowed accidents to happen (upgrading when an upgrade wasn't expected and thereby breaking other systems using the old version).

    The second problem I didn't go into exactly in the description, but it's related in a way...right now when Titan 0.3.1 upgrades to 0.3.2, it happens right when any process touches the graph with a 0.3.2 artifact. So, if there service A and service B both rely on titan and service A is redeployed with 0.3.2, service B will immediately start seeing problems until it too can upgrade.

    Based on the changes you guys have made, how will upgrade affect that scenario I've laid out? How do systems with lots of independent services all talking to the graph cleanly upgrade without downtime? Any thoughts on that matter?

    点赞 评论 复制链接分享
  • weixin_39621185 weixin_39621185 4月前

    442f468 disabled Titan's automatic backend-version upgrade behavior. The patched code in Backend.java still checks the version at startup, and it still requires that the version belongs to a set of known-compatible versions. In 0.3.1, the set is { 0.3.0, 0.3.1 }. For 0.3.2, it's { 0.3.0, 0.3.1, 0.3.2 }. In 0.4.0 it will be something else. But the patched code won't modify an existing version in the storage backend. It only writes the version if it is null (initializing an empty store). For the 0.3.x branch, we'll probably leave the storage version at 0.3.2 unless there's some breaking data format change, and that seems far-fetched for a bugfix-level version increment.

    Re problem 1: auto-upgrade was deleted.

    Re problem 2: if this patch is brought online against a storage backend alongside Titan 0.3.1 processes, it will not modify the version in the storage backend and should not interfere with those 0.3.1 processes.

    Re "how do systems with lots of indepedent services all talking to the graph cleanly upgrade without downtime": in this issue, we were basically forcing little spurious version upgrades even though nothing in the storage format had changed. When we actually do change storage format, this will be harder. The severity of the specific format change will dictate whether we can efficiently and safely apply it online or whether they'll require an offline conversion.

    Does that address the scenario you laid out? Did I miss something?

    442f468a5b0c962b5c087baa682302976a08cbb8 is on master. It touches a bunch of context lines that don't exist in titan03, mostly just to change whitespace. I don't think git cherry-pick can natively ignore whitespace. So I generated a whitespace-ignored patch with git show -p -w 442f468a5b0c962b5c087baa682302976a08cbb8 and applied it on titan03: 0934f636844abe3e6be67f358f53b328cc70cf37.

    Is this useful to you? How can we help get this applied to your cluster? Could you use guidance downgrading an accidentally-upgraded 0.3.2 cluster to 0.3.1?

    点赞 评论 复制链接分享

相关推荐