doupin5408 2018-07-26 01:54
浏览 31
已采纳

如何编码以句点结尾的Blob名称?

Azure docs:

Avoid blob names that end with a dot (.), a forward slash (/), or a sequence or combination of the two.

I cannot avoid such names due to legacy s3 compatibility and so I must encode them.

How should I encode such names?

I don't want to use base64 since that will make it very hard to debug when looking in azure's blob console.

Go has https://golang.org/pkg/net/url/#QueryEscape but it has this limitation:

From Go's implementation of url.QueryEscape (specifically, the shouldEscape private function), escapes all characters except the following: alphabetic, decimal digits, '-', '_', '.', '~'.

  • 写回答

1条回答 默认 最新

  • doxp30826 2018-07-26 08:35
    关注

    I don't think there's any universal solution to handle this outside your application scope. Within your application scope, you can do ANY encoding so it falls to personal preference how you like your data to be laid out. There is not "right" way to do this.

    Regardless, I believe you should go for these properties:

    • Conversion MUST be bidirectional and without conflicts in your expected file name space
    • DO keep file names without ending dots unencoded
    • with dot-ending files, DO encode just the conflicting dots, keeping the original name readable.

    This would keep most (the non-conflicting) files short and with the original intuitive or hopefully meaningful names and should you ever be able to rename or phase out the conflicting files just remove the conversion logic without restructuring all stored data and their urls.

    I'll suggest 2 examples for this. Lets suggest you have files:

    /someParent/normal.txt
    /someParent/extensionless
    /someParent/single.
    /someParent/double.. 
    

    Use special subcontainers

    You could remove N dots from end of filename and translate them to subcontainer name "dot", "dotdot" etc.

    The result urls would like:

    /someParent/normal.txt
    /someParent/extensionless
    /someParent/dot/single
    /someParent/dotdot/double
    

    When reading you can remove the "dot"*N folder level and append N dots back to file name. Obviously this assumes you don't ever need to have such "dot" folders as data themselves.

    This is preferred if stored files can come in with any extension but you can make some assumptions on folder structure.

    Use discardable artificial extension

    Since the conflict is at the end you could just append a never-used dummy extension to given files. For example "endswithdots", but you could choose something more suitable depending on what the expected extensions are:

    /someParent/normal.txt
    /someParent/extensionless
    /someParent/single.endswithdots
    /someParent/double..endswithdots
    

    On reading if the file extension is "endswithdots" you remove the "endswithdots" part from end of filename.

    This is preferred if your data could have any container structure but you can make some assumptions on incoming extensions.


    I would suggest against Base64 or other full-name encoding as it would make file names notably longer and lose any meaningful details the file names may contain.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?