Admittedly, things will be easier if Markdown is run before Bleach. This discussion, however, assumes that Bleach must be run first. Example: if I want to store safe Markdown text, and render it later.
Escaping with Bleach
When escaping Markdown markup with Bleach,
> are always escaped and replaced with
>, making blockquotes impossible.
My first thought was to "teach" Bleach to handle
> correctly, but I don't think that is appropriate, as Bleach is not Markdown-specific, and cannot be bothered to learn about every markup that can include HTML.
My next thought is to "teach" the Markdown parser about escaped
>'s that would otherwise result in a blockquote.
Using a regex similar to the one used to transform
> into a blockquote, but that looks for
> instead of
>, and possibly packaging it into a Markdown extension: unescape_blockquotes.
- Does this make sense?
- Is there a better way to solve this problem?
- weixin_39664995 5月前点赞 评论 复制链接分享
- weixin_39994806 5月前
I would not try running bleach on markdown text. Bleach uses html5lib under the hood, and I would expect the output to be mangled by bleach. Yes, we recommend Bleach as a way to sanitize markdown - but only after rendering the markdown text as html.点赞 评论 复制链接分享
- weixin_39804603 5月前
There are multiple reasons using Bleach to sanitize Markdown is VERY lacking and has a lot of issues: - You can't disallow the user from writing HTML tags themselves - You have to specify all the tags and attributes that you want to allow, is there a list somewhere of all the tags and attributes that can be generated by markdown? - If you use a plugin that outputs an special tag with classes or something (for example to embed a YouTube video), you don't want to user to be able to be able to put arbitrary iframes in markdown, and now you have to write an special callable filter for bleach to allow this. - If you process something like
<script></script>Hi!with markdown it outputs
'<script></script>\n\n<p>Hi</p>', if you now process that with bleach allowing
<script>, the script tags end up outside a paragraph.点赞 评论 复制链接分享