douketangyouzh5219 2016-12-01 01:28
浏览 63
已采纳

我是否应该允许将包含脚本标签的内容(来自wordpress数据库)动态插入我的应用程序中的html?

I'm am building an app for a local online newspaper company.

They have an existing website which is a wordpress site where they upload news stories (wordpress posts).

The only people uploading the news stories are journalists within the company.

In one of the main sections of the app i'm building, I connect to this wordpress database (with a php file on the same server) and retrieve news story content to display within the app. I have built this service myself with php and used javascript to insert to the html on the client side.

I have been reading up on security (including the OWASP cheat sheet for XSS prevention) and have been taking the necessary steps to implement maximum security into the app including encoding the data before inserting to the html. However some of the content coming from the database contains html and this is where my concern/question is (more details on this to come)

Here is the flow of the app:

Establish a PDO connection with the wordpress database (also setting the charset to utf-8. and setAttribute(PDO::ATTR_EMULATE_PREPARES, false);) as stated here for protection against SQL injection.

<?php
include_once 'wp_psl_config.php';
//initiate a PDO connection
$pdoConnection = new PDO(HOSTDBNAME, USER, PASSWORD);
$pdoConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$pdoConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$pdoConnection->exec("SET CHARACTER SET utf8");
?>

I am using parameterized queries and prepared statements to retrieve news stories as follows:

function getStoryData($story_id, $pdoConnection){       
   $data = array();     
   $query ='SELECT * FROM wp_posts WHERE ID=:story_id';

   $statement = $pdoConnection->prepare($query);    
   $statement->bindValue(':story_id', $story_id, PDO::PARAM_INT); 
   $statement->execute();   
   $statement->setFetchMode(PDO::FETCH_ASSOC);
   //store content into $data array
   return $data;
} 

On the client side I have been using OWASP ESAPI javascript library for encoding content before inserting to html. I am using the encodeForHTML() function for encoding the post_title, post_excerpt, post_date etc (before inserting to my html) as these do not contain any html that needs to be rendered.

Here is an example of my Javascript/Jquery code for generating and inserting the html:

var safe_post_title = $ESAPI.encoder().encodeForHTML(post_title);
var safe_story_html = '<h3 class="story_headline">' + safe_post_title + '</h3>';        
$('#story_area').html(safe_story_html);

However the wordpress post_content field (which contains the main story content) contains many different html elements and also script tags and so this is where my concern is.

Here is an example of the data in the wordpress post_content field:

Line of text... more text... more text.
more text...
If you're not sure who represents you, you can find out 
<a href="http://example.com/">here</a>. 

<h5>Search here:</h5> 

<div id="ragic_webview"></div> 

<script type="text/javascript">// <![CDATA[ 

var ragic_url = 'www.ragic.com/companyname/sheets/3'; 
var ragic_feature= 'fts'; 
var exactMatch = true; 

/* * * DON'T EDIT BELOW THIS LINE * * */ 

(function() { 
var rq = document.createElement('script'); 
rq.type = 'text/javascript'; 
rq.async = true; 
rq.src = window.location.protocol == "https:" ? "https://www.ragic.com/intl/common/loadfts.js" : "http://www.ragic.com/intl/common/loadfts.js"; 

(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(rq); 

})(); 


// ]]>
</script> 

<noscript>Please enable JavaScript to view the <a href="http://www.ragic.com/?ref_noscript">Online database form by Ragic.</a></noscript> 
<a id="ragic-link" href="http://www.ragic.com">online database form by <span class="logo-ragic">Ragic</span></a> 

Another example of post_content data:

Line of text... more text... more text.
more text...

<script id="infogram_0_housing_list_by_area" src="//e.infogr.am/js/embed.js?c5h" type="text/javascript"></script> 


<div style="width: 100%; padding: 8px 0; font-family: Arial; font-size: 13px; line-height: 15px; text-align: center;">

<a style="color: #989898; text-decoration: none;" href="https://infogr.am/housing_list_by_area" target="_blank">Housing List, by Area</a> <span class="break_between_paragraphs"></span>

<a style="color: #989898; text-decoration: none;" href="https://infogr.am" target="_blank">

Create your own infographics</a>
</div>

Some main questions I have:

  1. The company have an anti spam on their wordpress site. Does this lessen the security concern for me when displaying this content in the app?

  2. Also, Should I allow the script tags at all?

  3. Overall, can you give me some advice on what is the most secure way to display this data. I have looked into html purifier. Is this a good option?
  • 写回答

1条回答 默认 最新

  • douwen9345 2016-12-01 01:59
    关注

    The company have an anti spam on their wordpress site. Does this lessen the security concern for me when displaying this content in the app?

    Not even a little bit. WordPress anti-spam plugins only screen comments.

    Also, Should I allow the script tags at all?

    This will depend on your use case. Your example posts appear to include <script> tags that were intentionally inserted as part of a post, so you may need to leave them in.

    Overall, can you give me some advice on what is the most secure way to display this data. I have looked into html purifier. Is this a good option?

    In general, yes. HTML Purifier is a good way of dealing with untrusted HTML.

    In this specific case, probably not. From what you've described, the HTML content is all being written by users with special access to the application (the journalists) -- it's trusted input, and may not need to be filtered.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 目标计数模型训练过程中的问题
  • ¥100 Acess连接SQL 数据库后 不能用中文筛选
  • ¥15 用友U9Cloud的webapi
  • ¥20 电脑拓展屏桌面被莫名遮挡
  • ¥20 ensp,用局域网解决
  • ¥15 Python语言实验
  • ¥15 我每周要在投影仪优酷上自动连续播放112场电影,我每一周遥控操作一次投影仪,并使得电影永远不重复播放,请问怎样操作好呢?有那么多电影看吗?
  • ¥20 电脑重启停留在grub界面,引导出错需修复
  • ¥15 matlab透明图叠加
  • ¥50 基于stm32l4系列 使用blunrg-ms的ble gatt 创建 hid 服务失败