douketangyouzh5219 2016-12-01 01:28
浏览 63
已采纳

我是否应该允许将包含脚本标签的内容(来自wordpress数据库)动态插入我的应用程序中的html?

I'm am building an app for a local online newspaper company.

They have an existing website which is a wordpress site where they upload news stories (wordpress posts).

The only people uploading the news stories are journalists within the company.

In one of the main sections of the app i'm building, I connect to this wordpress database (with a php file on the same server) and retrieve news story content to display within the app. I have built this service myself with php and used javascript to insert to the html on the client side.

I have been reading up on security (including the OWASP cheat sheet for XSS prevention) and have been taking the necessary steps to implement maximum security into the app including encoding the data before inserting to the html. However some of the content coming from the database contains html and this is where my concern/question is (more details on this to come)

Here is the flow of the app:

Establish a PDO connection with the wordpress database (also setting the charset to utf-8. and setAttribute(PDO::ATTR_EMULATE_PREPARES, false);) as stated here for protection against SQL injection.

<?php
include_once 'wp_psl_config.php';
//initiate a PDO connection
$pdoConnection = new PDO(HOSTDBNAME, USER, PASSWORD);
$pdoConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$pdoConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$pdoConnection->exec("SET CHARACTER SET utf8");
?>

I am using parameterized queries and prepared statements to retrieve news stories as follows:

function getStoryData($story_id, $pdoConnection){       
   $data = array();     
   $query ='SELECT * FROM wp_posts WHERE ID=:story_id';

   $statement = $pdoConnection->prepare($query);    
   $statement->bindValue(':story_id', $story_id, PDO::PARAM_INT); 
   $statement->execute();   
   $statement->setFetchMode(PDO::FETCH_ASSOC);
   //store content into $data array
   return $data;
} 

On the client side I have been using OWASP ESAPI javascript library for encoding content before inserting to html. I am using the encodeForHTML() function for encoding the post_title, post_excerpt, post_date etc (before inserting to my html) as these do not contain any html that needs to be rendered.

Here is an example of my Javascript/Jquery code for generating and inserting the html:

var safe_post_title = $ESAPI.encoder().encodeForHTML(post_title);
var safe_story_html = '<h3 class="story_headline">' + safe_post_title + '</h3>';        
$('#story_area').html(safe_story_html);

However the wordpress post_content field (which contains the main story content) contains many different html elements and also script tags and so this is where my concern is.

Here is an example of the data in the wordpress post_content field:

Line of text... more text... more text.
more text...
If you're not sure who represents you, you can find out 
<a href="http://example.com/">here</a>. 

<h5>Search here:</h5> 

<div id="ragic_webview"></div> 

<script type="text/javascript">// <![CDATA[ 

var ragic_url = 'www.ragic.com/companyname/sheets/3'; 
var ragic_feature= 'fts'; 
var exactMatch = true; 

/* * * DON'T EDIT BELOW THIS LINE * * */ 

(function() { 
var rq = document.createElement('script'); 
rq.type = 'text/javascript'; 
rq.async = true; 
rq.src = window.location.protocol == "https:" ? "https://www.ragic.com/intl/common/loadfts.js" : "http://www.ragic.com/intl/common/loadfts.js"; 

(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(rq); 

})(); 


// ]]>
</script> 

<noscript>Please enable JavaScript to view the <a href="http://www.ragic.com/?ref_noscript">Online database form by Ragic.</a></noscript> 
<a id="ragic-link" href="http://www.ragic.com">online database form by <span class="logo-ragic">Ragic</span></a> 

Another example of post_content data:

Line of text... more text... more text.
more text...

<script id="infogram_0_housing_list_by_area" src="//e.infogr.am/js/embed.js?c5h" type="text/javascript"></script> 


<div style="width: 100%; padding: 8px 0; font-family: Arial; font-size: 13px; line-height: 15px; text-align: center;">

<a style="color: #989898; text-decoration: none;" href="https://infogr.am/housing_list_by_area" target="_blank">Housing List, by Area</a> <span class="break_between_paragraphs"></span>

<a style="color: #989898; text-decoration: none;" href="https://infogr.am" target="_blank">

Create your own infographics</a>
</div>

Some main questions I have:

  1. The company have an anti spam on their wordpress site. Does this lessen the security concern for me when displaying this content in the app?

  2. Also, Should I allow the script tags at all?

  3. Overall, can you give me some advice on what is the most secure way to display this data. I have looked into html purifier. Is this a good option?
  • 写回答

1条回答 默认 最新

  • douwen9345 2016-12-01 01:59
    关注

    The company have an anti spam on their wordpress site. Does this lessen the security concern for me when displaying this content in the app?

    Not even a little bit. WordPress anti-spam plugins only screen comments.

    Also, Should I allow the script tags at all?

    This will depend on your use case. Your example posts appear to include <script> tags that were intentionally inserted as part of a post, so you may need to leave them in.

    Overall, can you give me some advice on what is the most secure way to display this data. I have looked into html purifier. Is this a good option?

    In general, yes. HTML Purifier is a good way of dealing with untrusted HTML.

    In this specific case, probably not. From what you've described, the HTML content is all being written by users with special access to the application (the journalists) -- it's trusted input, and may not need to be filtered.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿