duanmiyang6201
2019-08-08 21:17
浏览 223

检测base64 dataURL图像中的恶意代码或文本

I have the following 3 "dataURL image", all of them returns the same image if you open them via "URL", but two of the below dataURL code has "PHP code" and "JavaScript code" embedded on last.

How can I remove those malicious codes from my base64 dataURL image coming from users I don't trust.

base64 dataURL image (safe):

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAMCAgMCAgMDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMUFRUVDA8XGBYUGBIUFRT/2wBDAQMEBAUEBQkFBQkUDQsNFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBT/wAARCAA2AFwDASIAAhEBAxEB/8QAFwABAQEBAAAAAAAAAAAAAAAAAAIBA//EACcQAAEDAwQBBAMBAAAAAAAAAAABAgMTUVIREhSRYQQjQWIxMkJy/8QAFwEBAQEBAAAAAAAAAAAAAAAAAAEHCP/EABgRAQEBAQEAAAAAAAAAAAAAAAABEUEx/9oADAMBAAIRAxEAPwCeNJZRxpLKRVddRVddTFeuvZ4pfTSYqpnHkwUlZHO/pTN7sl7IL48mCjjyYKRvdkvZqPci/lewOnGkso40llIquuoquuoF8aSymO9NIn8qpNV11MWRy/KgVx5MFC+nmX8b2/5doRvdkvY3uyXsDvugxUboMVI4z7DjPsXpPFufBi4nWDF3ZKwPT41FGTFxBWsGLuxrDg7smjJi4yjJgoHXdBio3QYqRxn2HGfYC90GKhzoMV7I4z7BfTPRP1UDdYMXdjWDF3ZNGTFwoyYuAVn5KKz8lOm6GyjdDZSnHKq9flTKzrqdVWFfhTPY+xFc6zrqKz8lOnsfYez9gJrPyUVn5KdN0NlG6Gygc6z8lMdM9U/ZTruhspirCuQHKs66is66nT2PsPY+wHOn5FPyACSMVuia6mABcgbtABkbT8in5ABkKfkxW6eQAZGAAGR//9k=

base64 dataURL 2 image (PHP code injected):

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAMCAgMCAgMDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMUFRUVDA8XGBYUGBIUFRT/2wBDAQMEBAUEBQkFBQkUDQsNFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBT/wAARCAA2AFwDASIAAhEBAxEB/8QAFwABAQEBAAAAAAAAAAAAAAAAAAIBA//EACcQAAEDAwQBBAMBAAAAAAAAAAABAgMTUVIREhSRYQQjQWIxMkJy/8QAFwEBAQEBAAAAAAAAAAAAAAAAAAEHCP/EABgRAQEBAQEAAAAAAAAAAAAAAAABEUEx/9oADAMBAAIRAxEAPwCeNJZRxpLKRVddRVddTFeuvZ4pfTSYqpnHkwUlZHO/pTN7sl7IL48mCjjyYKRvdkvZqPci/lewOnGkso40llIquuoquuoF8aSymO9NIn8qpNV11MWRy/KgVx5MFC+nmX8b2/5doRvdkvY3uyXsDvugxUboMVI4z7DjPsXpPFufBi4nWDF3ZKwPT41FGTFxBWsGLuxrDg7smjJi4yjJgoHXdBio3QYqRxn2HGfYC90GKhzoMV7I4z7BfTPRP1UDdYMXdjWDF3ZNGTFwoyYuAVn5KKz8lOm6GyjdDZSnHKq9flTKzrqdVWFfhTPY+xFc6zrqKz8lOnsfYez9gJrPyUVn5KdN0NlG6Gygc6z8lMdM9U/ZTruhspirCuQHKs66is66nT2PsPY+wHOn5FPyACSMVuia6mABcgbtABkbT8in5ABkKfkxW6eQAZGAAGR//9k8P3BocCBlY2hvICJIZWxsbyBXb3JsZCI7ID8+Cg==

base64 dataURL 3 image (Javascript code injected):

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAMCAgMCAgMDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMUFRUVDA8XGBYUGBIUFRT/2wBDAQMEBAUEBQkFBQkUDQsNFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBT/wAARCAA2AFwDASIAAhEBAxEB/8QAFwABAQEBAAAAAAAAAAAAAAAAAAIBA//EACcQAAEDAwQBBAMBAAAAAAAAAAABAgMTUVIREhSRYQQjQWIxMkJy/8QAFwEBAQEBAAAAAAAAAAAAAAAAAAEHCP/EABgRAQEBAQEAAAAAAAAAAAAAAAABEUEx/9oADAMBAAIRAxEAPwCeNJZRxpLKRVddRVddTFeuvZ4pfTSYqpnHkwUlZHO/pTN7sl7IL48mCjjyYKRvdkvZqPci/lewOnGkso40llIquuoquuoF8aSymO9NIn8qpNV11MWRy/KgVx5MFC+nmX8b2/5doRvdkvY3uyXsDvugxUboMVI4z7DjPsXpPFufBi4nWDF3ZKwPT41FGTFxBWsGLuxrDg7smjJi4yjJgoHXdBio3QYqRxn2HGfYC90GKhzoMV7I4z7BfTPRP1UDdYMXdjWDF3ZNGTFwoyYuAVn5KKz8lOm6GyjdDZSnHKq9flTKzrqdVWFfhTPY+xFc6zrqKz8lOnsfYez9gJrPyUVn5KdN0NlG6Gygc6z8lMdM9U/ZTruhspirCuQHKs66is66nT2PsPY+wHOn5FPyACSMVuia6mABcgbtABkbT8in5ABkKfkxW6eQAZGAAGR//9k8c2NyaXB0PmFsZXJ0KCdoZWxsbycpOzwvc2NyaXB0Pgo=

You can see text code by "decoding online" using tools like these - https://www.base64decode.org/

I am allowing the user to upload the image to my server and I "convert image" to base64 dataURL image

From above all 3 base64 dataURL image, you can see all returns same image, but their base64 code is different due to embedded text code inside the image.

I am using Go in the backend to save the image. I am using the following HTML code to convert image to dataURL base64 text.

<input type='file' onchange="readURL(this);" />
<img id="blah" src="#" alt="your image" />
<script>
function readURL(input) {
  if (input.files && input.files[0]) {
    var reader = new FileReader();
    reader.onload = function (e) {
      document.getElementById("blah").src = e.target.result;
    };
    reader.readAsDataURL(input.files[0]);
  }
}
</script>

My concern is "text" that should not be inside the image, should not be there.

Above dataURL returns the same image, yet they have different base64 code due to extra data inside.

I want to fetch the actual image base64 code from above 2 malicious code.

Let's assume, User B uploaded image where I get "base64 dataURL 3" image, but I want base64 dataURL original image from user's uploaded image.

How this can be done?

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dongzhanlu0658 2019-08-08 21:31
    已采纳

    ImageMagick convert -strip <in> <out> will do it. It will also remove other extraneous data (EXIF, embedded thumbnails, etc.), so make sure that behavior is what you want.

    $ xxd img.jpg | tail -n 3
    00000280: 647f ffd9 3c73 6372 6970 743e 616c 6572  d...<script>aler
    00000290: 7428 2768 656c 6c6f 2729 3b3c 2f73 6372  t('hello');</scr
    000002a0: 6970 743e 0a                             ipt>.
    
    $ convert -strip img.jpg img2.jpg
    
    $ xxd img2.jpg | tail -n 3       
    00000260: 383a 2ebd 4c00 32c8 1ba4 0064 6d3f 229f  8:..L.2....dm?".
    00000270: 9001 90a7 e4c8 a1d3 eff9 0019 1800 0647  ...............G
    00000280: ffd9
    

    Regardless, if you don't try to execute the images, nothing will happen. But if nothing else, it's wasted space in your image files.


    To do this from Go, use the Go ImageMagick bindings and call StripImage

    已采纳该答案
    打赏 评论
  • dqq3623 2019-08-08 21:53

    Yes, there is a world where 'Hacking with Pictures' (often called Stegosploits) is a thing. The industry approach here is to use Content Disarm & Reconstruction (CDR) software. Quoting from Wikipedia:

    [CDR] is a computer security technology for removing potentially malicious code from files. Unlike malware analysis, CDR technology does not determine or detect malware's functionality but removes all file components that are not approved within the system's definitions and policies.

    If this is mission critical to you, you probably want to look into some of the commercial solutions available (the article also lists a number of them, I cannot give a recommendation here).

    For a home-grown solution, reencoding the image might be sufficient.

    打赏 评论

相关推荐 更多相似问题