At the heart of the problem is the variety of ways in which a file’s type can be determined. A jpg filename extension, for example, indicates an image in JPEG format. The web server may also define Content-Type (image/jpg in this case) in the HTTP header, but as a rule it determines the type of file being uploaded from its file name extension. Finally, most web browsers also check the first few bytes of a file (its “signature”) for known byte sequences, such as PNG, PK, JPEG JFIF, and so on.
Internet Explorer 4 introduced a fourth method, known as MIME sniffing, or mime type detection. So no version of IE now automatically assumes that a file taken from the web has the same content type as that stated by the server in the HTTP header. Nor does it trust the file name extension, or signature, on their own. Instead, Internet Explorer also examines the first 256 bytes of the file to determine its type. The snag is that it does this, only if the user calls up the URL directly, to download the file. No problems arise when locally stored files, or images that the browser links to via image tags (IMG) in HTML pages, are opened with Internet Explorer.
MIME sniffing was originally meant to guard against incorrect indications of content type by servers. These could be exploited by attackers to circumvent protective functions in Internet Explorer that were meant to prevent the browser automatically executing downloaded files, such as hta files. MIME sniffing also makes the browser tolerant of accidental errors in Content-Type statements. If, for example, the server announces text/plain, but then supplies an HTML file, Internet Explorer will handle it as HTML.
With the common GIF, JPEG and PNG formats, the browser ignores the result of MIME sniffing, as long as the filename extension, Content-Type and signature, all indicate the same type. Only if the results are inconsistent will Internet Explorer handle the file as the type identified by MIME sniffing.
security_logo_en.png, Content-Type: image/png, Signature: PNG
In the second example, we have changed the file extension to JPG. The server has noted this and changed the content-type to image/jpeg. But the signature check on the file says the file is PNG. Because the content-type (image/jpeg) clashes with the signature (PNG), the browser takes a closer look and renders the file as HTML.
security_logo_en.jpg, Content-Type: image/jpeg, Signature: PNG
In the third example, the file extension is BMP, the content type is image/bmp and the signature is BMP. Everything looks correct, but it is still interpreted at text/html. The reason for this is that the server states the content type as image/bmp when it really should be image/x-ms-bmp. This mis-statement of content type is not uncommon; we did not specifically configure our server to send the wrong content-type.
security_logo_en.bmp, Content-Type: image/bmp, Signature: BMP
Help at hand
Microsoft has identified the problem and plans to deal with it in the forthcoming version of Internet Explorer. IE 8 no longer sniffs images and therefore ignores embedded HTML. It also understands the proprietary Content-Type extension authoritative=true|false (e.g. content-type=text/html; authoritative=true;), which enables MIME sniffing to be switched off for individual downloads. Internet Explorer then handles the file as indicated by the server.
For critical cases, the new “X-Download-Options: noopen” header ensures that files are displayed strictly outside the site context. That means even HTML files can be delivered securely, because the browser will only offer to save the file. It will, unfortunately, take some time before Internet Explorer 8 has replaced its predecessor, to the extent that web site operators can rely on such measures.
Crafted files can actually be fended off quite simply right now. Ever since Windows XP SP2, users have been able to disable MIME sniffing in Internet Explorer by going to Internet Options, Security, Internet, Adjust, and selecting “Open files based on their content and not the filename extension”. However, that could reopen some old holes! Whether it improves security can only be demonstrated by practical tests. The tip shouldn’t really have to be spread among users in any case – it would be better if web service operators took security precautions to protect their visitors and ensure their systems doesn’t deliver crafted images.
Administrators can use scripts to check the type consistency of any files uploaded to their servers. If an image has a .jpg file name extension, for instance, and the signature at the start of the file says the same (confirmed using the command file image.jpg under Linux or getimagesize under PHP), all is in order and the server can deliver it. Even if it does contain HTML code, Internet Explorer will not execute it. It should be noted here, however, that only images can be secured in this way, and that the Content-Type stated by the server absolutely must be correct. The trick doesn’t work with other formats.
For absolute certainty, however, the first 256 bytes of the file can be checked for HTML code. Patterns that lead IE to identify HTML code are the usual tags like <body>, <head>, <html>, <img>, <script> and so on. If none of these patterns occurs within the first 256 bytes of the file, Microsoft’s browser won’t be able to interpret anything.
An administrator can also configure his server so that, when files are being downloaded (as opposed to pages being opened), it always delivers the header “Content-disposition: attachment; filename=”<filename.ext>”. This prevents the browser opening the files in the context of the Internet site. Instead, it opens the file with a locally linked application – though this may well irritate users. Unfortunately, such header rewrites only work if the user can be prevented from having direct access to files. For that reason, the storage locations of uploaded files should not be publicly readable, and the use of random file names is advisable.
The most efficient method is to convert the format of image files using ImageMagick or a comparable tool. That eliminates any fragments of code from images so they no longer present any danger to users. Big sites like Facebook and Twitter convert the portrait photographs uploaded by their users, but be careful, as this might open another attack vector. For example, if somebody discovered a buffer overflow problem in ImageMagick, attackers could try and exploit this with specially crafted pictures.
It’s as though once faithful guard dog has suddenly spun around with a snarl and become a threat to Internet Explorer users. Countermeasures do exist, but whether they will become firmly established in the medium term is an open question. Cross-site scripting via manipulated images doesn’t seem to be widespread at the moment, but things can change very rapidly: interactive web sites are becoming preferred targets for criminals. Changing to an alternative browser – Firefox, for example – could provide a remedy. Firefox carries out MIME sniffing too, but it doesn’t suddenly render an image as HTML.
Related Posts: On this day...
- Ron Paul wants to expropriate RonPaul.com from his supporters without compensation - 2013
- John Wayne Gacy had a helper? - 2012
- Google bets $20K that Chrome can't be hacked - 2011
- Fedora Rawhide Quickly Switching To Fedora 14 - 2010
- Mozilla retracts malware accusation against Firefox Addon - 2010
- Russian botnet tries to kill rival botnet - 2010
- HOWTO: Write a Linux virus in five easy steps - 2009
- Local root exploit in kernels 2.6.17 to 18.104.22.168 - 2008
- NBC's Heroes not returning until fall - 2008
- Netflix picks Blu-ray, good luck renting an HD-DVD soon - 2008