What follows is not a complete working download script, but rather a set of issues you should be aware about and that will allow you to write better code.
1. Never accept paths as input
It’s very tempting to write something likereadfile($_GET['file']); |
You might think you’re being extra clever by doing something like
$mypath = '/mysecretpath/' . $_GET['file']; |
What you must do – always – is sanitize the input. Accept only file names, like this:
$path_parts = pathinfo($_GET['file']); $file_name = $path_parts['basename']; $file_path = '/mysecretpath/' . $file_name; |
Even better would be to accept only numeric IDs and get the file path and name from a database (or even a text file or key=>value array if it’s something that doesn’t change often). Anything is better than blindly accept requests.
If you need to restrict access to a file, you should generate encrypted, one-time IDs, so you can be sure a generated path can be used only once.
2. Use headers correctly
This is a very widespread problem and unfortunately even the PHP manual is plagued with errors. Developers usually say “this works for me” and they copy stuff they don’t fully understand.First of all, I notice the use of headers like
Content-Description
and Content-Transfer-Encoding
. There is no such thing in HTTP. Don’t believe me? Have a look at RFC2616, they specifically state “HTTP, unlike MIME, does not use Content-Transfer-Encoding, and does use Transfer-Encoding and Content-Encoding“.
You may add those headers if you want, but they do absolutely nothing.
Sadly, this wrong example is present even in the PHP manual.Second, regarding the MIME-type, I often see things like
Content-Type: application/force-download
. There’s no such thing and Content-Type: application/octet-stream
(RFC1521) would work just as fine (or maybe application/x-msdownload
if it’s an exe/dll). If you’re thinking about Internet Explorer, it’s
even better to specify it clearly rather than force it to “sniff” the
content. See MIME Type Detection in Internet Explorer for details.Even worse, I see these kinds of statements:
header("Content-Type: application/force-download"); header("Content-Type: application/octet-stream"); header("Content-Type: application/download"); |
header("Content-Type: some-value", FALSE)
, the new Content-Type
header will replace the old one.3. Forcing download and Internet Explorer bugs
What would it be like to not having to worry about old versions of Internet Explorer? A better world, that’s for sure.To force a file to download, the correct way is:
header("Content-Disposition: attachment; filename=\"$file_name\""); |
The code above will fail in IE6 unless the following are added:
header("Pragma: public"); header("Cache-Control: must-revalidate, post-check=0, pre-check=0"); |
Cache-Control
is wrong in this case, especially to both values set to zero, according to Microsoft, but it works in IE6 and IE7 and later ignores it so no harm done.If you still get strange results when downloading (especially in IE), make sure that the PHP output compression is disabled, as well as any server compression (sometimes the server inadvertently applies compression on the output produced by the PHP script).
4. Handling large file sizes
readfile()
is a simple way to ouput files files. Historically it had some
performance issues and while the documentation claims there are no
memory problems, real-life scenarios beg to differ -
output buffering and other subtle things. Regardless, if you need byte
ranges support, you still have to output the old-fashioned way.The simplest way to handle this is to output the file in “chunks”:
set_time_limit(0); $file = @fopen($file_path,"rb"); while(!feof($file)) { print(@fread($file, 1024*8)); ob_flush(); flush(); } |
5. Disable Gzip / output compression / output buffering
This is the source of many seemingly obscure errors. If you have output buffering, the file will not be sent to the user in chunks but only at the end of the script. Secondly, you’re most likely to be outputting a binary file that does not need compression anyway. Thirdly, some older browser+server combinations might become confused that you’re requesting a text file (PHP) but you’re sending compressed data with a different content type.To avoid this, assuming you’re using Apache, create a .htaccess file in the folder containing your download script with this directive:
SetEnv no-gzip dont-varyThis will disable compression in that folder.
6. Resumable downloads
For large files, it’s useful to allow downloads to be resumed. Doing so is more involved, but it’s really worth doing, especially if you serve large files or video/audio.I’m not going to write a complete example, but to point you in the right direction.
First, you need to signal the browser that you support ranges:
header("Accept-Ranges: bytes"); |
At the start of your script, after checking the file (if it exists, etc.), you have to check if a range is requested:
if (isset($_SERVER['HTTP_RANGE'])) $range = $_SERVER['HTTP_RANGE']; |
bytes=-99
‘ or ‘bytes=0-99
‘ for the first 100 bytes, ‘bytes=100-
‘ to skip the first 100 bytes, or ‘bytes=1720-8392
‘
for something in the middle. Be aware that multiple ranges can be
specified (e.g. ’100-200,400-’) but processing and especially delivering
those ranges is more complicated so no one bothers.So, now that you have the range, you have to make sure that’s expressed in bytes, that it does not contain multiple ranges and that the range itself is valid (end is greater that the start, start is not negative, and end is not larger than the file itself. Note that ‘
bytes:-
‘ is not a valid request. If the range is not valid, you must outputheader('HTTP/1.1 416 Requested Range Not Satisfiable'); |
Then, you must send a bunch of headers:
header('HTTP/1.1 206 Partial Content'); header('Accept-Ranges: bytes'); header("Content-Range: bytes $start-$end/$filesize"); $content_length = $end - $start + 1; header("Content-Length: $length"); |
Accept-Ranges
. Don’t forget that given a file size of 1000 bytes, a full range would be 0-999 so the Content-Range would be expressed as Content-Range: bytes 0-999/1000
. Yet others forget that when you send a range, the Content-Length
must match the length of the range rather than the size of the whole file.You can output the file using the method described above, skipping until the start of the range and delivering the length of the range.
Closing thoughts
I did my best to provide only accurate information. It would be truly sad for me if an article about avoiding common PHP errors contained errors itself.Regardless, my point stands: PHP makes it easy to hack together code that appears to be working, but developers should read and adhere to the official specifications.
UPDATE: I released a free script that adheres to the above guidelines.
Nguồn: http://www.richnetapps.com/the-right-way-to-handle-file-downloads-in-php/
Không có nhận xét nào:
Đăng nhận xét