Skip to content Skip to sidebar Skip to footer

Unicode Characters In Image Url - 404

I am trying to open an image that has Latin characters in its name (113_Atlético Madrid). I saved it by encoding its name with the PHP function rawurlencode(), so now its new name

Solution 1:

%-encoding is for URLs. Filenames are not URLs. You use the form:

http://example.org/images/113_Atl%C3%A9tico%20Madrid.png

in the URL, and the web server will decode that to a filename something like:

/var/www/example-site/data/images/113_Atlético Madrid.png

You should use rawurlencode() when you're preparing the filename to go in a URL, but you shouldn't use it to prepare the filename for disc storage.

There is an additional problem here in that storing non-ASCII filenames on disc is something that is unreliable across platforms. Especially if you run on a Windows server, the PHP file APIs like move_uploaded_file() can very likely use an encoding that you didn't want, and you might end up with a filename like 113_Atlético Madrid.png.

There isn't necessarily an easy fix to this, but you could use any form of encoding, even %-encoding. So if you stuck with your current rawurlencode() for making filenames:

/var/www/example-site/data/images/113_Atl%C3%A9tico%20Madrid.png

that would be OK but you would then have to use double-rawurlencode to generate the matching URL:

http://example.org/images/113_Atl%25C3%25A9tico%2520Madrid.png

But in any case, it's very risky to include potentially-user-supplied arbitrary strings as part of a filename. You may be open to directory traversal attacks, where the name contains a string like /../../ to access the filesystem outside of the target directory. (And these attacks commonly escalate for execute-arbitrary-code attacks for PHP apps which are typically deployed with weak permissioning.) You would be much better off using an entirely synthetic name, as suggested (+1) by @MatthewBrown.

(Note this still isn't the end of security problems with allowing user file uploads, which it turns out is a very difficult feature to get right. There are still issues with content-sniffing and plugins that can allow image files to be re-interpreted as other types of file, resulting in cross-site scripting issues. To prevent all possibility of this it is best to only serve user-supplied files from a separate hostname, so that XSS against that host doesn't get you XSS against the main site.)

Solution 2:

If you do not need to preserve the name of the file (and often there are good reasons not to) then it might be best to simply rename the entirely. The current timestamp is a reasonable choice.

if(isset($_FILES['Team'])){
    $avatar = $_FILES['Team'];
    $date = new DateTime();
    $model->avatar = "{$id}_".$date->format('Y-m-d-H-i-sP').".png";
    if(!is_file(getcwd()."/images/avatars/competitions/{$model->avatar}")){
        move_uploaded_file($avatar['tmp_name']['avatar'], getcwd()."/images/avatars/teams/{$model->avatar}");
    }
}

After all, what the file was called before it was uploaded shouldn't be that important and much more importantly if two users have a picture called "me.png" there is much less chance of a conflict.

If you are married to the idea of encoding the file name then I can only point you to other answers:

Post a Comment for "Unicode Characters In Image Url - 404"