Gads

Spoof User Agent in http calls


Recently I was trying to download numerous files from a certain website using a shell script I wrote. With in the script, first I used wget to retrieve the files, but I kept on getting the following error message –

1
2
HTTP request sent, awaiting response... 403 Forbidden
2012-12-30 06:17:45 ERROR 403: Forbidden.

Then hoping that this was just a wget problem, I replaced wget with curl. It turned out that Curl would actually create a file with the same name as the one being download, but to my surprise the file was not downloaded. Instead, it contained an html file with 403 Forbidden message.

1
2
3
403 Forbidden
Forbidden
You don't have permission to access /dir/names.txt on this server.

What was surprising is that I could download the files using Firefox, Internet Explorer, elinks and even text based browser ‘lynx’. It seems that the website was blocking access from client browsers with certain ‘User-Agent’ header field. So the trick was to simply modify the User-Agent to a ‘legitimate’ one. Both curl and wget support the altering of User-Agent header field. You can use below commands to change the User-Agent parameter –

1
2
3
4
5
USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
 
wget --user-agent="$USER_AGENT" -c http://linuxfreelancer.com/status.html
 
curl -A "$USER_AGENT" -O http://linuxfreelancer.com/status.html

In addition to wget or curl, a much easier to use CLI HTTP client httpie can be used. Passing custom HTTP headers is intuitive using httpie, installation and usage details can be accessed here. Modifying the User-Agent header using httpie is shown below –

1
2
3
USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
 
http http://linuxfreelancer.com/ "$USER_AGENT"

All commands in one –

1
2
3
4
5
6
USER_AGENT="User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
wget --user-agent="$USER_AGENT" -c http://linuxfreelancer.com/status.html
 
curl -A "$USER_AGENT" -O http://linuxfreelancer.com/status.html
 
http http://linuxfreelancer.com/ "$USER_AGENT"

References

Curl – https://cheat.sh/curl

Wget – https://www.gnu.org/software/wget/manual/wget.html

Httpie – https://httpie.org/

Leave a Reply

Your email address will not be published. Required fields are marked *