|
Masquerading Your Browser
By Eric Giguere
September 19, 2003
Updated October 28, 2004
See also this related tool: Eric's HTTP Header Viewer
And this related article: How to Detect Internet Explorer
When your browser requests a page from a Web server, the browser sends information about itself along with the request. Typically, this information includes the browser
type (Internet Explorer, Opera, Mozilla, etc.), the browser version, and the underlying platform (Windows XP, Linux, Mac OS X, etc.). The server then uses this information to select an appropriate page format for the browser, since different browsers (and even different versions of the same browser)
have varying incompatibilities in their
support for HTML and JavaScript.
Sometimes the server misinterprets or fails to recognize this information and sends you an incorrectly formatted page. In rare cases, the server may even deny you access to its pages, whether it's for political reasons (you're using a browser that the site disapproves of) or because its pages have only been tested for use with specific browser versions. The solution is to fool the server by having your browser
masquerade as another browser.
The User-Agent Header
When a browser requests a page, it also sends a set of values to the server. These value strings are called headers. If you use Internet Explorer to fetch the URL
http://www.amazon.com/index.html, this is what the
browser might send to Amazon's server:
GET /index.html HTTP/1.1
Host: www.amazon.com
Accept: */*
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
|
The last four lines of this request are headers. Each header consists of a name and a value, separated by a colon. Most headers are optional, and all possible headers are defined by the HTTP protocol specification.
A browser identifies itself using the User-Agent header. (User agent is a generic term for an application, like a browser, that is acting as an agent for the user.) The header value consists of a series of product identifiers and/or comments. A product identifier is a string like "Mozilla/4.0" or "Opera/7.02" that identifies the product by name and (optionally) by version. Additional attributes about the product, referred to as comments, are enclosed in parentheses,
such as "(compatible; MSIE 6.0; Windows
NT 5.0)". The product identifiers and comments can come in any order, but in general the most significant values are listed first.
Note: Some browsers do not exactly follow the user agent format defined by the HTTP specification. The Opera browser, for example, appends language information to the string.
|
You can view your browser's user agent string by using JavaScript via its navigator.userAgent combination.
One way to do this is by entering the following URL
sometimes referred to as a bookmarklet in your browser:
javascript:document.writeln(navigator.userAgent)
|
Alternatively, you can create an HTML file that does the
same thing and load it in your browser:
<html>
<body>
Your browser sends this User-Agent header:
<pre>
<script type="text/javascript">
<!--
document.writeln( navigator.userAgent );
// -->
</script>
</pre>
</body>
</html>
|
If you've disabled your browser's JavaScript support, neither technique will work. However, if you refuse to
enable JavaScript, you
can run a local Web server and configure it to log the user agent headers.
For example, logging user agent information with the
Apache 2.0 Web server is simply a matter of uncommenting the line:
CustomLog logs/agent.log agent
|
in the server's http.conf file. When you restart the server, user agent details will be appended to the
agent.log
file in the logs directory.
Check this out: If you're reading this live on EricGiguere.com, you
can easily check your headers using my simple
HTTP header viewer page.
|
Common User-Agent Values
The User-Agent header is also used in HTTP requests made by Web crawlers, scanning tools, and other kinds of Web clients, not just browsers. There is no central registry for tracking user agent values, but lists of common settings have been compiled in various places.
Here are some sample user agent strings:
- Googlebot/2.X (http://www.googlebot.com/bot.html)
- The Google Web crawler.
- Mozilla/3.0 (Win95; I)
- Netscape Navigator 3.0 on Windows 95.
- Mozilla/3.01 (Macintosh; PPC)
- Netscape Navigator 3.01 on a Macintosh.
- Mozilla/4.0 (compatible; MSIE 4.01; AOL 4.0; Windows 98)
- The AOL browser, based on Microsoft Internet Explorer 4.01, on Windows 98.
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
- Microsoft Internet Explorer 6.0 on Windows 2000.
- Mozilla/5.0 (compatible; Konqueror/2.2.2; Linux 2.4.14-xfs; X11; i686)
- Konqueror 2.2.2 for Linux.
- Mozilla/5.0 (Windows; U; Win98; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1
- Netscape 6.1 on Windows 98.
- Opera/6.x (Windows NT 4.0; U) [de]
- The German version of Opera 6.x on Windows NT.
- Opera/7.x (Windows NT 5.1; U) [en]
- The English version of Opera 7.x on Windows XP.
Note that most browsers today identify themselves using the "Mozilla"
an early name for the Navigator browser
product name, even if they are not derived from the original
Netscape Navigator browser. Notice in the listing above that even Microsoft's Internet Explorer identifies itself using the "Mozilla" product name to indicate that it is roughly compatible to a specific version of Navigator or its successors. Today, "Mozilla" also refers to a newer Web browser available from
the Mozilla Foundation at www.mozilla.org.
A more extensive set of common user agent values and the browsers they refer to can be found at
www.pgts.com.au/pgtsj/pgtsj0208c.html.
A different list that includes the values used by various Web crawlers is
found at
www.psychedelix.com/agents.html.
Changing your User-Agent Value
Here are specific instructions on masquerading your browser by modifying the
User-Agent value it sends:
- Opera
- By default, Opera identifies itself as a version of
Microsoft Internet Explorer. Opera 6 pretends it's IE 5.0, Opera 7 pretends it's IE 5.5 and IE 6.0. To change the settings in Opera 7, go to the Network panel of the Preferences dialog and adjust the value selected in the Browser Identification drop-down list. Note that the values in this list are currently fixed, with no way to add
additional user agent strings. Full details on the list values can be found in Opera's knowledge base at
www.opera.com/support/search/supsearch.dml?index=570.
- Internet Explorer
- Most sites support Internet Explorer, so there's usually little reason to
change its user agent value. If you are experienced with the Windows registry, however, you can easily add new values to the user agent value's comment section by using the registry editor (regedit.exe)
to add new subkeys to the key:
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings\User Agent\Post Platform
|
After restarting the browser, the names of these
subkeys are automatically inserted into the brower's user agent string.
- Mozilla 1.x/Netscape 7.x
- The user agent value can be specified in the defaults\pref\all.js file by
adding the "general.useragent.override" property, as in this example:
pref( "general.useragent.override", "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)" );
|
- Firefox
- Firefox users (an alternative browser from the Mozilla Foundation) can install plug-ins for managing the user agent value, such as the User Agent Switcher at
http://chrispederick.myacen.com/work/firefox/useragentswitcher/
or the UserAgent bar at
http://uabar.mozdev.org.
Alternately, type the special URL "about:config" in the browser's location bar to access the browser's properties and use the popup menu to add a
new property called "general.useragent.override" with the desired value.
Another alternative for masquerading your browser as another is to access the Internet through a proxy server, and have that server modify your outgoing
User-Agent headers. Squid, for example, supports the fake_user_agent configuration option. Or use a proxy server that supports anonymous browsing.
A problem with the proxy server approach, however, is that it only works for pages fetched via HTTP. Page requests made securely via HTTPS cannot be
modified by the proxy server because all traffic between the browser and
the destination Web server is encrypted.
The Pitfalls of Browser Masquerading
Fortunately, there are few pitfalls to browser masquerading. If the
masquerading doesn't work you can try a different user agent string or restore your browser's original setting. The real pitfall here is that browser masquerading doesn't fix the underlying problem: badly configured, inflexible Web servers. Masking your browser may be necessary for you to use the site, but it
doesn't give the site owners any incentive to fix the site.
See Also
The HTTP 1.1 protocol specification can be found at
ftp://ftp.isi.edu/in-notes/rfc2616.txt.
Section 14.43 defines the User-Agent header.
The Web Robots Database at
http://www.robotstxt.org/wc/active.html
lists user agent strings for many Web robots/crawlers.
Macintosh users will find detailed instructions for modifying browser user agent values at
http://macfaq.org/hacks/useragent.html.
User groups have permission to reprint this article for free
as described on the copyrights page.
|