EricGiguere.com > Articles > Masquerading Your Browser |
Screen-friendly version | Set your preferences |
See also this related tool: Eric's HTTP Header Viewer
And this related article: How to Detect Internet Explorer
When your browser requests a page from a Web server, the browser sends information about itself along with the request. Typically, this information includes the browser type (Internet Explorer, Opera, Mozilla, etc.), the browser version, and the underlying platform (Windows XP, Linux, Mac OS X, etc.). The server then uses this information to select an appropriate page format for the browser, since different browsers (and even different versions of the same browser) have varying incompatibilities in their support for HTML and JavaScript.
Sometimes the server misinterprets or fails to recognize this information and sends you an incorrectly formatted page. In rare cases, the server may even deny you access to its pages, whether it's for political reasons (you're using a browser that the site disapproves of) or because its pages have only been tested for use with specific browser versions. The solution is to fool the server by having your browser masquerade as another browser.
When a browser requests a page, it also sends a set of values to the server. These value strings are called headers. If you use Internet Explorer to fetch the URL http://www.amazon.com/index.html, this is what the browser might send to Amazon's server:
GET /index.html HTTP/1.1 Host: www.amazon.com Accept: */* Accept-Language: en-us User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) |
The last four lines of this request are headers. Each header consists of a name and a value, separated by a colon. Most headers are optional, and all possible headers are defined by the HTTP protocol specification. A browser identifies itself using the User-Agent header. (User agent is a generic term for an application, like a browser, that is acting as an agent for the user.) The header value consists of a series of product identifiers and/or comments. A product identifier is a string like "Mozilla/4.0" or "Opera/7.02" that identifies the product by name and (optionally) by version. Additional attributes about the product, referred to as comments, are enclosed in parentheses, such as "(compatible; MSIE 6.0; Windows NT 5.0)". The product identifiers and comments can come in any order, but in general the most significant values are listed first.
Note: Some browsers do not exactly follow the user agent format defined by the HTTP specification. The Opera browser, for example, appends language information to the string. |
You can view your browser's user agent string by using JavaScript via its navigator.userAgent combination. One way to do this is by entering the following URL sometimes referred to as a bookmarklet in your browser:
javascript:document.writeln(navigator.userAgent) |
Alternatively, you can create an HTML file that does the same thing and load it in your browser:
<html> <body> Your browser sends this User-Agent header: <pre> <script type="text/javascript"> <!-- document.writeln( navigator.userAgent ); // --> </script> </pre> </body> </html> |
If you've disabled your browser's JavaScript support, neither technique will work. However, if you refuse to enable JavaScript, you can run a local Web server and configure it to log the user agent headers. For example, logging user agent information with the Apache 2.0 Web server is simply a matter of uncommenting the line:
CustomLog logs/agent.log agent |
in the server's http.conf file. When you restart the server, user agent details will be appended to the agent.log file in the logs directory.
Check this out: If you're reading this live on EricGiguere.com, you can easily check your headers using my simple HTTP header viewer page. |
The User-Agent header is also used in HTTP requests made by Web crawlers, scanning tools, and other kinds of Web clients, not just browsers. There is no central registry for tracking user agent values, but lists of common settings have been compiled in various places. Here are some sample user agent strings:
Note that most browsers today identify themselves using the "Mozilla" an early name for the Navigator browser product name, even if they are not derived from the original Netscape Navigator browser. Notice in the listing above that even Microsoft's Internet Explorer identifies itself using the "Mozilla" product name to indicate that it is roughly compatible to a specific version of Navigator or its successors. Today, "Mozilla" also refers to a newer Web browser available from the Mozilla Foundation at www.mozilla.org.
A more extensive set of common user agent values and the browsers they refer to can be found at www.pgts.com.au/pgtsj/pgtsj0208c.html. A different list that includes the values used by various Web crawlers is found at www.psychedelix.com/agents.html.
Here are specific instructions on masquerading your browser by modifying the User-Agent value it sends:
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Internet Settings\User Agent\Post Platform |
pref( "general.useragent.override", "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)" ); |
Another alternative for masquerading your browser as another is to access the Internet through a proxy server, and have that server modify your outgoing User-Agent headers. Squid, for example, supports the fake_user_agent configuration option. Or use a proxy server that supports anonymous browsing. A problem with the proxy server approach, however, is that it only works for pages fetched via HTTP. Page requests made securely via HTTPS cannot be modified by the proxy server because all traffic between the browser and the destination Web server is encrypted.
Fortunately, there are few pitfalls to browser masquerading. If the masquerading doesn't work you can try a different user agent string or restore your browser's original setting. The real pitfall here is that browser masquerading doesn't fix the underlying problem: badly configured, inflexible Web servers. Masking your browser may be necessary for you to use the site, but it doesn't give the site owners any incentive to fix the site.
The HTTP 1.1 protocol specification can be found at ftp://ftp.isi.edu/in-notes/rfc2616.txt. Section 14.43 defines the User-Agent header.
The Web Robots Database at http://www.robotstxt.org/wc/active.html lists user agent strings for many Web robots/crawlers.
Macintosh users will find detailed instructions for modifying browser user agent values at http://macfaq.org/hacks/useragent.html.
User groups have permission to reprint this article for free as described on the copyrights page.
Sponsored
Links |
Why are these ads here? |
|
Copyright ©2003-2009 Eric Giguere | Send mail about this page | About this site | Privacy policy Site design and programming by Eric Giguere | Hosting by KGB Internet Solutions This site is Java-powered | Get Firefox! This page was last modified on Last modified on October 28, 2004 |