History
The term "cookie" was derived from "
magic cookie",
which is the packet of data a program receives and sends again
unchanged. Magic cookies were already used in computing when computer
programmer
Lou Montulli had the idea of using them in Web communications in June 1994.
[6] At the time, he was an employee of
Netscape Communications, which was developing an
e-commerce
application for a customer. The customer was MCI and the application
was the "MCI Mall". Vint Cerf and John Klensin represented MCI in
technical discussions with Netscape Communications. Not wanting the MCI
Mall servers to have to retain partial transaction states led to our
request to Netscape to find a way to store that state in each user's
computer. Cookies provided a solution to the problem of reliably
implementing a
virtual shopping cart.
[7][8]
Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of
Mosaic Netscape, released on October 13, 1994,
[9][10]
supported cookies. The first use of cookies (out of the labs) was
checking whether visitors to the Netscape website had already visited
the site. Montulli applied for a patent for the cookie technology in
1995, and
US 5774670 was granted in 1998. Support for cookies was integrated in Internet Explorer in version 2, released in October 1995.
[11]
The introduction of cookies was not widely known to the public at the
time. In particular, cookies were accepted by default, and users were
not notified of the presence of cookies. The general public learned
about them after the
Financial Times published an article about them on February 12, 1996
[citation needed].
In the same year, cookies received a lot of media attention, especially
because of potential privacy implications. Cookies were discussed in
two
U.S. Federal Trade Commission hearings in 1996 and 1997.
The development of the formal cookie specifications was already
ongoing. In particular, the first discussions about a formal
specification started in April 1995 on the www-talk mailing list. A
special working group within the
IETF was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by
Brian Behlendorf
and David Kristol respectively, but the group, headed by Kristol
himself and Aron Afatsuom, soon decided to use the Netscape
specification as a starting point. In February 1996, the working group
identified third-party cookies as a considerable privacy threat. The
specification produced by the group was eventually published as
RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default.
At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of
RFC 2109 was not followed by Netscape and Internet Explorer.
RFC 2109 was superseded by
RFC 2965 in October 2000.
A definitive specification for cookies as used in the real world was published as
RFC 6265 in April 2011.
Terminologies
Session cookie
A session cookie
[12]
only lasts for the duration of users using the website. A web browser
normally deletes session cookies when it quits. A session cookie is
created when no
Expires directive is provided at cookie creation time.
Persistent cookie
A persistent cookie
[12]
will outlast user sessions. If a persistent cookie has its Max-Age set
to 1 year, then, within the year, the initial value set in that cookie
would be sent back to the server every time the user visited the server.
This could be used to record a vital piece of information such as how
the user initially came to this website. For this reason persistent
cookies are also called tracking cookies.
Secure cookie
A secure cookie has the
secure attribute enabled and is only used via
HTTPS,
ensuring that the cookie is always encrypted when transmitting from
client to server. This makes the cookie less likely to be exposed to
cookie theft via eavesdropping.
HttpOnly cookie
The HttpOnly cookie is supported by most modern browsers.
[13][14]
On a supported browser, an HttpOnly session cookie will be used only
when transmitting HTTP (or HTTPS) requests, thus restricting access from
other, non-HTTP APIs (such as JavaScript). This restriction mitigates
but does not eliminate the threat of session cookie theft via
cross-site scripting (XSS).
[15] This feature applies only to session-management cookies, and not other browser cookies.
Third-party cookie
First-party cookies are cookies set with the same domain (or its
subdomain) in your browser's address bar. Third-party cookies are
cookies being set with different domains from the one shown on the
address bar (i.e. the web pages on that domain may feature content from a
third-party domain - e.g. an advertisement run by www.advexample.com
showing advert banners). (Privacy setting options in most modern
browsers allow you to block third-party tracking cookies).
For example: Suppose a user visits
www.example1.com
, which sets a cookie with the domain
ad.foxytracking.com
. When the user later visits
www.example2.com
, another cookie is set with the domain
ad.foxytracking.com
.
Eventually, both of these cookies will be sent to the advertiser when
loading their ads or visiting their website. The advertiser can then use
these cookies to build up a browsing history of the user across all the
websites this advertiser has footprints on.
Supercookie
A "supercookie" is a cookie with a public suffix domain, like
.com
,
.co.uk
or
k12.ca.us
.
[16]
Most browsers, by default, allow first-party cookies—a cookie with
domain to be the same or sub-domain of the requesting host. For example,
a user visiting
www.example.com
can have a cookie set with domain
www.example.com
or
.example.com
, but not
.com
.
[17] A supercookie with domain
.com
would be blocked by browsers; otherwise, a malicious website, like
attacker.com
, could set a supercookie with domain
.com
and potentially disrupt or impersonate legitimate user requests to
example.com
. The
Public Suffix List is a cross-vendor initiative to provide an accurate list of domain name suffixes changing.
[18] Older versions of browsers may not have the most up-to-date list, and will therefore be vulnerable to certain supercookies.
The term "supercookies" is sometimes used for tracking technologies
that do not rely on HTTP cookies. Two such "supercookie" mechanisms were
found on Microsoft websites: cookie syncing that respawned MUID
cookies, and
ETag cookies.
[19] Due to media attention, Microsoft later disabled this code:
In response to recent attention on "supercookies"
in the media, we wanted to share more detail on the immediate action we
took to address this issue, as well as affirm our commitment to the
privacy of our customers. According to researchers, including Jonathan
Mayer at Stanford University, "supercookies" are capable of re-creating
users' cookies or other identifiers after people deleted regular
cookies. Mr. Mayer identified Microsoft as one among others that had
this code, and when he brought his findings to our attention we promptly
investigated. We determined that the cookie behavior he observed was
occurring under certain circumstances as a result of older code that was
used only on our own sites, and was already scheduled to be
discontinued. We accelerated this process and quickly disabled this
code. At no time did this functionality cause Microsoft cookie
identifiers or data associated with those identifiers to be shared
outside of Microsoft.
Zombie cookie
Main article:
Zombie cookie
A zombie cookie is any cookie that is automatically recreated after a
user has deleted it. This is accomplished by a script storing the
content of the cookie in some other locations, such as the
local storage available to Flash content,
HTML5 storages and other client side mechanisms, and then recreating
the cookie from backup stores when the cookie's absence is detected.
Uses
Session management
Cookies may be used to maintain data related to the user during
navigation, possibly across multiple visits. Cookies were introduced to
provide a way to implement a "
shopping cart" (or "shopping basket"),
[7][8] a virtual device into which users can store items they want to purchase as they navigate throughout the site.
Shopping basket applications today usually store the list of basket
contents in a database on the server side, rather than storing basket
items in the cookie itself. A web server typically sends a cookie
containing a
unique session identifier.
The web browser will send back that session identifier with each
subsequent request and shopping basket items are stored associated with a
unique session identifier.
Allowing users to log in to a website is a frequent use of cookies.
Typically the web server will first send a cookie containing a unique
session identifier. Users then submit their credentials and the web
application authenticates the session and allows the user access to
services.
Personalization
Cookies may be used to remember the information about the user who
has visited a website in order to show relevant content in the future.
For example a web server may send a cookie containing the username last
used to log in to a website so that it may be filled in for future
visits.
Many websites use cookies for
personalization
based on users' preferences. Users select their preferences by entering
them in a web form and submitting the form to the server. The server
encodes the preferences in a cookie and sends the cookie back to the
browser. This way, every time the user accesses a page, the server is
also sent the cookie where the preferences are stored, and can
personalize the page according to the user preferences. For example, the
Wikipedia website allows authenticated users to choose the webpage
skin they like best; the
Google search engine once allowed users (even non-registered ones) to decide how many search results per page they want to see.
Tracking
Tracking cookies may be used to track internet users' web browsing. This can also be done in part by using the
IP address of the computer requesting the page or the
referrer field of the
HTTP request header, but cookies allow for greater precision. This can be demonstrated as follows:
- If the user requests a page of the site, but the request contains no
cookie, the server presumes that this is the first page visited by the
user; the server creates a random string and sends it as a cookie back
to the browser together with the requested page;
- From this point on, the cookie will be automatically sent by the
browser to the server every time a new page from the site is requested;
the server sends the page as usual, but also stores the URL of the
requested page, the date/time of the request, and the cookie in a log
file.
By analyzing the log file collected in the process, it is then
possible to find out which pages the user has visited, and in what
sequence.
Implementation
Cookies are arbitrary pieces of data chosen by the
Web server and sent to the browser. The browser returns them unchanged to the server, introducing a
state (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a
Web page
or component of a Web page is an isolated event, mostly unrelated to
all other views of the pages of the same site. Other than being set by a
web server, cookies can also be set by a
script in a language such as
JavaScript, if supported and enabled by the Web browser.
Cookie specifications
[14][21][22]
suggest that browsers should be able to save and send back a minimal
number of cookies. In particular, a web browser is expected to be able
to store at least 300 cookies of four
kilobytes each, and at least 20 cookies per server or
domain.
Setting a cookie
Transfer of Web pages follows the
HyperText Transfer Protocol (HTTP). Regardless of cookies, browsers request a page from web servers by sending them a usually short text called
HTTP request.
For example, to access the page http://www.example.org/index.html,
browsers connect to the server www.example.org sending it a request that
looks like the following one:
The server replies by sending the requested page preceded by a similar packet of text, called
'HTTP response'. This packet may contain lines requesting the browser to store cookies:
The server sends lines of
Set-Cookie
only if the server wishes the browser to store cookies.
Set-Cookie
is a directive for the browser to store the cookie and send it back in
future requests to the server (subject to expiration time or other
cookie attributes),
if the browser supports cookies and cookies are enabled. For example,
the browser requests the page http://www.example.org/spec.html by
sending the server www.example.org a request like the following:
This is a request for another page from the same server, and differs
from the first one above because it contains the string that the server
has previously sent to the browser. This way, the server knows that this
request is related to the previous one. The server answers by sending
the requested page, possibly adding other cookies as well.
The value of a cookie can be modified by the server by sending a new
Set-Cookie: name=newvalue
line in response of a page request. The browser then replaces the old value with the new one.
The term "cookie crumb" is sometimes used to refer to the name-value pair.
[23] This is not the same as
breadcrumb web navigation,
which is the technique of showing in each page the list of pages the
user has previously visited; this technique, however, may be implemented
using cookies.
Cookies can also be set by JavaScript or similar scripts running within the browser. In JavaScript, the object
document.cookie
is used for this purpose. For example, the instruction
document.cookie = "temperature=20"
creates a cookie of name
temperature
and value
20
.
[24]
Cookie attributes
Besides the name–value pair, servers can also set these cookie
attributes: a cookie domain, a path, expiration time or maximum age,
Secure flag and HttpOnly flag. Browsers will not send cookie attributes
back to the server. They will only send the cookie’s name-value pair.
Cookie attributes are used by browsers to determine when to delete a
cookie, block a cookie or whether to send a cookie (name-value pair) to
the servers.
Domain and Path
The cookie domain and path define the scope of the cookie—they tell
the browser that cookies should only be sent back to the server for the
given domain and path. If not specified, they default to the domain and
path of the object that was requested. An example of Set-Cookie
directives from a website after a user logged in:
The first cookie
LSID
has default domain
docs.foo.com
and Path
/accounts
, which tells the browser to use the cookie only when requesting pages contained in
docs.foo.com/accounts
. The other 2 cookies
HSID
and
SSID
would be sent back by the browser while requesting any subdomain in
.foo.com
on any path, for example
www.foo.com/
.
Cookies can only be set on the top domain and its sub domains. Setting cookies on
www.foo.com
from
www.bar.com
will not work for security reasons.
[25]
Expires and Max-Age
The Expires directive tells the browser when to delete the cookie. It
is specified in the form of “Wdy, DD Mon YYYY HH:MM:SS GMT”
[26],
indicating the exact date/time this cookie will expire. As an
alternative to setting cookie expiration as an absolute date/time,
RFC 6265
allows the use of the Max-Age attribute to set the cookie’s expiration
as an interval of seconds in the future, relative to the time the
browser received the cookie. An example of Set-Cookie directives from a
website after a user logged in:
The first cookie
lu
is set to expire sometime in 15-Jan-2013; it will be used by the client browser until that time. The second cookie
made_write_conn
does not have an expiration date, making it a session cookie. It will
be deleted after the user closes his/her browser. The third cookie
reg_fb_gate
has its value changed to
deleted,
with an expiration time in the past. The browser will delete this
cookie right away – note that cookie will only be deleted when the
domain and path attributes in the
Set-Cookie
field match the values used when the cookie was created.
Secure and HttpOnly
The Secure and HttpOnly attributes do not have associated values.
Rather, the presence of the attribute names indicates that the Secure
and HttpOnly behaviors are specified.
The Secure attribute is meant to keep cookie communication limited to
encrypted transmission, directing browsers to use cookies only via
secure/encrypted connections. Naturally, web servers should
set Secure cookies via
secure/encrypted connections, lest the cookie information be
transmitted in a way that allows eavesdropping when first sent to the web browser.
The HttpOnly attribute directs browsers to use cookies via the HTTP
protocol only. (This includes HTTPS; HttpOnly is not the opposite of
Secure.) An HttpOnly cookie is not accessible via non-HTTP methods, such
as calls via JavaScript (e.g., referencing "document.cookie"), and
therefore cannot be stolen easily via
cross-site scripting (a pervasive attack technique
[27]). As shown in previous examples, both Facebook and Google use the HttpOnly attribute extensively.
Browser settings
Most modern browsers support cookies and allow the user to disable them. The following are common options:
[28]
- To enable or disable cookies completely, so that they are always accepted or always blocked.
- Some browsers incorporate a cookie manager for the user to see and
selectively delete the cookies currently stored in the browser.
- By default, Internet Explorer allows only third-party cookies that are accompanied by a P3P "CP" (Compact Policy) field.[29]
Most browsers also allow a full wipe of private data including cookies. Add-on tools for managing cookie permissions also exist.
Privacy and third-party cookies
Cookies have some important implications on the
privacy and
anonymity of Web users. While cookies are sent only to the server setting them or the server in the same
Internet domain,
a Web page may contain images or other components stored on servers in
other domains. Cookies that are set during retrieval of these components
are called
third-party cookies. The standards for cookies,
RFC 2109 and
RFC 2965, specify that browsers should protect user privacy and not allow third-party cookies by default. But most browsers, such as
Mozilla Firefox,
Internet Explorer,
Opera and
Google Chrome do allow third-party cookies by default, as long as the third-party website has
Compact Privacy Policy published.
In this fictional example, an advertising company has placed banners in
two websites. Hosting the banner images on its servers and using
third-party cookies, the advertising company is able to track the
browsing of users across these two sites.
Advertising companies use third-party cookies to track a user across
multiple sites. In particular, an advertising company can track a user
across all pages where it has placed advertising images or
web bugs.
Knowledge of the pages visited by a user allows the advertising company
to target advertisements to the user's presumed preferences.
Website operators who do not disclose third-party cookie use to
consumers run the risk of harming consumer trust if cookie use is
discovered. Having clear disclosure (such as in a
privacy policy) tends to eliminate any negative effects of such cookie discovery.
[34]
The possibility of building a profile of users is considered by some a
potential privacy threat, especially when tracking is done across
multiple domains using third-party cookies. For this reason, some
countries have legislation about cookies.
The
United States government has set strict rules on setting cookies in 2000 after it was disclosed that the White House
drug policy office
used cookies to track computer users viewing its online anti-drug
advertising. In 2002, privacy activist Daniel Brandt found that the
CIA
had been leaving persistent cookies on computers which had visited its
website. When notified it was violating policy, CIA stated that these
cookies were not intentionally set and stopped setting them.
[35] On December 25, 2005, Brandt discovered that the
National Security Agency
(NSA) had been leaving two persistent cookies on visitors' computers
due to a software upgrade. After being informed, the National Security
Agency immediately disabled the cookies.
[36]
The 2002 European Union telecommunication privacy Directive contains rules about the use of cookies.
[37]
In particular, Article 5, Paragraph 3 of this directive mandates that
storing data (like cookies) in a user's computer can only be done if:
- the user is provided information about how this data is used;
- the user is given the possibility of denying this storing operation.
However, this article also states that storing data that is necessary
for technical reasons is exempted from this rule. This directive was
expected to have been applied since October 2003, but a December 2004 report says (page 38) that this provision was not applied in practice, and that some member countries (Slovakia, Latvia, Greece, Belgium, and Luxembourg) did not even implement the provision in national law. The same report suggests a thorough analysis of the situation in the Member States.
The
P3P
specification includes the possibility for a server to state a privacy
policy, which specifies which kind of information it collects and for
which purpose. These policies include (but are not limited to) the use
of information gathered using cookies. According to the P3P
specification, a browser can accept or reject cookies by comparing the
privacy policy with the stored user preferences or ask the user,
presenting them the privacy policy as declared by the server.
Many web browsers including Apple's Safari and Microsoft Internet
Explorer versions 6 and 7 support P3P which allows the web browser to
determine whether to allow third-party cookies to be stored. The Opera
web browser allows users to refuse third-party cookies and to create
global and specific security profiles for Internet domains.
[38] Firefox 2.x dropped this option from its menu system but it restored it with the release of version 3.x.
[39]
Third-party cookies can be blocked by most browsers to increase
privacy and reduce tracking by advertising and tracking companies
without negatively affecting the user's Web experience.
[40]
Many advertising operators have an opt-out option to behavioural
advertising, with a generic cookie in the browser stopping behavioural
advertising.
Cookie theft and session hijacking
Most websites use cookies as the only identifiers for user sessions,
because other methods of identifying web users have limitations and
vulnerabilities. If a website uses cookies as session identifiers,
attackers can impersonate users’ requests by stealing a full set of
victims’ cookies. From the web server's point of view, a request from an
attacker has the same authentication as the victim’s requests; thus the
request is performed on behalf of the victim’s session.
Listed here are various scenarios of cookie theft and user session
hijacking (even without stealing user cookies) which work with websites
which rely solely on HTTP cookies for user identification.
Network eavesdropping
A cookie can be stolen by another computer that is allowed reading from the network
Traffic on a network can be intercepted and read by computers on the
network other than the sender and receiver (particularly over
unencrypted open
Wi-Fi). This traffic includes cookies sent on ordinary unencrypted
HTTP
sessions. Where network traffic is not encrypted, attackers can
therefore read the communications of other users on the network,
including HTTP cookies as well as the entire contents of the
conversations.
An attacker could use intercepted cookies to impersonate a user and
perform a malicious task, such as transferring money out of the victim’s
bank account.
This issue can be resolved by securing the communication between the user's computer and the server by employing
Transport Layer Security (
HTTPS protocol) to encrypt the connection. A server can specify the
Secure
flag while setting a cookie, which will cause the browser to send the
cookie only over an encrypted channel, such as an SSL connection.
[14]
Publishing false sub-domain – DNS cache poisoning
Via
DNS cache poisoning, an attacker might be able to cause a DNS server to cache a fabricated DNS entry, say
f12345.www.example.com
with the attacker’s server IP address. The attacker can then post an image URL from his own server (for example,
http://f12345.www.example.com/img_4_cookie.jpg
). Victims reading the attacker’s message would download this image from
f12345.www.example.com
. Since
f12345.www.example.com
is a sub-domain of
www.example.com
, victims’ browsers would submit all
example.com
-related cookies to the attacker’s server; the compromised cookies would also include
HttpOnly cookies.
[clarification needed]
This vulnerability is usually for
Internet Service Providers to fix, by securing their DNS servers. But it can also be mitigated if
www.example.com
is using
Secure cookies. Victims’ browsers will not submit
Secure cookies if the attacker’s image is not using encrypted connections. If the attacker chose to use
HTTPS for his img_4_cookie.jpg download, he would have the challenge
[42] of obtaining an SSL certificate for
f12345.www.example.com
from a
Certificate Authority.
Without a proper SSL certificate, victims’ browsers would display
(usually very visible) warning messages about the invalid certificate,
thus alerting victims as well as security officials from
www.example.com
.
Cross-site scripting – cookie theft
Scripting languages such as
JavaScript and
JScript
are usually allowed to access cookie values and have some means to send
arbitrary values to arbitrary servers on the Internet. These facts are
used in combination with sites allowing users to post HTML content that
other users can see.
As an example, an attacker may post a message on
www.example.com
with the following link:
When another user clicks on this link, the browser executes the piece of code within the
onclick
attribute, thus replacing the string
document.cookie
with the list of cookies of the user that are active for the page. As a result, this list of cookies is sent to the
attacker.com
server. If the attacker’s posting is on
https://www.example.com/somewhere
, secure cookies will also be sent to attacker.com in plain text.
Cross-site scripting is a constant threat, as there are always some
crackers trying to find a way of slipping in script tags to websites. It
is the responsibility of the website developers to filter out such
malicious code.
In the meantime, such attacks can be mitigated by using
HttpOnly
cookies. These cookies will not be accessible by client side script,
and therefore the attacker will not be able to gather these cookies.
Cross-site scripting
If an attacker was able to insert a piece of script to a page on
www.example.com
,
and a victim’s browser was able to execute the script, the script could
simply carry out the attack. This attack would use the victim’s browser
to send HTTP requests to servers directly; therefore, the victim’s
browser would submit all relevant cookies, including
HttpOnly cookies, as well as
Secure cookies if the script request is on
HTTPS.
For example, on MySpace, Samy posted a short message “Samy is my
hero” on his profile, with a hidden script to send Samy a “friend
request” and then post the same message on the victim’s profile. A user
reading Samy’s profile would send Samy a “friend request” and post the
same message on this person’s profile. Then, the third person reading
the second person’s profile would do the same. Pretty soon, this
Samy worm became one of the fastest spreading worms of all time.
This type of attack (with automated scripts) would not work if a website had
CAPTCHA to challenge client requests.
Cross-site scripting – proxy request
In older versions of browsers, there were security holes allowing attackers to script a proxy request by using
XMLHttpRequest. For example, a victim is reading an attacker’s posting on
www.example.com
, and the attacker’s script is executed in the victim’s browser. The script generates a request to
www.example.com
with the proxy server
attacker.com
. Since the request is for
www.example.com
, all
example.com
cookies will be sent along with the request, but routed through the
attacker’s proxy server, hence, the attacker can harvest the victim’s
cookies.
This attack would not work for
Secure cookie, since
Secure cookies go with
HTTPS
connections, and its protocol dictates end-to-end encryption, i.e., the
information is encrypted on the user’s browser and decrypted on the
destination server
www.example.com
, so the proxy servers would only see encrypted bits and bytes.
Cross-site request forgery
For example, Bob might be browsing a chat forum where another user,
Mallory, has posted a message. Suppose that Mallory has crafted an HTML
image element that references an action on Bob's bank's website (rather
than an image file), e.g.,
If Bob's bank keeps his authentication information in a cookie, and if
the cookie hasn't expired, then the attempt by Bob's browser to load the
image will submit the withdrawal form with his cookie, thus authorizing
a transaction without Bob's approval.
Drawbacks of cookies
Besides privacy concerns, cookies also have some technical drawbacks.
In particular, they do not always accurately identify users, they can
be used for security attacks, and they are often at odds with the
Representational State Transfer (
REST) software architectural style.
[43][44]
Inaccurate identification
If more than one browser is used on a computer, each usually has a
separate storage area for cookies. Hence cookies do not identify a
person, but a combination of a user account, a computer, and a Web
browser. Thus, anyone who uses multiple accounts, computers, or browsers
has multiple sets of cookies.
Likewise, cookies do not differentiate between multiple users who share the same
user account, computer, and browser.
Inconsistent state on client and server
The use of cookies may generate an inconsistency between the state of
the client and the state as stored in the cookie. If the user acquires a
cookie and then clicks the "Back" button of the browser, the state on
the browser is generally not the same as before that acquisition. As an
example, if the shopping cart of an online shop is built using cookies,
the content of the cart may not change when the user goes back in the
browser's history: if the user presses a button to add an item in the
shopping cart and then clicks on the "Back" button, the item remains in
the shopping cart. This might not be the intention of the user, who
possibly wanted to undo the addition of the item. This can lead to
unreliability, confusion, and bugs. Web developers should therefore be
aware of this issue and implement measures to handle such situations.
Alternatives to cookies
Some of the operations that can be done using cookies can also be done using other mechanisms.
IP address
Some users may be tracked based on the
IP address of the computer requesting the page. The server knows the IP address of the computer running the browser or the
proxy, if any is used, and could theoretically link a user's session to this IP address.
IP addresses are, generally, not a reliable way to track a session or
identify a user. Many computers designed to be used by a single user,
such as office PCs or home PCs, are behind a network address translator
(NAT). This means that several PCs will share a public IP address.
Furthermore, some systems, such as
Tor, are designed to retain
Internet anonymity, rendering tracking by IP address impractical, impossible, or a security risk.
URL (query string)
A more precise technique is based on embedding information into URLs. The
query string part of the
URL is the one that is typically used for this purpose, but other parts can be used as well. The
Java Servlet and
PHP session mechanisms both use this method if cookies are not enabled.
This method consists of the Web server appending query strings to the
links of a Web page it holds when sending it to a browser. When the
user follows a link, the browser returns the attached query string to
the server.
Query strings used in this way and cookies are very similar, both
being arbitrary pieces of information chosen by the server and sent back
by the browser. However, there are some differences: since a query
string is part of a URL, if that URL is later reused, the same attached
piece of information is sent to the server. For example, if the
preferences of a user are encoded in the query string of a URL and the
user sends this URL to another user by
e-mail, those preferences will be used for that other user as well.
Moreover, even if the same user accesses the same page two times,
there is no guarantee that the same query string is used in both views.
For example, if the same user arrives to the same page but coming from a
page internal to the site the first time and from an external
search engine the second time, the relative query strings are typically different while the cookies would be the same. For more details, see
query string.
Other drawbacks of query strings are related to security: storing
data that identifies a session in a query string enables or simplifies
session fixation attacks,
referrer logging attacks and other
security exploits. Transferring session identifiers as HTTP cookies is more secure.
Hidden form fields
Another form of session tracking is to use
web forms
with hidden fields. This technique is very similar to using URL query
strings to hold the information and has many of the same advantages and
drawbacks; and if the form is handled with the
HTTP
GET method, the fields actually become part of the URL the browser will
send upon form submission. But most forms are handled with HTTP POST,
which causes the form information, including the hidden fields, to be
appended as extra input that is neither part of the URL, nor of a
cookie.
This approach presents two advantages from the point of view of the
tracker: first, having the tracking information placed in the HTML
source and POST input rather than in the URL means it will not be
noticed by the average user; second, the session information is not
copied when the user copies the URL (to save the page on disk or send it
via email, for example).
This method can be easily used with any framework that supports web forms.
window.name
All current web browsers can store a fairly large amount of data (2–32 MB) via JavaScript using the
DOM
property window.name. This data can be used instead of session cookies
and is also cross-domain. The technique can be coupled with
JSON/JavaScript objects to store complex sets of session variables
[45] on the client side.
The downside is that every separate window or
tab will initially have an empty
window.name; in times of
tabbed browsing this means that individually opened tabs
(initiation by user) will not have a window name. Furthermore
window.name can be used for tracking visitors across different websites, making it of concern for
Internet privacy.
In some respects this can be more secure than cookies due to not involving the server, so it is not vulnerable to
network
cookie sniffing attacks. However if special measures are not taken to
protect the data, it is vulnerable to other attacks because the data is
available across different websites opened in the same window or tab.
HTTP authentication
The HTTP protocol includes the
basic access authentication and the
digest access authentication
protocols, which allow access to a Web page only when the user has
provided the correct username and password. If the server requires such
credentials for granting access to a web page, the browser requests them
from the user and, once obtained, the browser stores and sends them in
every subsequent page request. This information can be used to track the
user.