Tracking User Sessions Originating from Proxy Servers
“Bug in the software” say you. “Nay” say I.
Undoubtedly, you have probably heard by now that there is a shortage of available IP numbers and ranges. One of the ways that internet companies are getting around this issue is by routing their users through a proxy server. In this case, all users showing up on your web site will come in with the IP address of the Proxy Server, and the IP will resolve to the physical location of the proxy servers. Scratching your head are ya?
How about an example:
Say that various different users in different parts of the world use one of those global internet companies, and they dial into a local number in their area, say for this example that the following information is true:
User 1: Anaheim, California
User 2: Los Angeles, California
User 3: Buenos Aires, Argentina
User 4: Dallas, Texas
User 5: Madrid, Spain
User 7: Portland, Oregon
Their connection is being routed to a group of proxy servers owned by their ISP (say, BIG COMPANY INC.). Say BIG COMPANY INC. is in Virginia. This means that every time a user who is using BIG COMPANY INC as an Internet Service Provider connects to your web site, the server will log the IP number of the proxy server on the log file.
Follow?
Okay. Now, whatever software you happen to be using to create your reports takes the IP on your log files and tries to resolve it to a DNS server to provide you with some meaningful data. You anxiously sit back waiting for your report, and when it is finished you look in absolute horror as you encounter the following scenarios:
1. Your web server in Los Angeles is a small network only of interest to local folks, and yet the #1 Most Active Organization is BIG COMPANY INC in Virginia.
Explanation: Whatever software you happen to be using to create your reports resolves the IP in your web server log file. It resolves the IP to BIG COMPANY INC, which is registered where? You guessed it: Virginia. What can you do about it? Not much, I am afraid. There is no way to get this information. Cookies won’t even help, since cookies do not store demographic information.
Or
2. You notice the following data:
536,616 Successful Hits
5,627 Page Views
21,862 User Sessions
Explanation: Get thisâÂ?¦ you are never going to believe this. Most companies dynamically assign IP’s to each user. This practice is nothing newâÂ?¦ it’s been done for some time. But NOW, some big companies are assigning IP’s “on the fly.”
“So what” Say you? “Brace yourself,” say I.
Using the example above, say that “user 1” from BIG COMPANY INC connects to your web site via the proxy server with the IP 256.256.256.257. He or she is happily viewing the content of your web site, and happens upon a link to another page on your site. Here’s what occurs:
1. The link is clicked by the user on your site who has been logged as coming from 256.256.256.257
2. The request goes to the primary proxy server at BIG COMPANY INC, and it is directed to a DIFFERENT proxy server this time. Say 256.256.256.259 – This is rightly counted as two different user sessions from BIG COMPANY INC even though the activity is from one user, because in both instances that activity came from two different IP’s.
Here is another clincher. (Just when you thought it was safe to keep reading). Suppose that throughout the day, 300 users connect to your web site from BIG COMPANY INC, and each of these users logs some activity on your server, from the same IP. If 30 minutes have not elapsed in the log file in between hits from the IP in question, it is treated as the same User Session even though several different users initiated the activity. In this case, you will see very few User Sessions with a tremendous amount of activity.
This gets so convoluted I even don’t understand it sometimes. If I haven’t lost you by now, you are either have the patience of a saint, or you work for BIG COMPANY INC.
What is the solution? Remember: cookies are yummy! Configure your server to pass cookies. Here is what happens with cookies:
1. User connects to your web site from IP 256.256.256.257 – Your server passes on a cookie, which is stored in the user’s browser. That cookie information is logged in the log file.
2. User connects immediately after with a different IP. No matter, because the cookie is still the same as the last hit that was logged by that user. The IP can be overlooked and the cookie can be used to track the User Session. Each user session will have its own fingerprint, regardless if IP.
So now you know what is up with all that goofy data in your reports and how to fix it. Setting your web server to issue cookies is a simple proposition, and the benefits from knowing how your users are maneuvering through your site is very important if you plan on having an effective web site.