| | program. However, there are some challenges that limit the effectiveness of this method. The four biggest issues are as follows.
1. Dynamically Assigned IP Addresses
The first problem companies had with log file parsing was a difficulty identifying unique users. The original method for identifying a user on the website was to use the IP address of the user’s computer. However, most people are not connected directly to the Internet. They connect to the Internet through an Internet Service Provider (ISP) such as AOL. So the IP address is coming from AOL not from the actual user. Additionally, the ISP will dynamically assign a new IP address with each click. This is good for the ISP, but it makes it very difficult to identify unique visitors on the website.
There was a solution to this problem. It was to make user cookies a standard practice for websites. When a user enters the website, a cookie is placed on his computer. Then, with each subsequent click on the site, the web log records both the IP address and the cookie. That allows all of the hits during the visit to be associated with that specific user. However, this method still does not work for people who have disabled cookies on their computers. For these cases, the imperfect method of using IP addresses is all that can be done using the log file information.
2. Page Caching
The second problem encountered with log file parsing would prove to be even more difficult than the first. As soon as a company’s website becomes popular, they start experiencing performance drains on their web servers. During periods of high traffic, this means customers may have to wait a long time for pages to be displayed, because the server is also processing the requests of many other customers at the same time. In order to optimize the performance of their websites, companies started saving copies of pages being served in a virtual memory storage, called a cache. That way, if the same page is requested again, it will be served from the cache. This results in tremendous performance gains which both improves user experience and saves money. So page caching quickly became an indispensable practice. However, when a page is served from the cache, it does not record an entry in the log file. Therefore, it is impossible to accurately record site visits when page caching is being used.
3. Outsourcing Web Analytics
The third challenge confronting companies using log file parsing was the desire to have another company perform their analytics for them. Web analytics is a somewhat technical endeavor. Not all companies are able to dedicate in-house staff to it. On the other hand, it is a fairly straightforward process that could easily be done by an outside vendor. However, with log file parsing, the software must directly access the web server logs to work. That means the software must be installed in-house on the company’s web servers. This makes it difficult to outsource.
4. Measuring Business Objectives
The fourth problem companies had using log file parsing was that it is difficult to directly measure whether you are meeting your business objectives online. The log file records which pages are being served. It does not necessarily tell you what the customer was doing while they were on that page. For example, you can measure whether a sale took place on the website, by checking to see if a confirmation page was served. But it is difficult to tell what they actually bought, or how much they spent. That information is not typically recorded in the log file.
The ABC’s Of Page Tagging
The second method of performing Web Analytics, page tagging, became the method of choice for marketers after 2001. Companies were still reeling from the recession that followed the Dot-Com crash. Many were looking for a pay-as-you-go outsourcing solution for their Web Analytics. Businesses were also learning how to tie website activity more directly to their marketing objectives. They wanted a solution that reported marketing results rather than just the technical activity on the website.
Page tagging allows companies to overcome the challenges experienced with log file parsing. With page tagging, you identify all of the actions you want to measure on the website. Then you put a small piece of programming code (usually Java Script) on every page where those actions occur. This is called tagging the page. When an identified action occurs, the tag will send a message to the Web Analytics software recording the action in a database. As with log file parsing, analytics is then performed on information in the database to report on key site metrics
Page tagging is only offered as an outsourced solution.
Going Beyond Log File Parsing
Page tagging has some significant advantages over log file parsing. For these reasons it has become the method of choice for companies who are using Web Analytics as a strategic tool to measure and increase the profitability of their Internet Marketing programs. Page tagging overcomes three of the four major challenges faced by log file parsing. Identifying unique users still relies on cookies being enabled on the user’s computer.
1. Overcomes Page Caching Limitations
With page tagging, the action is recorded by programming code on the web page itself. When the web page loads on the user’s computer, the script file runs and records the identified actions. This allows companies to overcome the problem of caching web pages. Whether a page is served from the web server or the cache, it will still be recorded when it is loaded by the user’s browser.
Nevertheless, this method has its drawbacks also. The data collected by page caching depends on the user’s browser running the script file contained in the page tag. This will fail with some percentage of users on the website. Those users will then be lost in the reported site metrics. Those users whose computers do run the page tag scripts, though, will be recorded accurately. So, even though there is missing data in the report, the trends reported will be accurate.
2. Enables Outsourcing
As important as overcoming the caching limitation is the ability to outsource Web Analytics. Page tagging sends information over the Internet to the Web Analytics software. One of the great things about the Internet is that the software can be literally anywhere in the world. That means Web Analytics can be installed on your company’s website without needing to install any software at all. You just need to put the tags on your website and direct the output to your Web Analytics vendor. Their software will process the information and provide all the reports for you.
3. Measures Business Objectives
Since page tagging records actions occurring while a user is viewing the web page, and not just the log file entry recorded when the page loads, this method is able to capture more information about the user’s visit. You can capture information entered into forms contained on the web page as well as data pulled from a database into the page view. Examples of some of the information you can record with page tagging is:
 | | Responses submitted in online forms
|  | | Items put into the shopping cart
|  | | Actions taking place within a Flash content element
|  | | Behavior occurring within a page view, such as scrolling down or accessing an onsite utility
|
Problems with Page Tagging
It would be nice if there was a perfect world of clean data. Unfortunately, there are always |
| | tradeoffs. As with log file parsing, there are also shortcomings to page tagging.
The biggest shortcoming of log file parsing is caused by the source of information used to generate reports. Analytics is limited by what is captured in the web server logs. In the same way, the shortcomings of page tagging are also caused by its source of data. Page tagging only records information sent from the user’s browser once a page loads. There are two significant drawbacks:
1. Missing Visits
The first drawback to page tagging was already discussed. It relies on information captured by a script file running while the page is active on the user’s computer. Therefore, it will be missing data from users with browsers that fail to run the script file.
2. Unable to Run Site Diagnostics
A second, and more significant problem with page tagging is the inability to run certain site diagnostics. Page tagging can only report successful page loads for computers that successfully run the script file contained in the tag. Therefore, it is unable to record failed requests, such as broken links. It also is unable to provide the complete picture of site traffic provided by the web server logs.
Because of this drawback, it is not uncommon for companies to set up a basic log file parsing solution to measure site diagnostics, while using page tagging to measure their business objectives.
Website Traffic Metrics
You now know how the two methods of Web Analytics work. These methods both start with basic data coming from a user’s visit on your website. That data is then assembled into meaningful information that can be compiled into reports measuring the success of your website. The only thing remaining to understand how Web Analytics works is to see what the basic building blocks of a web traffic report are. We conclude this chapter with a brief overview of the basic metrics used to create Web Analytics reports. In the next chapter, we will take a look at how these building blocks can be assembled to create your website usage reports.
1. Hit
A hit is the very first metric used to measure website activity. It is also the simplest metric to calculate. A hit is simply one entry in the web server log. In the very first websites, each web page might be no more than a simple HTML page with text on it. In this simple page, there are no images or other files associated with the web page. So each web page has only one single entry in the log. That translates into one hit for each page viewed on the website.
That quickly changed. Today, there are very few web pages that contain nothing except HTML code and text. As we’ve seen above, you may have pictures, graphic images, movies or other media on a single web page. Each one of these will record a separate entry in the log file. Therefore, each time a page is viewed, there will be many “hits” recorded in the log. For this reason, a hit is not really a useful metric any longer.
2. Page View
A page view is one complete web page loaded to a user’s browser. In the web server log, a page view consists of the HTML file for the web page plus all the associated graphics and other files associated with that page. A page view is made up of one or more hits.
3. Visit / Session
The words visit and session are used interchangeably. It refers to all of the pages viewed by a single user at one sitting. The session is identified by finding all of the hits for a given user that occur within a specified period of time from each other. Typically, a half hour is used as the cutoff. In other words, a session is calculated by stringing together all of the hits for a given user, where each hit occurs no longer than 30 minutes from the one immediately before it. The result is a complete session.
4. Unique Visitor
A unique visitor is a visitor to the website who can be uniquely identified. That way if the same visitor returns multiple times, you can measure his activity over time. Unique visitors are typically identified by the user cookie. As discussed above, the older method of using the IP address is not a reliable method for measuring unique visitors. It is possible that a unique visitor can actually be multiple persons. In the case when a family or multiple employees at a company are using the same computer, they will all have the same user cookie.
5. Authenticated User
If the user is required to log in to the website at the start of the visit, they become an authenticated user.
6. Referring URL
The referring URL is the web page where the link that sent a visitor to your website is located. If the user types your URL directly into her web browser, she will have no referring URL. These are sometimes called walk-ins.
7. Entry Page
The first page in a unique visit is called the entry page.
8. Exit Page
The last page in a unique visit is called the exit page.
Web Analytics solutions come in many varieties. There are solutions for small businesses that provide basic reporting at a low cost. There are also solutions for large businesses that provide in- depth, customized reporting and analysis for a much larger cost. Whatever size business you have, there is a Web Analytics solution for you.
========================== This article is an excerpt from The Complete Internet Marketer: A Practical Guide To Everything You Need To Know About Marketing Online by Jay Neuman.
Since 1994, Jay Neuman has been helping businesses as varied as Fortune 500 companies, startup Dot-Coms and nonprofit organizations overcome their Internet Marketing and Database Marketing challenges.
Jay is currently Sole Proprietor of the KnExT Consulting Group.www.knextconsulting.com. He can be reached at jay.neuman@knextconsulting.com
|