 |
|
Appendix L: Using The 404 Error Reporting Application
Whenever someone follows a link, types in an address, or uses a bookmark to access a page that doesn't exist, the web server generates an error (Number 404'File not found). We've set up this application to track all the 404 errors created by either the English (www.hud.gov) or Spanish (espanol.hud.gov) websites. You can use this 404 error reporting application to find broken links and, more importantly, find where the user is coming from when they get the 404 error.
Why should I use this application?
This reporting application will compliment the information you get in LinkBot reports. In fact, since this data is live (the data is pulled from the server when you run the report), you can check for errors without waiting for the monthly linkbot report.
Using the options listed below, you can discover:
- Are you linking to a page that no longer exists? Thereby causing causing a 404 error.
- Is someone linking to one of your pages that no longer exists? (In which case, you can check the "referring" Field to find out who and send them a mail message asking them to change the link.
How do I use the application?
To use the 404 error reporting application go to web management - http://hudatwork.hud.gov/po/odoc/webinc/ and click on the link for 404 errors under the tool box. (Since the actual 404 Error application sits on the www.hud.gov server, you can also access this report from anywhere you have access to the web (you don't have to be connected to HUD.) The reporting application can also be found at http://www.hud.gov/utilities/404error_viewer.cfm
Report Options
You have several options in running the report:
- Date Range
We keep the 404 errors for the last 30 days. To select a date range, pick a starting date and end date. You will get a report of all 404 errors for the entire range between those dates. (Dates go from just after midnight through 11:59 for each day.)
- Filter Criteria
You can filter the results to narrow down what you are looking to find. Options are:
- IP Address if you know the specific IP you're looking for (used for weeding out search engines)
- Browser type if you're interested in whether one browser or the other is causing more errors
- HUD Server here you can select either www.hud.gov for English content or espanol.hud.gov for the Spanish mirror.
- Output option'You have the choice of sending the output to a web page (HTML) or to an Excel spreadsheet. For small reports'a day or two'the HTML version works fine. If you're looking at a larger range, or a large number of pages, the spreadsheet option will allow you to manipulate the data more efficiently.
- Select Sort Criteria
You can do some sorting right off the bat. Again, there are several options for sorting. (If you selected Excel as your output, you can do all these sorts in Excel.)
- Date/Time gives you a chronological listing of errors. This is particularly useful in identifying search engines and other automated accesses. (For example, it's not likely that an individual will create 50 404 errors in one minute. When you see this, it's probably an automated, e.g., search engine, access.)
- Page that caused the 404 error: This is the page someone tried to find and received a 404 error.
- URL Variables: these are the variables used by the database. For example, listserv name, state name, language variables, etc.
- Referring page: if allowed by the user's browser, we also try to capture how the person got to the error. So, if they followed a bad link, the referring link should be listed here.
- Broswer: again, you can look up by browser type'also useful for stripping out search engines which normally (although not always) identify themselves in the browser name. (E.g., MSIECrawler, Yahoo! Slurp - Yahoo!'s Web Crawler - http://www.inktomi.com/slurp.html)
- User's IP address: this is also useful for identifying search engines, etc. If you notice 75 404 errors in a short time period, it's probably safe to assume it's an automated system. (The 404 error reporting tool automatically strips out any IP address that creates 250 or more 404 errors in the same 24 hour period. This removes many of the more common search engines automatically.)
- Server name: This has the same effect as selecting the server name in the filter criteria above.
- Sort Order:
This is pretty self-explanatory. You can have the output listed in either lowest-to-highest or (ascending, e.g., A to Z for alpha lists) or highest-to-lowest (descending, e.g., Z to A) order.
|