LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement

Print Subscribe to Linux Subscribe to Newsletters
Linux & Unix > Excerpts >

Ads in Cache-Friendly Pages

by Jennifer Vesperman
03/21/2002

You want to make your Web pages cache-friendly, but you're worried you'll lose advertising revenue. Ads are a reality in our current model of the Web, and it's important to ensure that they work and to keep track of how often they are served. Fortunately, it's easy to make ads work without losing cache friendliness in our pages.

To explain how to make ads uncacheable while having a cacheable page, I need to explain entities and selectively applying Cache-control headers. For a detailed description of Cache-control headers, see Cache-Friendly Web Pages.

The principles described here work for all the major elements in a Web page. Once you understand them, you can selectively apply cache headers to every entity on your pages.

Entities and cache control

HTTP is a client-server protocol. The client initiates the contact, asking the server to provide the entity described in the URI (Uniform Resource Identifier). A URI can be a location (a URL) or a name (a URN). URLs are the most common, but HTTP can handle either.

The server returns a response. The response consists of some HTTP headers and an "entity." The entity includes entity headers and a body that is (we hope) the requested data. Cache-control headers and Expires headers are entity headers, and they apply only to the entity they are included in.

Related Reading

Web Caching
By Duane Wessels

Inside the entity, especially if the entity is HTML, there may be URIs referring to additional entities. HTML image tags are the most common of these, and Web browsers read image tags and then send additional GET requests for the new entities to Web servers.

Any time you include a URI (relative or full) in your HTML, you may be referring to another entity. The exception is a fragment token, #foo, in the same entity it refers to. If stripping the fragment off would leave you pointing to the same page, it's the same entity. Other than that, every distinct URI refers to a distinct entity.

The browser (or other client) will make a separate request to pull down each entity. In the case of a link, it waits for the user to initiate the request. In the case of images, it usually initiates the request itself. (Some browsers do not request images, or do so only on user-initiated request.) Other included objects may or may not be automatically downloaded; please see the HTML specification to determine what the browser is expected to do with them.

All images, including images that are ads, are individual entities. And individual entities have their own, individual Cache-control or Expires headers. So we can have ads in cache-friendly Web pages without actually caching the ads.

This fact also allows us to have images with very long expiry times in frequently changing (and rapidly expiring) Web pages, and to have other elements uncacheable.

  • Set the expiry of the main Web page to whatever expiry is appropriate for that page.
  • Set the expiry of unchanging images or other included entities to the maximum (a year).
  • Set expiries of anything you want not cached to "do not cache" using either Cache control or a zero expiry time. You may want to do this for ads.

Setting ads to be uncacheable is one way of attempting to count the hits accurately. It's a rather unfriendly way of doing it, though; it forces the reloading of a (usually) static object, every time.

Pages: 1, 2

Next Pagearrow




Tagged Articles

Be the first to post this article to del.icio.us

Recommended for You

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

Sign up today to receive special discounts,
product alerts, and news from O'Reilly.
Privacy Policy >
View Sample Newsletter >
  • Youtube
  • http://www.youtube.com/OreillyMedia
  • Twitter
  • Subscribe
  • View All RSS Feeds >
O'Reilly Media

800-889-8969 or 707-827-7019
Monday-Friday 7:30am-5pm PT
©2011, O'Reilly Media, Inc.
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
  • About O'Reilly
  • Academic Solutions
  • Contacts
  • Customer Service
  • Careers
  • Press Room
  • Privacy Policy
  • Terms of Service
  • Writing for O'Reilly
  • Community
  • Authors
  • Forums
  • Membership
  • Newsletters
  • RSS Feeds
  • User Groups
  • More O'Reilly Sites
  • igniteshow.com
  • makerfaire.com
  • makezine.com
  • craftzine.com
  • labs.oreilly.com
  • Partner Sites
  • PayPal Developer Zone
  • O'Reilly Insights on Forbes.com