|
by Christine Churchill
If
you're a regular reader
of the MarketPosition newsletter, then you already know about the
potential
pitfalls of using frames, Flash, and dynamically generated pages on
your site.
Search engine spiders struggle with these technologies, so using them
heavily
can limit your success in the search engines.
Well,
here's another item
to add to that list: invalid HTML code. Bad HTML can hurt your site in
the
search engines without you ever realizing it.
What
exactly is HTML
validation? It's the process of checking the syntax of your HTML code
to find
places where you've violated the rules of the language. The official
rules for
writing HTML are defined by the World Wide Web Consortium (W3C). Those
rules include
strict definitions stating which HTML tags are legitimate parts of the
language, and how you should structure your HTML documents.
HTML
errors that violate
these rules include things like badly nested tags (where you
incorrectly close
one element before another), content model violations (where you nest
tags that
aren't allowed inside one another), and badly formed tables.
Sound
confusing? Don't
worry - many HTML editors include a built-in validator that will check
your
page and point out this sort of error. In addition, online services
like WDG HTML
Validator and W3C itself
offer free page validation.
So
what exactly is the
impact of this sort of error? It depends upon who's reading the page.
Errors
may have no impact at all in your browser, or they could cause the text
to
appear in the wrong place or in the wrong size on the page. At their
worst,
HTML errors can keep sections of your Web page from displaying.
To be
honest, validation is
about as much fun as a trip to the dentist. Sometimes it feels like
you're
getting a root canal. The first time you validate your page, you could
see
dozens of errors. That's especially true if you coded your pages by
hand which
tends to result in more errors. Even if you use FrontPage or another
WYSIWYG
editor, they don't always produce code that validates cleanly. Of
course,
there's some assurance in the idea that a search engine would try to be
compatible with common HTML errors created by the most popular editors
like
FrontPage, it's still not a sure bet.
What
makes matters worse is
that it's hard to see the value of fixing all of these problems when
your page
displays just fine under Internet Explorer. Indeed, the whole reason
why the
W3C stresses validation is because following the official rules of the
HTML
language makes it easier for browsers to interpret your page correctly.
If the
latest browsers do that already, why bother with HTML validation?
The
reason is simple:
search engine spiders also need to interpret your HTML. And while the
Microsoft
and Netscape browsers are very forgiving of your HTML errors, search
engine
spiders aren't nearly as kind. It helps to think of a search engine
spider as a
web browser - just like a browser, the spider needs to interpret your
page and
figure out what you're saying. Only then can it properly index your
page.
Search engine spiders also care about the structure of your Web page
because
they give extra weight to keywords placed inside certain HTML tags.
But
there's a big
difference between web browsers and search engine spiders: web browsers
are
under pressure from the marketplace to correctly display as many Web
pages as
possible. Any browser hoping to be taken seriously needs to understand
the
latest web technologies and be able to understand all those badly
written pages
out there on the Web. Users would quickly abandon any browser that
fails to
render the average Web page.
One
would think that same
market pressure would push the search engines to improve their spiders,
making
them more forgiving. After all, there's a great deal of competition in
the
search engine world. Yet that doesn't seem to be happening, partly
because its
difficult to tell when this issue has come into play. It's surprising
how far
search engine spiders lag behind the major browsers.
I
don't want to give you
the impression that any and all HTML errors will wreck your search
engine
ranking. Spiders do tend to forgive many errors, such as badly nested
tags. But
I have direct experience with bad HTML hurting a search engine ranking.
A few
months ago I helped a webmaster who had lost his Top 10 ranking because
of a
simple typo in his HTML. One badly placed angle bracket kept Googlebot
from
correctly parsing the home page, causing it to fall completely out of
the
index. The page displayed correctly under all the major browsers, but
it still
caused problems for Googlebot.
So
validating your pages is
a wise precaution, particularly if you write the HTML code by hand.
Clean,
well-written HTML is important if you want to ensure a good search
engine
ranking. It also helps guarantee that your page will be displayed
properly on
older, or more obscure browsers that are less forgiving. Therefore, you
have
two compelling reasons to validate your HTML today.
Christine
Churchill is
President of KeyRelevance.com
a full
service search engine marketing firm. She is also on the Board of
Directors of
the Search Engine Marketing Professional Organization (SEMPO) and serves as co-chair of the
SEMPO
Technical Committee.

|