Semantically Rich Markup

While reading my book on CSS today I came across the best explanation I’ve seen on how to structure your HTML code for the search engines. Unfortunately there is a lot of confusion about how to optimize HTML code for the search engines. Some people don’t realize that what you see in the browser is not what the search engines see. The search engines can only parse the HTML code in plain text as if they were blind. They do not render the page in the browser and apply human judgement to what they see. This is why an ugly web site may rank better than a beautifully designed web site. Other people seem to think there is only so much text the search engines will parse so you should place all your content at the top of the page. They want you to get rid of JavaScript, CSS, and meta tags in favor of as much text content as possible. Then there are those people who absolutely forbid Flash because they’ve heard that Flash is bad for SEO. Actually, Flash is only bad when the entire web site is designed in Flash and there is no text to be found. This does not mean that you should avoid any use of Flash.

But as the CSS book explains, it is semantically rich markup that is important. Intelligent markup of the content helps search engines better understand how they should weight and index web pages. In other words, it is not the text or code that is important. What is important for the search engine’s interpretation is how the code structures the text. For example, blocks of text that are not contained within the tag will not be seen as paragraphs. You may have a lot of text on your web page but without tags the search engines will not see a lot of paragraphs. Therefore you may only get credit for having a single paragraph even though you have line breaks to separate the text into paragraphs. It is quite possible that your entire text content will be overlooked by a parser if none of it is contained within tags!

Another example is how headers are defined. A header should be indicated using the <h1> tag. You can use font tags or CSS to scale text so it is header size but it will not be seen as a header by the search engines. Therefore you will not get any credit for having headers in your content. Unordered lists will not be seen if you use a bullet point character to indicate list items instead of the <li> tag.

Content is frequently poorly marked up because people copy and paste text from Word documents and other sources without looking at the HTML code. Word uses a lot of inline style bloat to format text exactly as it appears in a word processing document. Copying and pasting from other sources frequently carries over a lot of <div> and <span> tags which provide a meaningless structure to the document.

Therefore it is essential to understand HTML code. You must stop relying on a visual inspection of your web page and remember to view the source. You have to know how to replace content that is formatted by inline styles, <font> tags,
<div> and <span> tags with <h1> tags that are formatted by the style sheet. You also need to understand CSS so that you can reapply formatting to the proper tags for your content.

But don’t become a tag Nazi and forbid the use of <div> tags. These are often necessary to apply the CSS. A <div> tag should not be used to create visual paragraphs. In other words, this is a matter of making proper use of the tags according to how they were originally intended to mark up the content.

This entry was posted in CSS, Web. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit exceeded. Please complete the captcha once again.