Artikel
Webdesign: SEO (Search Engine Optimization)
Autor: Ralf EichingerFirst we have to distinct between Full Text Search Engines (e.g. Google, Yahoo, MSN, Ask Jeeves, Witch, Abacho, Hotbot, Lycos, Infoseek, Fireball, Excite, Webcrawler), Human Edited Catalogs (e.g. Open Directory Project, AllesKlar, Yahoo, Web.de) and Commercial Search Engines (e.g. Overture).
A directory of search engines can be found under www.suchlexikon.de or Klug Suchen!.
The following describes optimization for full text search engines.
It depends on
- the ranking-algorithms, which can change from day to day.
- the amount of indexed documents for a search keyword
After optimising, register your website at as many searchengines and catalogs as possible.
Note: after optimising and registering your website it can last a few weeks before a search engine spider visits your site for indexing and after this it can last again weeks until your site appears in the result lists.
Some common behaviour of search engines
- Search engines index only a certain amount of all pages of the website. So it may be useless to optimize pages, that are far away from the "entry" of your site.
- Search engines use software called "spider" or "robot" to crawl the web and to index pages. Each search engine has its own spider with names like "ramBot xtreme" (Aladin), "Scooter/1.0" (Altavista), "ArchitextSpider" (Excite), "KIT-Fireball/2.0" (Fireball), "BackRub" (Google), "Slurp.so/1.0" (Hotbot), "Sidewinder" (Infoseek), "Lycos_Spider_(T-Rex)" (Lycos) or "Googlebot". They navigate from hyperlinks of a page to the next page and so on. They visit websites not only after registration, but also by following a link to this page on another website.
Optimization Programs
There are a number of programs, that automatise registering your website at full text search engines.
What is needed?
- URL of the website
- keywords for the website
Programs
Tips and Tricks
- optimize each page for 3 to 5 search keywords
- keywords:
- should characterize the content
- should be often used by search engine users (so company names are not a good choice)
- can be categories of products or services
- can be synonyms or slang words (you may use a keyword generator for that)
- don't need to be case sensitive
- analyze successful other websites found by the search engine
- avoid frameset-design
- some search engines don't process frameset-websites
- framesets contain no information
- single pages often don't have an own navigation, so users arriving from search engines get frustrated
- if frameset-design is needed
- provide a full set of meta-tags
- it is very important to provide a noframes-section, containing links to pages of the site, that provide links to all other pages (like the sitemap-page). so the spider can follow them.
- it happens that single pages are indexed and a visitor sees only this page without navigation or links to other pages. so it is comfortable to insert a link into every page, which calls the homepage (that is the frameset-page). additionally the surrounding frameset can be loaded afterwards by using the following javascript in each page:
<!-- Add this script to the HEAD of every "inner"-frame page --> <script type="text/javascript"> // change the page "frameset.html" to the name of your frameset page var frameset = '/frameset.html' if (top.location == self.location) { window.location = frameset +'?'+ window.location.pathname } </script> <!-- Add this script to the HEAD of your frameset page --> <script type="text/javascript"> function setPage() { if (location.search) { var innerPage = location.search.substring(1,location.search.length); // change the name "contentFrame" to the name of the frame // where you want to load the page contentFrame.location=innerPage; } } </script> <!-- Add to the frameset tag an onload --> <frameset rows="5,*,5" border="0" frameborder="no" framespacing="0" onload="setPage()"> ...
- "good" content rules!
- don't put masses of keywords into content (keyword stuffing), such pages often get bad rankings or are banned
- good: keyword density, which means a good rate from a keyword to all words of the content (Google: 8%, Yahoo: 12%)
- so discuss only one theme on one page - divide the website in more single pages if pages discuss too many themes
- you can get good content to a theme e.g. from Wikipedia or the Open Directory Project, but don't copy and paste it! 1:1 copied pages often get banned.
- elements affecting relevance
- head-elements:
- title (shown in browser-window-head, must describe website short and concise, is used as bookmark text and in result lists of search engines), contained keywords get high relevance (so just use substantives, use keywords for which this page is optimized)
example:"Forellenzucht Danuber - Hommingberger Gepardenforellen"
only 80-200 characters are used by search engines (so leave company out for listing more keywords) - meta-tags
a simple metatag-generator can be found here- meta name="description" content="...": short description of the content in one or two sentences (some engines use this for displaying in result list!)
example:<meta name="description" content="Forellenzucht Danuber ist der Spezialist für Hommingberger Gepardenforellen: Wir liefern sie als Frischfische, Räucherfische und Satzfische"/>
- meta name="keywords" content="...": keywords describing the page
because this tag has been often used for manipulation, it is ignored by many search engines. But use it nevertheless.
List keywords or keyword combinations separated by comma.
example:<meta name="keywords" content="Forellenzucht Danuber, Hommingberger Gepardenforellen, Frischfische, Räucherfische, Satzfische"/>
keep in mind only to use words which appear in the content of the page. Otherwise it may happen that you get banned by a search engine.
- meta name="robots" content="...": defines rules for search engine crawlers, e.g. not allowing to crawl the pages. to allow indexing of the whole content use "index,follow".
example:<meta name="robots" content="index,follow"/>
- meta name="revisit-after" content="...": asks spiders for visiting this page again after a certain time. But it is mostly ignored because robots have their own time plan...
example:<meta name="revisit-after" content="2 days"/>
- meta name="description" content="...": short description of the content in one or two sentences (some engines use this for displaying in result list!)
- title (shown in browser-window-head, must describe website short and concise, is used as bookmark text and in result lists of search engines), contained keywords get high relevance (so just use substantives, use keywords for which this page is optimized)
- body-elements:
- headlines:
use important keywords in headlines (h1, h2, ...), - link-texts, filenames, img-alt-attributes, html-comments
- markups like <b> and <strong> emphasize the relevance of a keyword
- headlines:
- off-the-page-elements:
- domainname:
keywords in the domainname get high relevance. so register different domains redirecting to one.
example:www.hommingberger-gepardenforelle
- directorynames:
keywords contained in the directory structure are also a good idea. it also helps the visitor of the page for orientation. example:www.hommingberger-gepardenforelle/Rezepte
- link-popularity:
- the more other websites contain a link to your website, the higher is the link-popularity of your site. you can check this by entering the search term
link:www.yourdomain.de
in Altavista or Fireball. - are the other websites ranked high, your site is rated higher. so it is a good idea to ask high rated sites to link to your site.
another tactic is to search for high rated domains (e.g. in Yahoo-catalog), which are no longer registered and register them for the purpose of providing links to your site. - keywords geht high rates if they are contained in the link-text of the other site.
- the more other websites contain a link to your website, the higher is the link-popularity of your site. you can check this by entering the search term
- file "robots.txt":
create a file called "robots.txt" (all lower case) in the root directory of your website, which can be reached over the link "www.yourdomain.de/robots.txt". This file is fetched and analyzed by most spiders (Web Robots Database).
Syntax (a syntax-checker can be found here):User-agent: name of spider the rule is made for *: placeholder for "anything/any name/etc." #: comment Disallow: blocks given directories from being indexed, only one directory per line is allowed
Examples:# completely prevent indexing by robots User-agent: * Disallow: / # invite all robots (same as an empty file) User-agent: * Disallow: # prevent indexing of certain directories by robots User-agent: * Disallow: /unwichtig/ Disallow: /cgi-local/ # exclude a certain robot User-agent: Giordano Disallow: / # invite a certain robot User-agent: WebCrawler Disallow: # exclude certain files for all robots User-agent: * Disallow: /dontreadme.html Disallow: /notimportant.html
- domainname:
- head-elements:
- provide a sitemap-page that contains links to all pages of the site
- check regularly the position of your website and optimize it
- actualize the content more often, search engines like fresh content and visit therefore your website more often.
- register your site regularly, an onetime registering is not sufficent!
Mistakes
- providing important informations not in HTML:
big search engines can index Word-, Excel- or PDF-documents, but small ones can't. Provide important information always as HTML-page. - providing navigation only contained in a flash animation:
Robots can't navigate through flash animations. Always provide navigation links besides the flash animation, like "skip intro". - navigation contained in scripts
- sites requiring acceptance of cookies aren't indexed, because robots don't accept cookies.
- pages generated dynamically from database content often don't get indexed.
- an accidently inserted <meta name="robots" content="noindex,nofollow"/> prevents the page from being indexed by spiders and even the visit of links contained in this page
- <meta name="robots" content="index,nofollow"/> causes the page to be be indexed by spiders, but the visit of links contained in this page is prevented
- a wrong configured "robots.txt"-file can prevent indexing, netherless it is sometimes reasonable to exclude spiders from indexing some directories (robots.txt-Validator and tutorial)
Sources
- article "Aufsteiger" in magazine "c't" nr. 9/2005, page 158-163
- article "Ich bin wichtig!" in magazine "c't" nr. 23/1999, page 180-186
