Content
02 February 2007, 15:57
10 common mistakes... (continuation)
Reason ¹1: Your Web page does not have unique IP address.
Does your Web site has a unique IP address? If not, your Web site is running
the risk of getting banned from the search engines.
Human beings use domain names like yahoo.com, but network computers use IP
addresses, which are numeric addresses written as four numbers, separated by
periods.
Every domain name translates to a so-called IP address. For example,
yahoo.com is translated to "64.58.76.225". Just enter "http://64.58.76.225/" in
your Web browser and you'll go to www.yahoo.com.
Many Web hosting services don't give out unique IP addresses to their
customers to save money. They assign the same IP address to multiple domain
names. This means that several hundred Web sites could all be using the same IP
address as your site does.
There are 3 reasons why you need a unique IP address:
- If you're sharing an IP address with 50 other sites, you're trusting
them not to over-submit or spam the search engines. When a search engine
blocks an IP address, all the sites that are sharing that IP address are
blocked. You could wind up being banned from the search engine.
- If the server or the search engine spider software is misconfigured, the
search engine spider may end up obtaining a Web page from another domain
with the same IP address. This may mean that the other Web site gets indexed
instead of yours, or your Web site will be found for the keywords which are
applicable to the other site.
- Rumor has it that having your own unique IP address may help your search
engine ranking.
So when you select a Web hosting service, make sure that your domain name has
a unique IP address, even if it means that you have to pay a bit more for your
hosting.
Are you sharing an IP address with people you don't even know? Here's a way
to test it yourself:
- Go to http://www.eamnesia.com/hostinfo/i.jhtml " and enter your domain name (for
example, yahoo.com).
(If this URL doesn't work anymore, go to "
http://www.name-space.com/search/ " and enter your domain name in the
nslookup field.)
- The result page shows you what IP address your site resolves to (for
example, 64.58.76.225)
- Copy the IP address to the clipboard.
- Open a new window in your Web browser, enter the IP address (for example,
http://64.58.76.225 ) and hit Return.
- If your Web site appears, you have your own IP address. If another Web
site or an error message appears, you probably share the IP address with
others.
If you are unsure, ask your Web hosting service company if your Web site has
its own IP address.
Reason ¹2: You are hosting your Web site at a free Web space provider.
Some search engines (e.g. AltaVista) limit the number of pages they will
index from a single domain. For example, if your Web page is hosted at
Geocities.com or Tripod.com, it might happen that your Web site is not listed
just because the maximum page limit for that special domain name is reached.
Some search engines no longer even index pages residing on common free Web
hosting services. Their complaint is that they get too many spam or low-quality
submissions from free Web site domains.
However, Google is the exception. Google does index Web pages on
Geocities.com and Tripod.com. Those pages also seem to have a Google PageRank of
at least 3/10 because they are linked from a popular domain.
Reason ¹3: You don't allow robots to index your Web site.
Imagine you're a Internet marketing service company and you keep trying very
hard to get a top ranking in the search engines for your customer.
Even after several weeks, the customer's Web site hasn't been listed in any
search engine. Then you start to realize that the search engine spiders and
robot programs cannot access the Web site because your customer blocks them (by
mistake).
There are two ways to block search engine robots: a) with a simple text file
in the root directory of the host server, or b) with a certain META tag in the
Web pages.
a) Robots.txt
The host server might have a plain text file named "robots.txt" in the root
directory. It contains rules for the search engine spiders. The rules in the
robots.txt file follow the Robots Exclusion Protocol, a document designed to
help Web administrators and authors of Web spiders agree on a way to navigate
and catalog Web sites.
The content of the robots.txt file consists of two main commands: "User-agent"
and "Disallow".
The User-agent command specifies the name of the robot for which the
following commands should be applied to. You can set this to "*" to have the
spidering commands applied to any robot.
The second command, "Disallow", specifies a partial URL that should not be
indexed by the Web robot.
The text
---
User-agent: *
Disallow: /
---
tells all search engine spider programs to go away. If you find a text file
called "robots.txt" in the root directory of the host server with the above
content, you should delete it immediately. The text file says that no search
engine is allowed to index your Web site.
Even if your robots.txt file don't contain the above commands, you should
make sure that its syntax is correct. A robots.txt file with a faulty syntax
also prevents search engine spiders to index your Web site.
To check the syntax of your robots.txt file, you can use this free tool (just
enter your domain name www.domain.com): http://www.sxw.org.uk/computing/robots/check.html
b) The META ROBOTS tag
There's a second way to stop search engine robot programs to index your Web
site: the META ROBOTS tag. If you find the following HTML tag in your Web pages:
---
---
you should replace it immediately with
---
---
If you want all search engine spiders to index all Web pages, you can also
remove the META ROBOTS tag from your Web pages.
Further information about both ways to stop search engines to index your Web
site can be found at: http://stakh.com/blogen/
Reason ¹4: Your Web pages are created dynamically.
Databases and dynamically generated Web pages are great tools to manage the
contents of big Web sites. Imagine you'd have to manage the Web site contents of
the New York Times without databases...
Unfortunately, dynamically generated Web pages can be a nightmare for search
engine spiders because the pages don't actually exist until they are requested.
A search engine spider is not going to be able to select all necessary variables
on the submit page.
The exceptions are the spider programs from Google and Inktomi. They are able
to index Web pages that are dynamically generated, even those that use question
marks and query strings.
On the other side, AltaVista isn't able to index dynamically generated Web
pages, and here's why:
http://help.altavista.com/adv_search/ast_haw_wellindexed
If you create dynamic Web pages with the help of Active
Server Pages (ASP), ColdFusion, CGI, Perl or the Apache
Server, then the following Web page offers good avice:
http://spider-food.net/dynamic-page-optimization-b.html
Reason ¹5: Your Web pages require a full-fledged browser.
When search engines crawl the Web to find new Web pages, they use special
software for it, called "spiders", "robots" or "crawlers".
These crawler programs don't have the functionality of full-fledged Web
browsers such as Microsoft Internet Explorer or Netscape Navigator.
In fact, search engine robot programs look at your Web pages like a text
browser does. They like text, text, and more text. They ignore information
contained in graphic images but they can read text descriptions.
This means that search engine spider programs are not able to use Web browser
technology to access your site. If your Web pages require Flash, DHTML, cookies,
JavaScript, Java or passwords to access the page, then search engine spiders
might not be able to index your Web site.
Therefore, it might be a good idea to test your Web pages with very old
versions of Web browser applications or with the software program "Lynx", a
text-only browser.
Lynx is available for download
here . Here's an online version of Lynx that allows you test your Web pages
with a text-only browser quickly and easily:
http://www.delorie.com/web/lynxview.html .
"Simulation of a search engine spider", see how search engines see your Web
site:
http://www.delorie.com/web/ses.cgi
Reason ¹6: Your Web site has a low link popularity.
Link popularity is becoming _the_ determining factor for a top search engine
ranking.
Link popularity means the number of Web sites linking to your site. However,
the quality of links is more important than the quantity of links. For instance,
if the New York Times links to your site, their single link might count a lot
more than 30 links from your friends' personal homepage.
By now, all top search engines use link popularity in their ranking formulas:
AltaVista, Inktomi, MSN Search, HotBot. For Google, it's even the most important
factor in ranking sites.
The idea behind link popularity is that other Web sites will link to your
site only if you are a quality site offering quality resources. So if many Web
sites link to your site, search engines come to the conclusion that your site
must be very popular and deserves a high ranking.
As Google co-founder Sergey Brin said in an interview: "...a page that is
pointed to by many other sites is important. In other words, external approval
raises a page's ranking."
Link popularity can do a lot for your site. Not only will the most important
search engines rank you higher, but links from other sites will also drive more
traffic to you.
In addition, as more sites link to you, the odds increase that search engine
spider programs will encounter your site more regularly so that it's less likely
that they drop your site from their index.
What's the link popularity of your site?
The freeware Windows program Link Popularity Check tells you the link
popularity of your Web site is and compares it with competitor sites.
Link Popularity Check is a classic freeware application: it has no nag
screens, it doesn't change the system registry, it makes no unauthorized
connections to the Internet and it comes with a hands-free uninstaller.
Download Link Popularity Check (freeware).
How to improve the link popularity of your Web site:
You can search the search engines for Web sites that are related to your
business, find the webmaster's contact information and then solicit a reciprocal
link. Do this every day and your link popularity will climb steadily but slowly.
And it's very time-consuming.
Reason ¹7: Your Web site has a slow host server.
Search engine crawler programs that index Web pages don't have much time.
There are approximately 2-4 billion Web pages all over the world and search
engines want to index all of them.
So if the host server of your Web site has a slow connection to the Internet,
you may experience that your Web site will not be indexed by the major search
engines at all.
AltaVista and Google specifically mention the problem on their Web sites.
AltaVista: "If a site has a slow connection or the pages are very complex, it
might time out before the crawler can index all the text."
Found at:
http://help.Altavista.com/adv_search/ast_haw_wellindexed
Google: "Your site may not have been reachable when we tried to crawl it
because of network or hosting problems. When this happens, we retry multiple
times, but if the site cannot be crawled, it will not be listed in our current
index."
Found at:
http://www.Google.com/webmasters/2.html#A3
You may also want to limit the size of your homepage to less than 60K. It'd
also benefit the still numerous users that connect to the Internet with a slow
modem. For even the casual Internet user, the performance of a Web site can make
the difference between pleasure and frustration.
How you can test the speed of your server:
http://web-hosting.candidinfo.com/site-response-speed.asp
Reason ¹8: Your Web page URL contains special characters.
Most search engines have problems indexing Web pages when their URLs contain
special characters. The following special characters are known to be "search-engine-spider-stoppers":
- ampersand (&)
- dollar sign ($)
- equals sign (=)
- percent sign (%)
- question mark (?)
These characters are often found in dynamically generated Web pages. They
signal the search engine crawler program that there could be an infinite loop of
possibilities for that page. That's why they ignore Web page URLs with the above
characters.
AltaVista and Lycos explain on their help pages why they cannot index such
Web pages:
HotBot recommends that you submit your dynamic Web pages with all parameters
added onto the URL (for example, "www.site.com/articles/query.asp?article=83").
Google and Inktomi utilize crawler programs that are able to index
dynamically generated Web pages, even those that use question marks.
So what can you do if you have dynamically generated Web pages with special
characters? If you use the Apache Server, ASP, CGI/Perl or ColdFusion, the
following Web page provides some solutions:
http://spider-food.net/dynamic-page-optimization-b.html
Reason ¹9: Search engines could not resolve
your DNS name.
There's a mistake that novice users often do. They register a domain name (for
example, www.my-great-site.com), and they immediately submit the Web site URL to
the search engines.
Then they wonder why the search engines didn't index their site. The answer
is, they weren't able to do it.
It takes approximately 2-4 days until a domain name becomes active. All
Internet access providers must update their records (DNS tables) to reflect new
site locations. The process of updating DNS tables is called propagation.
Search engines must also update their DNS tables and until then, the new
domain name www.my-great-site.com doesn't work.
So when you register a new domain name, you must wait about 48-72 hours until
you can submit the domain name to the search engines.
Copyright © 2003 Stakh SEO News ST-K
[01] [02] [03] [04] [05] [06] [07] [08] [09] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] |