• Jul
    15



    Sometimes you don’t want some parts of your website to be indexed by search engine bots. Perhaps some directories are for private use only or perhaps you are working on a test site. If the page is not linked from anywhere, it’s supposed to be safe from spiders because no one knows the exact url, in theory, at least. But you want to make sure that the private part of your site remains private.

    There are 3 common ways to prevent search engines from indexing some directories or files.

    1. Using robots.txt
      When a search enginge crawlers visits your site, it looks for a file named robots.txt for instructions for your site. In order to prevent any well-behaved crawlers from indexing a directory, you will put the following in robots.txt.

      User-agent: *
      Disallow: /directory_name/

      If you have 2 directories, MyData and TestData, that you want to keep private, robots.txt will look like:

      User-agent: *
      Disallow: /MyData/
      Disallow: /TestData/

      The asterik * indicates that this instruction is for all spiders. If you only want Google bots from indexing, then you will use:

      User-agent: googlebot
      Disallow: /MyData/
      Disallow: /TestData/
    2. Using meta tags in each file
      You can put the following meta tag between the < head > and < / head > tags in any file you do not want search engine spiders to index. (Remove the space before and after < and > .)

      < meta name="robots" content="noindex" / >
    3. Using a password
      You can password protect files and directories either using a web server control panel or using applications that have a password option.

    No Comments

Subscribe by Email
Enter your email address:

Delivered by FeedBurner

Holiday Shopping at Santa's Geek Shop - CompUSA.com