Go Back   SEO Forum > SEO TIPS > Search Engine Spiders

Reply
 
Thread Tools Display Modes
  #1  
Old 07-13-2009, 02:12 PM
Alphanso Alphanso is offline
Junior Member
 
Join Date: Jul 2009
Posts: 9
Smile What are spiders

Crawlers, Agents, Bots, Robots and Spiders
Five terms all describing basically the same thing, but in this article they'll be referred to collectively as spiders or "agents". A search engine spider is an automated software program used to locate and collect data from web pages for inclusion in a search engine's database and to follow links to find new pages on the World Wide Web. The term "agent" is more commonly applied to web browsers and mirroring software.

If you've ever examined your server logs or web site traffic reports, you've probably come across some weird and wonderful names for search engine spiders, including "Fluffy the Spider" and Slurp. Depending upon the type of web traffic reports you receive, you may find spiders listed in the "Agents" section of your statistics.

Not all spiders are good
Who actually owns these spiders? It's good to know the beneficial from the bad. Some agents are generated by software such as Teleport Pro, an application that allows people to download a full "mirror" of your site onto their hard drives for viewing later on, or sometimes for more insidious purposes such as plagiarism. If you have a large or image heavy site, the practice of web site stripping could also have a serious impact on your bandwidth usage each month.

Banning spiders and agents
If you notice entries like Teleport Pro and WebStripper in your traffic reports, someone's been busy attempting to download your web site. You don't have to just sit back and let this happen. If you are commercially hosted, you'll be able to add a couple of lines to your robots.txt file to prevent repeat offenders from stripping your site.

The robots.txt file gives search engine spiders and agents direction by informing them what directories and files they are allowed to examine and retrieve. These rules are called The Robots Exclusion Standard.
Reply With Quote
  #2  
Old 07-18-2009, 10:56 AM
Shweta Shweta is offline
Junior Member
 
Join Date: Jun 2009
Posts: 6
Thumbs up

Hello Alphanso,

You have provided really a very beneficial information. But, I think it is not enough for the new comers or the people who are new to this field.

As far as I concern, I want to know more that, how can I prevent my website or software from hackers by making some changes in robots.txt file. Plz also make me aware by providing me the information about the names related to spider & their meaning too.
Reply With Quote
  #3  
Old 07-23-2009, 04:31 PM
guneet guneet is offline
Junior Member
 
Join Date: Jun 2009
Location: New Delhi
Posts: 28
Send a message via MSN to guneet Send a message via Yahoo to guneet
Talking Robot.txt file is not stop the hacking of webpage or softwares.

robots.txt file is only block or allow indexibility of search engine spiders but its doesn't provide and security for protect our webpage or software’s from hackers... if you need to concern about hackers then you need build some java scripts forms for user verification and generate typical passwords for main admin entry in the webpage or in software’s.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +6.5. The time now is 04:34 PM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2010, Search Engine Optimization.
SEO Rank Smart