The Official Website of AgoraCart and Agora.cgi
AgoraCart.com Demos Download AgoraCart User Manuals & Wiki Gold Members Forum Go Gold Now! Gold Version Memberships

AgoraCart.com

About
Features
Download
Payment Gateways
Send a Donation
Founders Club
BLOG: News & Updates

Showcases & Demos

AgoraCart Demos
Shop Live Stores

Downloads & Add-ons

Gold Version Downloads
DBwizz Database Mgr.
AgoraCart.com Store

Help & Support

User Manuals
Gold Version Users Forum
Gold Version Chat
Tech Support
Certified Agora Pros
Certified Designers
Hire a Freelancer

Gold Version Members

Member Benefits
Join Today!
Gold Members Home
Gold Version Users Forum
Gold Version Chat Rooms
Gold Version Downloads

For Store Owners

Merchant Accounts
Cool Resources
Advertise Here
"Powered by" Logos
Web Hosting Search

Misc.

Contact Us
MEET's Talking Guide
The Ancient Greek Agora






AgoraCart Free User Forums

This is the official FAQ and Cool Tips guide For the AgoraCart shopping Cart software


Official Sponsors of the AgoraCart Project:

       


RegisterSearchFAQLog in
Reply to topic Page 1 of 1
Spider attack from Yahoo
Author Message
Reply with quote
Post Spider attack from Yahoo 
My webhost provider thinks I am being hit with a spider from Yahoo which is using my monthly bandwidth at a rapid rate.It keeps coming every few seconds. How do you add robot.txt to deter the spider, where do you put it, I only know you should put some text in the header?

Anyway it is my main agora.cgi page which it keeps downloading. (http://miraclebaby.co.nz/shopping/agora.cgi.)

View user's profile Send private message
Reply with quote
Post  
Here's the info on robots.txt:
http://www.robotstxt.org/wc/robots.html

You might want to create a site map page for the spiders...
This will help with indexing in the search engines.

HTH!


_________________
God Bless!
Bonnie - AgoraCart Moderator

Get a Gold Membership
View user's profile Send private message Send e-mail Visit poster's website AIM Address Yahoo Messenger ICQ Number
Reply with quote
Post  
i noticed you have a lot of product data outside of the store folder. are all of your products described outside of the store? if so then disallow robots from entering the "shopping" folder.
right now the google page rank for your store is grayed out which means there is a problem. this could be due to duplicate information. so disallowing search engines from indexing the store wont hurt anything. if you have enough descriptive text for your products outside of the store then there is no reason for the spiders to be entering the store.
d

Reply with quote
Post  
I've got the main website which all links to the shopping cart (an info site and a shopping site together). I don't know what to put to just disallow the shopping part to be accessed so I have just put a robots.txt file on which blocks Yahoo from the whole site, where the problem was coming from. It would be better to just block it from the cart area but I wouldn't know how to.

I don't know what I would have done to make google gray out the shopping area, but wouldn't the site have better rankings if it ranked the shopping area?

View user's profile Send private message
Reply with quote
Post  
the user agent can be spoofed. your administrator must make sure that it is indeed yahoo causing the problem. also sometimes robots can go nuts and this maybe an isolated incident. if the abuse continues then action will be necessary. msn hammered one of my sites once. drove the bandwidth up in a very short time. however, it never happened again so i didn't do anything about it.

if you have the same data for all products in static pages (html files outside of the store) as you do in the store then there is no reason for the spiders to index the store. further, this can be considered duplicate content which could very well be why google has grayed out the ranking meter for your store pages. again, there is nothing to gain by allowing a spider to index your store if all of your product's information can be indexed outside of the store.
you're better off with the static pages (SEO wise) then having all of your product information (hidden to many spiders) inside dynamic pages. so keep what you have. just disallow spiders from indexing your store.
one important thing here. i am assuming that duplicate content is the issue with google and possibly other search engines. exclude the spiders with robots.txt from indexing your store. if you find your placement has dropped, traffic has dropped, sales have dropped or any other negative effect after adding the robots text file then either delete the robots.txt or change it to allow everything.

robots.txt file - how to
create a new plain text file with notepad. name it robots.txt
to allow all robots to index everything add the following to the robots.txt file contents...

User-agent: *
Disallow:


to block all robots from indexing everything (not adviseable!)...

# warning this is not a good thing to do!
User-agent: *
Disallow: /


to exclude the store folder from all robots...

User-agent: *
Disallow:
Disallow: /store_folder_name/


the begining "/" is the public root so the path to the store folder is necessary. if your store is in the cgi-bin then you would do this...

User-agent: *
Disallow:
Disallow: /cgi-bin/store_folder_name/

if the store is in a sub folder of a folder in the public root (public_html or www or whatever) then do the following....

User-agent: *
Disallow: /members/store_folder_name/

this would look something like the following in a url...

http://www.domain.com/members/store_folder_name/agora.cgi

another thing to remember is any sub folders under the folder name you're excluding will not be indexed nor accessed by virtue of the trailing "/". so in your case the following will prevent spiders from accessing any file and/or folders in your store but allow indexing everywhere else...


User-agent: *
Disallow:
Disallow: /shopping/


whenever dealing with SEO fixes or changes you must monitor results over time to make sure you're not doing anything harmful. a simple mistake can have negative results. the fact that your store has been grayed out doesn't necessarily mean something really nasty is going on... just that something isn't right. you maybe slighly penalized in placement or you may suffer severe placement problems by not fixing it. it's hard to tell without finding exactly why google has a problem with your store. however, the fact that your TLD and static pages have ranking tells me that your site hasn't been banned.

add the last example to your robots.txt file then upload in ascii to your public root folder.
the robots.txt file will not in itself be harmful if indeed the problem is duplicate content. but then again, this is google and they don't bother with telling you why about anything only that they decided in their infinate wisdom that they decided to mess with your head. for them it would be no problem to do a site search form and tell you what the major problems are much like WC3 does. but they don't. they just do whatever they want and explain nothing to anyone. so keep an eye on changes in placement, traffic, page ranking and etc. any negative effect remove the robots.txt asap.

d

Reply with quote
Post  
oh one other thing. if you want to block one or more robots from your store and allow everyone else access you can do something like this...

# disallows yahoo only
User-agent: Slurp
Disallow: /shopping/
User-agent: *
Disallow:

or....

# disallows msn and yahoo only
User-agent: MSNBot
Disallow: /shopping/
User-agent: Slurp
Disallow: /shopping/
User-agent: *
Disallow:


another thing you can do is tell them to slow the heck down...

# tells all bots not to come here and start downloading everything all at once. take small bites!

User-agent: *
Disallow:
Crawl-delay: 120 # in seconds

this maybe more helpful concerning yahoo and other bots site wide but i would disallow all bots from entering your store too providing all product data is located outside the store!

Reply with quote
Post  
okay I wil disallow from the shopping area and see how it goes.

View user's profile Send private message
Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum