The Official Website of AgoraCart and Agora.cgi
AgoraCart.com Demos Download AgoraCart User Manuals & Wiki Gold Members Forum Go Gold Now! Gold Version Memberships

AgoraCart.com

About
Features
Download
Payment Gateways
Send a Donation
Founders Club
BLOG: News & Updates

Showcases & Demos

AgoraCart Demos
Shop Live Stores

Downloads & Add-ons

Gold Version Downloads
DBwizz Database Mgr.
AgoraCart.com Store

Help & Support

User Manuals
Gold Version Users Forum
Gold Version Chat
Tech Support
Certified Agora Pros
Certified Designers
Hire a Freelancer

Gold Version Members

Member Benefits
Join Today!
Gold Members Home
Gold Version Users Forum
Gold Version Chat Rooms
Gold Version Downloads

For Store Owners

Merchant Accounts
Cool Resources
Advertise Here
"Powered by" Logos
Web Hosting Search

Misc.

Contact Us
MEET's Talking Guide
The Ancient Greek Agora






AgoraCart Free User Forums

This is the official FAQ and Cool Tips guide For the AgoraCart shopping Cart software


Official Sponsors of the AgoraCart Project:

       


RegisterSearchFAQLog in
Reply to topic Page 1 of 1
Googlebot crawling too much
Author Message
Reply with quote
Post Googlebot crawling too much 
Googlebot 18107+13

I noticed that googlebot is very busy on my agora.cgi files.

This is in version 4.

It seems to be crawling alot of the left over carts.

It seems to be coming back and going through everything.

Normally we want the googlebot but this seems too much.

Any ideas on what if anything to do?

View user's profile Send private message
Reply with quote
Post  
You will need to create a robot.txt file for your site that will tell the robots where they can and cannot go.
Here's some info them:
http://www.robotstxt.org/wc/robots.html

HTH!


_________________
God Bless!
Bonnie - AgoraCart Moderator

Get a Gold Membership
View user's profile Send private message Send e-mail Visit poster's website AIM Address Yahoo Messenger ICQ Number
Reply with quote
Post  
what do you mean it's crawling left over carts? the only way it can "see" other carts is if there is a link someplace with the cart_id hardcoded in the link. either on your web site, a forum, a guestbook, another website and etc.
i'm guessing here as all the previous are possible but each time the bot indexes the contents of your store (cartlinks, pages, categories, etc.) a new cart_id is created for the session. the bot indexes the links and contents. then it returns later to check that the links (with the cart_id) are valid. so it's possible for google to have numerous links with different cart_ids that it has to check to be sure they are valid.
i have a program that creates a static page index of the dB linked to a static html page for every product in the dB. each individual product page is linked directly to the category and the p_id in the store. once set up this way then you can use robots.txt to keep all bots out of the store. otherwise you can't unless you want to suffer from lack of SE traffic.
i wouldn't worry about this issue. where you need to worry is when a bot gets into an endless loop and drives your bandwidth through the roof. but that's not happening much these days. one other issue is the links provided by search engines to visitors. if more than one person clicks a link with a hardcoded cart_id (provided by google, for example) then it could get messy if they are paying customers.
d

Reply with quote
Post  
<a href="http://mycart....com/getmydvd/agora.cgi?cart_id=564542.29788*c37SA4564542.29788"

<b>Agent:</b> Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Its showing links like these.

So is that an old cart, or did the bot create a new cart_id?

I want the bots to index my products using agora.cgi, but not the carts.

I used robots.txt to disallow most of the other directories but I don't want to disallow the entire store, so the products can get indexed.

View user's profile Send private message
Reply with quote
Post  
right. robots.txt will keep the honest bots out of store directories. however, that is not an issue unless you have links to files and or folders in the store which you don't or shouldn't. so you're back to square one because all links contain cart_ids from the second store page visited onwards. with the exception of returning from the viewcart page and some other functions.
there are several ways to prvent the cart_id from displaying in a link or address...

you can strip the cart_id=(%%cart_id%%) from all scripts and links. this will prevent the cart id from displaying in the query string. this is not recommended. however, i know ppl who claim this has never presented a problem for any of their customers.

you can do a product link list out side of the store linking to each product with p_id and not product. in the store headers put the meta tag no-follow. this will have some limited success.

you can do a complete online catalog outside for the store linking to the product, the product category and other destinations of your site. included in the product catalog would be the product description, price is not necessary. then do a robots text telling all bots to stay out of your store directory...
http://www.hgltd1.com/
the store is in the hgltd1 sub directory, for example
http://www.hgltd1.com/hgltd1/category-Bracelets.htm
the online catalog (my static page generator) is in the Catalog subdirectory
http://www.hgltd1.com/Catalog/index.html
i choose to keep bots out of several directories including the store. since all information relative to search engines can be found in the catalog there is absolutely no reason to allow bots to index the store's dB. so, i created this robots.txt to keep them out...
http://www.hgltd1.com/robots.txt

awhile back i updated the robots.txt and omitted the hgltd1 subdirectory for experimental reasons so if you have the google tool bar you will see a pr where there shouldn't be.

one more way to prvent the cart_id from being included in the urls/links is to do a mod rewrite. this is not a cure all unless the store is massively hacked to allow for rewrite throughout. at hgltd1.com you will see the store with mod rewrite in the hgltd1 subdirectory.

my personal opinion? the online catalog is the best way while keeping bots out of the store. it's a lot of work to do by hand but worth it. it's also tons easier to use the online catalog generator.
d

Display posts from previous:
Reply to topic Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum