- 5 05 2002 - 01:36 - katatonik

what I fail to understand

camp catatonia used to be hosted at one server. then it relocated. the pages on the old server are still there, but all redirected to the corresponding pages on the new one. the pages on the old server expressly say “piss off” to each robot that aims to index them, and they do so in nice language (with a meta-tag). there’s no robots.txt file, but the old site still gets hits from search engines that one might expect to respect meta-tags which tell them to piss off.
it doesn’t matter that much, but i’m just wondering why all this happens. oh, am i wondering.

google isn't perfect. i have experienced the same anomalies with waplog.ch. meta tags and robots.txt file have long been in place there, nevertheless the (two) pages have been indexed by google the last few months. i haven't checked other search engines, because frankly i don't care about em. ;) did you find any other engine that didn't cope with your indexing directions?

ubique (May 5, 14:37) #

these are the main engines, or sites, that continue to index the old site: www.inktomi.com, www.fastsearch.net, altavista, www.search.at, www.speakeasy.net, and, of course, google. lots of google. it's especially bizarre with the individual entry pages, because in my setup, the entry pages contain only comments. i don't get that many hits for these indexed pages, luckily, but they still get indexed :-)

katatonik (May 5, 15:05) #

hm, i dunno. none of these engines can afford to ignore indexing directions. but i think i got something for you. you're robots.txt file resides at http://.../, right? a robots.txt file must reside on the server root, that is at http://.../ (where you probably don't have any access privileges, alas). imagine robots had to look up a robots.txt file for every directory they index - that would mean a whole lot of extra traffic for big sites.

also, there might be a syntactic problem with your meta tags. currently you have:

[meta name="robots" content="noindex"]
[meta name="robots" content="nofollow"]

it's possible that the second definition overwrites the first one. http://.../>this page indicates it should be:

[meta name="robots" content="noindex,nofollow"]


ubique (May 5, 15:44) #

thanks, but the two meta-tags instead of one was a feature only of the main index page (leftover), on the entry pages, they are all separated.
i don't have a robots.txt file, exactly because i don't have root access on this server, but i thought that perhaps meta-tags would be enough. oh well, just wait and see.
actually i'd much rather delete the old pages, but am still not certain whether people have updated bookmarks already. i tried a server redirect, but couldn't get that to work (don't remember why).

katatonik (May 5, 15:52) #

nun, dann weiss ich auch nicht weiter. abwarten und tee trinken, würd ich sagen. ;)

ubique (May 6, 17:32) #

danke. hast du vielleicht noch ein paar tee-empfehlungen bereit? :-)

katatonik (May 6, 17:50) #

am besten griechisch/türkischer (je nach einkaufsort) bergtee alias zitronensalbei. der beruhigt.

nusquam (May 7, 23:51) #

