[LRUG] mod_ruby question

Graham Seaman graham at theseamans.net
Fri Mar 9 06:46:44 PST 2007

Matthew Westcott wrote:
> On 9 Mar 2007, at 14:08, Graham Seaman wrote:
>> Dominic Mitchell wrote:
>>> I believe that's actually incorrect... HEAD should replicate a GET
>>> request *exactly*, including the COntent-Length for the non-existent
>>> content.
>>>    http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.4
>> Thanks; of course, you're right - and when I looked a GET isn't
>> returning the content-length either.
> As far as I'm aware, it's entirely standard practice (and permitted  
> by the HTTP spec) to omit the content-length header on dynamic pages,  
> on both HEADs and GETs. Going back to the original question - what  
> oddities are you seeing with Googlebot / Yahoo that lead you to  
> suspect you need a content-length?
Repeated 'HEAD' requests on the index page (several a day over about a 
week since I submitted the site), never followed by a GET or any attempt 
to spider. In the case of yahoo  this has ended  up with  slurp 
apparently just giving up and going away; with google I  later added a 
site-map,  and google is now reading the  files from the site map, but I 
suspect it might not have done without it.

Of course I don't know if it's lack of info in the headers that is 
causing this, it may just be current behaviour (eg. check the site is in 
continuous existence for a few weeks before attempting to index), but it 
seemed like a possible guess. It's been several years since I last tried 
to get a site indexed and I'm sure indexer behaviour has changed a lot 
in the meantime.  However, having got this far just for my own 
satisfaction I would prefer to see a more complete set of headers 
returned, or at least be sure I knew how to do it if needed.

The site is http://www.abruzzocasa.eu/penne if you can see anything odd



> - Matt
> _______________________________________________
> chat mailing list
> chat at lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org

More information about the Chat mailing list