How to return proper 404 for google while providing user friendly content to the user?
- by Marek
I am bouncing between posting this here and on Superuser. Please excuse me if you feel this does not belong here.
I am observing the behavior described here - Googlebot is requesting random urls on my site, like aecgeqfx.html or sutwjemebk.html. I am sure that I am not linking these urls from anywhere on my site.
I suspect this may be google probing how we handle non existent content - to cite from an answer to the linked question:
[google is requesting random urls to] see if your site correctly
handles non-existent files (by returning a 404 response header)
We have a custom page for nonexistent content - a styled page saying "Content not found, if you believe you got here by error, please contact us", with a few internal links, served (naturally) with a 200 OK. The URL is served directly (no redirection to a single url).
I am afraid this may discriminate the site at google - they may not interpret the user friendly page as a 404 - not found and may think we are trying to fake something and provide duplicate content.
How should I proceed to ensure that google will not think the site is bogus while providing user friendly message to users in case they click on dead links by accident?