Google indexed the same page under two URLs (despite rel-canonical)

Posted by unor on Pro Webmasters See other posts from Pro Webmasters or by unor
Published on 2013-11-02T15:58:09Z Indexed on 2013/11/02 16:02 UTC
Read the original article Hit count: 228

The Super User question "Playing mp3 in quodlibet displays “GStreamer output pipeline could not be initialized” error" is indexed under two URLs in Google:

  • http://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia
  • http://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058

The first one is the canonical one; the corresponding rel-canonical is included in both pages:

<link rel="canonical" href="http://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia" />

Google also indexed http://superuser.com/a/652058, which redirects to the answer:

http://superuser.com/questions/651591/playing-mp3-in-quodlibet-displays-gstreamer-output-pipeline-could-not-be-initia/652058#652058

Now, the second URL from above is the same as this one minus the fragment #652058.

So Google seems to strip the fragment, which results in exactly the same page under another URL (= containing the answer ID /652058 as suffix), and indexes it, too -- despite rel-canonical and duplicate content.

Shouldn’t Google recognize this and only index the canonical variant?

And what could be the reason why Stack Exchange includes the answer ID in the URL path, and not only in the fragment (resulting in various URL variants for the same page)?

© Pro Webmasters or respective owner

Related posts about google-search

Related posts about duplicate-content