Google indexing pages with #! although we don't have any
- by Benjamin Gruenbaum
Our company has developed a Single Page Application using AngularJS and its routing. Google indexed our site decently with JavaScript but it did not index some pages very well so we have developed an HTML only version.
We have followed the Ajax Crawling Specification posted here and have a <meta name='fragment' content='!'> tag and canonical urls. We expect http://www.example.com/foo/bar to be fetched from http://www.example.com/?_escaped_fragment_=/foo/bar.
However, we have found out that when we rolled the AJAX specification we now have all pages indexed twice, once with the JavaScript version as http://www.example.com/foo/bar and once with the new version as http://www.example.com/#!/foo/bar. This is harmful to us since it's duplicate content and also mis-representing out site.
I have tried looking for similar questions here and in the Google product forum but could not come up with anything.