i have a website with thousands of dynamic pages. I want to use the robots.txt file in order to dissalow certain url patterns corresponding to pages with duplicate content.
For example i have a page for article itemA belonging to category catA/subcatA, with URL:
/catA/subcatA/itemA
this is the URL that i want to be indexed from google.
this article is also visible via tagging in various other places in the web site. The URLs produced via tagging is like:
/tagA1/itemA
this URL i want NOT to be indexed from google. However i want to have indexed all tag listings:
/tagA1
so how can i achieve this? dissalow URLs of including a specific string with a '/' at the end?
/tagA1/ itemA - dissalow
/tagA1 - allow