I've encountered a strange issue with our internal Windows DNS infrastructure. We have a website hosted on Amazon EC2 with the DNS running on Amazon Route 53. In the publicly facing DNS we have the wildcard record setup as an A record Alias pointing to an AWS Elastic Load Balancer sitting in front of our EC2 instances. For those who are not aware, the A record Alias behaves like a CNAME record, however no extra lookup is required on the client side (See http://docs.amazonwebservices.com/Route53/latest/DeveloperGuide/CreatingAliasRRSets.html for more information). We have a secondary domain that has the www subdomain as a CNAME pointing to a subdomain on the primary domain, which resolves against the wildcard entry. For example the subdomain www.secondary.com is a CNAME to sub1.primary.com, but there is no explicit entry for sub1.primary.com, so it resolves to wildcard record. This setup work without issue publicly.
The issue comes in our internal DNS at our corporate office where we use the same primary domain for some internal only facing sites. In this setup we have two Active Directory DNS servers with one Server 2003 and one Server 2008 R2 instance. The zone is an AD integrated zone, but it is not the AD domain. In the internal DNS we have the wildcard record pointing to a third external domain, that is also hosted on Route 53 with an A record Alias pointing to the same ELB instance. For example, *.primary.com is a CNAME to tertiary.com, so in effect you have www.secondary.com as a CNAME to *.primary.com, which is a CNAME to tertiary.com. In this setup, attempting to resolve www.secondary.com will fail. Clearing the cache on the Server 2003 instance will allow it to resolve once, but subsequent attempts will fail. It fails even with a clean cache against the 2008 R2 server. It seems that only Windows clients are affected. A Mac running OSX Mountain Lion does not experience this issue.
I'm even able to replicate the issue using nslookup.
Against the 2003 server, with a freshly cleaned cache, I recieve the appropriate response from www.secondary.com:
Non-authoritative answer:
Name: subdomain.primary.com
Address: x.x.x.x (Public IP)
Aliases: www.secondary.com
Subsequent checks simply return:
Non-authoritative answer:
Name: www.secondary.com
If you set the type to CNAME you get the appropriate responses all the time. www.secondary.com gives you:
Non-authoritative answer:
www.secondary.com canonical name = subdomain.primary.com
And subdomain.primary.com gives you:
subdomain.primary.com canonical name = tertiary.com
And setting type back to A gives you the appropriate response for tertiary.com:
Non-authoritative answer:
Name: tertiary.com
Address: x.x.x.x (Public IP)
Against the 2008 R2 server things are a little different. Even with a clean cache, www.secondary.com returns just:
Non-authoritative answer:
Name: www.secondary.com
The CNAME records are returned appropriately. www.secondary.com returns:
Non-authoritative answer:
www.secondary.com canonical name = subdomain.primary.com
And subdomain.primary.com gives you:
subdomain.primary.com canonical name = tertiary.com
tertiary.com internet address = x.x.x.x (Public IP)
tertiary.com AAAA IPv6 address = x::x (Public IPv6)
And setting type back to A gives you the appropriate response for tertiary.com:
Non-authoritative answer:
Name: tertiary.com
Address: x.x.x.x (Public IP)
Requests directly against subdomain.primary.com work correctly.