In a previous position, I encounter a rather strange problem I would
like to share here. It has to do with DNS resolution. From the internal
network of a company it was not possible to get the IP address of the
www.banks2ifrs.ru.
host name.
We determined that the name servers which manage the banks2ifrs.ru.
domain are ns3.nic.ru.
and ns4.nic.ru.
. In the same time, we saw
that for the fbk.ru.
domain name, the managing name servers are
ns3.nic.ru.
and gw.fbk.ru.
. Interestingly, it was possible to
resolve www.fbk.ru.
using gw.fbk.ru.
, but not from ns3.nic.ru.
.
More, we noted that the reverse resolution for the name servers
ns3.nic.ru.
and ns4.nic.ru.
are not correct, translated to
ns3.ripn.net.
and ns4.ripn.net.
, respectively. So, it is worth to
mention that nothing was wrong when querying the name server
gw.fbk.ru.
.
# host -t a ns3.nic.ru.
ns3.nic.ru has address 194.85.61.20
# host -t a ns4.nic.ru.
ns4.nic.ru has address 194.226.96.8
# host -t ptr 194.85.61.20
20.61.85.194.in-addr.arpa domain name pointer ns3.ripn.net.
# host -t ptr 194.226.96.8
8.96.226.194.in-addr.arpa domain name pointer ns4.ripn.net.
The error from the DNS query seems to be related to an incomplete answer
from the server (the truncated
flag was set to 1 in the network trace)
when the query is made over UDP. In this case, an automatic fallback
over TCP must be used, certainly prohibited from the company's network
security policy. This may say that the answer is larger than 512 bytes
long, too. So, we tried to advertise different sizes of the UDP message
buffer, but without being confident that this message went through
network devices properly. Nonetheless it would seem curious to get an
answer larger that 120 bytes long.
Last, we can note that the complexity of the network layout (DMZ, firewalls, NAT, etc.) may badly interact and hamper DNS queries, at least in certain circumstances.
After more investigation from the network team, they decided to permit TCP DNS queries. And it worked. It worked letting the internal DNS servers doing their job themselves...
# dig +trace -t a www.banks2ifrs.ru.
; <<>> DiG 9.3.4-P1 <<>> +trace www.banks2ifrs.ru.
;; global options: printcmd
. 449798 IN NS L.ROOT-SERVERS.NET.
. 449798 IN NS M.ROOT-SERVERS.NET.
. 449798 IN NS A.ROOT-SERVERS.NET.
. 449798 IN NS B.ROOT-SERVERS.NET.
. 449798 IN NS C.ROOT-SERVERS.NET.
. 449798 IN NS D.ROOT-SERVERS.NET.
. 449798 IN NS E.ROOT-SERVERS.NET.
. 449798 IN NS F.ROOT-SERVERS.NET.
. 449798 IN NS G.ROOT-SERVERS.NET.
. 449798 IN NS H.ROOT-SERVERS.NET.
. 449798 IN NS I.ROOT-SERVERS.NET.
. 449798 IN NS J.ROOT-SERVERS.NET.
. 449798 IN NS K.ROOT-SERVERS.NET.
;; Received 512 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
ru. 172800 IN NS ns.ripn.net.
ru. 172800 IN NS ns2.nic.fr.
ru. 172800 IN NS ns2.ripn.net.
ru. 172800 IN NS ns5.msk-ix.net.
ru. 172800 IN NS ns9.ripn.net.
ru. 172800 IN NS sunic.sunet.se.
;; Received 297 bytes from 199.7.83.42#53(L.ROOT-SERVERS.NET) in 125 ms
banks2ifrs.ru. 345600 IN NS ns4.nic.ru.
banks2ifrs.ru. 345600 IN NS ns3.nic.ru.
;; Received 107 bytes from 194.85.105.17#53(ns.ripn.net) in 66 ms
www.banks2ifrs.ru. 86400 IN A 83.222.6.194
banks2ifrs.ru. 86400 IN NS ns4.nic.ru.
banks2ifrs.ru. 86400 IN NS ns3.nic.ru.
;; Received 91 bytes from 194.226.96.8#53(ns4.nic.ru) in 65 ms
... and it worked when querying directly the name servers responsible for the wanted domain:
# dig @ns4.nic.ru. -t a www.banks2ifrs.ru.
; <<>> DiG 9.3.4-P1 <<>> @ns4.nic.ru. -t a www.banks2ifrs.ru.
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1530
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0
;; QUESTION SECTION:
;www.banks2ifrs.ru. IN A
;; ANSWER SECTION:
www.banks2ifrs.ru. 86400 IN A 83.222.6.194
;; AUTHORITY SECTION:
banks2ifrs.ru. 86400 IN NS ns4.nic.ru.
banks2ifrs.ru. 86400 IN NS ns3.nic.ru.
;; Query time: 65 msec
;; SERVER: 194.226.96.8#53(194.226.96.8)
;; WHEN: Tue Apr 15 21:05:12 2008
;; MSG SIZE rcvd: 91
Note: The size of the answer is 91 bytes long, so nothing wrong from this side.
I think we will never know what was going wrong here, even if the heart of the problem seems related specifically only to the two same name servers.