为什么 ping 域名及ip的响应时间差别很大

2016-10-27

出现问题的现象和下面的博文类似: ping-slow-by-hostname-not-by-ip

上面文章中提到了可能是由于反向域名解析(reverse dns) 延迟引起的 ping 域名时间特长, 事实上我们通过 strace 跟踪 ping -c 1 highdb.com 程序的调用情况如下显示:

socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("114.114.114.114")}, 16) = 0
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
sendto(4, "\273u\1\0\0\1\0\0\0\0\0\0\6highdb\3com\0\0\1\0\1",29, MSG_NOSIGNAL, NULL, 0) = 32
poll([{fd=4, events=POLLIN}], 1, 5000)  = 1 ([{fd=4, revents=POLLIN}])
ioctl(4, FIONREAD, [96])                = 0
recvfrom(4, "\273u\201\200\0\1\0\4\0\0\0\0\6highdb\3com\0\0\1\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("114.114.114.114")}, [16]) = 96
close(4) 
......
write(1, "PING highdb.com (85.90.244.13"..., 57PING highdb.com (85.90.244.138) 56(84) bytes of data.
......
......
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("114.114.114.114")}, 16) = 0
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}])
sendto(4, "}=\1\0\0\1\0\0\0\0\0\0\003203\0012\00259\003123\7in-add"..., 43, MSG_NOSIGNAL, NULL, 0) = 43
poll([{fd=4, events=POLLIN}], 1, 5000) = Timeout
......

ping 程序先做了正向的 dns 解析(从域名到ip)后, 打印出 ping 的结果, 后面又进行了一次反向解析(sendto 函数中的 ..in-add 字串), 为什么要进行一次反向解析, 我们从源码结构来看, 代码见: ping.c

401     case 'n':
402         options |= F_NUMERIC;
403         break;
......
......
576     if (inet_aton(target, &whereto.sin_addr) == 1) {
577         hostname = target;
578         if (argc == 1)
579             options |= F_NUMERIC;
......
......
1588    getnameinfo(sa, salen, address, sizeof address, NULL, 0, getnameinfo_flags | NI_NUMERICHOST);
1589    if (!exiting && !(options & F_NUMERIC))
1590        getnameinfo(sa, salen, name, sizeof name, NULL, 0, getnameinfo_flags);

从 1589 行代码来看, 默认情况下(不指定 -n 选项的情况下 if 里的组合条件为真)每次 ping 的时候都需要进行一次反向解析, getnameinfo() 函数即是反向解析.

另外在执行 ping 的时候, 使用 tcpdump 抓包出现以下信息(114.114.114.114 为本地使用的 dns):

13:43:43.248262 IP 10.0.21.5 -> 85.90.244.138: ICMP echo request, id 23321, seq 1, length 64
13:43:43.255232 IP 85.90.244.138 -> 10.0.21.5: ICMP echo reply, id 23321, seq 1, length 64
13:43:43.274894 IP 10.0.21.5 -> 85.90.244.138: ICMP 10.0.21.5 udp port 47570 unreachable, length 79

使用 wareshark 查看包的内容如下:

0000  7c c7 06 00 00 00 00 00 10 11 12 13 14 15 16 17   |...............
0010  18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27   ........ !"#$%&'
0020  28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37   ()*+,-./01234567

No.     Time                       Source                Destination           Protocol Length Info
      6 2016-10-26 12:37:46.482722 10.0.21.5              114.114.114.114       ICMP     113    Destination unreachable (Port unreachable)

Frame 6: 113 bytes on wire (904 bits), 113 bytes captured (904 bits)
Ethernet II, Src: Dell_66:4f:b1 (00:22:19:66:4f:b1), Dst: Cisco_9f:f3:1f (00:00:0c:9f:f3:1f)
Internet Protocol Version 4, Src: 10.0.21.5 (10.0.21.5), Dst: 114.114.114.114 (114.114.114.114)
Internet Control Message Protocol
    Type: 3 (Destination unreachable)
    Code: 3 (Port unreachable)
    Checksum: 0xea93 [correct]
    Internet Protocol Version 4, Src: 114.114.114.114 (114.114.114.114), Dst: 10.0.21.5 (10.0.21.5)
    User Datagram Protocol, Src Port: 53 (53), Dst Port: 42200 (42200)
    Domain Name System (response)
        Transaction ID: 0xc947
        Flags: 0x8182 Standard query response, Server failure
        Questions: 1
        Answer RRs: 0
        Authority RRs: 0
        Additional RRs: 0
        Queries
            138.244.90.85.in-addr.arpa: type PTR, class IN

138.244.90.85.in-addr.arpa 行即为反向解析.

综上来看, 如果反向解析出现延迟或超时, ping 的响应时间就会很长, 当然直接 ping ip 是不需要上述所说的两步解析操作, 因此会快上很多. 所以要解决 ping 域名慢的方法归根结底是要解决反向解析, 因此 ping 的 -n 选项禁止反向解析或将指定的域名配上反向解析记录都能满足这里的要求.