Friends, benefactors, and beneficiaries, we welcome you to the third edition of the Dataplane.org newsletter. The newsletter is published on the first Monday of every month. In this edition we look into our DNS signal data, recently enhanced with a new sensor module. We highlight what this data represents and why unsolicited queries may not be the threat intelligence you thought it was. Next, up is a summary of IPv6-based traffic our sensor modules are reporting. The community consensus would suggest there is little to no unsolicited IPv6 activity to be seen since the address space is impractically large to survey. Is that right? Lastly, we conclude with a brief summary of the technical work we performed since last month and as always, an organizational update.
DNS Signal Data
Of all signal data we currently make freely available, the DNS signals are among the least understood. The source IP addresses associated with unsolicited DNS queries are often NOT a potential threat. Most of the DNS signal data is derived from unsolicited DNS queries that are carried over a single UDP message, and as many of you will surely know, these can be source address spoofed. Our DNS sensors do not provide a positive response so we would not expect them to be used for amplification/reflection attacks. However, even though we will never amplify a response from a query and we do not associate any domain names with sensor IP addresses, we see a surprising amount of source address spoofed queries. The odds of a query being source address spoofed are approximately 66% based on our recent data! We have verified that this is not specific to our sensors. Spoofed DNS queries are being sent to seemingly random IPv4 addresses all over the Internet.
Approximately a year ago it was brought to our attention that the DNS signals data was being used in automated systems and the IP addresses being reported were being blocked by some operators. All of the reports have text indicating they are not block lists. We contemplated removing the DNS data from public download, but instead have embarked on an effort to educate and market the data more effectively. We recently deployed a new DNS module that will provide us with additional context that enhance our DNS signal data analysis capabilities. For example, we are now publishing the DNS TCP signal report that will help distinguish between source addresses that are spoofed and those that are not. Above we provide a glimpse of top query names our sensors have seen in 2021 and in the past week. This data helps us understand the scope and extent of amplification/reflection attacks using DNS, even when we are neither the victim nor operating vulnerable open resolvers or forwarders.
IPv6-specific Signal Data
A colleague was surprised to find IPv6 traffic to a web honeypot, which raised the question of how this was possible if IPv6 enumeration is practically infeasible. There may be a number of explanations as to how an IPv6 client found the website. Perhaps a DNS entry exists that was discovered in a zone transfer or zone walk? Perhaps the last 64-bits were not very random (e.g. 2001:db8::1)? Perhaps the honeypot originated some IPv6 traffic that ended up in logs or a monitoring system, which was then used by a web crawler? Whatever the reason, we thought this was a good opportunity to take a look at what IPv6 traffic our sensors see, and if any, how much of it there is. The short answer is that they see some, but not much. The volume of IPv6 traffic over the past year has been so small in comparison to IPv4 it isn’t even worth trying to graph it. Instead we’ll summarize the approximate number of events we saw for each of our primary sensor services in the past 12-14 months:
Service | IPv6 events (approximate)
-----------+--------------------------
DNS | 1800
SIP | 300
SMTP | 0
SSHclient | 500
SSHpwauth | 0
TELNET | 10
VNC | 0
Those values are less than a fraction of a percent for IPv4-specific events. Even as anomalies we don’t find sensor IPv6 data all that interesting to explore further at this time. They might be most useful for us to monitor internally as early warning signs that a sensor’s IPv6 address has been discovered or that someone is starting to find the more easily guessed addresses. Otherwise, there are other types of IPv6 signals we think are more useful and interesting. Our IP protocol 41 signal report is a good example. We also have some authoritative DNS server projects on the horizon that we expect will help shed more light on the IPv6-enabled Internet.
The State of Dataplane.org
We rolled out three improved sensor modules this past month. Two of those involved changes to the handling of UTF-8 encoded strings in our SIP and TELNET code. The processing of their UTF-8 event data was problematic under our old versions. We also deployed a completely new DNS module as was briefly mentioned above. The new module captures additional DNS message data we thought might be useful for future analysis. We now collect practically all the DNS header fields, including EDNS(0) fields such as the DO bit and message size option. We also wanted a way to respond to any UDP-based query with the the TC bit set to signal a client that it should retry using TCP. We hope to perform more detailed analysis with all this additional DNS data in the future thanks to these changes.
As noted earlier, we began publishing a new DNS over TCP signals report this past month. We feel good about the current set of sensor modules and signals data based on our sensor network. We occasionally have requests to produce reports for other services such as HTTP and RDP. However, for the immediate future we are going focus on other types of signal data. For example, we would like to deploy a traffic flow (i.e. IPFIX) collector on the sensors. We think this will give us a more broad and versatile view of Internet activity from sensor observations despite the limited application fidelity flows-based monitoring provides.
On the administration front, this newsletter will mark the official beginning of our journey towards not-for-profit status. We have retained an attorney and the first meeting to move forward with appropriate filings is scheduled to take place the day this newsletter is published.
Lastly, as we alluded to last week, we have begun to invite people to our Slack work space. If you have not yet received an invite, but would like one, feel free to reach out to myself or Matt at info [at] dataplane [dot] org. It is a low volume communications path to discuss anything related to Dataplane.org or adjacent to our work and data.