Sensors, Anomalies, and RPKI Refresh
October 2022 Dataplane.org Newsletter
While there are only 3 months left until the close of 2022, the team here at Dataplane.org is hard at work on several different fronts. We are looking forward to unveiling several projects and enhancements we are making to our network in the next several newsletters. Here we will show an updated visual on our sensor deployment, discuss a DNS query flood anomaly we observed, talk about our upcoming refresh on
rpki.dataplane.org, and finish up with some organizational updates.
We discussed some of the lessons learned from low cost providers in the September 2022 Newsletter, April 2022 Newsletter and more details will come out in Bill’s presentation at CHI-NOG 10 later this week. While preparing the presentation, we took the time to update a graphical representation of our current sensor distribution around the globe. In order to provide data that has value we aim to operate across a wide swath of networks where we can procure services (as long as geo-political events do not hamper this ability). Our network has grown, and with it we are working to ensure the quality of our data persists.
Currently we are operating:
Over 300 sensors
In over 100 unique IPv4 /8 netblocks
On 6 continents
We will continue to expand the breadth of our deployment over the next several months and have plans to continue the growth into 2023. If you find value in our signals and are interested in donating servers to our cause, or making a charitable donation to offset our operating costs feel free to contact us.
DNS Query Anomalies
The sensors we utilize to observe network activity do not solicit incoming traffic. At least not intentionally. They exist to simply observe what an otherwise largely passive system attached to the Internet might see. We do not assign domain names to their IP addresses and they not originate traffic apart from what is required for management and reporting. However, there are certain scenarios in which a sensor may end up attracting traffic. We detail one recent case here.
In September 2022, one of our sensors was assigned an address that turned out to have been previously used by a hosting provider’s domain name registration system. The prior use had residual effects that were immediately seen by our sensor. This address was associated with a default name server (NS) resource record (RR) for all unconfigured
COM domain names registered through the hosting provider. We knew something interesting was happening when we saw an unusual spike in our DNS over TCP statistics.
Our sensor’s DNS module will suggest UDP queries be retransmitted with TCP. Rarely do unsolicited DNS queries switch from UDP to TCP since the sources we see are usually unsophisticated scanners or spoofed addresses. When we saw a dramatic rise in TCP-based queries we knew something had changed. We analyzed our DNS data by grouping TCP-based signals by sensor and spotted the source of the anomaly immediately. But why exactly was this happening?
For illustrative purposes we’ll denote the hosting provider’s domain as
example.com, the default domain registration placeholder NS RR they used as
CHANGEME.example.com, and our sensor IPv4 address as
192.0.2.1. In the COM zone there was a glue A RR mapping
192.0.2.1. This glue record would be returned in replies to queries to the COM name servers for any of those unconfigured names from the hosting provider. This glue record is then used by resolvers, directing them where to look next for an authoritative answer. Those resolvers looked to us and our sensor. By our count, there were dozens of these “unconfigured” names that resulted in us seeing approximately 300,000 to 700,000 additional queries per day.
Our DNS sensor module only returns negative answers so the effect was largely benign, but these unexpected queries resulted in what might be considered “pollution” of our DNS signal data. This is a matter for debate, but it underscores a point we often make - signals data reports are not block lists. They are records of observations made, which can be insightful, but often require additional context. If anyone was using DNS signal data for blocking purposes, chances are very good that they would have blocked legitimate, widely-used resolvers. If instead someone was using the data to better understand Internet activity, hopefully this serves as a public record of what was behind that recent spike in activity.
Technically we could have answered those queries, returning an address that would service web or email requests in an attempt to masquerade as the legitimate destination. In other words, we could have hijacked those domain names and collected a wealth of data, perhaps much to the chagrin of the domain name owners and those on the client end of the communications. We did not do this.
We eventually reached out to the hosting provider summarizing the problem. They responded quickly and updated their glue record to point to another address.
We have plans to provide more DNS signals and capabilities in the coming months, far beyond what we observe from sensors. We hope this case of domain “inconsistency” is a useful foreshadowing of the kinds of things we’re working towards.
Refresh of rpki.dataplane.org Planned
Our RPKI publication point (PP) and certificate authority (CA) setup has been experiencing problems. We believe these troubles stem from an ROA publication experiment that has pushed the system beyond its limits. In a nutshell, as part of jtk’s PhD research work in RPKI, frequent and numerous ROAs were automatically published over the course of many months. This work was to help understand ROA to RP propagation behavior. That experiment has concluded and we recently halted the creation of new ROAs. We want to remove all these extraneous ROAs, but it appears we cannot do so through the software’s user interface (UI) or command line tools. The CPU spikes and the attempts to remove them usually time out. We’ve found no easy way to debug or fix the current system so we’re going to rebuild it from scratch. This means some downtime will be incurred, which may be reflected in the RPKI stats. We’ll also take this opportunity to upgrade the software we’ve been using.
We have been in contact with the software developer to see if there is a fundamental scaling problem we may have run into with our unusual experiment. We’ve found the problem sufficiently too difficult to troubleshoot on our own so we’ve relayed our system data to them for evaluation. In the meantime, we’re going to move on and clean up our portion of the RPKI repository. We do not anticipate running any experiments like this involving rpki.dataplane.org in the future. Instead, we hope to focus on the more simpler and safer matter of RP monitoring and PP measurement. We’ve not yet scheduled the rebuild, but we expect to perform this work sometime in the next few weeks.
As we explore upcoming projects, we are working with both our legal and tax partners to determine if it may be necessary to have a for-profit entity affiliated with Dataplane.org to handle certain operating expenses, potential revenue streams, and ensure we do not fall out of tax exemption as a charitable organization. While this will not impact our mission, signals, or ability to provide educational services to the public and data for research purposes, we find our goal of transparency as a crucial part of our DNA. Once we have a conclusion to this process, we will share it with our community in upcoming newsletters. We are also nearly complete with the logistical setup of our banking partners to allow charitable donations to be received through various channels.
We welcome feedback on any items covered in our update or suggestions for improvement.
Feel free to reach out via email, Twitter, or Slack (request an invite if you need one).
Thanks for reading Dataplane.org Newsletter! Subscribe for free to receive new posts and support my work.