Friends, this month brings no shortage of activities to report on. Much of our day-to-day takes place largely out of view from our web pages, social media, and other communications channels. In our first newsletter we listed four technical TODOs. Before today, all of those but one had been completed. This month we are happy to announce a preview of that fourth, an API-based search on our data. We also introduce new summary statistics pages for all of our published signal data. Many have taken an interest in our systems and observed signal data across the Eurasia region. We briefly summarize our presence there and some thoughts inspired by recent events that may shape what we do in the coming years. Lastly, as always, an organizational update including an introduction to the newest addition of the leadership team.
Dataplane.org API Technical Preview
We are excited to announce our data search API. This preview is available with a sample subset of current data that can be queried with your choice of software tools, or through a web interface on our public development website at https://dev.dataplane.org/signals.html.
To automate searches with with software tools of your choice, check out our API documentation at https://api.dataplane.org.
Documentation of the available API calls, along with expected return codes, can be found on the API landing page. A number of public features will be available without authentication, while future additional functionality requiring API keys will widen both the scope of searchable data and allow us to offer more features to supporters.
Currently you can search by IP address, short prefix, or ASN. The number of queries per day is limited to minimize abuse and system load. Take a whack at it, get a feel for how it works, and make note of anything you like or don’t like about it. As we build out our API roadmap, we really want your feedback which will influence future iterations and the final production release.
Signal Statistics
As we propel Dataplane.org into a serious operation that others come to rely on we need to have some basic data metrics not only for ourselves, but to show the world what we have, what we’ve seen, and what we may need to do differently in the future. This month we introduce an entire suite of Signal Statistics. The addition of these graphs helps address a much needed longitudinal view of our data. The graphs are a high-level overview of each published signal data set, typically a time-series summary of events seen each day. This is as much for us as it is for the community. For example, consider the graph shown below:
This time-series plot shows the number of “DNS rd” signals seen per day over the last 14 to 15 months. The most obvious anomaly occurred on December 26, 2021. We took a quick look and found that there was an explosion in unique source addresses issuing UDP-based queries for IN/RRSIG/pizzaseo.com
.
In the Dataplane.org newsletter #2 we reported on a curious collapse of SSH signals in October. Well, guess what? We can now see in our new graphs that the signal levels appear to have mysteriously recovered:
Interesting data events not-with-standing, we want to know our signal data is being recorded and reported accurately. As much as we encourage the community to use these graphs to get a sense of trends, we’ll be keeping an eye on them to ensure our systems are performing as expected. There are and will be more anomalies. We may occasionally call attention to them and perform some analysis in a future newsletter, but if you have questions about what you see, don’t hesitate to ask us about it.
Commentary on UA and RU Signals
Dataplane.org operates several resources that can provide some insight from a distinct vantage point in Internet infrastructure and operations. We exist to provide a reliable and trustworthy service, but don’t expect us to chase every latest threat, technology, or discovery. Nonetheless, major world events are hard to ignore. Where we have something of value to contribute on questions of the day, we will try our best to help if we can. In regard to the what is going in the Ukraine (UA) and Russia (RU), we want to outline what, if anything, Dataplane.org sees, and how this affects our organization going forward.
We have sensor nodes in both UA and RU. All systems are online and have been operational without disruption since before February 24, 2022, with one exception. One of our RU providers locked our account and shut down our systems on February 26 for reported abuse. It was hard not to jump to conclusions, but after contacting the provider they informed us that a recent change caused an anti-abuse system to malfunction. Everything was restored within a few hours. In UA, we have experienced what appears to be packet loss for some traffic with one of our nodes, but we cannot say with any confidence if this is due to congestion, rate-limiting, or other issues. The network paths to and from this node appear to be stable.
We have not performed any rigorous analysis of signal data changes on UA or RU sensor nodes, nor have we spent much time looking for a change involving traffic to or from those countries in any of our signal data from other sensors, our RPKI activity, DNS authoritative servers, email systems, or web servers. However, a cursory look at some of our data didn’t show any noticeable differences so we have not pursued this line of analysis further.
We are fully prepared to lose access to any systems operating within UA or RU, although probably for different reasons depending on the country. Some of our RU systems were paid for in advance, with some service extending into 2023 and we intend to let them run as-is. It seems plausible that we may lose them before then or be unable to pay for them in the next billing cycle due to sanctions.
We are casually and periodically performing some regional measurements on availability and latency, but we’ve not automated this type of monitoring. So far, these events have not led to any current changes in what we do, but this has raised our consciousness to potential future avenues of work.
The State of Dataplane.org
Our big organizational news this month is to welcome Bill Eaheart to the leadership team. Bill will help lead and steer the organization on both a financial and technical level. Bill, John, and Matt have all known each other for about the same length of time and we all felt it made a lot of sense to join up and take advantage of his willingness to participate.
We are still focused on putting the organization on a solid legal footing. We are most of the way in completing a lengthy questionnaire from our attorney. This is the informal part of the process where the attorney turns our organization’s hopes and dreams into government-required filings for non-profits. We expect by the time we publish our next newsletter the official paperwork process will have begun.
On the technical front we have a handful of projects we have started, but we’ll focus on two that are currently in progress. One is the deployment of some new back-end infrastructure. As our activity has grown and the reliance on what we do by the community increases, we need to improve core system resiliency. Currently our back end is centralized at a single location that cannot scale much more beyond where it is. This is largely a systems administration exercise, but it will take some time to adapt everything to our new redundant infrastructure.
A second project underway is a public repository of almost all our historical signals data. This is work primarily for the benefit of researchers. There will be only very limited redaction of data to help limit identification of our sensors, but we will provide a nearly complete picture of all the signals data beyond even what you see in our signal reports published on the web pages. We are planning on making data available after six months have passed. We feel this is sufficient for research and evaluation purposes.
We are beginning to consider what we must do to secure funding to make this organization self-sustainable. One idea is a data subscription model that provides access to the full set of real-time signals data and other new data sets. Another approach is to institute a support contract model where our analytical expertise and development projects support those that have special needs much like what is done in many open source software projects. A non-profit lives and dies by the funding it can attract and we will be no exception. We are want to make supporting us easy to justify, but we don’t like to ask so this will be one of our biggest challenges. We’d be very interested in gauging your reaction early on in this process.
Find us through email, our Twitter feed, or our Slack space to chat about anything you read here today.