MAT Working Group
RIPE 84
18 May 2022
12 noon
BRIAN TRAMMELL: Good afternoon everyone, welcome to the MAT Working Group at RIPE 84 in Berlin and hybrid on the Internet. I always like coming to Berlin. It used to be I would be able to fly [in]to Tago and walk wherever I was going, apparently there is this new airport which is almost in Poland, so I took like a one hour flight, a one hour train ride I was coming up through the main station in Berlin, so this is my, just whip out the camera and took a shot of this here and it's great to be here again with everyone. We have a fairly packed agenda today so I'm not going to ramble any more about how nice Berlin is.
So, there is the intro, hi, I am Brian, the agenda will be having four presentations today, one from Lesley Daigle, one from a student from the Hoog school, Utrecht, one on flow monitoring in BW net and one from Robert with the ever present tools update, this time with slides because we told him he was going to be here.
And then we will do introductions and a fair well. So we have two new Chairs joining this time, Massimo Candela, who I heard is in the building but I don't see him here at the moment. And Stephen Strowes who is joining remotely. I will be stepping down, as a co‑chair and Nina will be continuing, Nina is also remote. So, with that, let us go ahead and get started with Leslie Daigle.
LESLIE DAIGLE: Thank you. If I stand here I don't get to see the comfort slides. The joys of being slightly height challenged. We'll figure it out.
Right. So this is ‑‑ what what I'd like to talk about today first of all is why do we all care about all the unwanted traffic on the Internet and then some perspective on just how much is out there. If you were at this Working Group meeting for the RIPE 83 meeting last fall, you'll have seen some material presented by George Michaelson and this is sort of a logical follow‑on to that presentation.
Then I want to spend a bit of time talking about what bad looks like. How do we quantify what are bad ACME networks, and then have a bit of discussion and I hope there will be time for some discussion and some engagement to see what we think is acceptable levels or how do we figure out how to get to acceptable levels and where do we go from here.
So, why do we all care? So, who resource the delightful MIRAI botnet attack? A few people here remember the MIRAI attack on the Dyn systems and that was basically largely attributed to a bunch of IoT devices that had been owned by the MIRAI virus and coordinated to create a large scale attack from very minimal devices, and this is also known as why suddenly we have so many regulations around the world against default passwords.
Of course, it's not all about conscription advices into the world's largest botnet, these things are ‑‑ the players that are attacking IoT devices are basically hammering on anything available, any open port in the IPv4 world. So, a lot of these attackers are just trying to get into any system that they can get into, get a toe hold and then from whatever machine they have gotten on to in your network they are trying to get anywhere else. I think that's why we should call care about all this unwanted traffic that's just paragraph raiding around the Internet.
Why do we, as the global cyber alliance care. This is a not‑for‑profit organisation which is dedicated to reducing cyber risk and the material I'm talking about today is art of its AI DE project. I can't not say it, it's not actually yet automated nor an event ecosystem, some days I'm not even sure it's IoT, but for the purposes of discussion we will refer to the aid project. What it consists of is a global honeyfarm with hundreds of sensors all offer the world, four years of data that has been collected from those sensors and a bunch of other cool tech I am not going to talk about today but would be happy to talk about offline.
Right. So, I have said that we have a bad problem with unwanted traffic flowing freely on the Internet, just how big is this, how big a problem is it? And the charts that follow in this section are from my colleague Rufo de Francisco who has been a deep dive in our data for over a year to try to put a size on this.
So, this is the kind of thing we see. I'll explain the grey part of the graph and the apparent disconnect. We basically see tens of thousands of attacks per sensor per week. So this is the kind of stuff that is literally hitting every open IPv4 site, any open IPv4 port everywhere, and I know that because I have looked into the sensors for our open port on our home network and I can see in our system log there, yeah, that IP, address, yeah, see it. So, literally these are attacking everything. The disconnect in this graph is because we started the honeyfarm in 2018, September of 2018, with about a 1,200 node network of sensors. They end of lived early last year and we transitioned to a smaller 200 node honeyfarm at the end of last year, and the grey part of the chart is when we had overlapping some old, some new sensors, so basically ignore that part.
Another aspect in which we can see that this is ‑‑ the problem is not small. It doesn't matter that you can't actually read this graph. The shape conveys the point. These are the IP addresses that were hitting our sensor network for the first three years, basically this is the old sensor network and the connected lines at the very top shows there are five IP addresses that were attacking our sensors every single day for those three years. So they just don't quit. And if URL, well I haven't mentioned yet, this particular particular sensor is tuned to MIRAI attacks, these are basically infected IP addresses hosts on IP addresses that are not being rebooted, not being cleaned up, and are ‑‑ or are deliberately acting maliciously.
I needed to give you some sense of globality, so I thought why don't I show you attacks to and from a particular country and somehow Russia sprang to mind. You can see that there are attacks that are going everywhere all over the world from any given country to any other country, pretty much all the time. You might see that there are some places, it seems to be that there are more places that are attacking Russia than the Russian IP addresses are attacking back. That's because we don't have sensors in literally every country, but many.
And some of the attacks and some of the data that we have we can actually detect a specific attacker, so, I wanted to give a bit of a flavour for what does it look like for an attacker attacking our systems and this is just sort of a heat map of where this one attacker was attacking our sensors from. And you can see it's pretty much all over the world. And I wanted to zoom in just because the red on grey doesn't work very well, but, you know, it literally is countries scattered all over the place. So don't think that you can protect your network by just saying my systems just don't need to talk to this part of the world or that part of the world because the attackers will be coming at you from all over the world.
So, just to highlight some of what that sort of showed. We have a lot of attack traffic hitting any host on the Internet at any time, that there are some very persistent players out there or at least systems that just aren't getting enough attention to get cleaned up. It could be somebody's fridge, I don't know. And it's literally from everywhere to everywhere. So putting a fence on it isn't really going to work.
And so I wanted to talk a bit about what "Bad" looks like, how can we say what's a bad ‑‑ what's a bad amount or quantification of traffic coming out of a network? Because that's basically what we see in our sensors. We can see what geography is coming Fr. Although I have got stories about that. We can see what networks it's coming from. So, what would good look like? In other words what would bad look like?
.
In the following charts I did a bit of a deeper dive from November 2021 to May 8th of this year, which is 160 days of data.
So our approximately 200 sensors saw attacks from almost 11,000 distinct autonomous systems, and over 2,000 of them fielded more than 1,000 attacks on our sensors. 40 of those autonomous systems launched attacks from more than 1,000 IP distinct addresses within their network. And so we saw also just over a quarter of a million distinct IP addresses touching our sensors in that time frame. So that's some numbers.
And then if you look at it graphically, I just numbered the networks with sequential numbers. I am not interested in naming and shaming at this point. So, the numbers ‑‑ the numbers at the bottom of this graph are consistent through the rest of the graphs, and that will become important in a minute. So, you can see that ‑‑ so, there is one AS that's fielded over 20 million attacks fraudulent sensors in that time frame in those 160 days. And even at the tail of the top 20 graph, we're still seeing, you know, over three quarters of a million attacks from the autonomous system.
So, does that make the ones on the left bad actors? Is that what bad looks like? And the reason why I said the numbers at the bottom of the graph, the numbers of the networks are relevant because if you look, instead, at how many different attackers came out of a network, you get a different best of list. So, I did try to scale up the numbers at the bottom of the graph. I hope they are mildly legible there at least available in the slides online.
The ordering is different. It's not massively different, but there are some interesting things. Like you can see the second highest bar was number 144 in terms of number of attacks. But so there are many more attackers in this network, each launching fewer attacks.
If we look in terms of how many attacks were sent by each attacker, you can see again the chart is a little different. The numbers at the bottom are different, so this is a different way of characterising how bad is a given network. And again, even at the tail end of this, the tail of this particular top 20 graph, we have over 18,000 attacks per attacking IP address in those 160 days.
So, I don't know, it's like just sitting there churning away, attacking, clearly not just us but everything on the Internet.
And then I had a look at using Hurricane Electric's site to get a guesstimate of how many IP addresses were in any given of the top 20 ASs. You can see there is a fair variability in terms of how much of the network is apparently infected and how many bad actors there are in the network. Each of these numbers is individually pretty small, thankfully, it's not like we have networks that are dedicated to attacking, I guess if we did we would feel better about blocking them.
But the ‑‑ there is a fairly big range in terms of, you know, how well the network has been cleaned up. Or at least, prevented from launching attacks.
So, I think the things I wanted you to particularly take away from those points are again, the network IDs are not the same in every graph. The characteristics of these individual networks in terms of how many attacks are launching, how many different IP addresses are launching those attacks, and how many attacks are being launched by a given IP are pretty different. And that the level of IP address bad behaviour in a network is small, less than 1% of the network, but it's also highly variable between networks. So some networks are very clearly putting effort into cleaning things up, and some networks are not. Which is kind of a hopeful thing, because where I'd really like to get to from this is: You know, that this amount of rampant bad acting traffic isn't right and that reducing the amount of bad traffic coming out of network is probably more helpful than trying to just block everything that appears to be acting badly. And I know in the Anti‑Abuse Working Group tomorrow, there is a presentation that takes sort of the opposite approach that that will be fun, but I think that it's more important, if we want to continue to have a free and open Internet, it's more important to figure out what is an appropriate level of bad traffic coming out of a given network, and how do we get to, you know, those ideals, if you will?
.
I mentioned that the aid project was started with the intention of becoming a defence ecosystem, in other words being able to detect bad activity and give some kind of an alert in time to help protect networks, but I mean there is only so much you can actually do with that, right. We have a few cases where we can track a particular attacker and say this is the same actor in all these intense instances but for the most part with the level of data that we're getting anyway, it's not really feasible to say, you know, here comes another way of MIRAI. It really would be far better to stop it at the source and I can't imagine that all of this activity is really within the operators ‑‑ the respective operators networking usage guidelines.
So, what I'd like to talk about is what is worst? What is the right way to look at bad and what are the right ways to define acceptable traffic coming out of a network? And, you know, what's reasonably detectable within the context of a network? I don't know if network operators even see this kind of data in their own networks. We see it because we're sitting outside the network and are being poked by them. I'm sure that, you know, if you are just generally looking at traffic exiting your autonomous system, it all looks legitimate for all intents and purposes, so what should we say is reasonable activity and how could we get there? What should we do about unacceptable activity? By the way if you want to see, if a DPS is part of 8 ‑‑ I am happy to talk to you afterwards.
So I think that's pretty much what I had. But, if there is any time for questions or discussion, I would be happy to focus on the, you know, what does good look like and how do we get there? Thank you.
BRIAN TRAMMELL: Thank you very much Leslie. I think we have Jen Linkova in the queue.
JEN LINKOVA: Thank you. Very interesting. A question: Am I right that the humans you are talking about IPv4 scenarios.
LESLIE DAIGLE: Yes, we are only measuring IPv4. I haven't ‑‑ I think the 4 billion addresses are reasonably more scannable than all those in IPv6, but ‑‑
JEN LINKOVA: I I am interested in how safer is v4 landscape yet. And second question is: In v4, you are probably see public before sources so you don't actually know how many devices is behind, right? While v6 experiment you actually might see kind of end‑to‑end communications, it also might give you a problem about indication if it's a single devices or many devices behind a single NAT pool or something.
LESLIE DAIGLE: True enough. As I alluded to there are some other interesting things in terms it of geography, where we think the attacks are coming from, are they really ‑‑ where we think our sensors are located, so, yeah, thanks.
NIALL O'REILLY: I am not wearing my Vice‑Chair hat, I am just somebody who struggles it keep my own network clean. And you mentioned earlier, Leslie, that you weren't interested in naming and shaming, and I'm not a fan of that either. But, one of the things that I find useful for my own housekeeping is having out there websites I can visit which will give me some kind of report. Do you think that's a useful approach to encourage more of the kind of cleaning up that you mentioned so that people can go, in the first place and see how bad am I and should I do something about it and what can I do about it? And secondly, can I, you know, sign up to some best practice that says I'm doing these cleaning things.
LESLIE DAIGLE: Yeah, so that's where I would like to get to. I would be happy to share the data of, you know, what does your particular autonomous system experience look like from our perspective for any value of your, but I think that in order to really frame that up on a sort of, in an organised fashion, I would like to see some community agreement about what is ‑‑ what are those acceptable levels. I mean, we're never going to get to zero on any of those dimensions, but, you know, is it the actual number? Is it the trend? I didn't talk about trends. Is it ‑‑ like what is an acceptable level and how should we encourage operators to focus on that? Because I'd be much happier to publish something that said, you know, trending well rather than like oh my goodness look at the spew out of this network.
GERT DOERING: I am very interested to see what badness is in our network. I try to keep it clean but housekeeping is not always perfect.
That was one aspect. The other one is what's detectable? That is something that has interested me since years basis, basically since the code red outbreak that killed our NetFlow export, and so we have been monitoring anomalies in NetFlow, and some bits are sort of easy to see, like a BruteForce port scan, if you have a machine that sends single packets to millions of other, it's either a name server or somebody port scanning, then you look at the ports and you get an idea. If it's a more and more smarter tech just doing http requests to possible victims, no chance, it could be a normal client. So, this is something I'm definitely interested, what traffic patterns do you see in the attacks, what traffic patterns I could see in NetFlow, and how to optimise generic ISP building that flow is set up to also find this stuff.
LESLIE DAIGLE: That sounds interesting. I'd be happy to catch up with you afterwards.
SHANE KERR: I may have missed it but what exactly do you consider an attack from your folks?
LESLIE DAIGLE: So, the honeyfarm basically is a bunch of fairly lightweight Linux servers, and effectively they are just sitting there with no particular purpose and being contacted. So, pretty much any contact is an attack. Scanning, we can detect and say if you are just scanning the network and saying, you know, who is there, we filled the data. So, that's ‑‑ in terms of the numbers, and I am showing, that's what's considered an attack. And then having looked at some of the raw data itself, it was like, yeah, these are attacks. Obviously I'm not all of them. I am not scanning the many terabytes of data.
SHANE KERR: Maybe we could discuss, since I have to be brief, ways to try to figure out. Because there is just a lot of random crap which may just be badly configured IP addresses, or software with bugs and all of that kind of stuff. And it doesn't seem like it's distinguishing accident traffic from malicious traffic.
LESLIE DAIGLE: That's a fair point and I'd be happy to go back and see if we can go through our data and make some of those assessments. But yeah, as I said, having looked at sort of the live stream of connections coming in, it was like, yeah, you know, no.
SHANE KERR: Okay, that's fair.
BRIAN TRAMMELL: Thank you very much, Leslie, sounds like we're going to have lots of follow‑up discussion after this. Thanks a lot.
(Applause)
SPEAKER: Hello, first of all, it's good to be here for the first time. We're going to talk a bit about our project, anomaly alerting for RIPE Atlas.
Our agenda. We're going to talk a bit about the project context. The solution, the anomaly detection and the prototype.
First of all, the project context.
About us, we're second year students from university. We're a group of five, sadly one of our members couldn't come, so he is still at home. We're doing a study to become software engineers, that's also why our project is a bit challenging for us because we don't have the knowledge of network engineers, and it's mainly focused on the ‑‑ you need to have a lot of knowledge about networking, so, that's a challenge.
The project context:
The question RIPE asked us, because we're doing a project for them, was how do provide more value to anchor hosts. We had a few choices but the choice we went for was monitoring software.
.
The problem statement is right now it's easy to monitor or autonomous system, or our network, but the problem most people have is you can't see what happens outside of your autonomous system or network. There is some ‑‑ there is a solution for it, but the problem with it, it's hard to set up, that's the status check API, and it's also complicated to use.
The solution:
.
SPEAKER: So, we think we came up with a solution for this. We wanted to make a monitoring system for outside of your own network. This would be based on RIPE, existing RIPE Atlas data. It should be easy to set up, so just spin up a Docker image, type in your AS number and it should just work. But it should also be customisable, because maybe one neighbour is more important than another neighbour.
So, we split this application into two parts. We have plugable detection methods, and we have a user based filtering system, and the filtering system would replace the setting that thresholds manually.
So, here is a diagram showing how this would work in our application. So, the user would insert their AS number, they would find, from the RIPE Atlas, get all the measurements data for the existing anchors in your AS, based on those measurements, currently we are using the anchor mesh measurements to train a one‑day baseline. Then every 15 minutes or so when new results come in, it would use that to ‑‑ in the detection methods, if there is an anomaly or not, then if there is an anomaly, it would go to the user feedback part of it. They are based on user feedback, you would get an alert or not, then it would send an alert via e‑mail or anything that where you wanted to receive your alerts. And if you are not happy with some kind of alert, you can give negative feedback. And then you won't get those alerts.
So, this is the part where we had to give it a name, so we started to call it RIPE alerts.
SPEAKER: I'll talk a little bit about the alert detection that we have so far. So, as my team mate said, it is plugable but we have implemented a first option already, which we call entry point delay. We basically use traceroutes, mesh data, and then we iterate through the hospitals backwards. We match the IP address with the AS number using RIS Whois data and when the hop is outside of your own network, we start looking at the round‑trip time that it took to enter your network. And we do anomaly detection on this to find any weird delays.
A little check that we did. On the RIPE PC graph of how many probes measure through a certain neighbour. On the left you can see the RIPEstat data about the neighbours. It's interesting they match. One interesting thing is that the large AS numbers are actually reverse, which I think is a result of the probe distribution, but at least we get the same AS numbers. Then we do basically anomaly detection per probe or anchor that's measuring towards the end point. We use a sliding window for this that measures for positive change, and you can see the red lines is where it marks an anomaly. We then do this for all anchors in a single neighbouring AS, and as you can see here on the right, there is like a clear moment where all the anchors have a spike in round‑trip time. So we can assume that something weird is going on there. We aggregate the result, and if right now we do, if the amount of probes comes above a certain percentage of the total AS, then we mark it as an anomaly and it will get sent to the user feedback system. And so, this is the first one that we have implemented. There are some other ideas, looking at route changes based on the traceroute data looking at delays in the entirety of a neighbouring network. If we use ping we could use ‑‑ or look at the delay per country or those kind of things, but we would love input from actual users on what would be interesting to you and if you have any other ideas.
SPEAKER: I am going to share you the prototype from what we have made. Currently it's very easy to set up. The only thing you have got to do is start up that application and go to settings and enter your autonomous system number that you want to monitor and click save of course. After the application will automatically look for anomalies for the given autonomous system. In it will give the most recent alert which is ping above and to the right you can see the probes anomaly graph. Next is the anomaly overview and feedback. The special thing about is we have a feedback engine and every time an anomaly comes in, it will be processed by the feedback engine, so next time you will know whether it's a real anomaly or not. So ‑‑ a fake anomaly.
There are also some features we still want to implement but have not been implemented. This is a work in progress. One thing we want to do is give the user for visualisation of the data which is coming in. We want to do this with graphs, but we also have some other ideas, which is a network topology map. As you can see, each grey circle represents a hop, and the colours green represents low MS, and red represents a high MS, etc.
SPEAKER: Well. This was everything we had to show for now. Please come meet us. We have a demo from that Wolf has on his laptop and we'll be in the coffee break room at the RIPE Atlas stand. So, yeah ‑‑ and you can find the code, it's Open Source on GitHub, it probably won't work right now because we're working on it. So, thank you for your time and if there are any questions, we are happy to answer them.
(Applause)
BRIAN TRAMMELL: Where he do have time for two questions. So, Daniel and Shane.
DANIEL KARRENBERG: I am one of the inventors of RIPE Atlas, and you are stars, because when we first conceived the whole thing, this is exactly the stuff that we would hope that people would develop. It's taken a couple of decades, but it's really, really good to see that even second year students can do something. I really like the human feedback thing and so on.
But since this is a question, I have to ask a question.
The feedback module that you have, does that use machine learning or ‑‑ and can you maybe say 20 seconds on the methodology there.
SPEAKER: It's still very much in development. But the basic idea is that we do some feature extraction looking at the latest alert, how much time there has been between, we extract some of the information about the AS, about all that kind of stuff, and then we just try to predict if the value is an alert or not, and we do that based on the feedback. Probably it's just going to be like a random forest, something along those lines at first, because we really need to build up some sort of ground truth on what is alert worthy and what isn't. That's the main goal is building a dataset for round truth value.
DANIEL KARRENBERG: Great work. I will find you.
SHANE KERR: I did some experimenting with using Atlas for monitoring many years ago and economics didn't work out for me. I couldn't get enough credits to actually monitor with anything like realtime monitoring. Is that something you have looked at?
SPEAKER: So, there are ideas to make custom measurements in the end, and use custom measurements but for now we're using the RIPE anchoring measurements which are just continuous anyway so we don't need any credits for that.
SHANE KERR: Okay.
BRIAN TRAMMELL: Thank you very much and please go and visit ‑‑ it's there in front of the main room right, or has it been moved? Over in the main room.
(Applause)
Daniel, please come on up.
DANIELE NAGELE: Hello everyone. So, let's get right into it. I am Daniel. I work at Belwu, which is the research and education network for this german federal state I have marked on the map for you. I am assigned to a research project which is ‑‑ I am assigned to a research network which is focused on creating research with operations, so, somewhat of a conflict sometimes.
What we're doing there, we're doing flow monitoring, which is what I'm going to talk to you about today. The flow pipeline tool we have developed is completely configuration defined. We process any form of network flow and to show you how that goes, I have a little graphic on the right‑hand side.
In principle, you define a list of segments and these segments can be anything from inputs to modifications. And a flow basically passes any segment you define. In this case, we have one input. It generates the flows and inserts it into the pipeline. We can modify that flow with a number of segments we have developed, so there is a BGP segment, there's annotation by subnets, there is SMTP for interface descriptions and so on. You can anonymise flows. We have also filtering flows, which allow us to do statistical things like sliding window analysis, a specific query language which is TCP dump style, loosely based on that I'd say, and lastly we can export flows to a variety of targets so to speak.
You note that we have outputs and exports, the difference between an output is preserving all the feeds in a flow while an export is goes to time series and you can't have all the dimensions there.
It's also popular with our researchers for dataset generation, machine learning folks really love the CSV files so that's one use case. It's also Open Source so you can look at it after the talk.
And regarding the inputs.
We have several ones. We have EBPF input. It uses a custom flow cache. We can configure to match our production flow processing we have from routers or something.
We have some talks at this meeting about statistical astuteness and bias in data, we used it to have the same bias in all of our data, so no way getting around sampling bias. So we used that for instance in university border scenarios where we have Linux firewalls or something, but for our backbone we used the go throw 2 integration, so we get the flows from network devices directly. This again can be any type of flow, sFLow, IPFIX, NetFlow, whatever. And we just ingest from there.
Lastly, we have the Kafka configuration. We can use that to get to pre‑defined or pre‑generated flows. This is also based on go flow somewhat, because that uses Kafka as an output format as well and we can have pre‑filtered or pre‑enriched flows for specific interfaces of interest. Like in this example here, we have them for any of our IXP interfaces.
That was a bit dry up to now but I have prepared some examples for the rest of the talk.
Just a simple example config where we grab flows from EPF from a specific device, that is passed on to a flow filter. In this case the filter is coming from the first comment line argument because network engineers love their comment line stuff. Then we just print it. As you can see in this example it's very not annotated, not enriched at all, it's just the basic measures of the NetFlow and the time the flow took.
So, simple stuff so far.
We can also do what we saw in the first day, I think it was a talk by DAC of ken tech I think for doing RPKI measurements on traffic volume instead of prefix based, how many presentations have valid or invalid but how much traffic goes through RPKI invalid states. So in this case, I have ‑‑ I did the example with go flow, so you invoke a go flow segment, you receive NetFlow from a router, it can be any kind of flow, sFLow, whatever, and you also have a session of BGP with that router. So you can go ahead, annotate each flow with data from BGP that might be the RPKI status, which is relevant here, but also AS paths and so on.
Then we use flow filter to filter specifically on to RPKI status. You can find it in the syntax in the left, maybe look at that after the talk, it's a bit much for a single slide but on the right‑hand side you can see we are capturing the result of the flow filter even if it's dropped, so we can do one thing if it's dropped and another if it's not dropped. In this case, both of the times it's printed, once with a highlight, once without. Obviously in a real scenario, you would do some detailed analysis with the past flows, for instance put it into an SQL database or whatever and do something not as elaborate with the drop flows maybe.
The last example, we are doing some basic DDoS detection. It's taking flows from a Kafka cluster, in our case that's all the outside border net flow we have. We are filtering that and we can use different filters. I have started out with the premises that we have different, how should I say, different focuses in our network, so on one side we have the research folk, who always have the latest models and nicest things they want to try out in a real network, and we have the operations folk who want to keep it stable and simple and in this case, we can use this in a production environment and have networking folk try out different filters, so instead of having some model drill down into it like the research folks would like to do, we just have our network engineers input different filters in this case.
So, the first one would be a classic NTP reflection thingy maybe, so it's basically just filtering for NTP. You can look at that and find out the top talkers. So, on the bottom right you see how that would look like. You can lock that into a file, you can use that to alert yourself with some HTTP hooks or whatever, and if the network knows there is nothing fishy going on here, it's all okay, he can reiterate using that filter. So, just by changing the filter really fast, it's very short interval in which you can update that.
In the second case, we just depend the port zero which is giving us most fragmented packets. That's often a good queue that something stupid is going on in the network. And last case would be just looking at TCP flows which are Zoom only so [Zoom] flaps basically.
On the slide feedback is a thing we value most I think with this tool. It's giving us the opportunity to have like operations folk do analysis which the research folk can use to have their datasets generated using this pipeline.
So, I think I'm almost done already. Thanks for your time. If you have any questions, please feel free, also pick me up after, I am still around.
BRIAN TRAMMELL: Thank you very much.
(Applause)
We have time for one and a quarter questions with the rate that we're going on questions recently. Going once, going twice... Daniel is around, please find him.
ROBERT KISTELEKI: Hello everyone. This is the usual update from the RIPE NCC in general about the tools that we do for you in this space.
We do research. We published a number of articles recently on various topics, you may have heard Emile talking about the situation in Ukraine and Russia, and the analysis that we did there. There was a k‑root prefix leak back in China in November. We did an analysis on that and just this week we published another article on some interesting network behaviour. I encourage you to look at all of this on RIPE Labs, you can read this and more and perhaps you can write our own article on this by the way.
We have also been doing something else in the research space and that is prototyping some visualisations based on the RIPE Atlas data flow and Emile and others have been working on publishing something on observable. That's an interactive tool, you can go there and type in your country or AS number and you can see probe connections and disconnections which are particularly interesting if there is something going wrong with that AS or country.
In the RIPE Atlas space. For a long time we have been focusing on getting more probes out there and that was done, so we focused on trying to be in more places so diversity of the network is useful, that's the current strategy as well and I am happy to report that on that sense we have a probe in Antarctica as well. I would like to ask you do not do that, because that probe will melt, I imagine this will be very popular. So, please be gentle on the host especially because they are on satellite so it costs them something.
The version 5 probes are coming. I'm about to approve the test batch. It looks good to the proper manufacturing of the whole new set is going to happen any time soon. We are still working on the UI, there is a lot of work to be done there of course as always. We had a feature request some time ago that the network should be able to measure what is called the reserved pace in IPv4 and we did not allow that before. But it seems like that address space is actually used on various networks. So we basically removed some of the blockers from this and it's about to be concluded.
We are still looking for sponsors, if you like the idea you can increase your population of anchors, if you not a sponsor you can get probes, another number of benefits. Please look at that link and you will find what it really means.
We have all the data in Google BitQuery. You can go there right now, there are articles and labs on how to do that and some examples as well. And we're looking at what this really brings to the community as a value, what does it bring to the NCC operations as value and trying to look at what are the costs of running this and trying to determine if this is actually a good idea in the long run.
And of course, the infrastructure always needs some help, some nurturing, and at the moment, for example, we're looking at whether it would make sense to move our in‑house big data back‑end somewhere onto the Cloud for, you know, reasons, costs or efficiency or scaleability or whichever reasons those are.
We are about to put out some proposals from the Atlas perspective to change the behaviour a bit. We do not want to make she is decisions unilaterally, we wanted to consult the community to get the continues and what the people think. Whether these a ready goods ideas or not. One of them is this has been discussed last year or before, that we are really, really struggling in a couple of dimensions to keep up the so‑called non‑public measurements. In theory, this is to be in the network to make a measurement and ask the system please do not publish my data. And some people use this to verify upcoming services and servers, but it's really, really low usage and it makes us do all kinds of weird exceptions in the system that we'd like to not do.
Also we hear a couple of would be sponsors saying we would really like to do HTTP measurements on your network. And it's totally understandable. So far we tried to focus on the network level measurements, pings and trace routes [and] DNS and so on, but we are open to the suggestion so what we did was put out a proposal with enough checks and balances so that to keep the network safe but yet allow HTTP measurements to the public it's obviously going to be some kind of an opting from the host perspective so we're not going to allow HTTP measurements from your probe if you do not want that too.
The whole purpose of this is that we are going to do this on probably on the mat working group, maybe on the mailing list, the Atlas mailing list, we'll decide, maybe both.
.
You can findly a, Chris or myself if you want to talk about these topics. We have some probes, the upcoming batch, there are some target networks we published, so if you happen to be in those networks you can host a probe in those networks then you might get a physical probe which has not happened for a while.
Moving on to RIPE Labs. If you have been using RIPE Labs recently then you notice that we we placed the UI with a more modern version last year, and we ‑‑ the stat team is actually working on what we defined to be the feature parity between the new UI and the old one, there is some features in the old one which people like and manager making those are migrated into the new UI as well and you can see a number of what we call info cards now they used to be called widgets in the old system, most importantly I would highlight the BGPlay one, that also works in the new UI now and I have also got some face lift during the transition.
But otherwise it's the good old one from back in the day.
But then, we really do want to phase out the old UI and what's going to happen there is we really want to Open Source it, so you can expect or the current plan is that we will put it out every code and every module that go behind the will old user interface, that's the plan.
We have been incorporating ROA and information into some of the widgets now so that has been requested by the community.
And also, the team worked together with various community members, they put out a call back in the day to revamp the block list feature, but also includes potential write list features, so this is coming, this is around the corner, there are some legalities and policy that's we need to figure out and that is going to be available.
The team has been working on trying to work on the consistency of the data API, there were bits and pieces that were different in different codes and so on, so the quality of the API is going to be enhanced and they are also focusing on the operation to say make sure that the service is available and we can scale up if we need to.
Michela [and] Stefan are here if you want to talk about RIPE Labs in particular, they are the best people to approach.
And then RIPE IP Map. IP Map needs a lot of care still to make it reusable for the community but we are working on this as we speak. Just this week, a new engine, DNS engine has been released behind the scenes, so this means that we are using the DNS PTR records that exist for the various hops along the traceroute to give a hint to the engine, a hint about where those IP addresses might be and we are using one the datasets for this.
The various engines, so we call these engines that help the system figures out where the hops of the infrastructure are, we made these engines work together with each other, so they inform each other nowadays and the focus from now on is going to be on the crowdsources engine because we heard that a whole lot of people would love to help the system, and they have data that can inform the system about where the particular prefixes and IP addresses are, so that's going to be our next step. And if you want to talk about these details, then Chris is the best person to approach.
Okay. And finally, one slide on RIS.
So RIS is the Routing Information Service, or system I think that collects BGP information. The website has been renewed. It's nice and shiny. It has proper documentation. I encourage to you look at t it's really a step up from what it was previously. We are still looking for peers, so if you would like to peer with RIS, then by all means come and talk to us and us in this caseing with Emile or Michela or Flor I didn't know, they are all in the room so find them. There is some research published recently based on RIS information, so, there was this award winning Labs article about bias network infrastructure, that also touches on RIS so, that's nice. I really love that. And also, on the data level, we have made improvements, so there is better metadata nought a days and what I would like to highlight is the upcoming test. Is what we internal have is Kafka is shut link all the message to the route collectors to the system and what we have been thinking about for a long time, and it is around the corner, you can try it out if you want if you talk to Flor inis a public instance of that, if you know what Kafka is, if you tap into Kafka flows, we will provide a public Kafka instance of all the messages flying around in ‑‑ I almost said BG play, in RIS. So, if you want to drink from the fire hose, basically you can, I do not recommend it for everyone because it's not easy to do but in some cases that's highly useful. For example what we imagine is this will be an enabler for having local end points if you will for fire hoses all over the world. So all of a sudden RIS data can appear in, you know, west coast and east coast and Asia and visibly locally and then for bonus points we imagine that we can run RIS live on top of it locally over there, so, for example, the Japanese community can just tap into the local general practitioner needs RIS live instance to observe their prefixes and how RIS sees those prefixes so that's pretty cool I think.
And that's all I have to say. If there are any questions, I am happy to answer or you can find me and my colleagues around.
BRIAN TRAMMELL: Thank you very much Robert.
(Applause)
So, we have got about a minute and 30 seconds left, and I would like to use that to welcome our two new co‑shares, Stephen, please jump into the queue, and we will ‑‑ I think I have to click some buttons, and then you have audio and video.
STEPHEN STROWES: Hi, my name is Stephen Strowes, you haven't been able to see me until this point. I can't even see if I'm on the screen.
BRIAN TRAMMELL: You are.
STEPHEN STROWES: Sweet. I have been will able to see your faces it's been nice to see some familiar faces even if that's been unidirectional. I am looking forward to co‑chairing the MAT Working Group and I'd like to thank the Working Group for its support. I am an Internet measurements kind of person, I have been in these kind of communities for a while working from the research angle through to the ubiquitous deployment of IPv6 through to my time in the R&D team at the RIPE NCC. So, I have worked with a bunch of you. I think I have got a good sense having bounced between different environments of how to communicate across people and I like trying to do things this way. I think MAT is a great way for doing this and I look forward to doing for of that here and on the list. I'm looking forward to getting started.
BRIAN TRAMMELL: Cool. Thank you very much Stephen.
(Applause)
MASSIMO CANDELA: I am going to be super fast. So, hello everybody. My name is Massimo Candela. The first thing I would like to say is thank you Brian for everything you have done for this Working Group in the past years. For all the e‑mails that you answer me and everything you have done.
So, just briefly, yeah, also measurements are something I like. Currently I work in NTT, in the past I was in the R&D department of the RIPE NCC, and I think this Working Group is amazing for supporting and creating solutions and insights both for the operator worlds and for the academic world which is something I would like to keep and grow in the coming months and thank you very much for your support during the selection procedure. That's all.
(Applause)
BRIAN TRAMMELL: And thank you very, very much. Thanks a lot to Massimo and to Stephen for standing and to Nina for continuing. I think we have a really, really, really strong Chair team going forward. I was thrilled to see Nina continue and to see both Stephen and Massimo step up.
I have really enjoyed chairing this Working Group for the past, I don't even know how long it's been ‑‑ seven, eight years. Internet measurement remains kind of a hobby of mine but sort of a day job has been pushing me towards the data centre side of things. I expect you'll see me around other venues in the Internet and I hope to be up in the big stage in the Plenary talking about some of the work that I'm doing at some point in the future.
But, you know, watch this space! So, thank you all very much, so long and thanks for all the fish. Lunch is served.
.
(Lunch break)
LIVE CAPTIONING BY
MARY McKEON, RMR, CRR, CBC
DUBLIN, IRELAND.