Plenary
Friday, 20 May 2022
At 9 a.m:
WOLFGANG TREMMEL: I am chairing this session together with Peter. I see we have a packed audience this morning, everybody is fully awake and eager to get started. Let's start this Friday morning, 9 o'clock plenary, and the first speaker is Jaromir Talir talking about using eIDAS to verify the identity of ccTLD registrants.
JAROMIR TALIR: Thank you, good morning to everybody who managed to wake up after the social. I work for.cz registry. This talk is about the digital identity, more and more people is carrying in their pockets some form of the digital identify, whether it's chip card or something on the mobile phone, and at the same time, we see the growing demand to provide identity of the owners of the registry sources, may be TLDs but may apply also to RIRs. So we, together with couple of other CcTLDs took a chance to explore the possibility of using these digital identities. As part of the project we call RegeID who was co‑funded by EU grant, so this is the content of my talk.
I will briefly describe what is the project about, the output, some showcase from us and some challenges.
The eIDAS regulation is from 2014, quite an old regulation. It has actually two parts. One is for the trust services like certificates, signatures, seals, things like this. And completely distinct part is about the eID, so I will only be talking about the eID part. And the scope of this regulation is to establish the cross Bord recognition of the governmental eIDs in EU or European economic area. To make it happen, the regulation imposed it's mandatory for all public on‑line services to recognise these foreign eIDs in every country in Europe and it's been effective since September 2018. I said it's for public only services, the access to private services is left on the decision of each country, unfortunately that for most countries they decided not to allow it, but I will get to that later.
The regulation itself introduces concept of something called level of assurance, which encapsulates on the one side the strength of the eID mean, whether it's the single factor or multi factor or software or hardware solution and on the other hand, the strength of the verification of the identity during the issuance, whether somebody checked physically your ID or some remote level identification, high base of these properties.
And the regulation also defines the mandatory and optional attribute that are both for natural and legal persons. Most importantly, it describes the whole procedure, how the countries notify their eID means, half a way, it's complex, it includes the, something called peer review where all other countries discuss the features and technical things related to ‑‑ or procedures related to the issuance of the eID before ‑‑ notified and start to be mandatory to recognise by the other countries.
On the technical level it's designed in the way there is something called network that compose from different codes in every country that hides the differences in each country, and there is a profile of some authentication protocol that is mandatory for the communication between the nodes and each node connects in the country, local on‑line services that request authentication and the eID means that provide the authentication.
It's maybe a little bit seen in this schema, the diagram where you can see that each country is running the node, they have the mutual relations like peer to peer network where each two countries must exchange the URLs of their services and the public certificates to establish the trust, to verify the signatures and you can see also here that, for example, if the Estonian service wants to provide authentication, it has to be connected to Estonian node, if I want to use it I am first redirected to the eIDAS node of Estonia, I will select I am from the Czech Republic. We are operators of the Czech as well, and then I'm redirected to the Czech identity governmental solution where I identify myself and then on the way back, all my attributes are encapsulated, signed and transferred back to the Czech eIDAS node and Estonian node and to the service. This is one way how it can be done. There is another way which is more decentralised where it's possible for each country to decide whether they will go decentralised or centralised way. In the centralised approach, the country that decides to go this way they have to provide a software that other countries will deploy on their premise and authentication is local. There is one only country that decided to go these ways, our host Germany, definitely it's better for the privacy perspective but it would be quite a nightmare of the operator of the node to have software of 27 countries and getting responsibility for that, so luckily for us, at least it's just Germany and all other countries decided to go the centralised way.
Germany was also the first country in 2017 that notified the ‑‑ now the number is constantly growing every year. At the moment there are 18 countries that have already providing the notified eID means to their citizens that can be used cross‑border. There is one more country coming, Norway, which has passed the peer review but not yet notified and there is like 9 more countries coming. I should also say that countries may have the multiple eID means in provided to the citizens, for example in Czech Republic we don't have just Czech citizen card but we as a registry we are operating also the identity service, more ID, which went through this notification procedure and will soon also be available for the cross‑border use.
So there are a couple of countries that are missing, but even if you are from the country that doesn't provide you eID there is some opportunities to get it. I know at least about two countries, Estonia, they provide something called E residency and if you pay €120 you can get the card, that you can use for the online trust section around whole Europe, and also Germany, that's other country, that's cheaper for €37, you can get the German card that you can use in whole Europe for electronic transactions. I will stop raising Germany in this presentation.
So, and last thing about eIDAS, it's old regulation, eight years already, so there is the process of the revolution that was started last year. To store the identity information. It's even more decentralised than current centralised approach. There are also the possibilities to be more widely used in the private sector, the authors probably would like to even have the big platforms, be obliged to use it. The legislative process right now going on together with the technical solution in parallel, maybe some outputs will be available at the end of of this year, so ‑‑ but it's not related to our project so I will just skip that.
So, in the previous programme period European Commission was every year, I guess almost every year providing the possibilities for the online E services to connect to the ‑‑ this how they call it, eID digital service infrastructures and they provide 70% funding to do that. So, we formed the consortium with couple of other companies and tried to submit the proposal, first time we failed in 2018, it was like short time to prepare that submission, but we worked on that and tried to again next year and then we succeeded.
Here is the list of project partners, the ether lands, Denmark, Estonia and Czech Republic together with the centre of mostly European CcTLDs and Dutch company that was part of SIDN but was later sold [to] Signika.
For us as the registry, the project was to connect our registrant facing portals to this network and provide the possibility to link the eID with the contact in the registry. For the, the main goal was to make every search of the potential uses of these tools among the registries and registrars and it was mainly for the dissemination of the ‑‑ these ideas and achievements in the domain industry.
And the result definitely we to fulfil the visibility requirements, we create some website and made a video explaining these things, I will talk later about the research. And we also connected all portals to this network, so at least first two, the Estonia and Czech solution is live so you can just click and if you have the eID you can try to look in there. For the Denmark, I was told it's being put into the production as we speak, so maybe in the afternoon or maybe on Monday, you will have the possibility as well, and for SIDN, they have the solution, the preparation, but it will take them like couple of more weeks to get into the production. So you can just click and try.
I talked about research there. There was two surveys, among the registries and registrars. The results for the registries definitely they feel it's important to verify the registrant data but they have limited opportunity for that. They mostly do that by the paper‑based procedures so it's hard. Some of them they already have the possibility to use their own national eIDs for the purpose so it's easier for them to verify identifies. Vast majority will come the opportunity to use this. There are some concerns definitely the limited availability of the legal entities and also some registrars are outside the Europe, but at least from our perspective, it's just a small minority so it's not a big issue.
For the registrars, definitely there is a mixed bunch of them. They mostly operate on global scale, so they would like to have the solution that is standardised over the world for all their customers, for all TLDs, which could be hard. Surprisingly, they don't see themselves as primarily responsibility for the validation of the registration data, but they know that they are part of the ecosystem so they should participate somehow and of course they are afraid of the cost and complexity of these solutions.
So now the show case from the cz registry, we maintain the Open Source registry that is used by many other countries. So we did the changes in the core of the registry, the new database tables to maintain these eIDAS identifiers, linked to the contact, new APIs and we have the registrant portal that we call domain browser that has some features for registrants, that have been there for long time so we haven't touched this in this project. User can go there and see all the domains regardless of the registrar, the use for registration, they can set the security feature that's called registry lock on the domains. They can even see the DNS traffic on the domains if they want to do that, and for our case, we provided this services to all the Czech registrants that have our identity service so as part of the project, we extended the possibilities for the authentication also to the ‑‑ to using eID does network for the foreign registrants in our registry. It's like win‑win situation because we provided some features, if they link they can take all these features they have, and for us it's advantage if they link, we know who they are.
So here are some screenshots from the system, this is the home page, I could do live demo, but this is probably better. So this is the home page where the new button was added, login with your ID. By the way, these features are from the testing environment so all the identities are fake in this environment. I have some backup slides at the end where you can actually go through the whole process yourself, you can register domain on test environment and contact and try to link it, so just try it.
You can also see here that there is, I said like there are 18 countries already notified. Here are more, like 25 on the test environment, so you can see the difference that, almost the whole Europe already is sort of trying this technology and even though they notified these ‑‑ there are at least in the some face of preparation. For the proposed testing it's always the good to use the Swedish solution because in their testing environment, they have the possibility to select some prepared identity from the list so you don't need to have any credentials and you can still demo the authentication transactions. So here, I will just click this demo Sweden connect reference IDP, I will select the identity for this case like, pretend to be her, as a next step I have to consent that the data here will be transferred from the identity provider back to the service. This is the information that is transferred so name, date of birth and the unique identifier. There is another consent on the Czech side which is weird but it is how it is at the moment. And then you end up back at the ‑‑ our application, the domain browser, and here is the most important part of the changes that we had to implement because this is where the ‑‑ this identity matching takes place, for our case we decided to use the ‑‑ to verify that it's you, that you still need to provide handle of your contact and registry and something called authorisation information or info that is mostly used for the transfer, it's a shared secret between the registrant and registry so registrant should somehow know it or know the way how to get it from the registry. So if the registrant will fill the information, the handle and the auth info, then he gets right into the service and can enjoy the features. If the user will do that a second time, definitely the previous step is skipped because we only ‑‑ already have this link, so there is no matching on the way.
So, now the important project challenges, definitely is the access to the eIDAS node is the worst problem. I mentioned every country has different rules and requirements, and it's mostly open only for governmental organisations. The reason is liability, they are afraid that if some private organisation will sue them, that the ‑‑ somebody used the governmental eID for like for some bad things, so they rather do not want to touch this issue or some of them they provide the possibilities like the paid, as a paid service to cover that cost. There is some exceptions, mostly in cases where the private companies are fulfilling some governmental role or they are bound by some legislation which is actually the case, for example, in Denmark, where DK Hostmaster, they operate under some domain act that requires them actually to verify the identities of all Danish registrants, or registrants with dot DK domain, this serves as a good example that they said we must do that so please give us the access to the eIDAS node. The other TLDs are mostly non‑governmental organisations so the access is rather exceptional, but luckily, before we submitted the grant, we had to negotiate with our agencies in this area to get approval for the access to the node, so we explained then why is this useful, how it will help the security on the Internet and, at the end, they gave us these access.
Some little provocative statement, with respect, this upcoming NIS legislation that will impose maybe the requirements, it will be seen as opportunity rather than threat because definitely if there will be something like that, we can say, like, you want from us this so give us the tools that we can use for this verification.
The other thing that we talked about is the, how to select the proper level of assurance. This is up to each service, whether they are in single factor authentication or they want some stronger features. We thought it would be great to agree on the same level on the level of multiple TLDs and not to have every country the differents. If they think about this, it's important to look up the, not just about the required strength, but also the availability, because as is the case, for example, for the Denmark, every Danish citizen, they are ‑‑ almost everybody has this name ID, governmental identity, but from the perspective of eIDAS it's only on substantial level, so if ‑‑ we had decided to require the high level of assurance, we were probably only the whole country so at the end we decided to require on a substantial level which we think it's enough. Even for our country, almost all governmental services they require only substantial level. So we went this way.
The other challenge was that, I mentioned the distinction between natural and legal persons, even though their relation includes the possibilities how to transfer the attribute of a legal persons to the service provider, it's almost unused. They are only at the moment only two eIDAS schemes notified that provides at least some of these attributes, not all of them, and it's the ‑‑ something called [Eher] Kenning in Netherlands. Even though in the registry it's like half of the registrants is the physical natural persons and half of the registrants is legal persons, we decided to accept this situation and we concentrated only on the natural persons.
So the next challenge was insufficient data for do this matching, the mandatory data that must be transferred is first name, surname, date of birth and identifier. There are a couple more and they are optional, this is not ‑‑ maybe if there would be this address that could be enough, but there's only half countries that provides the address of the ‑‑ in the eID, and what is worse, since the address in the transaction is in structured form and the different countries they are selected different subsets of this, so if we had implemented that you would have the implemented the processors for each country so at the end we decided not to rely on the current address and to use some different for the identity matching procedures.
We explored multiple possibilities and implemented them in each country. For this purpose, for example, one way is actually to take this person identifier and to pass it through the ‑‑ via a registrar, to the registry which is a nice way because you do it in the beginning and then when you look into the registry, you already have this much established. This approach was selected by Estonia and the reason is because everybody knows their identifier, it's printed on the card or they can see it on the phone so it's quite easy for them to take it and put it during the registration to fill the information about themselves, like name, surname and the ID, but this is not a case for all other countries. I don't know my eIDAS identifier is some random hash stored somewhere in the registry, so for this proposed Estonian implemented something that if you log into the Estonian registrant portal, you will see the identifier, so then you can cut and paste it to the registrar portal for the registration so you can imagine how easy is that from the user perspective.
The other option I have shown on the screenshots that we selected using auth info and together with some check of the other data like surname and date of birth, the other approach selected by Denmark was using some pre‑existing authentication because if you register dot DK domain you immediately get an account, user name, password, that you see from the registry. Later on if you want to link the eID you can use this proving that it's you.
For the.nl, they have the procedure that they send some verification links over the e‑mail and that involves I think some manual checks in this procedure as well, so different procedures.
So, if you link the contacts, contacts with identities now you have the issue what ‑‑ how to update the data because you probably don't want, if you have the verified set of data you don't want that the registrar will ‑‑ or registrant via registrar will remove it or update it randomly. We thought that we could maybe take the contact from the registrar responsibility and maintain it ourselves but we didn't go this way; rather, we used this statuses on the field like server update prohibited that registrars are sort of used to, where we indicated that the recommend registrant cannot update these data. There is every authentication the new data can get into the system and we inform of the registrars about the change.
And last challenge, definitely you could guess that there's quite horrible user experience aspect of of this. There are multiple parties involved in the single transaction, there is a lot of redirect on the way. There is a different UI in each country and it's of course hard to trace failures, if there is some error on the way you don't know whom to reach to find out what happened. The European Commission, they prepared a new logo that every country should use, they made use guidelines, but it's not mandatory to use it, so not every country follows these guidelines.
So my last slide is the conclusion, if you ask me if eIDAS is the final solution for this, definitely not yet. There is this EU DI wallet is coming but it may take some time. Anyway, still, it could be the useful tool if you will ‑‑ if you fulfil all these requirements, you will accept all the limitations, you can use it. We have the ‑‑ we see it as infrastructure that we have created during the project, and you can build on that; ‑‑ for example, offer more verification possibilities when we are trying to tackle the problem of some suspicious contacts or ask them to verify and offer them this as the possibility. So thank you. I am happy to answer your questions or comments.
(Applause)
PETER HESSLER: So thank you very much. I will be moderating this part of the Q&A.
SPEAKER: Speaking for myself. If I authenticate and you get my personal data from the central whatever register there is, is there a push mechanism to let you know that, for example, my whatever permanent residence address has changed or not?
JAROMIR TALIR: Not. This is something that is not expected because it would be hard for each ‑‑ because it's different in every country, the way how the eID is done. So, yeah, the only way at the moment is if the country is providing, at the moment of the authenticatication trust section, the current data you just authenticate again and everything went through from the citizen registry back to the service.
SPEAKER: Understood. I am asking specifically because usually the register has in its conditions that you have to keep your data up to date and blah‑blah‑blah, and it's total pain to, if you change your address, it's total pain to go and update all the places where the address is so that's why I am asking, thank you.
PETER HESSLER: Apparently, we have no written questions, no one in the queue. Last call for coming up to the microphones, say something at the mic? And thank you very much.
(Applause).
PETER HESSLER: Next up we have Marcin Nawrocki.
MARCIN NAWROCKI: I'm Marcin and I am from Friei Universitat Berlin, this is joint work with Nick from Microsoft and also others and... Hamburg University of Applied Sciences.
So this is the last plenary session, I want to have a quick recap what happened during the first session.
As you can see, Thomas King said that latency is the new currency of the Internet and I agree. So, we really try to deliver content as quickly as possible to clients on the Internet. By pushing the content as close as possible to the client, but this is not always possible, right? And so the latency gets high. And what's the other possibility that we have, what can we do? Well, basically, we can design protocols that can deal with the high latency. So we designed better protocols and one of these new protocols is QUIC, and if the latency is high, every roundtrip time hurts so QUIC was designed to have a handshake process which only requires one roundtrip time. Currently when you visit the website you usually have at least three handshakes, so we have TCP and ‑‑ at least three roundtrip times and QUIC tries to reduce this to one roundtrip time handshake.
Also the other design of QUIC was to reduce or prevent UTP amplification attacks so we did not ‑‑ we did want to prevent the mistakes we did before so DNS amplification attacks are still very common so there is an upper limit for the service which they can respond with. So, a server only responds with the triple, the data size that it received from the client. So these are the design goals.
For all of you who did not see QUIC handshake in practice, that's how it looks like, so basically we have the client and the client sends an initial packet and this is very much like a TCP SYN packet. However there's a large difference because the initial message from the client are actually quite large so in the initial messages at least 1,200 bytes. When the server receives the initial packet it responds with also initial message but handshake message which includes all the TLS certificates and so on.
So, we actively scanned 1,000 top domains and what we see so far is that around ‑‑ or exactly 169 servers support currently QUIC out of these thousand domains and the results that you see are based on those 169 servers or domains.
So have design goals been met? We think no, and my job today is to work you through our plot so that kind of hopefully come to the same conclusion.
Our client here was sending QUIC initial packets with the size of 1272 bytes and we see three things that can happen. Basically this is a bar‑chart and for roughly the half of the servers we see the server responds with more than three times the data that it's actually allowed so the servers do not respect the threshold that QUIC designed. For the second half, we see a bottleneck, so the handshake does require more roundtrip times, so it's not fast one roundtrip time that we expected. And the very thin green bar that you see, this is the optimal case that we hoped for, it's only ‑‑ this is the one roundtrip time handshake.
So, the red bar is basically an implementation issue, that's what we think, because this limit is specified by the RFC, so there is likely an implementation which simply does not respect this limit. The orange bar was a little bit more complicated, this is probably configuration issue, but we will look into this.
So, my talk is more or less structured around this plot, so we will now look what happens when we go to the left, so what happens when we see larger and smaller QUIC initials and we will talk in detail about the red, orange and green bar.
So if we go to the left, so we send smaller QUIC initials with the client, almost all handshakes require multiple roundtrip times and we expected this because the servers are not allowed to respond with so many bytes. Then if we go to the right, we see that more fast roundtrip time handshakes take place but it's still only a small fraction of the total handshakes but still this is better than the initial size that we used.
Now, the last part is tricky because if we use initial sizes close to the max, what we see is we reduce the reachability and do not reach the servers, this is because QUIC forbids IP fragmentation and this way when we use large QUIC initials we are not able to reach the servers.
So, let's take a look at the red bars. So, again, everything in red did not respect the amplification limit and we wanted to know how bad is it, so the responses were larger than 3 X but the good message here was that actually, the largest response that we observed was not larger than 4.4 basically, so to ‑‑ so, what we see here is basically that it's not as bad as we expected; however, some implementations still need bug fixes because they do not respect the RFC. We are still in the process to figure out which implementations they are actually.
If we focus on orange bars, we tried to find out what causes multiple roundtrip times. Two possible reasons, the first is very much like a TCP cookie or it's a form of DDoS prevention so before the server actually responds with an initial, as I showed you before, it sends retry packet and doing so it does not allocate any memory for reconnection, does not perform at least almost none of the TLS complexity it normally does and waits for the client to come back but only two domains actually use the retry tokens. So this is not the reason for the multiple roundtrip times we see and these two domains were one operated which operated Russian speaking social media site.
The other reason are very large TLS certificates and this is the majority. So, again, we have a CDF which shows us the distribution of the TLS data size that we received while performing QUIC handshake and what you can see is that we usually receive more than 4,000 bytes so everything that you see here in the upper right area, which is the largest area, actually triggers multiple roundtrip times and this is bad. So, what we see here is, and is that TLS data or the large certificate especially are responsible for bad performing QUIC handshake.
So, let's talk about the green part. If you ask me, based on our measurements what is currently the best initial size for QUIC handshakes, at least for these thousand domains that we measured it is around 1,350. Again, this is a trade‑off between very small initials which trigger multiple roundtrip times because of the large TLS certificate but on the other side, we have the problem that large initials actually reduce the reachability because QUIC forbids IP fragmentation and the packets simply got dropped.
So, what do we learn from our measurements, is there anything we can recommend? Well, reduce the size of TLS data. I know this is quite difficult because TLS ecosystem is quite complex but maybe you can optimise something in your TLS data so that you can actually use the one roundtrip time handshake time that QUIC offers. If you can't, on the other hand, and you are forced to use multiple roundtrip times we recommend to activate the retry tokens because if you have already multiple roundtrip time you get DDoS prevention and no additional costs.
So all the measurements are based on QUIC reach, QUIC reach is Open Source tool that we released and you can download and measure your site so go ahead, we are likely to receive feedback. QUIC reach is based on MS QUIC which is the default implementation of QUIC by Microsoft which also will be included in Microsoft projects in the future.
So with that, I am coming to an end, so use our tool, check your QUIC servers and with that I am happy to receive questions. Thank you.
(Applause)
WOLFGANG TREMMEL: Are there any questions? I see Jim Reid is ‑‑ oh Gert.
JIM REID: Great work, very interesting stuff, so well done to you for producing this interesting information. Just a very simple question: Are you planning to take your findings to the IETF because I think there's probably a good idea for writing up some advisory guidance on how to deploy and use QUIC, especially if the problem is misconfigurations?
MARCIN NAWROCKI: So this is still work in progress. The things I presented, we are still working on the details and yes, we are planning to present the stuff at the IETF and other conferences.
JIM REID: Great stuff.
GERT DOERING: Good morning. Great stuff on a Friday morning, really kept me awake, come here. I am curious about the big TLS bits of it. I admit I have not looked into this in much detail yet. I know that in the DNSSEC arena, there is a movement going from RSAs certificates with huge bit sizes to elliptic curve to get the packet sizes down. Is this something that could help here or am I totally on the wrong track?
MARCIN NAWROCKI: You are actually correct. Let me see. Yeah, are so again, this is work that we still ‑‑ it's ongoing work but we still try to find out what makes TLS certificates ‑‑
GERT DOERING: And the subject auth names.
MARCIN NAWROCKI: Yes. So again, we are pretty sure that people can use better algorithms, better deciphers to decrease the footprint. The second thing, because they are large deployments, CDNs, the certificates we receive have a lot of names, that's the second thing. And the other thing, if you receive a certificate for a website it's not only this but you usually receive the certificate chain and this also makes it larger so it's actually team effort.
GERT DOERING: Thank you.
WOLFGANG TREMMEL: Any more questions? Are there any written questions?
PETER HESSLER: There are no written questions.
WOLFGANG TREMMEL: All right. Then, thank you very much.
(Applause)
WOLFGANG TREMMEL: So, the next one is remote presentation ‑‑ okay, here he is. Hello, Geoff, hello Australia, do you hear us?
GEOFF HUSTON: This sounds like Eurovision against, doesn't it?
WOLFGANG TREMMEL: 10 points, all right. Okay, we have next one is Geoff Huston talking about revocation. Geoff, go ahead, thank you.
GEOFF HUSTON: Thank you very much, and I am just doing the entire screen fumble at this point to try and figure out how to make sure that things are happy sharing my screen. Let me try once more in the Chrome tab. Have I got a screen share running or a permission? I have uploaded my slides, can they present my slides from that side, please? Oh, you are so good.
WOLFGANG TREMMEL: We have the slides now.
GEOFF HUSTON: Thank you, thank you very much, that's perfect. This follows on from the previous two slides but it introduces I suppose a subtle set of twists, and so I'd like to talk today about certificate revocation. Some of you may have noticed there was a bit of kerfuffle over in Europe, still is really, and one of things had a happened in early March was that a number of Russian banks had their certificates revoked because of the US sanctions and this particular, some banks had their certificates revoked by thought as a CA, it was pretty prompt, it was March 11th, when you look up the logs that certificate is marked as revoked. So what does this mean? Well, in theory, a revoked certificate says this is not a good certificate, do not go here, this is bad. So, if certificate revocation worked, then clients shouldn't be able to connect to you because if the certificate is revoked this whole TLS handshake should in theory fail, somewhere down in the bowels of the TLS exchange should be a handshake around this certificate, and the revocation status should pop up by some magic, I guess, and say no, not a good certificate, it's been revoked. So that was the theory. But the revocation of those certificates by thought would basically take that website down or cause whatever website it was to look for another certificate authority that the rest of the world trusts. So, this was indeed intended to be a disruptive action but let's just take a step back here. And actually look if this works. And look at the implications of what revocation is all about.
So, a bit hard to start with the Russian bank so let's start with one of my banks, an Australian bank, the Commonwealth Bank of Australia, it's a very grand name and there is the front scene, Hi, immediately jumped to the bottom line, tell me your secrets. Now, why should I trust it? That's a really damn good question, because it's very hard to tell if that's my bank or a clever scam. If I look behind that web page, I'm actually not banking with the Commonwealth Bank of Australia, I am banking with the bank of Akamai and you are probably too. Most of these retail banks actually use some various forms of CDNs these days and so if I was a clever scammer, I would actually do my scamming using Akamai, much harder to actually find out that I'm being a naughty person, but my question is still, why should I enter my credentials into this screen? Why should I tell, theoretically Akamai, all my secrets, why should I trust this?
Well, damn fine question. So if you click on that padlock, it tells you some very reassuring stuff. This certificate was issued by a company called Entrust, and anyone who calls themselves Entrust immediately scores minus points, in my book; it's not even null points, it's negative, because Entrust sounds about as dodgy as you can get. I have never met them, I have no idea what this means, so I am still not very happy so what does the certificate actually say? Let's sort of figure out if this certificate is worth trusting. So show certificate.
Now, my Mac, trusting little soul that it is, says it's valid. So what? My Mac says lots of things are valid, maybe it shouldn't. Let's dig into this a little bit deeper.
Not only does it have a green tick but there are a couple of dates down there. This certificate was issued in August 2021, and it's going to be valid until August 2022. Now, hang on, that's a 12‑month length certificate and at the time I looked, it was already seven months old and it doesn't matter if I look again next month or the month after, it will still be trusted. This is an amazing piece of predictive trust because what happens if the private key of the Commonwealth Bank of Australia was leaked in the next five months? What if Entrust, despite such a wonderful name, actually has breached in next months and its certificates leak? The real issue is, what if something happens between now and August of this year that says this certificate shouldn't be trusted?
Okay. So what do I do? Well, it's the same answer as we started with: It's revocation. Now, normally a certificate is valid for those dates, but if something bad happens you want to unsay it, you want to suck it back out of the world, and the way that you do this in the X509 world is every certificate authority maintains this thing called certificate revocation list. Now it's really simple, it's just a list of all the seriously numbers that are dud, that for whatever reason some bad thing has happened or maybe it's just part of their process, don't trust this certificate any longer; it cannot be trusted. No.
Then simply what they do is add that seriously number to the certificates revocation list, as simple as that.
So, if I'm worried about the Commonwealth Bank of Australia's certificate because I suspect something dodgy has gone on or even if I just don't know maybe I should check the certificate revocation list to see if that serial number has been listed. But it's not me, it's my Mac, it's my browser, it's down in the machinery. So, how does my client, my browser, perform this check? So, here is this certificate which has been signed and, inside there, there is an extension field and that extension field, if you can read the 3 point font I have put up there to test exactly how big last night was and whether your eyeballs are still working, there is an extension there called a CRL distribution point, and there is a URI associated with it, so in the case of the Commonwealth Bank, CRL dot entrust .net/something or other, and that is where the certificate revocation for Entrust resides.
Well and good. It's not an https URI. But, but, it is a signed object, it's going to have a ‑‑ it can be validated, and you should. So in some ways, requiring authentication is kind of necessary so whether you protect it or not with TLS matters less because essentially you are going to validate what you see when you finally get it, so in some ways, https is not strictly necessary here. But here is the certificate revocation list, so what's in it? Well, I got it, W get, that is URL and using the Texas Chainsaw Massacre of all forms PKI, Swiss knife, open SSL that are commands that say treat this gobbledygook as a CRL and tell me what's going on.
So it says I am a certificate revocation list and I got produced on March 14th and I'm not going to produce another one until March 21st. So if anything bad happens on March 15th, hang on, I won't tell you until March 21st. Why not? Oh because ‑‑ no reasons, thanks. I appreciate your diligence, people.
So there's also 5072 certificates on that period, March 14th, which Entrust had revoked. For reasons that I still don't understand, they tell you the reason code is as to why it was revoked. I really don't understand why. Bad is bad. I am bad because I am superseded, well I am not going to trust you, the key got compromised, I am not going to trust you. Why are you telling me this stuff and other look are saying it's superseded, it's just the process. Interestingly, out of those 5072, 387 said oops, the key got compromised, which is actually, if it's the truth, a surprisingly big number. Wow. Didn't expect that.
Okay, so what does the browser do? It retrieves this list of 5,072 serial numbers, it validates the digital signature of that CRL against the CRL contents to make sure it's the real deal and goes and validate if that was generated by the CA's private key and goes through the process of generating validation chain all the way back to a no one Trust Anchor, much as with the original certificate in TLS, you have got to do the same procedure against the CRL. You check the dates because if the CRL is out of date, and if you are before the validity date, not good. You should be within that window. By the way, your own clock should be accurate, which is often the case that it's not. So I looked for that certificate serial number and if I find it listed there, that's a really bad certificate. It's a dud. Don't go there, even the CA is saying this is bad. Sounds good. Sounds brilliant. Sounds, I don't know, does anyone actually do it? Well, no. Why not? Because why do I have to load down 5,072 serial numbers just to check on one? Why do I have to go through this entire sort of dance to retrieve all of the revoked certificates from that CA just to check one? Now, I think I understand what's going on here, and I am old enough to remember credit cards, you know, pre‑coded and all that where we used to run with bits of plastic and before online networks they used to circulate to retailers a list of the bad credit cards of that time and then send this list out through the mail once a week because crooks move slow, don't they? The shop assistant if they thought your credit card was a bit dodgy would look up the entire list of bad credit card numbers and if yours was listed, wasn't going to give it back, there was a bad credit card, it's not good for credit. So it might have worked in this bygone age but this is just overkill, not going to work that way.
So, how do we make this better? Let's go to plan B. Plan B is the on‑line certificate status protocol, and what it does is it replaces downloading the entire list with a query response protocol, and so I query that CA and I say, here is the serial number, what do you reckon? And in essence, the CA looks up its own certificate revocation list and looks up to see if it's there or not and says good or bad, which kind of sounds interesting, right? So, what we are seeing now is, in the TLS handshake, the TLS server sends over the domain name certificate and then the client, in that certificate, finds the OCSP portal and says, Hi, I have got a question for you, CA, what do you think about this particular certificate? You have to ask the CA because if you ask the TLS server and it's trying to hoodwink you, you are not going to get an answer you should believe anyway. So this referral back to be what in DNS we would call the authoritative ‑ is the way through to try and get the right answer from the people who should know. So, this sounds pretty good. Does it work? Does it use it? And so the question really is: How many folk would go and retrieve an object if the certificate was revoked and the CA doesn't publish a CRL? Because most of them don't, because there's no point. So, who uses OCSP? What a great question. Let's use an ad, ads are amazing, they can measure all kinds of things, they retrieve the URL from the user perspective and in this case rather than playing around with other aspects, we are playing around with the certificate behind the https side of the web object. So, we are going to generate a let's encrypt wild card certificate at great cost to everybody, and then immediately we are going to revoke it. So this is now a dead certificate, and when everyone ever comes to retrieve this, they are going to find that it's a dud if they use OCSP. And if I do a manual check of this, using, again, the Texas Chainsaw Massacre of AP KI and do an OCSP query against that issuer with that serial number which is what the browser actually does, that exact call, it will find, in amongst all of the fancy stuff there, in ASCII, it says: Dud. The certificate status is revoked. And here is the time it got revoked. But there's a bit more here that I am kind of curious about. This update, March 13th, next update March 20th, so if I asked about a certificate that was revoked on March 14th, what's the answer? That long silence is I don't know either, I really don't know. But it kind of worries me that I'm getting one‑week periods, this is actually an extract from the CRL, let's encrypt to produce one. They don't produce one because let's encrypt CRLs would be massive and there all week trying to download. This is revoked, this is cool, let's see what happens now? Does anyone actually do this? I am going to capture all of the TCP packets of the server of this exchange and I'm going to look at the SNI field because at the moment I am not using encrypted client hello so I can see it, and capture all the web blogs to see who, if anyone, is actually doing this OCSP check. If folk are obeying revocation from a let's encrypt certificate, they are going to do it because they are doing OCSP.
Now, if all of you clients out there in the big Internet performed this revocation check, then no one would be fetching the object, nobody, because it's a revoked certificate, there is no alternative, don't get it. And if no one support OCSP, we will see a really high correlation because if no one supports it everyone goes, don't care, I am going to fetch it anyway. So, before actually I go to the results of the ad experiment, let's have a look with a benchmark test, let's raise our expectations here so it's pretty easy to do this with a few systems and a few browsers. MacOS and I was running 12.2.1 in March, Chrome, Firefox, Safari, does the check. There is no Edge on MacOS. My phone same answers, no, don't care. Windows, Firefox cares, but Chrome and Edge, no, don't care. Debian, only Firefox. So only Apple platforms and Firefox perform the checks and everybody else, it sucks to be you, here is the web object because we don't care.
So Apple platforms and Firefox. What's the market share of Apple platforms and Firefox out there in user land? Well, you know, statistics says ram it up 20% market share these days. If I had a relatively good ad and Google are very, very good at that, they just send ads around the planet, good on them, so we would expect to see about a 20% rate of folk who are actually doing OCSP checking. And in one of those rare moments in research when theory and practice come together and actually agree for once, thank you gods, it is out of these five days in March, 16‑and‑a‑half million simple points, 21% said not going to go there it's a dud and the other 79% said life is an adventure, let's take a risk, I don't care. So theory and practice are doing fine. That's really disappointing. That's incredibly disappointing. 79% of care just don't care whether it's revoked or not. So where are these people who actually care? They are the people who have Firefox and AP else. Who has a lot per capita? Saudi Arabia. Who else? Bits of South America, Australia, China. Who doesn't? A whole bunch of Europe, Africa and Mexico, they just don't check. Interesting.
So where do we go from here, what's the score card of OCSP? Well after I have done that TLS and before I start worrying about QUIC from the previous presentation, I have got to do an OCSP check, that's another roundtrip time and it's not going to back to the server that's in the CDN, it's going back to the CA and the CA really has a problem, because it's going to get a whole bunch of very, very time critical queries so unless it's got all of its credentials out there on good CDN you are going to be spending time doing this OCSP check, not good. So roundtrip time, and it might not be very fast.
But there is something else that's going on. The server knows I am going to that site but now I have just told the CA what I'm doing, the CA didn't know about me, it didn't know I was going to that site, it was a secret between me and the server, now it's a secret between me, the server and the CA. Why should Entrust know who my bank is? But they do. They do know who my bank is. What a leak. And as I said, the CA is now going to handle all these time critical OCSP queries. And what happens if I launch the query, wait for a while, don't get the answer I want. No answer. Should I just say oh, let's go there anyway? That's not good because if I go there anyway, if I DDoS the CA, everyone is going to go to a bad certificate site. Because if the CA's OCSP server is unreachable because of DDoS, if I failover and go, yeah I am just going to go there anyway, that's not good. If I allow it then that's equally not good. So if it's uncontactable, you are in a bit of a quandary about what to do. Fail or. Some supports checking and some don't. Which strikes me as this is the security of the Internet, let's just toss a coin because there is no uniform single way of dealing with this. It's just ‑‑ what machine are you using, what's the day of the week? I am going to check or I'm not going to check. That's ridiculous. And I would actually say that's a fail.
Now, this is not a unique answer. I found this blog entry from Adam Langely, the man behind a huge amount of work from Google and Chrome on the same area who had made the same point back in 2014 online revocation checking is useless, it doesn't stop attacks, it just makes things slower, it's just theatre. As he says, and you have got to agree, you need an absurdly specific situation for it to be useful. The waste of the time you are wasting everyone's time. Okay so OCSP is a fail.
Time for plan C, because it can only get better. And Adam came out this in the same article that said maybe we should do short‑lived certificates or maybe putting in the certificate itself, so it's signed and you can't fiddle with it, OCSP must state your name staple, the certificates last for ages, a year or move, if you move the certificate down a bit, revocation is only necessary inside a short time window and so maybe you just use short‑lived certificates or you do the OCSP must staple.
Now, OCSP must staple is kind of interesting, because it off‑loads the work from the client to the server. The server does the OCSP check on your behalf, and because the answer is signed, it can send that answer to you and you can validate it. You don't have to ask the CA; you have to validate the OCSP response, so whether the CA gave it to you or the server gave it to you makes no matter if you can validate the answer. So the way this looks is now subtly different, it gets rid of that roundtrip time. The TLS server makes the OCSP query on your behalf, and it sends you both the do name name certificate and the response and let's you, the client, validate both, so yea, no privacy leak. Yea, no delay. So, looking interesting. Yeah? Because now I'm offloading both the time and the work penalty somewhere else and getting rid of privacy. And RFC6066 and 7633, for those of you who love bedding RFCs, so I have said all that, it's really, really good, wonderful, because it must staple is part of the certificate and you don't get an OCSP data, something bad is going on, so hard fail, no equivocation, no doubt. So, who supports OCSP stapling? Mac, Chrome, Firefox, Safari? Only Firefox. I am getting confused. But we do need to talk about Chrome. We really, really do. Because Chrome doesn't use OCSP checking these days, it uses a concept called CRL sets, and there's an URL that you can look at, it actually dates to 2012, it predates the whole we must use must staple and so what Chrome does on your behalf, thank you, Chrome, it calls across all these participating CAs and trims the CRLs to strip out what it thinks are unimportant revocations and it sends the result to the Chrome browser. So the Chrome browser is already pre‑loaded with all the important revocation certificates currently on the Internet. What a clever browser. But what if the CA I used isn't part of Chrome CRL set? It sucks to you. What if I am not important, and I'm not? Well it sucks to be me. Chrome just goes, not revoked, I don't care, let's have a session. I find this weird, you know, the whole thing about this approach that we have been doing with security, we have put an awful lot of emphasis that security is necessary. All that work with let's encrypt to make certification and security not a luxury good, but a commodity that everybody can use, everybody, and we are being forced HTTPS is what we have to do, yet Chrome is saying revocation, that's only for the rich and exclusive; for everyone else, live with it. Why? Why is Chrome saying on the one part, security is for everybody; on the other part, oh, if you are in my privileged superset I will do something about you, everybody else, I don't care about revocation, you stuck with it. Chrome might have said no, but other folk have said okay staple CRL makes sense, it will do staple, yea Akamai. What I could find when I looked in March, Azure was under construction. Fastly yea and whatever your favourite CDN is you should ask them about this because it's one way around this that kind of works. Or does it?
Think about this for a second. What's the point? You see, when I evaluate a certificate, I accept it as long as I can construct a validation chain to local trust anchors and CRLs and OCSPs is able to say you must be able to construct that validation chain but the CA doesn't want you to trust that any more, circumstances have changed. So most of the time OCSP should say it's fine, and the only time it's really useful to you is when the OCSP response says it's been revoked.
Now, why is the server sending me the certificate and a stapled OCSP response if the server already knows that that certificate is dud? Why doesn't the server say I am not going to do this and give you the option, it's done? Because the server knows the OCSP information already, and it should fail the TLS go and look, I would really like to take you there but, you know, there is no certificate, and I'm not going to offer you one because the CA pulled it out. So what's the point of handing that job to me if the only answer is no, if the server already knows no, what's the point of going you might have a different answer? I know it's bad but it's your life and your browser, you can hang yourself, I am a mere server, that's what you would say. So in some ways, you get led to the solution that CRLs are a failure, OCSP is a failure, and stapled OCSP is actually a failure. If the entire point of this certificate infrastructure is to tell the user that, you know, work with the location you have reached or not location you intended to reach, then, you know, if you can't inform the user of the certificate being used is not to be trusted right now. It's not the status of up to seven days ago; it's now. CRLs lag in time. OCSP lags in time. Stapled OCSP lags in time. That Ethereum hack back in 2018 lasted a week‑and‑a‑half, not ‑‑ an hour‑and‑a‑half. If I want to know if a certificate is being revoked, I'd really like to know if it's being revoked now. Not oh, it wasn't revoked a week ago, you are all good to go. That's not helping me. And so if the entire purpose of these revocation issues is to reduce that trust window, then if all you can do is to reduce it to a trust window of seven days I would call that a fail. So what we really want is not let's encrept with certificates of 90 days and we don't need to worry about revocation, you do. One week, you still need to worry. You need certificates with a trust window of a few hours because then revocation wouldn't matter because within an hour or two, it's a new certificate, that one can't be trusted any more because it's run out. So, revocation is a failure, it's not delivering what we need in today's Internet, attacks are too fast, trust is lasting too long. What does that mean? All certificates are a failure? We have moved beyond what we want from certificates. The trust Windows are broken for our world. Not long certificates and functional revocation is a combination from hell. We do it because anything else seems like a lot of work and we don't want to change the world but we do it because we actually don't care about the user, we really don't care.
Actually incapable of informing a client that mischief is happening right now. They can tell that mischief happened a week ago, so what? And if the real objective a client wants to know is I am about to enter my credentials to my bank, is this safe? The answer from the certificate system is I don't know, it sucks to you, but go try, see what happens. I can't help you.
So, we can't fix X509, doing certificates of one hour validity periods is a nightmare, it's not built for that. So why are we using this crap from the 1970s? Why is this actually relevant to our life at all? Why are we bothering?
Now, there's an age old saying out there that no matter what the question is, the DNS is always the answer. It's the DNS, it is, indeed always the DNS. And the issue really isn't TLS, the issue really isn't digital certificates, the issue is the certificate infrastructure comes from a previous century and it's not just working. The DNS has exactly those issues but instead of designing stuff with a next week timeline, they actually design stuff with seconds and minutes, TTLs are astonishing effective, you can set the maximum retention times back to a remarkably small time window and refresh against the authoritative servers so the DNS out performs the entire certificate infrastructure by orders of magnitude. Can I put keys in the DNS? Well, yes, there is DANE and staple DANE, the whole idea of this is to associate key pair with service name, put the public key into the DNS alongside all the other attributes, use TTLs and if you are getting some kind of suspected compromise change it and you have got a TTL time and it will flush. Sign things with DNSSEC, that's what it's there for and staple that validation chain into the TLS handshake as a staple chain response, using basically that entire bundle in a TCP session so you haven't got the DNS problems of UDP, that combination of DANE, changed responses and stapled gets us out of this mess. My time is up. Where have we got to? That whole X509 certificate, well, it's broken, it's so broken it's unfixable. Pulling certificate times down to small number of hours, that's the doomsday scenario and not going to work. The DNS manages to achieve that. So what if we ditch all this X509 brokenness and turn this over to DANE and DNS and DNSSEC, because at least at that point we can hand users what we wanted to hand them, is it safe to go there? Yes or no, now. Not last week, now. And my time is up, if there are any questions, I will happily answer them. Thank you very much. If I can't answer them, I will take a wild guess.
(Applause)
SPEAKER: Lee Howard from IPv4 ‑‑ Geoff, this was as much fun as always. I had occasion to do a bit of not quite the same research but adjacent research a few years ago and I found, I thought you might be interested, something like 1% of CRL and OCSP servers have an AAAA. So the occasion was trying to run v6 only infrastructure, if I only need reach certain things but trying to reach website on IPv6 only system with no translation I can't even check the server at all.
GEOFF HUSTON: I think it underlines the issue, Lee, that no one in the certificate world cares in the slightest about revocation. Really don't.
LEE HOWARD: Exactly, same thing
GEOFF HUSTON: And users assume they do, poor deluded user. Every single time that assumption is glaringly wrong, it's not right.
JIM REID: Great talk as always. I have got a couple of concerns about this, I think you are right all this X509 nonsense is not really working at all, but I'm not sure if DANE is the answer. I mean DANE hasn't really set the heather on fire since it was first standardised all those years ago, very little usage of it. A second concern we have about that if we are going to start putting some identify token to validate TLS sessions how are you going to see the likes of Commonwealth Bank to sign theirs on, do we have the expertise and have that problem for all other people providing content information on the Internet, how are they geared up to deploy DNSSEC?
GEOFF HUSTON: So, there is a number of parts to unpack in that, Jim, and one is, even the most cretinous folk can get a TLS certificate disturbs me a lot and it should you, they are that clueless and about keys they shouldn't have TLS certificate either because this is not exactly toys here. People rely on that data. They rely on the integrity of the data. And if you are not doing your side of the bargain and getting that certificate, that trust that users are putting in is completely displaced. So the next kind of issue, why do folk think DANE is wrong? It comes down to two issues, DNSSEC over UDP is a nightmare. It is. Large responses and TCP failover and so on are causing us all to go nuts. Nay, DNS over TLS, yea DNS over HTTPS, if we could do this part of the work, we can actually do one thing that makes DNSSEC sing and dance. There's an RFC, 799 ‑‑ 7901 that takes all of the validation questions, that take a geological era in DNS over UDP, and you package it up with the answer to be validated. No more queries, here is all the DNSSEC answers, knock yourself out and perform a validation chain yourself. So if you use DNSSEC and chained responses and you have to use DNS over TLS, then all of the sudden all those DNSSEC doesn't work, DANE doesn't work kind of objections, get punted out the door. Yes, we are not doing this today. Why not? Because we are relying X509 works, what a stupid answer that is because it doesn't work.
JIM REID: Thanks, Geoff.
PETER HESSLER: For time‑keeping, we are running close to the end of the session so I'm cutting the queue now. You can still ask yourself. I will go to one of the text questions.
WOLFGANG TREMMEL: There are two written questions in the queue. One very long one which I do not want to read out in full and basically the question goes: Online VTU dot arrow seems to be used from global sign and before March 29 so it just depends on CA policy who gets the OV certificate. So is it still broken, OCSP does only deal with certs not direct requesting organisation. So again, no proof of organisation.
GEOFF HUSTON: There's a lot to unpack in that question about organisations and certs and I think it's probably best handled off‑line here in a much, much bigger answer but I will say one thing: Certificate transparency gives you same month responseness, it's busted.
WOLFGANG TREMMEL: The second question from ‑‑
GEOFF HUSTON: If someone wants to build it I am sure I can persuade APNIC to build somebody's costs, this needs to happen, it serial needs to happen.
JEN LINKOVA: Hi, Geoff. Yesterday we heard in this room that this Internet thing is kind of working and good ideas stay but bad ideas die, thanks for proving once again that that doesn't seem to be the case.
(Applause)
PETER HESSLER: Thank you very much, Geoff. And with that, we are ending this part of the plenary session. We will be taking a break shortly, in about one minute the results of the GM will be announced. I also just want to mention that there is a weather alert for Berlin, there should be some thunderstorms later this afternoon so please check the weather and adjust your travel if need be and we will see you after the break.