Daily Archives

Database Working Group

Thursday, 19 May 2022

At 2 p.m.

WILLIAM SYLVESTER: Welcome, everyone. We have an exciting agenda today for Database Working Group. A couple of quick items, if you are using Meetecho we are going to be managing the queue through that, please try to use that the best that you can. We are all working through getting used to it so bear with us as we work through, this is new for us as well. In regards to our agenda, I want to thank the secretariat and NCC for their help, our scribe as well as the other support that we have from RIPE NCC. A couple of things. We are looking for Chairs, we currently have one Chair position open so if you are interested in being a Chair or learning more about that, reach out on the mailing list or to the Chairs directly, as we are working on succession planning we changed our rules several years ago so we had three Chairs that were off yet annually, so we never had more than one Chair coming up for renewal. With the future coming we would like to have some more folks involved. On our agenda today we have got our usual operational updates which Ed is going to help out with, Denis has some information about our current NWIs, what's exciting about that a lot of input we have had from the database task force and how that's generated some new things for us to consider. Up first is Maria from RIPE NCC who is going to talk about our legal review of a couple of our NWIs.

MARIA STAFYLA: Hello, everyone. I am the senior legal counsel at the RIPE NCC. And I am here to give you an update on two number NWI, NWI 13 and NWI‑2.

So, before we go into the specific details of each topic, I would like to highlight some underlying principles that we took into consideration when performing the legal analysis of these topics. And they are actually common for both items. So this is why we are ‑‑ I will describe ‑‑ I will take you through them at the beginning of the presentation.

So the RIPE database is meant to contain specific information for the purposes that are currently defined in the RIPE database terms and conditions. If there is a request for new sets of data to be inserted, what we check is whether these data is in line with the purposes, the current purposes of the database. Now, if the purposes have changed, then the community should discuss this, should document it ‑‑ should discuss this, should get consensus via the community processes and document this.

Especially when in the new types of data to be inserted, there is also personal data involved, then additional legal texts need to be performed and we need to check whether the processing of these is in line with the purposes of the RIPE database. This is in order to understand whether this can be in conformity with GDPR. Again if the purposes have changed, then the community should discuss this, should get consensus on this and get it documented before the processing takes place.

Why is this legal review necessary to happen? It is necessary to happen in order to ensure that only data, including personal data that gets inserted in the database, serve the purposes of the database and we do not have something that is not relevant. This is also in line with the data minimisation principle of the RIPE database task force has suggested in their report that was published at the previous meeting. It is also, in terms of personal data especially, this is necessary in order to ensure that unnecessary personal data is not processed in the database. And in terms of legal consequences, this is necessary in order to limit the GDPR and liability exposure of the RIPE NCC with regards to personal data that is inserted in the database. And just to flag that the RIPE NCC has shared responsibilities with a party that is inserting the information, and this is because we are the party who makes the database available for the community to use it for the specific purposes.

Now let's talk first about NWI 13, the geofeed attribute. Some background information about this NWI.

The request was to create a geofeed attribute in order to correlate geolocation information with the use of IP addresses, and the information would be inserting by linking an URL in the attribute that would direct to a CSV file that would contain the geolocation information. This was already used in the remarks attribute. And the question that was raised to us, to legal, was whether the geofeed information could lead ‑‑ could be considered as personal data under GDPR.

So, in our legal analysis that was also shared in the Database Working Group mailing list, we communicated our outcome and we basically said that depending ‑‑ if a registration is reasonably assumed to be used by an individual, then, yes, the geofeed information could lead to the identification ‑‑ could lead to the identification of the user and therefore it would be considered as personal data.

Looking at the current purposes of the RIPE database, they do not justify processing of personal data for geolocation reasons. Therefore, we suggested restrictions to be implemented so that unnecessary personal data will not be processed.

Based on that, our initial advice was to implement the restrictions based on the prefix size; however, there were concerns raised in the mailing list and there were indeed drawbacks with this approach and that's why with our database team, they came up with a new technical implementation solution that is satisfying the legal ‑‑ and is addressing the legal concerns so now the proposal is to restrict the geofeed base on the status of the resource, I could give you more information about this later on.

Taking a step back. We looked also into what other geolocation information is already in the database. So we have the geoloc attribute and the country code and the organisation object and the country code in the resource objects. Some of these attributes are maintained by the users themselves and the country code in the organisation object is maintained by the RIPE NCC is to meant to indicate the legal address of the resource holder.

We looked into the purposes, the current purposes of the RIPE database and we saw that the current purposes could only justify geolocation information in general to be inserted. Ordinarily for scientific research and network operations and topology. We also looked into the report from the database task force, where they recognised there is an active user group for geolocation information; however, they did not establish that geolocation is one of the purposes of the ‑‑ main purposes the RIPE database must fulfil.

So the question is now whether this has changed, because the purpose of the database are not static, they are dynamic and can involve in the time. So if this has changed and if the community believes that geolocation is one of the main purposes the database should fulfil we will reevaluate the situation and, accordingly, we will reevaluate whether the implemented restrictions are still required to be in place.

This is about the geofeed attribute. I will go to the next topic and perhaps I can take questions at the end.

NWI‑2 is about displaying history of objects where available, and in like, in short, to even show the history of deleted objects.

The recommended changes is to drop the restriction of the most recent deletion point, and to allow access to history of deleted objects. When we started going through it, we realised that the scope of this requested change is not clearly defined in a sense which objects are we talking about, are we talking everything that has been deleted or something specific but from the discussions in the mailing list it seems that the main argument ‑‑ one of main arguments that is driving this change is in order to be able to see who was the former resource holder at the specific time. Therefore, we believe that the organisation and resource objects are the most relevant to this request.

Now, we start conducting our legal analysis and we take as the basis that resource holders may not be natural and legal persons. In terms of the contact details and all the details that may have been inserted currently in the database, refer to contact details of resource holders who are natural persons and, therefore, we have privacy concerns, we have also contact details of the appointed contact persons of resource holders who are legal persons and then again we have our concerns. And there are also attributes that are free text and anyone may insert whatever information they like; for example, the description remarks, address, e‑mail, where although they are not meant to contain personal data they might be used, someone might do it as well.

Looking at the current purposes of the database, and again, in line with the logic that we need a purpose in order for this data to be processed, we needed a clear purpose that would justify the display of the full history, and currently none of the defined purpose of the database justifies such processing. Therefore, if this is to be allowed, we would need to apply filtering rules that would filter out any attribute that could contain personal data.

Again, we wanted to see the bigger picture of how this request would affect the rest of the database and what data is inserted in there. We looked into ‑‑ we looked into this matter together with the ‑‑ what other historical information we provide there and currently historical queries on specific objects are allowed. We do apply filtering rules in order to eliminate exposure of unnecessary personal data. And we also looked into what recommendations the task force did with regards to this matter. So the task recognised that there is a need for historical information to be there; however, they made certain recommendations. So they recommended that access to historical data should be strictly to the ‑‑ should be limited to what is necessary for the purposes. They recommended that the community considers the criteria under which wider access to a specific set of data would be allowed to be granted for research purposes to researchers, and also, they recommended that the community considers how to easily demonstrate the change of holdership changes of Internet number resources, especially in cases when resources get split or merged.

Now, again, in line with the same logic that we followed for the NWI 13, and following also the discussions that have been happening in the Database Working Group mailing list the last few days, an important question to be answered is whether historical information is still a requirement to be provided in the database, and if yes, to clearly define what is in scope and what does this include. Do we need all information, what objects is it crucial for the community to be ‑‑ what historical objects is crucial for the community to see there and also, are all the attributes of these objects necessary to be returned in the historical information?

And that's from my side. Any questions? Or comments?

WILLIAM SYLVESTER: We are going to hold off questions until we go through the next presentation, and as we go through, so Denis is going to talk about NWIs and then we will take questions for Maria's presentation then as well.

DENIS WALKER: Hi, one of the co‑chairs of the Database Working Group. I have got my presentation up. We have ‑‑ NWIs is one of these things that sometimes just drags on and on and it's clear that some of them, nobody is interested in, so the Chairs recently recommended we closed a couple of them. One was NWI 1 which was about talking changes to abuse‑c when you have got many abuse‑c attributes spread around your network but basically, nobody was interested, so we have closed that one. NWI 8 had two phases defined, the first phase was about default maintainer, that was implemented some time ago and many people are using it, but then there was the second phase where users could define their own authentication groups. Again it's just sat there for a few years, nobody at all seemed interested in it so we have ditched it.

If somebody suddenly finds a need for it, it can always be opened again but, as for now, it's closed.

Now, historical queries. This NWI was opened in May 2016, so it's been around for a while. As Maria said, we can only at the moment query the data that goes back to the most recent occasion when an object was deleted. That was a completely arbitrary decision, I wrote the spec for this when I was in NCC many years ago, we had no requirements, we didn't know whether anybody wanted historical data so we went for the low hanging fruit, this was the easiest option to get something up and running and tested. Over the years since then, several people have asked for this arbitrary limit to be dropped, and up until that point, nobody objected to doing that so the Chairs recommended that we drop this restriction. Then we have had some objections recently based on privacy concerns, which obviously are an important issue. We have also had a comment about whether public interest plays ‑‑ the RIPE database is a public register or it's the public face of a register, so as Maria was saying about the purposes, what is the reason that we actually hold data in the database? What is the reason for possibly showing any of the data that's no longer currently visible in the database? And does public interest override privacy or does privacy override public interest? These are all issues that now need to be considered. And they are by no means easy questions to answer.

As marry said, it's currently been reviewed by the legal team.

Then there is NWI‑4, the inetnum status, again this has been sitting there since 2016, we get a couple of comments and silence for a while and more comments. We really want to get some of these things wrapped up if we can. This is about how do we assign an entire allocation. This work around has been around for years where you split it in two and you make two assignments. It works but it's not strictly correct. This was raised again in the address policy Working Group this week but it was considered this is not a policy issue; this is a technical issue in the database, how do you physically represent this duality of allocation and assume in the database? There have been several solutions posed recently, some technically not possible, strictly speaking it's software, as Shane always used to say, it's software, anything is possible. But on the cost benefit graph, some of these proposed solutions are so far off the scale that, yes, we could do it but is it worth the cost?

So, the Chairs recommended the simplest solution that have been put on the table, one that came from the RIPE NCC, the suggestion, we have a new status value, I put the name to it, allocated assigned PI but the name can be anything you like. This is not a perfect solution, it doesn't solve all the issues about having an allocation and assignment but at least it does allow you to represent in the RIPE database the fact that this allocation is also an assignment. So, it solves part of the problem. The question is, do you want this? Is it important enough an issue to be solved? Especially now with all the /24 allocations, do you really want/25 assignments or do you want to just simply represent in the database this allocation has been assigned?

Over the years, we have changed the structure of status attributes and the values many times and nothing seriously broke, it doesn't mean to say nothing won't break in the future, but the RIPE NCC's prepared an impact analysis so basically, the option is on the table, do you want it?

NWI‑10 is the definition of a country. The implementation is still ongoing. The RIPE NCC expect this to be completed by RIPE 85. I don't think there's anything else we need to discuss or any decisions need to be made, unless you think differently.

NWI‑12, NRTM v4, Sasha did a lot of work on this, the solution was proposed and agreed, it's now an ongoing IETF process to develop into the new standard. In the meantime, the RIPE NCC will start implementing the IETF draft and hope to have something maybe put up and running by RIPE 85. So again, it's in progress.

NWI‑13 geofeed, the implementation is completed. That was the easy bit. Now, we have these ongoing discussions since we implemented it, especially the legal issues, and particularly as Maria said, the functionality of geofeed or geolocation is not covered by the defined purpose of the RIPE database. Now, some of you may remember at the last RIPE meeting, I did two presentations on the purposes. I was heavily criticised for daring to make those presentations so soon after the task force had completed their work. Somebody even said in the comments that it was inappropriate for me to make those presentations. But this issue about the purposes is not going to go away unless we actually address it. Because we know what the purposes always were of the RIPE database but what are the purposes today? Is geofeed or geolocation actually recognised as a purpose of the database today? But defining the purposes not only affects geolocation but the use of the IFT object, historical data, whether we should have assignments to the database or not, personal data, LEA access to the database, these are all issues that cannot be fully involved unless we have this discussion, the community has to think about the RIPE database in 2022, what is it for? What does it mean? What does it do? We all know who puts data into this database but who takes it out? Should the consumers of this data in any way be recognised as part of the purpose of having this database and having it as a public database? So, we have to have this discussion. You can criticise me all you like for keep going on and on and on about this but until we have this discussion, until we identify what the purposes are in 2022, until we have a consensus from this community on that, many of these issues simply cannot be finished. So, again, it's up to you guys: Do you want to talk about it or not?

MNT‑ref, one of these barely used attributes that most people don't even probably think about but the RIPE database got a lot of information in it, over the years accidentally or misrepresentative of who is responsible. If you don't want to bother handling abuse queries you just link to somebody else's role object and you have all your abuse reports for your network sent to somebody else. You can reference almost anything you like in this database and there's no mechanism to stop it, except this MNT ref which I think only exists for the organisation object. So you can stop somebody pretending to you in terms of the organisation, but you can't stop them referencing your role objectives, personal objectives, maintainers, just about anything so the idea was to extend the optional use of this reference to put it into the person, role, maintain objects, you don't want to use it if you don't want to, it's optional, at least if you are concerned about somebody misrepresenting themselves and pretending to be you, you can stop them.

We have these recommendations for the database task force, we have added another four NWIs now to the list for the issues that are considered to be on the radar of the Database Working Group. These will be the baseline requirements for registration information, use of the RIPE database as an I PAM solution or not, another aspect of historical data which overlaps but isn't the same as NWI‑2, and operational contact information.

So these issues will be discussed on the mailing list between now and the next RIPE meeting.

Now technical.changes. The Chairs have noticed and I am sure many of you have noticed that it's actually very difficult, these days, to get approval or consensus on a technical change. Everybody is busy. We all have busy lives, you have jobs to do, businesses to run, family commitments, holidays, and time to look at database issues. You know which one usually ends up at the bottom of the list. People don't have time to read e‑mails, especially the ones I write which tend to be quite long. They don't have time to read articles, you don't have time to join a lot of the discussions. You are busy. And so some of these issues remain open for year after year after year. And then when the Chairs finally thing that after five years we have had four comments that were all positive and nobody objected, you know we say, okay, let's go ahead and do it. And then we get one or two objections, which sometimes are perfectly valid objections but it makes it almost impossible to solve the problems.

Now, I am going to make a little suggestion. It might be ridiculous, crazy, stupid, completely unworkable suggestion, but I'm just going to throw it out there and see what you think and it might at least prompt somebody to think there is a better way of doing this; if so, please help us.

If we set up some kind of technical committee, like a sort of permanent task force, but with transient members that can come and go within this group. People who may have time to consider an issue, maybe last week you were busy, maybe next week you know you are going to be busy but this week you are on holiday, you are lying on the beach with chilled wine in one hand and your mobile phone in the other. This is the perfect moment to think about the RIPE database. What else are you going to do on the beach? Maybe the week you can join the conversation and help us out a little bit. This doesn't have to be stuck within rigid boundaries of the Working Groups, the Uncle Tom Cobley's Working Group or whatever, because a lot of these issues really don't fit exactly uniquely within the bounds of how we have defined Working Groups. So, eventually, you can perhaps make a recommendation. It is probably completely unworkable idea but it's an idea which I hope will make somebody think of the better way of doing it.

Lastly, NWI V policy, the numbered work items have been up and running for quite a time. The intention was to have a means of managing and documenting technical changes. Sometimes what starts off as a very simple technical change evolves into something much more complicated and there are legal issues, there are management issues, there are all sorts of issues. So we reach a point where perhaps a policy is needed rather than just think of it as a simple technical change. And some example would be the policy which I am going to talk about later on on personal data, this issue of historical data, it was a simple change, we just dropped this completely arbitrary technical constraint, but, no, it's not simple by any means. There are so many privacy concerns, legal concerns, maybe we reaching the point where we should say we need a policy on what exactly is historical data, what is it for? What can it be used for, what should it be used for, what can we allow it to be used for? So maybe some of these issues are now moving out of the NWI arena and into the possibility of having a policy.

So, any questions on this and Maria for the legal issues as well?

SPEAKER: So I will try to keep this short because there is so much here and other people in the queue. I want to apologise I am one of the people who have last minute objections

DENIS WALKER: No problem.

SPEAKER: Too many e‑mails for me in mailing list. One of the main things I wanted to comment on in NWI‑2 you had a question if public interest should ‑‑ if it kind of trumps priority, and I think ‑‑

DENIS WALKER: Yes, or the other way around.

SPEAKER: Yes, and I think it's kind of a special case in that I think public interest generally weighs a lot in some cases in the RIPE database. However I think that's much less true in the historical data. I think in historical public interest has to go way below privacy and privacy weighs a lot higher when it comes to historical data. And personally, I think that for attributes that are ‑ text like description, contact details, there is almost never any legitimate good reason to keep that in historical data, because it can contain personal data and why would you need historical contact information? It's kind of pointless in the vast majority of cases. And I don't think that's at all ‑‑ then also about geofeed. I don't think it's necessarily that geolocation has a purpose, I think the purpose there is to facilitate ‑‑ or to facilitate the ‑‑ basically people wanting to link data to a certain block ‑‑ block of IP addresses in an authoritative way, like it's not necessarily just geolocation, if this was something else ‑‑

DENIS WALKER: There is a way to generalise this, which I said on the mailing list quite some time ago, we could define a purpose as being external services that require access to information in the RIPE database.

SPEAKER: Or basically an authoritative link, because all this is is an URL, it's not even the geoloc attribute where we are putting coordinates

DENIS WALKER: The IoT is not covered by any of the defined purpose of the database but in a way that is an external purpose that uses that data in the database.

SPEAKER: Yes but what my point was with geofeed is that it's just a link to some external data source because it needs to be authoritative in a standardised format.

MARIA STAFYLA: If I may reply to this. I understand that indeed it is a link and the information is not even inserted, the geolocation information itself is not inserted in the database. However in terms of, from a legal point of view, there has to be clear responsibility over who is responsible for certain data, and yes, maybe the information itself is not included in there; however, the link that point to this information is inserted in the database and this is why we did the whole analysis over this in terms of GDPR.

SPEAKER: As many other people said the issue with that argument is that well you could include information for/128 in INET(6)NUM for /29, you can just ‑‑ geofeed file for/128, that's going to be the same thing at issue and like the only way to prevent against that is to block any kind of geofeed attributes in the RIPE database.

MARIA STAFYLA: I would like to understand a little bit the example and I can ask Ed to explain a little bit this outside the ‑‑

SPEAKER: I am happy to talk to you or someone else after this session.

MARIA STAFYLA: Sure. And perhaps, Ed, you can join that ‑‑

PETER HESSLER: I have quite similar concerns about the geolocation restrictions. I was going to give exactly the/128 example as well. Also it's important to consider for many people, the ‑‑ they are not going to publish a street address, they are not going to publish exactly like 8 significant digits of the latitude or longitude, either of a city or country, and in my view one of the purposes of the database is to provide authoritative information about Internet numbers and the location of these ‑‑ of this information is unfortunately a thing that we immediate to be authoritative about because you have various geolocation companies who make up whatever they feel like. I have a PI, PI allocation, where I have seven different countries I am located in apparently and none of them are the correct location and I would love to say, actually, this isn't this country versus this street address blah‑blah‑blah. And this ‑‑ I feel this is an arbitrary restriction based on just the size of the allocation. I can create a geofeed for I believe a /24 or is it only /48 that's restricted?

EDWARD SHRYANE: That's one of the legitimate concerns and one of the drawbacks of the current implementation, prefix size is really hard to get right and to say where we say prefix size could possibly be personally identifiable.

PETER HESSLER: The status suggestion was not described or listed on either set of slides. Can you quickly say what that ‑‑

EDWARD SHRYANE: That was covered on the Working Group mailing list a couple of months ago, I will summarise it again, I will restate what the intention is and what we plan to do. We haven't implemented it already because I wanted to get the discussion to come to a conclusion and take any feedback into account but it's something we need to document properly and have a discussion about. But I will say that the draft, the draft RFCdoes say that the prefixes that are in the CS C file are meant to the relate to the prefix it's referred from and the client is meant to exclude anything outside that prefix.

PETER HESSLER: Correct, but include smaller ‑‑

EDWARD SHRYANE: More specific you mean.

PETER HESSLER: Yes. If you have a /29 you can enumerate all the way down /28s and forward the portals that have a /28 allocation. They cannot ‑‑ they cannot correct this false information that is out there.

EDWARD SHRYANE: We have this legal concern you have a more specific prefix that could be personally identifiable, linked to a geofeed, geolocation could also be personally identifiable.

PETER HESSLER: You don't check had for the large allocations so this arbitrary restriction is, in my view, ridiculous and made up.

EDWARD SHRYANE: We have to go with the legal concern, it's something we can discuss more on the mailing list.

MARIA STAFYLA: I think this is why we mentioned if geolocation is one of the purposes the database must fulfil and this is documented and there is a consensus about that, then we will reevaluate the situation from legal side and maybe then restriction also not be even necessary to be in place because this will be in line with the purposes the database must fulfil.

DENIS WALKER: Just a subtle point on what the meant. You mention the purpose being it to provide authoritative information on numbers but do we have a definition of authoritative information or is it what is generally considered from a historical perspective to be that information?

PETER HESSLER: That is a good point. I don't think that this phrasing has been used with RIPE community before. This is just my perspective on what it should be and whether should is current versus future is an open question.

DENIS WALKER: Yes, but those terms do need to be defined as well.

WILLIAM SYLVESTER: The queue is Erik and Nick we will go to you.

ERIK BAIS: I have a couple of points that I want to address and I will do the one about NWI‑4 last, because I have ‑‑ and we already had a small ‑‑ the topic about the historic data, yes, it's very relevant, at least for us in the work we do as a broker because to see what are the changes in the past and we actually look at that kind of data. So for us, that's valid info if you want to see how we use that, we have that also on public page where you can actually query that whole stuff.

Including the removal of the personal data, so we don't show that, but if you want to see how many iterations this AS number has had or this prefix, it's all possible in the tool.

Then there is the topic about process, can we improve on the NWI process? Yes, please. I'm co‑chair for address policy so I have a preference on a certain way of doing that, and it may not be the perfect way for the Database Working Group, but there are parts in that process that we call PDP which can be used to smooth things out and actually have fixed days, this is the timeline for an NWI process. And whether you use ‑‑ if you call it NWI or PDP or something, whatever, something with more strict structure in there, I think would benefit all of us, because it includes something like an impact analysis and that's where I come now to NWI‑4. I would like to see an impact analysis, a technical impact analysis, if possible, so that we can actually see how can we move forward because I think NWI‑4 is really necessary, it's annoying for customers that have small allocation and want to assign that whole ‑‑ that whole ‑‑ the whole prefix, and not needing to say I need to split it up because I cannot put one prefix with the same assignment in the database because there is GDPR constraints. For us as a broker, prefix broker, we do leasing and some of the leases are 22s and if the prefix itself is a 22 or 24, we have to refer back to cutting it up in two pieces and then creating overlapping route object and RPKI stuff so it is easier for us and technically you know it looks better if you can just have one assignment in there. What kind of status it gets, I couldn't care less, please fix it.

DENIS WALKER: Yeah, so the NCC is working on an impact analysis right now. As for the structure of NWIs, the original intention this would be a quick and simple way of handling a simple technical change, the other thing was to avoid having all the formality and dates and time periods. In practice, it's probably never been that simple.

ERIK BAIS: That's why my suggestion was probably revisit how we do this and from there, you know, with some structure. I will be more than happy to chat with you ‑‑

DENIS WALKER: Perhaps we could take this off‑line

NICK HILLIARD: From INEX. And as Erik mentioned, I have some queries about NWI‑4. The issue as I see it in relation to NWI‑4 it's not about whether there's going to be breakage but what level of breakage we consider acceptable and who deals with that breakage? At the moment we have a situation where, as Erik says, there's issues with creating root objects. The proposal on the table for using the allocated assignment PA will create an inconsistency in the data model used by the RIPE NCC so essentially allocations allocations, assignments within the allocation block are assignments, except if the allocation size is equal to the assignment size, in which case they are this new object, and modelling that is something that will need to be done, it will need to be handled by other software which interfaces with the RIPE NCC and you know whether it's a broker software, whether it's DSIM or IPAM software, hacks are going to have to be put into place to deal with this issue because you are essentially saying there's now a completely new way of referring to a situation where the where allocation size equals assignment size.

DENIS WALKER: I don't think it's a new situation because this has always existed but we generally lie because we put data in the database that masks the truth but just gets around the reality.

NICK HILLIARD: Absolutely. There are hacks out there already, and really the issue is, as I said, is not whether there's breakage but what sort of breakage we want there to be and in particular ‑‑

DENIS WALKER: What is likely to break?

NICK HILLIARD: Well in particular where do we want the landing zone to be, say, 10 or 20 years down the line? What do we want the database model of the RIPE database to be like in that sort of time period? Because this is a decision that we make now because it's easier to do this than to change the RIPE database software, it's a lot easier for the RIPE NCC, but ten or twenty years down the road is this really what we want to have as the long‑term solution.

DENIS WALKER: In 10 or 20 years' time I think we need a new design of data model.

NICK HILLIARD: And if you were able to say that was going to to have happened I would be saluting you ‑‑

DENIS WALKER: I said it ten years ago.

NICK HILLIARD: I am quite happy to stand at a mic and say I would be gob smacked ‑‑

DENIS WALKER: Perhaps we can take this off‑line

NICK HILLIARD: The point is one model, the model that we use at the moment has a certain set of breakage and the model that has been proposed as the solution out sources the breakage from the RIPE NCC to the community software, the model that I proposed on the mailing list in sources the breakage to the RIPE NCC and it creates a lot of breakage and quite a major project, there does need to be a consideration where we want the long‑term landing zone of this particular problem to be.

ELVIS VELEA: I want to talk a bit about history of objects. The IP addresses that have benefited from the RIPE NCC have now quite some history, have been bad actors and blacklisted the various ways, put them in ‑ for a few months and those IP addresses could stay in the blacklist. If historical data about assignment by the RIPE NCC is deleted or not available the current user has ‑‑ it's difficult for him to him they are not the original abuser and of those blacklists, I believe that information showing the history of who the RIPE NCC has allocated or assigned should be of interest and should be available. So, the history of who the block has been allocated in the past should be history that should be available. Historical information about objects that are not allocated or assignments made by the RIPE NCC, I'm a bit torn there because, yes, there is a lot of private information there and that should not be made really available. But I still believe that the information should be made available to the original maintainer, for example historical information about assignments in cases where LIR wants to look for history of when assignment was made to a customer, and stuff like that, but that has been deleted and maybe another customer is using the block right now, maybe that would be a good idea to be able to just make available this information to the original maintainer of that ‑‑ original creator of that info.

DENIS WALKER: Thanks for that, Elvis.

NIALL O'REILLY: RIPE vice‑chair. I think there's more that I wanted to say than time to do it, I don't propose to take it off‑line because I don't have the mailing list as off‑line and if people have said they will take stuff off‑line they mean follow up on the mailing list, please do, that's what I'm going to do.

SPEAKER: Maybe for the historical stuff and maybe also for the geofeed stuff, it might be worth having like some kind of Zoom calls or something at some point where also maybe Ed and, I can't read your name from her, someone from NCC legal team participate so we can have slightly longer more dedicated discussion to those times. Sometimes when it's on the mailing list and it's passing team to the legal team and it gets so ‑‑ so many steps.

DENIS WALKER: That sounds like a good idea. Thank you.


WILLIAM SYLVESTER: Up next we have Ed with operational update.

EDWARD SHRYANE: I am Ed Shrayne, I work for RIPE NCC as senior technical analyst as part of... here the ‑‑ here is database team who are responsible for all of the work since the last RIPE meeting. Thanks to them for their work. Progress since last RIPE meeting. Whois releases, there were two major ones and two fixes for vulnerabilities. In December we implemented the first, the geofeed attribute as you have heard, graceful shut down is one of the dividends from the Cloud migration proof of concept that we worked on and in May just a couple of weeks ago, we added another feature change bug fixes, thing like abuse‑c is required for end user organisations. We are disallowing weak keys and hash algorithms and working on switching from built in to standalone elastic search for the work.

There were a number of outages. I can't say more than has been said about the network issues, but the last one, the connectivity issues on 18 January was caused by DNS change, we have been migrating from a proxy servers to hardware load balancer and the IPs were shared between environments so when we switched we switched production so it caused an outages, it was enough to disrupt DNS ‑‑ the time out and the connected users were affected by it for up to 30 minutes.

Obviously the fix now is to use separate IPs per environment so we had enough to separate that so changes to one environment won't affect others.

Statistics. As before, changes since the last RIPE meeting, we finished all of the abuse‑c validation for last year, over 110,000 addresses, two clean‑ups of the NONAUTH source, the numbers are small but they are continuing, it's now automated and requires no work from the RIPE NCC and causing the NONAUTH source to shrink over time and that will accelerate as RPKI gets picked up. NWI‑8, we are over thousand LIR organisations using that and 9 the open NRTM, over 200 distinct clients using that as of last month.

Person objects, as you all know, there is a lot of personal data in the RIPE database, most coming from assignments and the ten ten LIRs are responsible for about a million of those persons. We inlocked of 600,000 in June 2020 so originally unmaintained, they were locked by us ten years ago and we unlocked them by resigning them back to the LIR maintainer. About two years ago. So the ‑‑ and since then, the LIRs have told us that they are maintaining those objects and they have informed their users that they are doing that. But very few of them have been deleted and few have been updated and we are talking about person objects which are more than ten years old so there is not a lot of change coming into the RIPE database to maintain those. One thing that's been talked about, the remainder of the locked person objects locked by the RIPE NCC, there are 13,000 of those. There are 3,000 referenced from multiple objects but the vast majority of those we have a clear link between the locked person and the referencing object so we suggest that like the phase 1 project that we now unlock those 10,000 we are sure are linked ‑‑ there is a one‑to‑one link with the maintainer so we should unlock those and assign them to the referencing maintainer.

Search UI, as my colleague represented at the last RIPE meeting, we have been working hard on the UI, web query page and the progress since the last RIPE meeting. We have integrated page template that happened earlier last year, there is a consistent look and feel across different RIPE NCC sites. But we have also implemented the relevant filtering in the drop downs after you search for something and also searching by flag, the 5 to 10% of users who are typing flags into that field we now explain better under the field, in realtime how those will affect your results, so a bit easier to understand. The idea is to simplify the page and make it easier for everybody to use this and not ‑‑ so the remaining work though is to start integrate the full text search from ‑‑ they are currently on separate pages, I think the Whois searching is very good for numbers and for resources but not so good for words, for names, and that's where full text search is much better and we can start using that more so that will be upcoming work to try and integrate that and make full use of our search API and related to that, once we have switched to search we will be more free to support better a public API and that's something already been promised but we will open up that so users can use, can query through Whois to elastic search database.

Operational works, there is a bunch of that, grateful shut down, simplifies your environment, means we don't have to script restarts and so deployments and maintenance is way easier. We removed the proxy server, there is a third of our servers in production have disappeared, it makes maintenance work much easier, removes a layer. We haven't missed the proxy servers so I think it's gone well. One casualty of that is the client certificate authentication, we had a test environment in allowed you to use client certificates to authenticate updates. We have to migrate that work over now to the new load balancer we are using, we have started doing that but we haven't resolved it yet. And finally, the elastic search we need to set that up operationally as well so we can start using it and drop the built in we have seen.

UTF‑8, the Working Group asked the RIPE NCC to do an impact analysis. Technical and functional impact of how that would affect the RIPE NCC. I published a labs article, I would welcome any feedback. I think the next steps are to ask community for more functional requirements and where exactly should UTF‑8 be added, it's not a technical added but there will be changes needed across the RIPE NCC to support this in processes, the internal registry and the RIPE database itself.

White pages, this came up, Cynthia pointed this out a year‑and‑a‑half ago, I made a proposal at the RIPE 82 ‑‑ sorry, white pages, there was a suggestion to deprecate RIPE pages. It's not used very much, there is four pages to the white pages organisation so we have been asked to deprecate that, we have removed the documentation, removed the org type from the software, e‑mailed the persons referencing the organisation and we will plan to finish, clean up, delete the object once we have made arrangements with those persons.

The alternative to white pages we have been using on request, we have been adding an exclusion to the clean‑up and reference to object job. The problem is there's no purpose for doing that so we have been doing to assist our users to the person objects that they are using outside the RIPE database, referencing they don't get deleted but we need to know from the community whether we should continue doing this. Is it in accordance with the purposes of the RIPE database to make these exclusions?

Default maintainer, as I said at the last meeting, we are making changes to that. One thing to say is that it's optional, there's nearly 1,000 LIRs that don't have a default maintainer. We need to properly ask the for that in our software. We need to extend default maintainer where it's used to all of the top level resources for an organisation, including a PI and aut‑num, and there's been cases where internal users are using override to change objects without using set to default maintainer so there's two processes going on in parallel and we need to resolve that. One of the issues is for closed LIRs, the default maintainer isn't unlinked from the organisation so we need to resolve that too.

Upcoming changes, apart from the numbered work items, missing history, we have been aware of and in particular NWI‑2 makes it more urgent that we look at it, there is a small amount of existing objects that are missing some history from ‑‑ between mid‑2001 when the current iteration of the database was created and 2004, it affects 23,000 out of 8 million objects, so small amount but it does have an impact on those old objects. It affects querying for version history so generally the first version of the object would be missing. How does this happen? It's been a long time, what I have heard is that possibly Whois, because it was quite a systematic at some points it's possible Whois did this automatically. Also there is a break‑up script that accidentally deleted data so since database is a registry and it is used to as a feature that's used, we should have a complete history but there is ‑‑ we had a plan to do this, we looked at it, it's possible to restore the missing versions from the update logs which we have for that time period and we can cross‑check with our internal registry which does have full version so there is no gaps internally.

RIPE database documentation, this is something we would like to do to modernise it. It's currently static text on the main website with lots of legacy pages scattered. There's already been an effort to modernise the documentation, already been done for RIPE stats so we have an example. We would like consistent UI across all applications, built in search facilities, from within the page, also easier for the team to get the documentation updated. Currently through a website tool to update each page manually. We will have a pipeline that we can release a Whois release with co‑changes and deploy them at the same time. We are planning to do this after this RIPE meeting and combine all of the other legacy database documentation into this new site as well so you will have one site to look for all of the RIPE database documentation. Whois tags, apologies earlier, this is ‑‑ the suggestion from Cynthia, from December two years ago, the tags, it's a very ‑‑ it's a feature, Beta feature we added in 2013 to add metadata around objects, it's rarely used and shows up in the dump files but it's not queried for, we have four query flags, very rarely used, and the tags are no longer updated, they haven't been updated in years so I suggest we remove the tag feature altogether. I will make a suggestion on the mailing list and anyone who has an objection to this please follow up.

Service criticality, so I think this is my final slide, as ‑‑ my colleague Kaveh has presented yesterday, the components of the criticality will include availability, confidentiality and integrity, we give these areas a rating, between low and very high, we will be publishing something specific for the RIPE database in the coming weeks and we will ask the community to give us your feedback on how you use the RIPE database and how critical do you find the service.

That's it. Thank you for your attention. Any questions?

CYNTHIA REVSTROM: I forgot what I was going to say. Never mind.

EDWARD SHRYANE: I am happy to follow up afterwards, thank you. That's it, thank you very much.


KAVEH RANJBAR: The tags said they were my brain child in 2013 when I was working closely with database team, I big plus ‑‑ want to deprecate.

SPEAKER: But just ‑‑ Harry Cross, representing myself. Thought a thought about the whole Unicode UTF‑8 situation, in the light of increasing GDPR requirements is it not important that we think about whether we need to support Unicode to handle scripts like ‑‑ characters in the database but on the flip side of that I am slightly concerned by the broadening character set may lead to pollution of the database, so to speak, and I would be interested to see if there was some sort of balance that could be managed?

EDWARD SHRYANE: Absolutely and it's something I kind of touched on in the labs article, interoperability is also very important, that we need to make ‑‑ make sure it continues to be usable but it's something, there are some technical solutions we can support a subset and do normalization and exclude certain characters, they all need to be taken into account in implementation. Historically we only support ASCII on names so it's tiny character set it's how the database ‑‑ the character set is default in the database.

SPEAKER: Yes, and also the whole GDPR thing, I don't know whether the script that something is written in matters for the accuracy or whether, as long as there is something in there that represents that name?

EDWARD SHRYANE: Yeah, and whether we have in the implementation two attributes maybe and one is normalised or an ASCII and one is the correct name, whereas ‑‑ or we have one attribute which has the UTF‑8 or Unicode coding it's up to the community for feedback.

Harry: I know referencing the BGP tools talk earlier, I feel ‑‑ whatever is going to happen is probably going to make Ben's life a lot worse.

CYNTHIA REVSTROM: I remember what I was going to say. So, it's about the ‑‑ white pages clean‑up and how if we should support those kind of ‑‑ well basically unreferenced objects not being cleaned up and there is at least one case where that is and I think it's anchors in the RIPE Atlas, I feel like that's totally valid use case for contact and not have a separate system for contact information so I feel this needs to be supported at least in that case, I am not sure if there are other cases.

EDWARD SHRYANE: One other case where it is used, there is a reference from a domain names so from domain registrar to reference a person in the RIPE database, that was another use. Cynthia: I am not sure about that, that's probably also correct at least for RIPE Atlas another RIPE NCC service.

ORN ORRASON: RIPE, at least in those cases with RIPE Atlas it makes a lot of sense to not allow them to be cleaned up.

WILLIAM SYLVESTER: Up next we have Denis who is going to talk about personal data in the RIPE database.

DENIS WALKER: Hello again. It's still me and still Denis Walker.

Personal data in the RIPE database. Privacy has been a common theme I think for this session today. A brief summary:

I published this policy proposal on 10th May. Personal data has always been entered into the RIPE database. The question is in almost all cases, personal data is not actually needed. The only time we really need personal data is when a resource holder is a natural person. Every elsewhere we'd prefer business details to personal details.

Now, it's often been said, yeah, but the data subject gave consent and it's fine publishing personal data if somebody has given their consent to it. But this is an open, public database. It's available globally to anybody who is connected to the Internet. So once you publish information to this database, it's out there, it's in the real world, you can't take it back, it's too late after you have published it you think ah, I have some privacy issues with putting my personal details in this personal database. It's too late. And most people don't realise this. Yes, you guys, the operators of these systems, you know what this database is, but when you get consent from your end users, your customers, whoever, to put their details, do you really go into detail to explain to them the consequences of publishing their name and address into this open, public database? So, really, unless it's absolutely essential for the purpose of the database to publish personal information, it shouldn't be there.

Now, the ‑‑ the main function of the RIPE database is to be this public face of a register. The registrar needs to record who holds blocks of IP addresses. This is the main reason the database existed. Even if it's a natural person. The issue of publishing the addresses of natural people is a little bit more restrictive, but in any case, even if it's a public register, do we really want to publish the name and address of a natural person? We could go down the road of like domain registries where you can tick a box to say please don't publish my details on your registry, but you will get some comments saying that the details are held by the registry but not necessarily public. We can go down that road, but you also then have to understand, there are consequences of doing that. People and organisations try and handle abuse issues on the Internet. The police and law enforcement agencies try to handle criminal activities on the Internet. They need to know who is using their blocks. So, if we decide to take some of the information that is now public and make it semi‑private, we have to think later about how do we allow people to have access to the information, who do we allow? Now that's out of scope for this policy proposal. But the Anti‑Abuse Working Group needs to be aware that, further down the line, if, in the end of the day, we decide that this information shouldn't be publicly available, they are going to have to consider how do abuse handers, how do LEAs actually get access to this information, so it's not relevant to this policy proposal, but it is going to be one of the consequences if we go down this road.

Phone numbers and e‑mails don't need to be personal information.

Contacts. Again, this database is built on the idea that it was the contact database for solving operational issues, abuse issues, administrative issues, but contacts should be roles, they should not be personally identifiable people. The contact name of the role objects shouldn't be a person's name. And very particularly, contacts need to be able to fulfil one of the defined roles according to the purpose of the database. This is not a customer database. Know some large Telcos, they have listed their entire customer list in the RIPE database, everybody that connect to the Internet with a single IP address is in there. They have got hundreds of thousands of person objects. This is not what the RIPE database was built for and I don't think we are ever going to define the purpose as being you can put every single customer's details into this database. So, unless a contact can actually solve a technical issue, an administrative question or abuse problem, they should not be in this database, with or without personal information. And again, phone numbers and e‑mails of contacts don't need to be personal data, and in a moment, the address is a mandatory attribute to the role object but do we really need the address of an abuse desk? Is anybody going to write them a letter or call in because they happen to be in the neighbourhood? I don't think we need addresses for contacts.

I also added in this issue about verification. As I said earlier, with the MNT ref, anybody can reference your role object, your maintainer and pretend that you are responsible for their data. The MNT ref gets around that aspect of it but then you could create a role object and copy the information from my role object so you can still use my own number and my e‑mail address. That's why I think this information should be verified. If you are going to enter a phone number or an e‑mail address into this database, you should be able to verify that that is the correct phone number or e‑mail address. And basically if you can't verify it, you shouldn't be entering it in the database.

This change is not going to happen overnight. It's going to take some time, probably years, to achieve full compliance with this policy if it's approved. But we have to start somewhere. As Ed said, 1.9 million personal data sets in this database, who on earth are these 1.9 million people? How many of them know that they are in there? How many of them do we really need? Probably ten of them.

So, we have this mindset, and we have always had this mindset that it's fine putting personal data, it's the RIPE database, it's the registry, Joe Bloggs, whatever, wants to be seen in there, it's kind of a bit of a badge of honour, or used to be, to have your personal object in the database. But in most cases we don't need it; it shouldn't be there. And with the quantity we have and it's still growing over time, we still are getting more and more and more personal data sets put into this database, it's just the wrong place to be putting it. So we need to change the mindset. That is much more difficult than a technical tweak. Unfortunately, your brains are not connected to our systems at the moment so we can't just reprogramme you. But we have to start.

Just one administrative issue, obviously I'm a co‑chair of the Database Working Group, I am putting forward a policy proposal, I will have no involvement at all in deciding whether any consensus is achieved on this policy proposal. The co‑chair William will be doing that, and Erik from address policy has kindly agreed to support William, because Erik has a lot of experience dealing with policy proposals. So we wanted to make that clear, that there is this cooperation going on between the Working Group co‑chairs to help determine whether there's a consensus or not. So, questions?

PETER HESSLER: I have a bunch of comments and I will try and be concise. First one is the question about law enforcement agencies, and my response is get a warrant.

As far as personal data in the database, just because we did publish it doesn't mean we can stop ‑‑ we can't stop publishing it, deplatforming information does work, we have seen that quite a bit especially looking into white supremists, removing their platform does remove their reach, removing personal information from RIPE database does help quite a lot. As far as the requirements to set a role and not a person object, I have had situations with ‑‑ especially small LIRs where the department is Dave, the department is Bob, it is a single person and there is not even a department e‑mail or phone number for that, they would have to publish their personal e‑mail address or personal work e‑mail ‑‑ phone number, in the database, for this, they can't ‑‑

DENIS WALKER: I would question that, because even if Dave is the abuse desk, the tech desk or whatever, do I need to know it's Dave or do I need to know that this is the phone number or e‑mail address for the technical desk and whoever happens to answer that phone will be able to answer the technical ‑‑ I don't need to know it's Dave

PETER HESSLER: True, you do see it's David at company.example versus abuse desk at company ‑‑

DENIS WALKER: You can set up e‑mail addresses that are not personal. VoIP routed to somebody's phone ‑‑

PETER HESSLER: Some companies you cannot do that, and there was no way at all you could arrange this and I tried, and it's completely failed. Because in those cases it was my work information. And I have privacy concerns that I have established on the mailing lists.

The other suggestion is the phone and e‑mail confirmation, I think this is a bonkers idea. As an example, I have never had a fax machine at the company I worked at and I don't know if it's still true but I have had some objects where a fax number was required to be published and I had to create 00000 etc., just to comply with these ‑‑ with this requirement. It is clearly false. But if I didn't have a bogus fax number I could not update the object or create the object and those are required for operational purposes.

DENIS WALKER: We should look at whether we should require fax numbers in the future.

PETER HESSLER: That's very true but there are other cases where a phone number is not relevant in any case. Or an e‑mail address may not be relevant. And requiring validation of this is silly. Like, you call someone, I don't recognise the number, I am not going to answer. You e‑mail somebody, I assume it's spam.

DENIS WALKER: Are you saying it's perfectly fine for you to enter my phone number if you happen to know it?

PETER HESSLER: It is better to not have a phone number field that requires validation and somebody calls you how do you know what they are asking is the truth? How do you know their responses are accurate?


CYNTHIA REVSTROM: I mostly agree with pretty much of Peter's points. However, I guess it's not really the validation that I have an issue with ‑‑ I am not sure what fax and also I am not sure if we want to keep supporting person objects being created, still require phone number and that feels a bit dumb, for the reasons mentioned, sadly I think making it clear that this is one person's individual e‑mail address is still useful to be able to mark that in the RIPE database so making everyone is role objects even when just individual personal data I don't think that's something we want to do. I think we want to know when there is personal data.

DENIS WALKER: I think we want to know that there isn't any personal data in the mailing list

CYNTHIA REVSTROM: If there is which there in some cases will always be, I think we at least want to know that there is. We don't want to pretend like it's not. And then ‑‑

DENIS WALKER: That's why I say we need a mindset change.

CYNTHIA REVSTROM: To remove the phone number requirement and remove the address requirements for role and personal ‑‑ but I can't in many cases that would be needed, make it up, allow people to put in an address but don't require it, because it's like who would send the post to an abuse contact?

Leo: I broadly agree with you. The database is hugely polluted by a lot of irrelevant person data. I also broadly agree with you that this is going to be a massive project, it's not just show a change of mindset; it's a change in a lot of software, at a lot of organisations around the region that automate, and so this isn't just a RIPE NCC thing; this is a RIPE community and RIPE NCC member/customer thing and it requires a lot of work and we should look at that significant workflow issue before we even start looking at the technology, and finally, we also need to look at the fact that phone numbers might well be going away in the not hugely distance future and postal addresses, well, the daily postal delivery is already going away in parts of the region and we can't expect that postal delivery is going to stay forever so we need to at the whole contact model.

DENIS WALKER: Yeah, I agree:

ELVIS VELEA: The data registered in the RIPE database becomes outdated and stale, a lot of the remains after this has been modified or deleted, a lot of that data is also personal data so I think if you come up with a solution to delete ‑‑ maybe I don't know require the LIR to confirm the object is in the database, i.e. 6 numb, route route, ROA, whatever it's still needed, once a year, once every five years, we will figure out a way but I think we need to somehow find a way to clean the database because it's full of objects that was created in 1995 and have not been used since 1996.

DENIS WALKER: Well you can't be sure it hasn't been used. It might actually still be in use and perfectly valid. There are objects that have been there 20 years.

ELVIS VELEA: Still update and still valid and still needed in the RIPE database ‑‑

DENIS WALKER: I think following on from Leo has just said, if we are going to start asking all our members to annually confirm that all their objects in the RIPE database are still valid and should still be there, I think you had better put your hard hat on and duck while the bricks are flying.

ELVIS VELEA: If you don't do it once a year you do when RAC happens ‑‑

DENIS WALKER: I take your point.

SPEAKER: Matthias. So I do acknowledge that removing full addresses for individuals, that makes sense from a privacy perspective. However, we need to keep in mind that there are multiple people Working Group the same name and at least having something like a city or something on there might help identifying who is actually meant by the object. I do believe that RIPE actually removed the address line on at least some organisations owned by people. So, yeah, that might be a compromise and make sense to do

DENIS WALKER: Thank you.

CYNTHIA REVSTROM: I just want to clarify I think while yes, it does make it difficult to identify which exact person it is, I think it's kind of the point, we don't want you to be able to identify one exact person because it's just contact information, not information meant to identify an individual. Thank you.


PETER HESSLER: I did want to comment on that specifically. So as you know I mentioned on the mailing list I am keeping my address private for personal safety reasons. I personally have no issues with RIPE NCC having my home address as private registration information and then keeping it private internally. I also generally don't have a problem with having a city or regional information available, so I could be identified as being that Peter Hessler for other questions in other searches that could be made based on that.


PETER HESSLER: So I feel my, especially living in Berlin, you know, quite a few million people living in ‑‑

DENIS WALKER: If you lived in a little village that might be a different situation

PETER HESSLER: True. And there should be a balance between the two and I'm not entirely sure the best way of doing that. Peter Hessler Germany could be some and having a unique identifier, somebody who needs to go to the RIPE NCC for more information could provide and if they have a valid warrant or other reasonable information, then they could get it.

DENIS WALKER: Thank you. So ‑‑

WILLIAM SYLVESTER: We have one question online and we are going to wrap up, this is Florence from MPI regarding the get a warrant, please keep the possibility to publish public contact data as I prefer to get a call from the police, writing in with a warrant in taking all devices with them, which is what police do in Germany when they get a warrant on IT crime.



DENIS WALKER: Please keep your comments and thoughts coming, thank you.