These are unedited transcripts and may contain errors.

DNS Working Group session, 26th of September, 2012, at 11 a.m.:

CHAIR: Good morning. It's about time to start, and so, a little bit of time to go. Hi, I am /KWRA*P, one of the Chairs of this Working Group, the two other ones Jim and Peter is running around somewhere, and this is the first session of the DNS Working Group, and first let it be known that everything is recorded, filmed, there are secret microphones on your chair, so if there are things you don't want to be known by the rest of the world, don't talk.

Also, the other thing is that wave mailing list and if you don't know what it is, it's the usual stuff. I don't know why it's there. The minutes has been posted on the mailing list. We didn't see any comments and so I guess we can say that we now ? the minutes of last time. And apologies for the slightly late appearance of the minutes. And again, if you are talking at the microphone, please state your name loud and clearly. And then I guess this is the final agenda, although as usual, things are slightly different; we are using dynamic updates for getting the agenda items up and the it is not really up?to?date yet, we have to swap OpenDNSSEC to benchmarking effort, and if there are no any other questions, I really would like to ask Shane Kerr to do his bit. Oh, there is a small problem: Some of the slides actually ended up in the wrong sessions, but we hope we have them all completed now.

SHANE KERR: I am Shane Kerr, I work at a company called ISC, we make software called BIND which does DNS. This presentation was actually going to be given by Ondrej Sury but he had to leave the meeting early. And what I am going to talk about today is some work some of the people making DNS software have been doing to try to come up with some standard ways to measure performance in DNS software.

So, my understanding of how this whole activity got started is that cc nick has produced a new authoritative only software called knot and as part of this activity they did some benchmarking to see how fast it goes, and my understanding is that Ondrej talked to his boss and said our software is really fast, and then Ondrej said well but why should I believe you, of course you say that your software is fast? Which is a really good question.

So, Ondrej scratched his head and thought yeah, that's a good question, maybe we should come up with some sort of more inner operable way to benchmark these things.

So Ondrej talked to you, I believe he talked to Joe and they were talked about this and said, well, why don't we get together and come up with some benchmarks and ways to measure the performance. So we had a chat over dinner with some DNS developers from, well you can see on the list here, we had BIND developers from ISC, people from NLNet Labs and some of the developers from CZ.NIC who make the new knot name server.

We also had a few people there who run large name servers, ccTLDs and route name serve operators and we had a nice dinner and agreed that we should get some performance benchmarking and probably we should also do some conformance checking on the protocol level as well. The idea here is not to do a comprehensive check but just to check that things that people actually use and make sure that we behave in somewhat standard ways and things like that.

So, the next thing that happened was NLNet Labs went ahead and said you can all come and hangout at our place for a couple of days and so they invited a bunch of DNS developers and we spent two days together in a room in Amsterdam talking about this stuff, and so we had developers from KNOT, which is cz.nic, YADIFA, the dot EU folks and of course the NLNet Labs developers there too and and I was there from ISC and we make BIND.

So, we spent a long time going over the way we currently test our software, because everyone has slightly different approaches and talking about the current problems and status of our software and then we agreed on our next steps. And those are basically, we set up get repository, we are going to do this in open and transparent way, we also defined what we think are going to be our initial scenarios and we do come up with some software to do this testing and they are not too fancy, basically what you expect, we want to do a delegation only zone so something that the, would be useful for ccTLD operators and root server operators and things like that and sort of a company that does DNS hosting, so basically what they are thinking here is lots of smallish zones instead of one or two or small number of really big zones. And they wouldn't be delegation?only, they would be lots of A records and MX records and things like that.

And we also thought well compliance tests would be nice but that's not going to be our initial focus, so looking at the protocol for problems is important but that's not the initial focus. The initial focus is to come up with a couple of simple tests. We talked through a number of different ways this would be done. I think what we are going to end up doing is having both a test suite that you can download and run on your own systems and I would really like to see an easier way for an administrator to know performance servers than to download and install four or five different name servers and test them. So I think it would be really good if some organisation were to do benchmarking, say, every year or every six months or something like that, of the latest version of all the popular name servers software and just put it on a web page somewhere. We don't have any volunteers for that, it probably shouldn't be any of the people involved developing this software, just due to conflict of interest reasons. We don't where that's going to fit.

What are the next steps? Well, our intention is to have everyone contribute to this, git repository, I haven't checked it so I don't know how much code is in there now. And we are going to be talking later today to see what our next steps are with this it would be nice to have more involvement from vendors, the people who make PowerDNS which is opensource DNS software were invited to our last meeting in Amsterdam but they were invited like a day or two before we started so they weren't able to attend, we would still like to get them involved, I think it's opensource, there is no reason not to participate in this effort, and it would also be good to have some commercial vendors, so if you work for a company that makes a prioritiry DNS product, you can also participate in this effort. I don't know what the feelings of commercial vendors are as far as this kind of thing. I know some companies have contractual requirements that you can't publish benchmarks of their software, so if you have things like that, talk to your legal department and tell them they are stupid and get that taken out so you can join us. No one will never know how fast and great your stuff is. I think that's about it.

So I just wanted to give you an update. Hopefully in the next meeting we will have some stuff you can download, some numbers we can public and things like that. That's it. Are there questions?

JIM REID: Thank you, just some guy. Shane, is there anywhere we can find out information about the development of these benchmarking efforts, do you have a mailing list or website or anything that any of the public can see what is going on?



SHANE KERR: It's not meant to be a closed effort, but we ?? well, I personally don't want it to be something where everyone who runs a DNS server in the whole world feels like they get a vote. This is really a vendor activity, so if you have specific requirements or numbers that you want to see that are crucial for you to be useful, I would love to talk to you about it.

JIM REID: That's what I was thinking, we could articulate some requests for demands for benchmarks you would be interested in seeing.

SHANE KERR: I expect we we will set up a web page and mailing list at some point. As I tried to indicate it's in very ad hoc and very informal.

JIM REID: OK. Thanks.

DANIEL KARRENBERG: Name server tester like a decade ago. Two questions: One is, which direction are you going in here? Is this just sort of kind of local benchmarking on one system against a named server that runs on the same system, store more like distal thing which tried to set up a network, run queries against the name server that ran on one box and find out what the answers were and the performance was? That's the first question.

The second question is: Are you also looking for realistic test loads that might come from actual name server operators that give you, hey, this is what we are seeing right now; for instance, I am thinking about reflection amplification attacks and things like that.

SHANE KERR: I will answer your second question first, which is that of course real world query traffic is the true measure, and we would like that. However, I think it's important that people be able to reproduce these benchmarks in their own environment, so I'd like to make sure that we avoid private query data that can't be published. I think right now we are thinking of synthetic data that tries to look like real world data. I don't know of a way to get real world data that doesn't have any privacy concerns. We can try to anonymise it but that's always a risky thing. Unless we need to I think we are going to try to avoid that. We may end up going a different way. And as far as the actual test scenario, how it's going to run, I don't know; I envision it as a software that can download a tower ball and run on your own lab, properly need machines for authoritative testing, maybe three or four machines in order to get the query load high enough and that's basically it. And then, as I said earlier, we would like to have these published regularly but I am not sure how we are going to do that.

AUDIENCE SPEAKER: Emelio from RIPE NCC, I have a clarification from Ondrej to the question by Jim, he said the more information available on ?? in your ?? on the mailing list, DNS?benchmarking at list dot DNS?org

SHANE KERR: There is a mailing list which I am probably on. Sorry about that.

CHAIR: Thanks. It looks like we are done, and last question also ?? so thanks, Shane.

SHANE KERR: Thank you.

CHAIR: The next presentation will be from Sara Dickinson about OpenDNSSEC and the slides are up. I love dynamic updates.

SARA DICKINSON: So, what I am going to present today is a project update for the OpenDNSSEC project. I will start a little bit of background, so what is OpenDNSSEC? It's a turnkey solution for DNSSEC. It's role is to automate the zone and key signing management tasks involved with running an OpenDNSSEC set?up. It doesn't have a name server in it so it doesn't serve the zone but what it does do is attack an unsigned zone and generate a signed zone for you. It's open source software and distributed under BSD licence.

The key features are that it's flexible, it's a policy?driven concept, we have something called the K ASP, key and signing policy and that allows to you define a range of parameters as to how you want your zones signed. It's scaleable, you can Tuesday on a single small zone or it will scale to thousands of zones or to a single very large zone with millions of records in it, and it can also support key sharing between those zones to minimise the number of keys that you need to manage. It's inherently secure and has out of the box PKCS 11 support and as part of the project we provide a software implementation of a security module that can be used alongside the OpenDNSSEC solution.

In terms of the organisation behind the product, we have a number of organisations that contribute various resources into the project. We have an architecture board that's made up from individuals in some of those organisations and also in organisations that use OpenDNSSEC. We have a project team who are in the trenches doing the day?to?day work, and alongside that, we also have a not for profit company and the remit there is to try and secure funding to support the project going into the future, it also operates services such as training, consultancy and also long?term software support.

So, to move on to updates. The latest stable release of OpenDNSSEC is one dot 3, we are currently on the 173.10 release. 1.3 has been out for just over a year and it's stable production code running quite happily. The companion soft data release for that is 1.3.3. What we are currently working on is the 1.4 release, we at the beta stage of that. The major developments in 1.4 are firstly that we have added input and output adapters to enable zone transfer through either file AXFR or IXFR and that's integrated into the product now, so that's quite an extension over earlier versions.

A second thing we have done is remove the integrated auditor that was available in the product up until 1.3. One of the major motivations behind this was to remove the dependency on ruby from the product and also there is a growing number of tools that you can use to externally validate your zone and there is an argument to say you are better off using an entirely different product to generate your zone than you do to validate it.

Also, in 1.4 there is a PIN storage facility, so for the paranoid among you you no longer need to have your HSM pin in clear text in your configuration file. And something that wasn't on the original roadmap for 1.4 but has been added in is a multi?threaded enforcer option. This is currently only supported with the my SQL database back end, we hope to resolve had a, and this is really targeted at increasing the performance of OpenDNSSEC, obviously it's quite system?dependent but we would encourage people to pick up this beta and play with it and see what that option can do for you.

The next stage in the 1.4 is we are targeting a release candidate probably in mid?October. (1.4). So alongside that effort there is also development on soft HSM 2.0, we are approaching the alpha stage with that development. The major change there is in the internal architecture where it's been designed with the ability to have plugable crypto libraries and currently in 2.0 that framework supports boat an and open SSL, there is also a number of improvements to the security of the application and if you are interested, there is a whole heap of information on your WIKI page which should be linked from there.

In terms of few tour roadmap, the next significant chunk of work is 2.0 release which is scheduled for next year. OpenDNSSEC in its architecture which two components, there is an enforcer component which processes the policy instructions into specific signing instructions forgiven zone and a separate component which attacks those and does the signing. In 2.0 we are refracturing that with a number of goals: One is, again, increased performance, we are looking at how this product scales, and the second is more flexibility in roll?over mechanisms in the ability to roll algorithms, in the ability too far single key as your zone and key signing key, and another feature which has been quite highly requested is support for just passing through unsigned zones and that's going to be possible with 2.0 as well.

The core of 2.0 uses some interesting theory in trying to model key roll overs in a very robust fashion, if you want some light bedtime reading there is a white paper linked to that which describes in some detail the theory behind 2.0 and how we achieve the level of flexibility in terms of algorithm management that we have been looking for. Also, there is an appeal for alpha and beta testers so if there are people out there who are specifically interested in either the performance side of this product or in exercising the different algorithm options, please do get in touch with us.

Beyond that, the kind of things that we are looking at are extending the adapters to support input and output from databases as well, we are looking at supporting dynamic updates in a future release and we also would like to generate a common API for the system. As always, we are keen to have your feature requests so get in touch with us and please attack a look at our website, there is a lot of information there about the product and the team behind it. So, I do encourage you to read that. That's everything for today. Are there any questions?

CHAIR: Thank you, Sara. Questions?

JOHN BOND: Hi, John Bond, RIPE NCC. You mentioned that you have removed the auditor, will you continue development on that as a separate product or is it just dead now?

SARA DICKINSON: In terms of being shipped with OpenDNSSEC, it's nolonger going to be part of that product. I mean, the code is still there but the issue is as the product evolves, there is a maintenance effort if you want to keep the auditor and particularly with the move to 2.0 the feeling is that effort is too high to warrant, really.

CHAIR: Any more? No. Personally, I am doing the port for FreeBSD, try to have it ready the moment it comes out in official release so people can pick it up directly. Anyway, that's it, thank you, Sara.


CHAIR: And now we move to RIPE NCC, the DNS update, a lot of things change, small things, big things and Romeo is going to till everything about it.

ROMEO ZWART: So good morning, everyone. I am one of the small changes or big changes, whatever, in this part of the RIPE NCC.

I am not going to dwell on that but you will notice, have notice add difference with my predecessor Wolfgang. I have prepared some slides for you this morning and there is a lot of regular stuff, some updates on K?root and the other DNS services and what we do there, regular current status there.

I am going to talk a bit about changes in the DNS delegation checking software that we have implemented and that I think we have improved significantly upon what was there before.

Then, obviously, we have to spend a bit of time on our thing we had in July, the Reverse?DNS out age. And then I think the most interesting part of the session is with the last bullet, the ?? some ideas where we would like to float with you on how we deploy K?root. And I am going to push through the first three bullets on this list in a fairly high pace. At least I intend to, and to focus on the last two bullets and stay within the given time.

So as far as K?root operations is concerned, this is fairly, we could say this is as usual now. Operation is stable, we have been running 18 instances for a while and as you can see in the regular daily traffic pattern this is not quite uncommon to what we have seen over the last periods and that has been reported in previous RIPE meetings. We partly performed some regular maintenance and system updates over the summer, and there is more of that planned in the remainder of the year.

And as far as future plans are concerned, the most interesting one is related to the idea that we have float with a few people, discuss with a few people, that's the member K?root instance and I will be coming back to that extensively at the last bit of the presentation.

With regard to other DNS services, obviously that's bit more than K?root; we have, as you most likely are aware, in the last year migrated off from a system into anycast cluster which is located or distributed over longer than Amsterdam and we have migrated all of our other DNS services to that cluster as well, so that's including Reverse?DNS, secondary services etc., ccTLDs. We are planning an additional third instance to have a better geographical spread as well as having, well, increasing the resilience within the platform that's already there.

Some statistics about the other services that we have been running for a while: Secondaries for ccTLDs, we have 76 zones in there, and the that use our secondary service, that's not a big change from last time, actually we have lost one, and that's related to the next bullet where we have now completed the migration of to the new cluster that we have in place as I mentioned just before the anycast cluster and we have lost one secondary zone that was not as maintained as was useful for us, or actually, well, no ?? this is complete way, the complete wrong expression of what I wanted to say. The ccTLD that we are supporting was actually because of the fact that it wasn't updated in the way that we ?? halves necessary, we couldn't migrate it to the cluster and it kept us running for more than little while.

As of July, we have clustered that and migration a to the new serve platform, to the anycast cluster is now complete.

With regard to ENUM there is a bit more on that in the ENUM Working Group tomorrow but the very, very high level summary of that is that we now have 25 delegations there, and we have added two over the last months.

DNS update, sorry, DNSSEC status, we currently have, as you see here, 126 signed zones and that includes and other reverse zones as well as some zones in ENUM. Basically, all the relevant zones are now ?? have a chain of trust and that is running along smoothly. There is a key roll?over planned for November, as you probably are aware the frequency of that has been changed over some couple of months ago. And that's now a yearly event and we have planned that for November.

And total count of signed reverse zones we now have is 641 and compared to what we have seen previously, that means we still see a modest growth in that but it's not really taking up really in a high number.

We have changed our DNS delegation software. DNS delegation checking software. As you, you know, or as you may know, the DNS delegation checks are run automatically whenever you do delegation update in the RIPE database. The same software is also available via a web URI. The we used to have some home?grown code that was hard to maintain, a code from several years back and we have now moved on to other code that is built by the Swedish registry which is a lot ?? well, it's a lot more modern, it's a lot cleaner and it's basically much better maintainable that be than the previous code that we had.

As I said, we have this, the delegation checker software is being used automatically whenever you do a delegation update in the database but also it's used, it can be used by you via a web front end, that's ?? that's going to the same packet.

OK, so now for the July incident, we had a slight mishap there or ?? well, we don't need to mince words, we actually fucked up there badly and everyone is aware of that. We are aware of that as well and that's why I put it very, very explicitly we make sure there is no mistake on our part, this is really clear. We have presented to you with very detailed outline of the sequence of events that led to the damage that was done during that period. I don't intend to reiterate that completely here; it's been discussed and presented on the RIPE Labs and it has the ?? pointers have be sent to the DNS Working Group. If you are interested in all of the detail, please go there or come and talk to us, that's fine as well, but I don't want to go over the complete list here. The short summary as you will see on the screen right now is that we basically, we had one, the root cause of things was unexplained loss of zones in the provisioning environment, in the provisioning system, we have investigated that thoroughly and we have not been able to come up with the root cause of that particular ?? with the root cause of the root cause of the failure, and that's of course, a concern, but that is the situation that it is in right now. However, the fact that it led to a significant impact was basically, as it says here on the second roll I couldn't tell, that there were several ?? a chain of events that included several failures on our part, human error, operational failure, procedural stuff that really wasn't what it should have been. That was actually the does that led to the damage that we saw. Now, of course, one thing not explicitly mentioned here is that on top of it all, we also communicated very badly. So, what did we learn from this?

Well, not just a few things; we learned a lot. And a lot of that is just plain common sense and when we ?? when you read this and when you will read a full report on that that will be available on RIPE Labs soon, you will most likely say yeah, and that's what we went as well but unfortunately this was the case, there was several technical and ?? several procedural errors that took place and we didn't just ?? well, as I said, we have learned from that. And very strong things at the communication level down to the very technical level and there is a few bits and pieces on the slide. This is not extensive; the report that we will provide will be more extensive than this. Just to pick out a few points, and I have put that at the top because I think that attributed to the impact as well. We communicated very badly and we didn't realise that at that particular point in time. So what we have done is that we have engaged, we have improved our communication procedure within the organisation we have thought about how we can be better reachable from the outside, and Axel will also be mentioning that in the Services Working Group, we will improve the reachability of the RIPE NCC in general and also our team on a 24/7 basis.

Then, a little bit further down the page, there were organisational issues, there were things really, really badly painful things like missing backups that were basically caused by unclear procedures between two departments, and we have improved on that as well, we have made very much clearer understandings there. So down to the bottom level, and I will let you read it here or you will have the slides available on the server, you can have a look at it there. There are ?? at the technical level, also a lot of counter?measures we took that will prevent this incident to reoccur in this way, any future incident.

As I said, the full report will be available during the coming months on RIPE Labs.

Some stuff that we have been thinking about:

Many of you will likely know that we have deployed K?root instances in basically a split model. We have a number of global instances that provide K?roots services to basic, well globally, basically, to a wide service region and to many organisations. But also, we provide K?root local instances where we basically have a BGP session with peers and where we mandate no export rule. That helps us or that forces the local instance to remain local within that particular AS.

The ?? that's led to a bit of discussion and we have seen some requests from people that actually required ?? asked from us if we could actually loose enthat restriction on the no export community. We have considered that and we have now thinking about adding a new model where ?? I am sorry, that particular point will come back in my next slide.

So what we are discussing here in the memories and idea that we have been thinking about and we would like to hear opinion on, is to provide a next level of smaller instance that actually communicates or that actually services one single AS. The service is being provided to a single AS. We do not intoned communicate with that single AS on BGP; we intend to communicate that by iBGP and the member the RIPE member that we have this provide this service for, will only provide that, the K?root instance to its own customer base.

So, the scope of that instance will be smaller, even, than the local instances we already had. And so we are interested to hear your opinion on this and there will be some discussion about that, again on RIPE Labs, we will send a pointer to the mailing list about this and we are happy to hear your thoughts.

Then, yes, OK, so, there is a number of advantages that we see in this model. One of them being that we think it's much easier to actually deploy and better spread the K?root instances. We see an advantage of a very clear separation of responsibilities and we will ?? we think that this also provides a growth path ?? growth path to local ?? to local instances as well.

This also could, for instance, apply to some of the organisations, we don't expect all organisation toss fit with this model but some of the organisation that's have come to us and requesting a local instance might actually be, well, have an opportunity to join us here, and start up with a member instance at first, and then grow into a local instance when there is opportunity and a need for that.

Again, we will work on mailing list discussion on this, I already mentioned that. And if you have input for us ?? with regard to this idea, we are here of course this whole week, you can talk to me and other people in the G I team, you can talk to Daniel, Vesna, please feel free to approach us on this, if you have ideas, I hope, if I don't blabber on too long there will be some opportunity talk about it here.

Then, there is another idea, and that's basically unrelated to the previous one, but I already hinted at that, at the ?? with the previous slide; we have some situations where people are running a local node have asked us for a restriction in the no export community because they want to provide the service to a broader audience than just their own organisation. We have had concerns and we still have concerns, with potential impacts; obviously, routing issues are conceivable and networks are not in our control are conceivable so we have some reservations but we are now thinking of actually moving on and talking to individual organisations to actually provide that loosened?up model where the no export restriction could be lifted in a close cooperation and a close?managed change?over from one model to the next. Again, we would like your input on this, and I see I have come to the end of my slot time, so I welcome your questions on this, if there is time for that, of course.

CHAIR: Very quick question, maybe?

AUDIENCE SPEAKER: Not really a question but I have a suggestion for the local nodes for K?root. Would it be possible to mandate that networks that have local K?root node have ?? do validation, so if there are any resolvers in the network they should do validation, otherwise you don't get a K?root instance and BCP38.

ROMEO ZWART: That may be a good idea.

AUDIENCE SPEAKER: The reason why you can distribute K?root because do you DNSSEC so the content of the zone cannot be changed, that easily any more when do you validation, but in the local network where you put the local node you should release mandate validation and then they cannot mess with it any more.

ROMEO ZWART: OK, well, with regard to messing it, maybe to clarify one point: The intention is not that the local node is being run by the local organisation, the intention is that it's still being run from the RIPE NCC by the GIT so in that sense messing with the information is ?? but still, having the validation in place is a good thing to look into.

AUDIENCE SPEAKER: It's a good path for...

WILFRIED WOEBER: From Vienna University and this time speaking for the National Research Education Network. We are definitely interested in becoming part of that game, whether the label is local instance or whether it's sort of member instance, I really don't care. With the caveat I fully understand your intention to minimise the potential for routing leaks and for accidents, some of you may remember my representation a couple of meetings ago with dancing root server things that I found by Atlas that turned out to be an artifact of leaking more specific root information but with that introduction, the current sort ?? sort of your current model of thinking would definitely exclude us because our network is built around a set of ASes and there is both globally visible ASes and some private ASes around, and for some services, we actually have the concept of running the services out of what we call a service AS, and then announcing the services within that service AS to our customers, and we have this funny mixture of some of our customers living in the core AS for acnet and the rest of the customers, mostly the bigger ones and multi?homed ones living in globally visible AS. I already submitted that on the labs, I wanted to bring it up here on this environment because this issue might also hit others, not just us. Thank you.

ROMEO ZWART: The point is of course, we see that and it's exactly the reason that we want to do it on a cooperative and well managed basis. We see that there will be rooting complexities with many organisations and we would like to take that on individual or organisational basis case by case.


CHAIR: Thank you. And without further ado, talking about DNSSEC and will be Geoff talking about DNSSEC measurements.

GEOFF HUSTON: How many of you were at yesterday's lightning talk? I am going to go really quick through the first part. So the questions, you know, how many DNS resolvers out there do we see as being DNSSEC capable? What proportion of users are using those resolvers and where are the users?

So, we did a two level test inside an object that was generated by dynamic Java Script or in this case, flash. One object that top one D dot T5, T5 itself is a sub domain that's signed DS key records etc., .nx, so that's a full?blown DNSSEC signed thing at the T5 level. It's all a wild card, the wild card is signed. So what this does effectively is give us dynamic URLs down here and that particular section there is unique. Every time you went and got these things, you got a different section. And if you look behind it, that's the time of day in seconds and that's a hashing number which is not a bad hash, it's not brilliant but when combined with the time of day is pretty good. That will withstand most forms of high volume.

The first one thing validates the second one looks precisely the same, except we stuffed around with the DS records and in this case the DS record is not the hash of the DNS key. So that second domain does not DNSSEC validate deliberately.

So we embedded in flash code, we pushed the flash code through an advertising network. We bid as low as we possibly could on clicks because we are cheap but the beauty of the ads is you only pay when people click so we made the ad as boring as crap so don't click, don't use it. The first part I will talk about results in the first seven days so in seven days we saw 57,268 folk querying our authoritative resolver. We had two NS records but both came to the same machine, those were the separate IP addresses that went to that machine.

We took the assumption that if you are doing DNSSEC validation, you will query for DNS key resource record and if you are not doing it, you won't.

So, 2316 over the life of that experiment, that domain never existed before the 10th, did not exist. So if you did validation you queried for DNS key there, 2316 queried, that's 4%. Kind of OK.

We then looked at the folk who only served one or two clients, so down forget we have got DNS query logs, web logs. So we can match the end user with the resolver that eventually starts to query us. So, we know how many clients are behind each resolver, so we separated out all that had only one or two and there were 40,000 of them. It's quite a lot, actually. 1136 did DNS key fetches, 2.8%. Then we looked at what was left, 16,822 resolvers, 7%, which would kind of lead you to suspect that the professionals in this industry, the folk looking after bigger networks, have turned on DNSSEC. Wouldn't it?

Well, you are wrong. When we look at the very, very biggest, and there is the top 25, and we'd look at the number of unique IDs each of those resolvers queried and where I saw multiple resolvers for an AS, I bunched them up. So, the top one was Google, AS 15,169 and there is a collection of resolvers, there is approximately 400 of them. A large number of them do DNSSEC or at least they query for DNS keys and that is certainly the largest, 47,973 unique IDs queried for DNS key. And there is the rest of the list. There are some very big ones out there in Hong Kong, Taiwan, Korea, only Google was seen fetching DNS keys so the very big guys with one exception do not, interesting.

So, I have the http records as well, I have the clients as well the resolvers. 770,000 unique a addresses for clients, 69,560 fetched ?? the simplest I can answer that 9% of clients that are performing DNSSEC validation. I will come back to that here because I want to delve into that number.

Some really strange outcomes although Libya, 330 separate experiments, 242 used resolvers that did DNS key resolution, so Libya is the densest deployment of DNSSEC today. And you can see the list there. Some of the numbers are so big that they are incredibly convincing. Vietnam, 3,371 queries, 1003 to DNSSEC. You also see Google's algorithm about ad placement, we were cheap so which countries accept cheap ads? Algeria, Vietnam, Ireland. We were bidding as low as you could bid so obviously they took us to all the cheap places and Indonesia, 13,900 that's where cheap ads go, which is interesting. On the other side too, this is where cheap adds go, Republic of career I can't, 77,571. I promise the nice gentleman from Greece I would say nothing about Greece. Nothing. So that's the other end of the list, right?

So it's kind of the top and bottom.

And you can dive it into ASes and that's the top ASes where we are seeing a large number of clients of which a large number of them actually perform DNSSEC queries, so it's all over the world. Which I found really weird, you know, that why is Azerbaijan Telecom there? And then you think, actually, when we look at the roll I couldn't tell of mobile telephony it happened so much in the so?called developing world than the developed because the copper infrastructure at the time was so poor mobiles was brilliant, so maybe it is because it is rolling this stuff out and they are actually doing it right. This is cool.

So, that's the RIPE region and that's where I finished yesterday. So this is new stuff. We have now been running it for 12 days, a little under two weeks. And I have now got 1.7 million clients doing this. That's a lot of tests. And don't forget, one URL had DNSSEC and one URL didn't. So, a huge number of folk told me, yeah, we got your test but we didn't retrieve either of them, 15% of folk, 275,000 said stuff you, Geoff. I don't know why. They ran the script and told me the results but they said, we didn't fetch anything. OK. Fine, I can agree with that.

These are the folk who did what I expected if they were DNSSEC and that kind of ties in with the 9%, doesn't it? That 8% of folk retrieved the one that was DNSSEC?valid and did not retrieve the one that was DNSSEC invalid. And these are the Looneys. 5.39% said we love invalid and this valid shit, no, we are not going to touch that. They did precisely the opposite. And the rest, 70% pulled both which is kind of what I expected. So I am getting a bit suspicious now about these numbers so I start to look a little bit harder. 5% of folk did the opposite. So what is going on? Have you ever looked at a browser, deep inside a browser? Because what you actually see is a mini operating system. Because we are pages now incredibly complex with large parts to them, the browser themselves normally have five or even up to ten or longer, helper processes, and when you give a collection of objects to actually retrieve, it fans the work out amongst the various little helpers and it goes down separate ports to retrieve the object, so there are separate pipes, and if someone is in front of you on your pipe retrieving some latest crap from to YouTube or whatever, your job that particular fetch waits, and another pipe, might have drained and your job is screened through, so what I see is the in order which I actually schedule the tests, is very rarely the in order which the web logs sees them happening. Things get shuffled around. So, the task, when I pass them to the browser come in one order, depending on the activity, there might be another order.

Now, the other thing that I have noticed from all of you, actually, is your attention span is really small. Totally small. Anything that takes more than five seconds, you are gone. Seriously. Even YouTube, you don't watch to the end. Every one of you just walks away. So, when you have got a lot that's downloading, and you walk away, all those jobs just terminate, they get re?set and they are just finished because the browser doesn't do extraneous loading so that's why I am seeing tests curtailed, so going back to that number, and those 94,655 are basically attention deficit disorder. The stuff has got rolled around and that also leads me to suspect that that first number also has a pretty huge noise factor in it, that sometimes it's DNSSEC and the rest of the time you just didn't wait long time because that's what you do. So, that's the first problem.

The second problem, I looked about ten minutes ago, I rapidly did these slides, at how many resolvers this network from RIPE gives me, and it gives me two, and there they are. Right? Now, why do you have two? Why not one? What is the religion in DNS that gives you two resolvers? Do you like having different answers so you can compare them? Is that what do you? It's a serious question, why do you do this? Resiliency, right, so when you don't get an answer from the first you go to the second, yes? What is not getting an answer? Well, the first thing is, I don't get an answer. What other things do you get to lead to you question the second? ServFail. Someone said it, if not, I said it for you. If you gate ServFail response that's a clear signal that that resolver is balked and it's time to move on. Right? Right. Wrong. Because when a DNSSEC validating resolver can't validate, what does it send back to its client? ServFail. What will the client do? Try the next. So, what if the next one does not do DNSSEC validation? Perfectly fine answer, let's just go there. So, you know, I have got a problem here, and then it gets a little bit harder because how can I actually tell if the resolver is actually performing DNSSEC validation? I am not the client, you know; you there in attention deficit disorder land are the client and there might be a number of resolvers before I get to see a query. So I am seeing the one at the end of the chain and it might abforwarder and a ?? how do I know it's doing validation because I am just the name server. The only thing I can do is attack a strong clue that if I get these requests, then it's doing validation. And it's not going to happen every time because you cache and that's a good thing, so I am just looking for even one and I know that resolver is probably doing something. And of course, I am down one level, so I will see the DS key resource record request as well, so if I see the combination of the two, something validation?like is happening.

So in this case, I am really not sure if it's a recursive resolver or forwarder, but I am willing to attack the ?? take the pun. Do a few more tests, at the moment that's the best we can do.

So, what does this mean when I say what proportion of users ?? using DNS validating resolvers, what did that statement claim? Well, it meant that 9% of the clients passed their queries to resolvers who performed some kind of DNSSEC validation, but we also observe that most clients pass queries to 2.1 different resolvers. Right? So, what I do know is that a maximum of 9% of folk fetch objects that lie behind ?? sorry, that will not fetch objects that lie behind DNSSEC invalid chain, a maximum of 9% obey DNSSEC validation but I really think that that number is rubbery and a more accurate thing to say approximately 4% of the world out there, plus or minus an uncertainty factor of 10%, so somewhere between minus 6% and 14% of the world pick a number, you know, is realistically the number we see about DNSSEC validation, so it's perhaps not as good as the original number says, it might be a whole lot worse and it's quite hard to tell. But, you know, part of the reason is, is that ?? I need to go to the next slide ?? part of it is that some resolvers are just insane, and this is one of the better ones that's completely insane. We did a test that had four URLs, one good, one bad, and two behind name servers that only did v6. Because I am testing path MTU and path ?? these two resolvers, somewhere in Sweden, decided that having a v6 name server was an obvious trigger for insanity. And then just sat there and didn't query DNS records, it had found that out; it then sat there and queried the same two A resource records from the server, exactly the identical query, 93,237 times over the ensuing eight hours. Thank you. I really appreciated the traffic. Thank you.


CHAIR: Very quick question.

LORENZO COLITTI: A question on your 15%, 9%, 4% slide, can you go back to that? Yes.

GEOFF HUSTON: There was something about Google as well if you really wanted to know. We found a whole bunch of resolvers, 113 did, 291 didn't and of those they are clustered up, some do and some don't. Google is skitzoid at the moment. Maybe there is a more general expression, I don't know.

LORENZO COLITTI: So we do, we do similar A to B testing as part of our v6 breakage measurements and we have found on date after date on network after network in experiment after experiment that those two numbers, the 8 and the 5 very, however they want, but the difference is constant. It just goes whoa like this, and so you might be able to extend your experiment and subtract one from the other, because statistics ?? and I did by the way I did talk to some statisticians and said surely subtraction is not the right thing, just assume they are independent and then you can you be tract them and what do, you know, it worked. So ??

GEOFF HUSTON: 4% plus or minus 10% where I think you are leading me.

LORENZO COLITTI: No, no, what I am saying is if you extend the experiment over multiple days you may find, like we did, that 3 is a number that you can trust, which is 8 minus 5. And if you can design your experiment to so that that difference tells you something useful, then you might have better data. As for why the failure which is supposedly the product of two failure probabilities and therefore should be lower than those two singularly taken, I have no idea.

GEOFF HUSTON: It's the combination of incredibly complicated browsers and users with just absolutely no longevity of attention so keep on curtailing this and my experiment dies halfway in so many times, so whenever you see this ad and it's an ad that says something about APNIC ??

LORENZO COLITTI: Wave check at the end.

GEOFF HUSTON: Whenever you see the ad ?? let it run ?? if you see the ad and it's where we are measuring v6, let it run and don't click, but let it run as well because I need the answer.

JIM REID: Have you any plans to do things like fingerprinting of these resolvers to try and identify what software is running on them.

GEOFF HUSTON: Yes, I have been pointed to a couple of folk who have fingerprinting software and now that we have the complete packet traces of all of these 77,000 resolvers and growing, yes, I would dearly like to do fingerprinting. We are forcing them to do TCP so we are getting even better fingerprinting evidence, I would be keen to understand who is doing what out there in resolver land.

AUDIENCE SPEAKER: Your last slide said something about two servers in Sweden querying ??

GEOFF HUSTON:, is that why you are up there?

AUDIENCE SPEAKER: Are they really?

GEOFF HUSTON: I am going the wrong way.

AUDIENCE SPEAKER: Seriously, if they are mine, I will fix it. But I was going to ask you what record they are querying for?

GEOFF HUSTON: They are querying the A record all the way down in that domain so it's the A record of it. 1,000 dot U something that S something dot T5 dot T in X could main not net, it's not the NS or DNSSEC and it started querying 4 and the two that were behind conventional v4 NSes, one query done but the two that were behind v6 only NS, it decided, no, whatever the answer; I don't like it, I am going to ask again and again and again.

CHAIR: Your name was?

HEATHER SCHILLER: Heather Schiller.

PAUL VIXIE: On this slide I wanted to let you know you should publish this more widely because the particular version of BIND 8 that's doing this is vulnerable to two stack smashing attacks if you publish this someone will fix those servers.


GEOFF HUSTON: Whoever is watching the video and running these servers, you just heard it, fix it now. Thank you.

CHAIR: Thank you, Geoff. We go to the last part of the programme and this is a panel discussion, no session without a panel discussion nowadays. And we have a dedicated ?? Peter Koch introducing this particular discussion.

PETER KOCH: On we don't remember what, right? So, yeah, we thought we are going to have another panel, someone lost their glasses here, I guess. I can't see you. So we thought after Geoff gave this interesting presentation about measuring the stage of resolution and validation of DNSSEC in the world, we'd look at how to improve the situation if there is anything to improve at all, so without further ado I would like to invite the four panelists to the stage, which are Antoine, Roland, Patrik Falstrom, and Andrei Robachevsky from the Internet society. I will do a short, very short introduction and then hand over to the panelists, we will frame some question, maybe even provocative, I don't know, and we will also try to involve the audience at the end. So the reason why we picked these people, well they were the cheapest ones of course and we have a tight budget, as you may know, there has been a sudden increase in DNSSEC signatures in the Interneter lands or especially in the NL top level domain, and there is an authoritative site and recursive site and we thought asking our colleague from to the stage would be a good idea here because they obviously have done something or were part of a bigger thing that increased DNSSEC awareness. Patrik, on the far side there, started this discussion on ?? in some smoke filled room maybe so this is why he is there and he is also registrar, and Andrei representing the Internet society is, I promised ?? Internet society has come experience with launch day to get industry together to enable the deployment of new core technology so we were going to exploit their insights. I would suggest we we start with Patrik then and go to Roland and Antoine and finally since Andrei has prepared some slides we do that at the end of introductory round.

PATRIK FALSTROM: Thank you very much. Regarding DNSSEC, one thing that myself and a few others have been talking about for quite some time is we have a similar problem like IPv6. Why should I sign my zone when it only helps others? It's very difficult to find a business model where I am doing investment to help others. On top of that DNSSEC consists two of things, people should sign their zones and people should validate. And regardless of who is doing wrong and this is where I see some similarities with the discussions that people that are dealing with IP packets, which I don't do very often, that the problem they have is that when there is an error, the press is very often blaming the wrong party and that's another uphill battle to fight. So to some degree, certain things regarding DNSSEC deployment it's much easier to do everyone at the same point in time because it's easier for people to understand what is happening but when writing on beer coasters in Prague, not the smoke filled room but there were liquids there for people, we said, yes, it would be interesting but we need to talk about what this is, and I think it is validation that we should try to start do at the same point in time.

ROLAND VAN RIJSWIJK: I was dragged up on to the stage by Peter this morning so I am a bit unprepared but we are one of the parties that has been doing validations for a long time, I think we were the first big network in the Netherlands to enable validation on our public resolvers. And our experience so far has been quite good, we have had zero help desk calls, no users panicking because they couldn't reach certain websites even when NASA fucked up so we are unlike Comcast who were swamped with calls, it's users assuming that somebody made a mistake and in a couple of days the site will work again. So actually I would say and I say this a lot when I give presentations, done on validation because there is no excuse not to do it any more because it just works, the number of validation failures is very low and with big deployment in dot NL we have been collaborating with the guys from SIDN, sending them reports on the validation failures we see, if you deploy on large scale you are bound to get some problems, they have been chasing them actively, we managed to keep for over one million signed domain names the number of failures down to 0.005 percent of the signed domain named.

Andrei: I mean, the reason why we got to this is, has been a long track already. But we have done corporation right from the beginning, and that's where we are today. First, it's a lot of explaining to do, of course we had an incentive to helps a lot, money talks for some people to get their initial events investment. We are now doing 20% of our zone is doing DNSSEC and it's still growing and of course our next step will be validation and we are very interested and looking into incentives to get this validation going. One of the things I just wanted at the microphone was if you are going to want to have local root instances or you want to have local instances of dot NL zones turn on validation because that will be a mandate for us to deploy such networks. But I am interested to hear opinion of others of what we can do to ?? for people to motivate validation. I think now that we have enough domains, that can be validated, stuff like DANE and other things will help people do validation or want to do validation.

ANDREI ROBACHEVSKY: I work for the Internet society. I have prepared some slides but this is not a proposal for the world DNSSEC day, we tried to do, we tried to distill some factors that contributed to the suck selves world IPv6 events, contribute to something that made those more than just a marketing exercise, and in a way changed the Internet, so, and I'd like to thank ? Roberts because most of this is his wisdom and he was the driving force because world IPv6 events from the Internet society side.

So, let me just show you a few slides, then. So what is important when you plan whatever event for technologies and services that have this effect that Patrik just mentioned when the cause and benefits are not in the same hands and risks are also somehow displaced. I think one thing you need to have is a clear, clear objective, so what we are trying to solve with this issue and I think it should be clear not only to the participants but also to general public; for instance, if you look at World IPv6 Day, what the clear objective was that our website, a big large scale website accessible on IPv6 doesn't break users and they experience, and world IPv6 launch helped to break this chicken/egg problem when content saying no clients and clients saying there was no content. It should be measurable result, if result is not measurable this will slide into a marketing exercise, so again if we look back at what happened in IPv6 real am it was very clear that the sites for the day were 24 hours up on IPv6 and for the launch they also were some clearly measurable matrix that people had to achieve in order to claim that they successfully participate in this stuff, in this case there was a permanent on for websites and at least 1% of the client base accessing those sites via IPv6.

And well, of course, had helps that it's a concrete date in the calendar and many people, well for the internal processes it also helps, it also helps for public awareness and preparedness.

And what is very important, that it's driven by industry leaders, people, while we heard many times people did this because they wanted to be in the same group of leaders like Google, Facebook, Yahoo, and that's very important that the leaders actually recognise the problem and they are the driving force behind those events.

And well, of course, it's important to understand who the leaders are, right? Can we clearly identify them, are they recognisable and can they, that's very important, can they agree to cooperate and well, maybe another side of cooperation, can they agree to share the risks, i.e. doing something at the same time and, therefore, bearing the same risks as their competitor?

And of course, it's very important to think what the result of this exercise is, and will the Internet be different once we did this day or launch or something world, related, in this case to DNSSEC?

I think that's that's the last slide. So this is some food for thought, this is something that we thought could be useful when you consider world event like this.

PETER KOCH: I can't resist the temp ration, go ahead, Jim.

JIM REID: I think one of the issues about DNSSEC deployment that is perhaps a hard thing to get across to people is the validation might not be as painful as they fear, it may abfear of the unknown that people are not using it yet and I think it's very important message Roland said, we were the first to deploy, the sky didn't fall in, but I think it would be very useful to try and document that and have that explained and promulgate sod people can look to the example of what was done in surf net as something they could apply in their own networks inside their own organisations and it would be interesting to try and get some information about costs, if there are costs, have you had to throw more boxs at the resolution because of the overheads of validation, and what are the overheads in terms of computing cycles for that and do you have to choosing one one true validation style from the root or alternative such as DLV and things like that as a way of trying to get around the problems where other TLDs are not doing signing yet.

PATRIK FALSTROM: I had, I sort of suspecting this discussion would come up so I asked the largest access provider in Sweden which was also one of the first ones, so what is their load and what happened on your resolvers when they turned on validation, and they said nothing, OK, is that document anything, there is nothing to document. OK. So, but ?? do you know how much more resource you need to do validation compared not to do, well to be able to do we need to be able to turn off validation or start to run multiple name serves, one with validation and one without. Now we don't want to do that. Now this works. Move on.

PETER KOCH: Adding to ?


PETER KOCH: How much money did you make Roland by turning on validation?

ROLAND VAN RIJSWIJK: How much did it cost?

PETER KOCH: How much did it cost and how much money did you make? Obviously you had an intensive, so please address both sides of the spectrum here.

ROLAND VAN RIJSWIJK: So how much did it cost us? Nothing. Because the extra load is negligible, how much money did we make, we are nonprofit but I got bonus miles for flying around the world and telling people about this.

PATRIK FALSTROM: The cost is not really zero because there is human labour and you need to have proper processors and you might have different errors coming up but there are none of the costs that people are asking for. Ridge the costs potentially for say ISP that's want to deploy is in training because their staff needs to be aware once you start doing validation you may see new errors that may need new ways to troubleshoot but you would say could you do that easily in one or two?day course and they would know more than enough to be able to deal with that.

ANDREW SULLIVAN: Andrew Sullivan. I work for DANE. There is a basic disanalogy between DNSSEC and IPv6 that I think we are not paying attention to here. I don't care if intermediate resolvers do validation for me; in fact, I am going to ignore them anyway because quite frankly I don't trust my upstream. I want my end point to be doing validation because if it's not I can be spoofed by my ISP and I am more likely to be misdirected by my ISP than by anybody else. I think that's one of the central problems. What we are trying to do in this exercise is not IPv6 day where we turn it on and it's for real, get every end point in the world to be using DNSSEC to do the validation out there. So this is a very, very different kind of problem than getting validators to start validating, at least as far as I can tell from my point of view. Then the key thing here that we need to be pressing on is getting the application case strong and getting the OS vendors to be using libraries so that applications can do useful stuff with this. I mean, right now, if you are a browser, you have to implement your entire resolver and the entire stack incited browser in order to do anything useful with DANE and I don't think that's a good idea. So we need to solve this problem in a very different way than the IPv6 world has.

PETER KOCH: I will be a bit rude here, because this is probably a topic for a different panel, the different panel talking about end point validation versus centralised ISP?based validation. The situation we wanted to address here is first steps, and addressing a ?? or reaching a point where you have broad validation and higher number of ?? numbers of users is definitely achieved by ISPs and enterprise resolvers taking the lead, if they don't do it it's probably unlikely there is anything happening at the end site.

PATRIK FALSTROM: As I tried to say from the beginning, the multiple ?? deploying DNSSEC can start using can be multiple things and of course, from my perspective, yes, do have application ?? have application validation is one thing; having resolver validation is one thing, having registries actually agreeing on what kind of day the registrars should send to them is one thing, having the DNS hosting provider assigning the zones is one thing, having easier communication is one thing, there is a long list of various different kind of things and the only thing I said is I think we should start by the low hanging fruit and I think it's very sad that some ISPs and some resolver operators that have started to do validation turned that off. This is a very hanging fruit because it's very easy to do that we can start there. That isn't to say we should not aim for all the other pieces in the puzzle as well.

PETER KOCH: Any other comments from the panel?

LORENZO COLITTI: So in the IPv6 case, we had a similar fear of the unknown problem. What happened is that we did what you said, and we said let's fix the low hanging fruit, let's put v6 and all implementations and turn it on by default and what happened is that we created a blocker, right? We shot ourselves in the foot because we had a bunch of implementations doing stuff by default and then we got the brokenness problem. It seems to me you are ignoring the other side of the equation, you are underestimating the chicken and egg problem, can't do one side of the equation by itself, even though it cost nothing, there is no benefit and there could be scaling concerns. If what if acnigh start ?? your CPU usage might go through the roof, wait and if you want to convince the content guys, say that I am duller, mega bank, I may be able to sell my security guys I am doing DNSSEC and it's going to protect us from phishing or whatever and then I have a question: What is going to happen if I turn it on? Does anyone have data on that? When we went into World IPv6 Day there was one source of data, you know, good or bad about brokenness; we had a number in mind.

ROLAND VAN RIJSWIJK: So, to go back to the IPv6 experiment, if I remember correctly, the first experiment was to deliberately turn IPv6 on for some people, right, I remember that we specifically submitted information to Google to indicate that we would be willing to receive AAAA records for all their services. If you were, as a first experiment, to do something similar for DNSSEC where you have a set of sort of pioneering maybe ISPs, though not the pie niece doing it now, and for instance a Google or a Facebook or Yahoo willing to have a separate set of name servers where they have a signed domain sitting there and those people talk you could repeat the same experiment and learn lot from that.

PATRIK FALSTROM: They do see that the CPU is going up, like all ISPs in Sweden, all the resolvers are in Sweden are doing validation in their resolvers, so I think in Sweden it's pretty interesting thing to look at the growth or the number of signed domains compared to the growth and number of domains with the growth of CPU and then you could compare that with other ISPs in the world to compare whether the growth of CPU is faster just because of the higher up take of DNSSEC.

LORENZO COLITTI: Don't forget TTLs, because we have 300 second TTLs for example.

PATRIK FALSTROM: Yes. I see those kind of things easier than many of the others. Other thing could be to first look at the signing part, and to be able to sign zones you need to have agreement between all TLDs in the world, for example what data the registrars send to the reg and that discussion has been going on in the IETF for years and the registries do ask for different information, so I don't see any conclusion there yet.

PETER KOCH: Last year ??

AUDIENCE SPEAKER: If I just make ??

PETER KOCH: In the interest of time I need to cut the queue here.

ANDREI ROBACHEVSKY: What Lorenzo said about chicken and egg problem I would like to second that. Looking for low hanging fruit might be a good or attractive approach, I think the step if we want to create something that resinates and has fact, something meaningful and I think the low hanging fruit had this case might not be meaningful because there is not much to validate so it might be easier to do but it will have no meaning and no resonance. So on another side if you look at who is the real driver for DNSSEC and do the drivers feel the urgency, I think that is the important question, that probably is not as easy to answer in this room right now, which was answered in IPv6 case as far as I know through a series of discussions, right, while all parts of this chicken and egg problem were understanding the problem and the urgency of this problem and that is very important.

PETER KOCH: OK, thanks. The Chair is signalling me and we are running over time. I would like to thank the panelists, and obviously, we need to continue this.

Continue this at another occasion. Thank you.

CHAIR: Yes, talking about chicken and eggs, it's lunchtime.