We all know it, the pool of unallocated IPv4 is steadily running out and the future path is to migrate to IPv6. It is no longer a matter of if but rather just a question of when to be ready for the new protocol. It is vital that we accurately can predict the IPv4 depletion date so that stakeholders can plan their migration to IPv6 accordingly.
The methods being used to predict the IPv4 depletion dates vary from relatively straightforward curve fitting  to complex simulation approaches . Perhaps the most known and most comprehensive study is Geoff Huston’s, IPv4 address report . His estimates are often used by the industry as the leading prediction of the depletion date. My efforts in the area of IPv4 depletion have been focused around creating a publicly available prediction tool and a report.
Huston’s calculations and data have been the main inspiration for my work. Without his collection of data, my work would not be possible. I had an opportunity to discuss IPv4 depletion with him at the ARIN XXIII meeting in San Antonio, TX in April ’09. He gave me some insights in how IANA works and we discussed some issues that I had found lately with the 188/8 block. With the insight from each other, we both made changes to our predictions afterwards.
It is beneficial for the Internet community that we now have at two IPv4 depletion estimates that both point in the same direction, even if they differ somewhat. We both agree that the IANA pool will be depleted in less than 2 years from now. But why is there a 7 months difference between my calculations and Huston’s calculations?
To verify my calculations I had to look into what differed between us. This article describes what I found. I thought it would be a good idea to inform the community that we are going to run out of IPv4 addresses earlier than the most referenced prediction claims.
My intent has never been to find issues with somebody else’s work. My findings described in this article are a result of my curiosity when I realized that my estimates came out earlier than Huston’s. Furthermore, I was troubled over the fact that Huston’s predicted “time until depletion” got higher as days went buy. See figure 1.
|#||Suggested IANA depletion date||Date of prediction||Days until depletion|
|1.||2 Apr. 2011||28 Feb 2009||763|
|2.||22 Aug. 2011||30 Apr. 2009||844|
In my opinion, the “time until depletion” can perhaps come to a standstill for a while if we have a low consumption rate. But I had to question the validity of a projection that for a long time moves “time until depletion” into the future.
The IPv4 depletion tool
The IPv4 depletion tool has been available online at www.ipv4depletion.com for about a year. However, it was first recently, at the Google IPv6 Implementors Conference  in March ’09 and then at the Rocky Mountain IPv6 Summit in April ’09 , that the tool was announced to a broader audience. The goal with the tool was to make as many variables as possible selectable to the user. This flexibility allows anybody to create their own estimate without detailed knowledge of underlying mathematics and allocation policies. I’m also tracking the result of my tools output with my default settings and comparing those to other experts’ research.
When I built my IPv4 depletion tool I had to make a choice. I could either base my tool around the algorithms and source code from Huston or I could implement a completely new calculations engine. I decided to create a new engine, so that I could compare my findings with Hustons and see how and if they differ.
The tool have been active for approximately a year now and the results consistently points to an IANA depletion date by the end of next year. When tracking my predictions over a period of time, one can see how large allocations show up like steps. Days without any significant allocations slowly moves the prediction date forward. See figure and table 2 below. The fact that each change in the prediction can be traced back to an actual allocation event attests for the validity of the prediction.
|#||Date||Network||Requester||Size(number of IP)||Impact (days)|
|1||18 Dec. 2008||188.8.131.52/10||Verizon Wireless||4,194,304||18|
|2||12 Feb. 2009||184.108.40.206/11||China Unicom Shandong province||2,097,152||12|
|3||9 Apr. 2009||220.127.116.11/11||China TieTong Telecommunications||2,097,152||5|
|4||6 May 2009||18.104.22.168/10||China Mobile||4,194,304||18|
Figure / Table 2, Example of large allocations and their impact on the overall prediction.
Top down / Bottom up approach
My projections are calculated individually per RIR in a bottom up approach. The first thing I calculate is the usage and pools of IPv4 addresses in each region. If a certain threshold is met, the program will simulate that the RIR in question allocates additional space from IANA.
Huston is calculating the depletion dates in a top down approach.
The first thing he calculates is to estimate the IANA pool depletion by looking at the IANA allocation history. The first thing he calculates is the summarized address consumption for all RIRs. He then breaks this up to the individual RIRs based on a calculated ratio between the RIRs. The problem with that approach is that it incorrectly takes into account all allocations in all regions between the RIR and a member. This is not how it works in reality.
For example, I don’t assume that AfriNIC will make any more requests to IANA. Therefore any allocations in that region have no impact on the overall IANA depletion date in my model (unless there is an insane growth in the allocations in Africa forcing AfriNIC to request more space from IANA). With my calculations, APNIC comes out as being the RIR that will make the final allocation from IANA. So right now, the complete system is defined by the allocations being made in the APNIC region. This can however change if one of the RIRs has an unexpected drop or uptake in allocations. RIPE is predicted to make their final allocation just two weeks before APNIC. If RIPEs allocation rate continues to be low, we might soon see them defining the system.
Another issue with a top down approach that uses IANA data is that it estimates the IANA consumption with a continuous line where in reality the IANA consumption made up of discrete allocate events. The continuous line will create a significant rounding error for the IANA depletion date. See figure 3 below. Huston’s calculation is however not affected by this as he is not using the IANA consumption statistics in his model.
I’m using a mix of linear and exponential models. Exponential models have an unfair bad reputation for growing too quickly. However, when fitted correctly to the data one is estimating, they produce very accurate and relevant results. Several phenomena in the biology, physics and economy (bacterial growth, nuclear chain reactions and Madoff’s ponzi scheme) can be described by an exponential model.
Furthermore, an exponential model is consistent with Reed’s law on how networks can be valued . And the size of the Internet will grow proportionally to the value of the Internet. The higher value of the network, the more devices and users will be connected.
Huston is using a polynomial model. Although Metcalfe used a polynomial function in his law describing the value of a network , I tend to agree with Dr. Reed’s criticism of the polynomial model as an estimator for the value of a network. I don’t think it is relevant to use a polynomial model in this context.
There are also some additional problems with a polynomial model that needs to be taken care of if one chooses to use it. The problem is that the model can bend and create a decelerating growth that is slower than a linear growth. This seems to have happened lately with Huston’s prediction. For example, when looking at figure 18b (http://www.potaroo.net/tools/ipv4/fig18.png) you can see that the light blue line that represent the polynomial growth are slower than the green line representing linear growth. This gives you an idea on what can happen with polynomial models.
Least Square fitting
I have implemented three different least square fitting algorithms in my program. One for linear, one for exponential and one for polynomial (however, the polynomial is not being used because of the issues identified and described above).
Huston has only implemented a linear least square fitting model in his program. He is using the first order differential of his input to create a 2nd order polynomial fitting model. This is however mathematically incorrect as he is losing one degree of freedom. His result from this is in the form Ax2 + Bx and not Ax2 +Bx + C.
To make up for the lost C in the formula above,
he estimates the C with the current usage (that is how I understand his code) . This glues his future estimates nicely to the historical data, he estimates C by minimizing the error to the original data series, but is again not mathematically correct. Instead he should have implemented a polynomial least square fit algorithm that would have given him all three variables.
There is a similar problem in his exponential function that he uses to break up the summarized view into each individual RIR. Instead of implementing an exponential least square fit model, he is using the logarithm of his values and runs those through his linear fit model. This is will not produce the expected result as larger Y-values are incorrectly penalized .
Smoothing the data
I’m using the raw data from the individual delegated files from each RIR.
Huston is smoothing his data with a three pass of a sliding window smoothing function. Smoothing before least square fitting should not be done as it destroys the Gaussian properties of the data. The residue is no longer independent. Books on the subject of regression analysis warn of smoothing the data before applying least square fit .
RIR pool estimates
My model for when and how much each RIR should allocate from IANA implements the policy as it is written with a dynamic model for when the RIR requests more space . I argue that the policy says that we are going to see the RIRs allocate earlier and earlier as their demand grow and that the combined RIR pool will be over 20 blocks at depletion date.
We might also see a small rush for the last IANA blocks, resulting in a pumped up RIR pool at IANA depletion date. Discussing an IANA to RIR rush is perhaps more controversial than a discussing a “big/bad ISP” to RIR rush. However, it is not a completely unrealistic thought that the RIRs will be pretty “trigger happy” to claim their perceived fair share of addresses as the IANA pool diminishes.
Huston uses a static low threshold for when RIRs request more space from IANA. This static low threshold model seems to underestimate the RIR pool sizes at the IANA depletion date. In his model the pools at the IANA depletion date is merely around 17.5 blocks. We can see this in the last saw tooth of the green line in figure 29g (http://www.potaroo.net/tools/ipv4/fig29g.png). The saw tooth is dipping down to about 17.5 blocks. I argue that this estimate is too low and that it affects the estimate of the IANA depletion date by delaying it with about 3 months.
Regional set aside policies
Each region have decided on or are discussing a policy where a fairly large amount of IPv4 addresses are set aside and will only be allocated in smaller chunks with a waiting period between each request . This means that in practice that you cannot take for granted that those last IPv4 addresses will be available for you. My tool is taking those policies into account, Huston’s prediction does not.
With correct mathematics applied, one can conclude that current research of the IPv4 depletion underestimates the remaining time until the free IPv4 pool get depleted.
The remaining IANA pool (at the time this article was written) consists of 25 x unallocated /8 blocks and 5 x reserved /8 blocks that are being saved for the so called N=1 policy . Given that fact that there are only 25 blocks remaining in the IANA, one can conclude that there are actually only 13 remaining allocations to be made from IANA to the RIRs. Most research and estimates of the date for the forthcoming IPv4
depletion date come with a disclaimer such as “Do not believe any dates this program tells you” or “This article does not attempt to encompass such ambitious forms of prediction”. But as we are getting closer to the IANA depletion date it becomes easier to predict what the end game will look like. With the right mathematical models, it is far from impossible to make an accurate estimate of the depletion date. Those disclaimers are no longer needed with correct mathematics applied to the problem.
The nice thing about statistics is that it doesn’t require a huge sample size to make a accurate prediction. This is for example used for election exit polls. Unless there is a very close call, exit polls tend to be very reliable. When the same type of statistics is applied to the IPv4 depletion problem, the outcome is a very reliable prediction of the depletion date.
I can promise that we will not make it longer than the end of 2011. In fact, I don’t even believe that we are going to make it to the end of next year. My research suggests that the IANA pool will be depleted by 30 October 2010. And there is no disclaimer, other than “don’t sue me”.
A daily updated estimate can be found under the report tab at www.ipv4depletion.com.
About the author
Stephan Lagerholm is an IPv6 and IT-security expert with over 11 years of international and management experience. His background includes leadership positions at the largest networking and security system integrator in Scandinavia, and managing the design of hundreds of complex IT-networks.
Stephan is the founder of Scandinode (www.scandinode.com), a consulting organization based in Dallas, TX that provides networking and security advice and researches IPv6 and the depletion of IPv4. One of his recent engagements was with InfoWeapons Inc., a worldwide leader that creates next-generation, fully IPv6-compliant DNS and DHCP products. He is CISSP certified and holds a Master of Science degree in Computer Science and Mathematics from Uppsala University in Sweden. Stephan is the chairman of the Texas IPv6 Task Force (txv6tf.org).
 Tony Hain – http://www.tndh.net/~tony/ietf/ipv4-pool-combined-view.pdf
 Murphy/Wilson – http://www.ripe.net/ripe/meetings/ripe-55/presentations/murphy-simlir.pdf
 The IPv4 address report – http://www.potaroo.net/tools/ipv4/index.html
 Google IPv6 implementors conference – http://sites.google.com/site/ipv6implementors/conference2009/agenda
 Rocky Mountain IPv6 summit – http://www.rmv6tf.org/IPv6Summit.htm
 Reed’s law – http://hbr.harvardbusiness.org/2001/02/the-law-of-the-pack/ar/1
 Metcalfe’s Law – http://en.wikipedia.org/wiki/Metcalfes_Law
 Potaroo ipv4.c line 1124 – http://www.potaroo.net/tools/ipv4/ipv4.c
 Minimizing exponential least square – http://mathworld.wolfram.com/LeastSquaresFittingExponential.html
 Smoothing of data – “Fitting models to biological data using linear and nonlinear regression” By Harvey Motulsky and Arthur Christopoulos, p 20.
 IANA allocation policy – http://www.icann.org/en/general/allocation-IPv4-rirs.html
 Set aside policies – See Arin proposal 2009-2 and LACNIC policy 2008-04 for examples.
 IANA N=1 policy – http://www.icann.org/en/general/allocation-remaining-ipv4-space.htm