As more websites move to encrypt their content and user data, more questions are raised over the future of Deep Packet Inspection. But advances in heuristic classification mean that DPI systems will still be able to function in an encrypted world.
Every time that I participate in a speaking panel on Deep Packet Inspection (DPI), someone asks, “How do you cope with encryption?” This is a good question since we’re seeing broader deployment of secure web browsing on the Internet; in fact, Facebook publicly moved parts of their site to “https://” in 2011, and Google now defaults to “https://” for their users.
What does this mean for DPI and all the investment that operators have made in DPI technologies? Are all of the DPI-based traffic shaping systems in the market suddenly going to become useless? Will we need to come up with new techniques to manage traffic?
Fundamentally, encryption will not be the death knell for DPI, but it will force greater innovation as operators seek to manage increasing traffic volumes, and to deliver the customer experience their subscribers demand.
The Drivers for DPI
DPI is a technology that can be used for multiple purposes, but the most popular to date has been traffic shaping systems used to manage congestion, first on fixed networks, and more recently on mobile networks.
It’s a cliché to talk about the rate at which mobile data is growing, but this growth remains the dominant driver of mobile infrastructure development and deployment decisions. With the amount of mobile data almost doubling every year, even long-anticipated solutions like Long Term Evolution (LTE) only help—they are not the cure. LTE provides “only” a 3X improvement in spectral efficiency—enough to satisfy 18 months of the projected growth—but is years in the making and will take years more to fully deploy. Other pieces of the solution include small cells, Wi-Fi offload, and opening up new spectrum, but each of these is likely to provide only part of the solution.
Given this reality, it is inevitable that there will be congestion in the network—because operators can’t roll out new technologies and spectrum fast enough to meet the demand. Once this congestion hits, an operator has a limited set of options:
• Do nothing, and allow the users to fight it out for bandwidth. In this scenario, a single user downloading a large file can cause poor performance for everyone else in the same cell site.
• Perform simplistic traffic management and allocate each user a fixed share of the spectrum. This is better than nothing, but is not very efficient.
• Deploy a DPI-based traffic shaping platform and intelligently adjust the bandwidth to each user and to each subscriber individually. This allows individual applications to be prioritised against each other, and can improve the quality-of-experience (QoE) for every subscriber in the cell, including the heavy users.
These traffic shaping platforms have seen strong market adoption and are expected to be a $1.6bn market by 2015.
DPI-based Platforms: Different Implementations
Current DPI or Traffic Shaping platforms can be separated into two categories: those built around a general purpose packet processing platform, and those built with dedicated ASICs or FPGAs.
The ASIC/FPGA-based systems have an intuitive appeal: for a given fixed problem, they can be optimised to perform well on that problem and can offer appealing performance and price points. While they come with a long development cycle and a high development cost, these drawbacks are often overlooked in the zeal to have the densest or highest performing system. This approach is also more common when the DPI functionality is integrated into some other piece of equipment such as a router or a mobile network node like a GGSN or LTE gateway. These systems weren’t designed for DPI, so the functionality is shoe-horned into some limited power and space budget.
A different approach uses a general purpose packet processing platform, which uses a blade-based server that has been adapted for use as a packet processing platform by adding load balancing, special packet routing software, and multi-core processors. Here the developer uses the processor cores to execute the DPI and packet shaping algorithms—making this a software exercise, not a silicon development one. This approach has been popular in the standalone traffic shaping market, which is aiming to offer the highest performance possible and can leverage the pace of silicon change with new chips coming out every year.
Encryption has been discussed in the DPI community for years, but it was always seen as a theoretical problem with a couple of famous exceptions (e.g., Skype going to great lengths to conceal itself). 2011 was a turning point: the year started with Facebook announcing that many of their services would be offered over encrypted web sessions by default. Facebook’s decision was triggered by the release of a proof-of-concept hacking tool called Firesheep that allowed users to snoop Facebook traffic on open Wi-Fi networks and impersonate other users. This was followed throughout 2011 by other high profile services like Twitter and Google moving to encrypt their sessions as well.
It is fairly clear that this is a one-way evolution. Significant barriers to encryption have been the hardware cost and the time it takes to encrypt and decrypt traffic. But these are shrinking every year with Moore’s Law, as even desktop and mobile CPUs get dedicated instructions added to accelerate encryption. Furthermore, once a web service has added encryption, it’s hard to imagine a reason that they would later remove it, so we can expect to see a steady increase in the percentage of encrypted traffic as service after service adds encryption.
Adapting DPI Platforms to an Encrypted World
To come back to the original premise of the article: No, in the general case, DPI platforms cannot break the encryption and look inside the packets.
In order to think about how a DPI platform can function in an environment where most of the traffic is encrypted, it is helpful to think back to the main purposes of commercial DPI platforms today: to understand which users are consuming the available bandwidth and then making intelligent decisions about which traffic to prioritise. Although strict encryption prevents the DPI platform from looking into the packet, there are still plenty of clues for the DPI platform to look at: the source and destination of the traffic, the packet size, and the pattern of packets. For example, a stream of small packets every 20 milliseconds in both directions is almost always a VoIP call. Traffic to and from the Facebook servers is, by definition, Facebook traffic. It’s also possible to correlate separate flows: even if everything is encrypted, if the platform sees a request to a server at CNN, followed by a request to Akamai, it can reasonably assume that Akamai is serving CNN content and thus apply the appropriate rules. This is called “heuristic” or “inferred application” classification, and can reach similar levels of accuracy as the traditional DPI approach.
With this information, the DPI platform can make the same decisions that it would have if the packets were unencrypted: control the amount of bandwidth that each user is allocated, and within that bandwidth help the user prioritize interactive services like VoIP and video streaming while de-prioritising less sensitive services like big downloads or backup sessions.
This approach is more compute-intensive than traditional DPI—it takes more CPU cycles to track flows, look at packet sizes and packet arrival times, and then correlate different flows than to just look inside the packet—but it’s still possible. Developers with FPGA and ASIC-based platforms are in a tough spot, though: the ASICs can’t be changed once they are in the field, and the task is more complex than FPGAs can be expected to handle because they are good at fixed function but poor at heuristic correlation.
Developers on Commercial Off-the-Shelf (COTS)-based packet processing platforms have an easier time: the same multicore CPU that was looking inside the packet can instead run heuristic code to infer the application, so systems that are already deployed can be repurposed to handle encrypted traffic with just a new software load.
The death of DPI?
The death of DPI has been predicted multiple times. I’ve no doubt that very prediction will be proffered at this year’s Mobile World Congress.
There are those who believed the functionality would be absorbed into adjacent network nodes. Those who argued that users wouldn’t put up with it. Most recently, there have been those who believe that encryption will render DPI useless. The shift to heuristic-based application classification, however, coupled with the use of general purpose packet processing platforms, provides a solid path forward that preserves existing investment and delivers the same benefits in a timeframe that meets the needs of operators already struggling with traffic congestion.
Mike Coward, VP Strategy & Innovation, Radisys