NetSys Projects

You can find our most recent projects here. We are also on GitHub.

Fault Tolerant Middlebox

Middleboxes – such as proxies, WAN Optimizers, and intrusion detection systems (IDSes) – are often stateful, keeping logs of active connections, port mappings, packet caches, and other data about users, connections, and services. When middleboxes fail, lost state can lead to reset connections, lost logs, and security concerns. Hence, like other systems, it is desirable to design middleboxes for high availability, with automatic failover when a device suffers a hardware or electrical failure; such failover should ensure that no state is lost. However, middleboxes are a challenging target for HA, first because their state changes rapidly (sometimes even updating per-packet) and second, because latency expectations for packet service are typically under a millisecond to avoid inflating flow completion times. To this end, we present FTMB, a record-and replay approach to middlebox failover which records middlebox state without imposing a heavy latency penalty on traffic.


By moving network appliance functionality from proprietary hardware to software, Network Function Virtualization promises to bring the advantages of cloud computing to network packet processing. However, the evolution of cloud computing (particularly for data analytics) has greatly benefited from application-independent methods for scaling and placement that achieve high efficiency while relieving programmers of these burdens. NFV has no such general management solutions. To this end, we present E2 – a scalable and application-agnostic scheduling framework for packet processing.

E2’s home site is available on GitHub

Providing Guaranteed Software Dataplane Performance

Consolidated software dataplanes becoming increasingly commonplace. While software dataplanes are well capable of high-performance packet processing in isolation, their performance is adversely affected under co-execution. Due to out-of-order execution and heavy use of pipelines in CPUs, there is no easy approach to predict performance degradation without simplistic assumptions. In this work, we are trying to understand if in-hardware QoS support for shared resources is sufficient to guarantee steady performance in presence of all possible competing processes. As a first step, we seek to find out to what extent the (newly introduced) last-level cache QoS support in Intel Haswell Xeon processors help us maximize the overall performance of co-running software dataplanes while maintaining guaranteed performance for each of them individually.


Modern NICs implement various features in hardware, such as protocol offloading, multicore supports, traffic control, and self virtualization. This approach exposes several issues: protocol dependence, limited hardware resources, and incomplete/buggy/non-compliant implementation. Even worse, the slow evolution of hardware NICs due to increasingly overwhelming design complexity cannot keep up in time with the new protocols and rapidly changing network architectures. We introduce the SoftNIC architecture to fill the gap between hardware capabilities and user demands. Our current SoftNIC prototype implements sophisticated NIC features on a few dedicated processor cores, while assuming only streamlined functionalities in hardware. The preliminary evaluation results show that most NIC features can be implemented in software with minimum performance cost, while the flexibility of software provides further potential benefits.

BESS’s source code is available on GitHub.

Recursively Cautious Congestion Control

Any congestion control mechanism has two primary goals - to fill the pipe and to do no harm to other flows in the network. These two goals conflict with eachother – the former requires aggressiveness, whereas the latter requires caution. Traditional approaches use the same mechanism (the sending rate) to achieve these two conflicting goals. For example, TCP cautiously probes for bandwidth using slow-start, starting with a small initial window and then ramping up, in order to fill the pipe. As a result, it often takes flows several round-trip times to fully utilize the available bandwidth.
RC3 simply decouples these two goals by sending additional packets from the flow using several layers of low priority service, to fill the pipe, while TCP runs as usual at higher priority. It can therefore, quickly take advantage of available capacity from the very first RTT to achieve near-optimal throughputs and smaller flow completion times while preserving TCP-friendliness and fairness. In common wide-area scenarios, RC3 results in a 40% reduction in average flow completion times, with strongest improvements – more than 70% reduction in flow completion time – seen in medium to large sized (100KB - 3MB) flows.

Our Hotnets’13 paper can be found here.

Authors: Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, Scott Shenker

CANDID - Classifying Assets in Networks by Determining Importance and Dependencies

CANDID is a passive NetFlow-based network traffic analysis platform targeted at inferring relationships and dependencies among services running on hosts in enterprise networks. These networks present challenges of great scale, complexity, and nonstop dynamism, which hinder the ability for network administrators to maintain insight into the complex relationships that exist in these networks. Consequently, administrators do not always know how best to proceed if a network failure occurs. CANDID strives to empower administrators by illuminating these relationships, such that they will be prepared to remedy complex service failures. The solutions we present take the first steps towards understanding these complex in-network relationships, with a special focus on inferring one class of dependencies and detecting load balanced services. The current focal point of our work is two radically different, yet complementary, strategies for inferring the presence of load balancing for pairs of systems. We leverage a case study using real NetFlow data from the network located at Lawrence Berkeley National Lab to validate our strategies. Promising results indicate this problem space is rich with unanswered research questions and is worthy of further exploration.

Our first CANDID publication, in the form of a M.S. Thesis, is available here.

Authors: Scott Marshall, Sylvia Ratnasamy, and Vern Paxson


Software bugs are inevitable in software-defined networking control software, and troubleshooting is a tedious, time-consuming task. In this paper we discuss how to improve control software troubleshooting by presenting a technique for automatically identifying a minimal sequence of inputs responsible for triggering a given bug, without making assumptions about the language or instrumentation of the software under test. We apply our technique to five open source SDN control platforms—Floodlight, NOX, POX, Pyretic, ONOS—and illustrate how the minimal causal sequences our system found aided the troubleshooting process.

Our paper draft can be found here. We have also made our tool publicly available. Slides from Colin’s talk at the Stanford CTO summit are available here.

Authors: Colin Scott, Andreas Wundsam, Barath Raghavan, Aurojit Panda, Zhi Liu, Sam Whitlock, Ahmed El-Hassany, Andrew Or, Jefferson Lai, Eugene Huang, Kyriakos Zarifis, and Scott Shenker.


Sparrow is a high throughput, low latency distributed cluster scheduler. Sparrow is designed for applications that require frequent research allocations due to launching very short jobs (e.g., jobs composed of 100ms tasks). To ensure that scheduling does not become a bottleneck, Sparrow distributes scheduling over several loosely coordianted machines. Each scheduler uses a constant-time scheduling algorithm based on on-demand feedback acquired by probing slave machines. Sparrow can perform task scheduling in milliseconds, two orders of magnitude faster than existing approaches. The Sparrow source code is publicly available here, and a tech report is available here.

Authors: Kay Outerhout, Patrick Wendall, Matei Zaharia, Ion Stoica.


BSD Sockets has been a de facto standard API for network programming. While it provides a simple and portable way to perform network I/O, it shows suboptimal performance for “message-oriented” network workloads, where connections are short or messages are small. This problem is exacerbated by its poor scalability on multi-core processors. In this work, we explore the benefits of a clean-slate design of network APIs aimed at achieving both high performance and ease of programming. We present MegaPipe, a new API for efficient, scalable network I/O, and evaluate its efficiency and effectiveness with a proof-of-concept implementation.

Our OSDI 2012 paper is available here. Source code is available upon request and will be publicly available soon.

Authors: Sangjin Han, Scott Marshall, Byung-Gon Chun, and Sylvia Ratnasamy


Ensuring basic connectivity in the network is generally handled by the control plane. However control plane convergence times are several orders of magnitude larger than the rate at which switches forward packets, which means that after a failure the network might be disrupted for a while, even though the network is not partitioned. Traditionally, networks have handled this problem by precomputing a set of backup paths, over which traffic can be redirected in the case of link failures. The most widely deployed example of such a mechanism is MPLS FRR, however MPLS fast-reroute can only handle a limited number of link failures. A natural question that arises is whether one could design a static mechanism, capable of dealing with any arbitrary set of link failures? We have shown that a static mechanism cannot handle an arbitrary set of failures.

Given this result, one must rely on a dynamic mechanism to guarantee ideal connectivity (i.e. packets are delivered as long as a network is connected, and barring congestion related drops). Previous work in this area, has resulted in algorithms that both require unbounded space in packet headers, and NP-complete computations to provide this guarantee. We propose a new algorithm, Data Driven Connectivity, which guarantees ideal connectivity, uses a single bit in the packet header, and can be carried out at line rate. Our NSDI’13 paper on this topic can be found here.

Authors: Junda Liu, Aurojit Panda, Ankit Singla, P. Brighten Godfrey, Michael Schapira, Scott Shenker


Middleboxes – such as firewalls, proxies, and WAN optimizers – have become almost ubiquitous in modern enterprises. In a survey of 57 enterprise network administrators, we found that these devices, while popular, are costly, error prone, and hard to manage. To ease these challenges, we developed APLOMB: a service for outsourcing middleboxes to the cloud entirely. With APLOMB, enterprise clients tunnel all of their Internet traffic to and from a cloud provider; the traffic undergoes middlebox processing at the cloud before being forwarded out to the Internet at large. Our implementation is built from open source components including Vyatta and OpenVPN; we hope to have a publicly-available service soon. Our SIGCOMM 2012 paper is available here.

Authors: Justine Sherry, Shaddi Hasan, Colin Scott, Arvind Krishnamurthy, Sylvia Ratnasamy, Vyas Sekar


Modern networks deploy middleboxes to support numerous advanced processing capabilities such as firewalling, traffic compression, and caching. Despite the widespread deployment of these features, they are nevertheless invisible to the end hosts using the network. We designed ‘network calls’ (netcalls) to allow end hosts to make function calls to the network processing their traffic, allowing the end hosts to invoke and configure the numerous advanced features provided by the networks their traffic traverses. For example, we built a web server that, upon detecting it is under attack using application-layer knowledge, adds additional filters to the firewalls in its network. A key challenge to netcalls is that we want to allow advanced configuration to end hosts not only in their local network, but in any network their traffic traverses. Thus, the netcalls architecture lies primarily in two components: an intra-domain protocol for end hosts to invoke function calls with their network provider, and an inter-domain protocol, by which providers invoke features in each others networks on behalf of their clients. Source code forthcoming. Text describing the netcalls design can be found here.

Authors: Justine Sherry, Daniel Kim, Seshadri Mahalingam, Amy Tang, Steve Wang, and Sylvia Ratnasamy


POX is a Python framework for writing network control software. At its most minimal, it is an OpenFlow controller. It targets research and education, and favors ease of use over most other concerns.

POX’s home site is at, and the source code is available on github.