Stanford Networking Seminar

12:15PM, Thursday January 19, 2012
Gates 104


Online Testing of Distributed Systems
 

Dejan Kostic
EPFL


About the talk:
It is notoriously difficult to make distributed systems reliable. This becomes even harder in the case of the widely-deployed systems that are heterogeneous (multiple implementations) and federated (multiple administrative entities). In this talk, I will argue that a key step in making these systems reliable is in detecting deviations from the desired behavior by predicting future system behavior in a novel way - by running the code itself from current state using either model checking, or a variant of symbolic execution. The systems we have built materialize this vision. For example, CrystalBall uses live model checking and execution steering to guard against unknown programming faults (bugs). DiCE goes one step further in that it detects faults in heterogeneous, federated distributed systems in which it is impossible to retrieve code, state, or configuration files from other participants. We demonstrate the ease of integrating DiCE with a BGP router and a DNS server, the building blocks of two vital services in the Internet. Our evaluation shows that our systems quickly and successfully detect three important classes of faults, resulting from configuration mistakes, policy conflicts and programming errors. In the last part of this talk, I will describe how we applied the lessons we learned in testing of distributed systems to an exciting new domain: automatic testing of unmodified OpenFlow applications. One would think that the presence of a logically centralized controller in OpenFlow networks makes it easy to avoid bugs. Our work shows that this is not the case - we found 11 bugs in three applications we considered.

About the speaker:
Dejan Kostic obtained his Ph.D. in Computer Science at the Duke University. He spent the last two years of his studies and a brief stay as a postdoctoral scholar at the University of California, San Diego. He received his Master of Science degree in Computer Science from the University of Texas at Dallas, and his Bachelor of Science degree in Computer Engineering and Information Technology from the University of Belgrade (ETF), Serbia. Since 2006, he has been working as a tenure-track assistant professor at the School of Computer and Communications Sciences at EPFL (Ecole Polytechnique Fédérale de Lausanne), Switzerland. In 2010, he received a European Research Council (ERC) Starting Investigator Award. His interests include Distributed Systems, Computer Networks, Operating Systems, and Mobile Computing.