12:15PM, Thursday, February 22nd 2007.
Gates 104

Beyond Bloom Filters: Approximate Representations of Concurrent State Machines (and Some New Constructions for Bloom Filters)

Michael Mitzenmacher
Harvard University

Slides: [ppt]

About the talk:
Many networking applications require fast state lookups in a concurrent state machine, which tracks the state of a large number of flows simultaneously. We consider the question of how to compactly represent such concurrent state machines. To achieve compactness, we consider data structures for Approximate Concurrent State Machines(ACSMs) that can return false positives, false negatives, or a ``don't know'' response, a generalization of the standard notion of Bloom Filters. We describe three techniques based on Bloom filters and hashing, and evaluate them using both theoretical analysis and simulation. Our analysis leads us to an extremely efficient hashing-based scheme with several parameters that can be chosen to trade off space, computation, and the impact of errors. Our hashing scheme can also be used to provide an alternative structure with the same functionality and nearly the same complexity as a counting Bloom filter but that uses much less space. For example, for a 1\% false positive rate, our structure uses roughly half the space of a counting Bloom filter. We explain how ACSMs can be used for applications such as video congestion control and real-time detection of P2P traffic. Along the way, we'll find some new constructions of Bloom filters that are more efficient for reasonable parameters and well-suited for hardware implementation.

Joint work with Flavio Bonomi, Rina Panigrahy, Sushil Singh, George Varghese


About the speaker:
Michael Mitzenmacher is a Professor of Computer Science in the Division of Engineering and Applied Sciences at Harvard University. Michael has authored or co-authored over 100 conference and journal publications on a variety of topics, including Internet algorithms, hashing, load-balancing, erasure codes, error-correcting codes, compression, bin-packing, and power laws. His work on low-density parity-check codes shared the 2002 IEEE Information Theory Society Best Paper Award. His first textbook on probabilistic techniques in computer science, co-written with Eli Upfal, was published in 2005 by Cambridge University Press.

Michael Mitzenmacher graduated summa cum laude with a degree in mathematics and computer science from Harvard in 1991. After studying math for a year in Cambridge, England, on the Churchill Scholarship, he obtained his Ph. D. in computer science at U.C. Berkeley in 1996. He then worked at Digital Systems Research Center until joining the Harvard faculty in 1999.