What's on IPTV's neighbourhood?

Thursday, 10 September 2009

SIGCOMM Day 4


Session 6: Routing and Forwarding (Chair: Jia Wang, AT&T Research)

Stable and Flexible iBGP

Ashley Flavel (University of Adelaide), Matthew Roughan (University of Adelaide)

· Routing oscillation can decrease performance and lead to a high level of update churn.

· iBGP has also been shown to oscillate.

· More than eight years of research has not solved the problem of iBGP oscillation.

· Various solutions have been proposed but they all have problems: complicated to implement, restrict routing flexibility, or lack guarantees of stability.

· Their proposal is a very simple adaptation to the BGP decision process.

· They prove algebraically that it prevents iBGP oscillation.

LIPSIN: Line speed Publish/Subscribe Inter-Networking

Petri Jokela (Ericsson), Andras Zahemszky (Ericsson), Somaya Arianfar (Ericsson), Pekka Nikander (Ericsson), Christian Esteve (University of Campinas)

· Most Internet applications are internally publish/subscribe in nature

· Supporting efficient publish/subscribe requires 1) data-oriented naming, 2) efficient multicast, and 3) in-network caching.

· Deployment of native IP-based multicast has failed, and overlay- based multicast systems are inherently inefficient.

· Scalable and efficient publish/subscribe will require substantial architectural changes, such as moving from endpoint-oriented systems to information-centric architectures.

· Proposal: a novel multicast forwarding fabric, suitable for large-scale topic-based publish/subscribe. Fabric more energy efficient than the currently used ones.

PLUG: Flexible lookup modules for rapiddeployment of new protocols in high-speed routers

Lorenzo De Carli (University of Wisconsin-Madison), Yi Pan (University of Wisconsin-Madison), Amit Kumar (University of Wisconsin-Madison), Cristian Estan (University of Wisconsin-Madison), Karthikeyan Sankaralingam (University of Wisconsin-Madison)

· New protocols for the data link and network layer are being proposed to address limitations of current protocols (scalability, security, manageability).

· High-speed routers and switches that implement these protocols traditionally perform packet processing using ASICs. Why? High speed, low power, low chip area. But inflexible.

· Proposal: a flexible lookup module, PLUG (Pipelined Lookup Grid).

· They show that PLUG presents similar results to that of specialized lookup modules (in terms of throughput, area, power, and latency).

Session 7: Network Management (Chair: Thomas Karagiannis, Microsoft Research)

Modeling and Understanding End-to-End Class of Service Policies in Operational Networks

Yu-Wei Eric Sung (Purdue University), Carsten Lund (AT&T Labs Research), Mark Lyn (AT&T Labs), Sanjay Rao (Purdue University), Subhabrata Sen (AT&T Labs Research)

· Extensive use of service differentiation in Virtual Private Networks (VPNs) operated for business enterprises.

· The resulting Class of Service (CoS) designs embed complex policy decisions (bandwidth availability, cost).

· These complex high-level policies are realized through low-level router configurations (tedious process, error-prone ).

· They have proposed a formal approach to modeling CoS policies from router configuration files.

· They also present a practical and computationally efficient tool that can determine the CoS treatments received by an arbitrary set of flows across multiple routers

· They have validated their system using router configuration data from a cross-section of 150 diverse enterprise VPNs.

Towards Automated Performance Diagnosis in a Large IPTV Network

Ajay Mahimkar (The University of Texas at Austin), Zihui Ge (AT&T Labs - Research), Aman Shaikh (AT&T Labs - Research), Jia Wang (AT&T Labs - Research), Jennifer Yates (AT&T Labs - Research), Yin Zhang (The University of Texas at Austin), Qi Zhao (AT&T Labs - Research)

· IPTV is increasingly being deployed and offered as a commercial service.

· IPTV distribution network: hierarchical structure (instead of mesh), stringent requirements on reliability and performance, different distribution protocols (IP multicast), serious scalability challenges in managing millions of network elements.

· Characterisation and troubleshoot performance issues in one of the largest IPTV networks in North America.

· Measurement data used: device usage, error logs, user activity logs, video quality alarms, customer trouble tickets.

· They developed a novel diagnosis tool called Giza tailored to the enormous scale and hierarchical structure of the IPTV network.

· Multi-resolution data analysis to detect and localize regions experiencing problems.

· They make use of several statistical data mining techniques to troubleshoot the identified problems and diagnose their root causes.

Detailed Diagnosis in Enterprise Networks

Srikanth Kandula (Microsoft Research), Ratul Mahajan (Microsoft Research), Patrick Verkaik (UC San Diego), Sharad agarwal (Microsoft Research), Jitu Padhye (Microsoft Research), Paramvir Bahl (Microsoft Research)

· They analysed trouble tickets from small enterprise networks, and concluded that their operators need detailed fault diagnosis.

· They built a system (NetMedic) that enables detailed diagnosis by looking at info from operating systems and applications.

· Main idea: use the joint behaviour of two components in the past to estimate the likelihood of them impacting one another in the present.

· Prototype is effective at diagnosing faults that we inject in a live environment.

Session 8: Network Measurement (Chair: Gianluca Iannaccone, Intel Labs Berkeley)

Every Microsecond Counts: Tracking Fine-Grain Latencies with a Lossy Difference Aggregator

Ramana Rao Kompella (Purdue University), Kirill Levchenko (UC San Diego), Alex C. Snoeren (UC San Diego), George Varghese (UC San Diego)

· Network applications have stringent end-to-end latency requirements

· Fine-grain measurement demands cannot be met effectively by existing technologies, such as SNMP, NetFlow, or active probing.

· They propose a hash-based primitive LDA (Lossy Difference Aggregator) to measure latencies down to tens of microseconds and losses as infrequent as one in a million.

· They define a compact data structure that computes the average and standard deviation of latency and loss rate in a coordinated streaming environment.

· The system was compared with Poisson-spaced active probing with similar overheads. The LDA mechanism delivers orders of magnitude smaller relative error.

· Active probing requires 60 times as much bandwidth to deliver similar levels of accuracy.

Spatio-Temporal Compressive Sensing and Internet Traffic Matrices

Yin Zhang (University of Texas at Austin), Matthew Roughan (University of Adelaide), Walter Willinger (AT&T Labs -- Research), Lili Qiu (University of Texas at Austin)

· How to fill missing valies in a matrix (coulbe be a traffic matrix, delay matrix, social proximity matrix)

· They focus on traffic matrices.

· Missing values are quite common – direct measurement in infeasible, measurement unreliable, anomalies appear, and future traffic has not yet appeard

· Many network tasks are sensitive to missing values

· Ideas: 1) Exploit low rank nature of Traffic Matrices (TM); 2) exploit spatio-temporal properties (TM rows and columns close to each other are often close in value); 3) Exploit local strucytures in TMs

· Compressive sensing is used (generic technique for dealing with missing values that exploits the presence of structure and redundancy). However, existing compressive-sensing solutions perform poorly for traffic matrix interpolation.

· In the paper a novel spatio-temporal compressive sensing framework was developed. Key components: 1) a new technique called Sparsity Regularized Matrix Factorization (SRMF) that leverages the low-rank nature of traffic matrices and their spatio-temporal properties; 2) a mechanism for combining low-rank approximations with local interpolation procedures.

· They used real datasets from 3 networks.

· Compared to all other algorithms, this one is always better. Even with 98% missing values, the error is only of about 20%.

Passive Aggressive Measurement with MGRP

Pavlos Papageorgiou (University of Maryland, College Park), Justin McCann (University of Maryland, College Park), Michael Hicks (University of Maryland, College Park)

· Active proboing is very expensive – bandwidth expensive, and probes can interfere with the data

· With custom active measurement you shape the application data for measurement. This is efficient but not modular. And it is not reusable

· They propose MGRP – main intuition: MGRP piggybacks application data inside active probes.

· MGRP properties: 1) end to end measurement architecture (schedules probes for tx, and piggybacks application data on probes), 2) transparent to applications, 3) independent of measurement algorithms, 4) easy to adapt existing measurement tools, 5) can piggyback data across applications

· It enables aggressive probing with passive-like overhead.

· Piggybacking reduces bandwidth wasted by probes and enables measurement tools to be more aggressive faster and more accurate.

· Any measurement algorithm can now be written as if active, but implemented as passive.

Session 9: Performance Optimization (Chair: Ratul Mahajan, Microsoft Research)

ROAR: Increasing the Flexibility and Performance of Distributed Search

Costin Raiciu (UCL), Felipe Huici (UCL, NEC Labs), Mark Handley (UCL), David S. Rosenblum (UCL)

· We rely in distributed search everyday.

· Characteristics : big data set that doesn’t fit in one server. To reduce latency you use multiple servers, and put a bit of the data in each. But we also replicate the data. We create clusters of servers.

· PR = N (Number of Query Partitions * Replication = Number of Servers)

· P affects system behaviour. It dictates how much data each node stores, and it impacts latency.

· Partitioning determines latency and cost.

· Question: how to choose P, without knowing the workload? Choose the biggest value? Not desirable. CPU load increases...

· P is very difficult to change with existing solutions.

· How does google change P? In a way that requires over-provisioning, and lots of copies of the data (an estimate of 20TB/date center).

· The proposal: ROAR. Key observation: we do not need clusters to ensure each query meets all the data.

· ROAR uses consistent hashing. It uses a single parameter: P.

· ROAR copies zero data to increase P (while google’s algorithm has to copy a lot of data)

· Minimal data is copied when P decreases.

· They demonstrate the system using a privacy-preserving search application built upon ROAR.

· ROAR Changes P efficiently.

· The system scales. They tested this with 1000 servers in Amazon EC2.

Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication

Vijay Vasudevan (Carnegie Mellon University), Amar Phanishayee (Carnegie Mellon University), Hiral Shah (Carnegie Mellon University), Elie Krevat (Carnegie Mellon University), David Andersen (Carnegie Mellon University), Greg Ganger (Carnegie Mellon University), Garth Gibson (Carnegie Mellon University and Panasas, Inc), Brian Mueller (Panasas, Inc.)

· Timeout delay in TCP is big – 200 ms could be a big problem for datacenters.

· Some datacenter applications are very sensitive to these 200ms timeouts.

· Solution: enable microsecond retx (being safe in the wide area).

· Datacenter environment: 1-10Gbps, commodity Ethernet switches, and 10-100 microseconds latency.

· In a datacenter environment, 1 TCP timeout means more than 1000 RTTs...

· Long periods of idle links due to these high timeouts

· Other TCP variants did not prevent TCP timeouts.

· Timeouts also increase latency

· The solution: Microsecond TCP retransmissions

· 1st solution: simple one line change in Linux (eliminate minRTO). It improves throughput, but still not microsecond RTO.

· Requirements for microsecond RTO: TCP must track RTT in microseconds (timestamp option). Efficient high resolution kernel timers (use HPET for efficient interrupt signalling).

· They were able to provide high throughput sustained for up to 47 servers.

· Is it safe? One has to look at the interaction with delayed ACK, and the performance in the wide area.

· Better performance with delayed ACK disabled.

· Is it safe for the wide area? Potential concerns: stability (could we cause congestion collapse?), and performance (do we oftenb timeeout unnecessarily?)

· Stability is preserved, because timeouts retain exponential backoff and spurious timeouts slow rate of transfer.

· Today’s TCP has mechanisms to detect spurious timeouts (using timestamp) and to recover from spurious timeour (forward RTO), and they are both implemented widely.

· They analysed wide area performance without minRTO, to see if it harms the wide area. They didn’t find any difference in throughput. Reason? Few total timeouts (supuious or legitimate).

Matchmaking for Online Games and Other Latency-Sensitive P2P Systems

Sharad Agarwal (Microsoft Research), Jacob R. Lorch (Microsoft Research)

· Latency is the main factor to be aware of in multiplayer online games.

· Predicting latency would help.

· They have an extensive matchmaking trace.

· Current approaches: geolocation, and network coordinate system. They mix both to get Htrae.

· Weakness of geolocation: inflexible. Weakness of network coordinate systems: sensitive to initial conditions (takes a while to converge).

· They have refined their original system that basically mixed geolocation for the initial condition and a network coordinate system to have flexibility. Refinements: 1) autonomous system correction (to change the height of the node); 2) symmetric updates to speed up convergence.

· Trace replays for evaluation: 30 days, 3.5 million machines, 50 million probes.

· Absolute error median for Htrae is 15ms.

· Htrae can find the best server 70% of the time.

· Deployed systems have to guess a lot, so Htrae is way better.

About me

e-mail: fvramos at gmail dot com