EECE571R: Topics in Distributed Computing - Quality of Service

 

Instructor: Matei Ripeanu

 

Mailing list: eece571r@ece.ubc.ca

 

Course description

 

The design of complex large-scale computing systems that provide controlled quality of service is an outstanding challenge for networking and distributed systems research. This graduate-level course uses an inclusive definition of quality of service (QoS) in computing systems: we will investigate issues related to providing predictable performance at multiple levels of the computing stack (operating system, network, middleware, application layer), using different elements of the computing infrastructure (e.g., network, storage) for different end-user applications (e.g., multimedia delivery, scientific workflows).

The course will cover fundamentals of: queuing theory, operating system and network support for offering controlled QoS; QoS enabled middleware and applications; the interplay between low-level QoS metrics and the quality of experience perceived by application users.  Advances in all these directions are key ingredients for recent efforts to build cyber-infrastructure. Students will be exposed to a range of quality of service technologies from networking (IntServ, DiffServ, RSVP), operating system (fair scheduling), and distributed systems (SLA, advance reservations) and their integration with massive computing systems.

 

Course structure

Three hours of class per week, with time divided roughly in equally between traditional lectures and student presentations/group discussions of recent research results.

 

Course outline (tentative weekly topics)

1.           Introduction. Overview of current research problems, technologies, and applications.

2.           Advanced QoS technologies. Capacity management. Pricing.

3.           Operating system support for QoS.

4.           Network level support for QoS (DiffServ, RSVP, IntServ)

5.           QoS-enabled middleware.

6.           Quality of service negotiation. Monitoring. Agreement violation detection and conflict resolution.

7.           QoS for distributed systems

8.           QoS for data storage

9.           Quality of Service vs. Quality of Experience

10.      Experience with deployed systems (I): QoS in distributed computing (Grids)

11.      Experience with deployed systems (II): QoS in multimedia applications: VoIP and IP‑TV.

12.      Project presentations

 

Team project

Each team (2-3 members) examines a particular distributed systems topic focusing on distributed systems and quality of service related issues.  While a set of projects will be proposed, students are encouraged to define a project of their own: either characterize an existing system, propose and evaluate techniques to improve existing systems, or prototype a new system. Note that it is critical that students present why a particular approach is used and how it contributes with rational explanation based on scientific or engineering knowledge leveraged by the literature search.  The result is evaluated by both the report in a standard form of IEEE publications and oral presentation.

 

References

Books (recommended):

1.     The Grid: Blueprint for a New Computing Infrastructure, Ian Foster, Carl Kesselman editors, 2nd Edition, Morgan Kaufmann, 2004.

2.     Reliable Distributed Systems: Technologies, Web Services, and Applications, Kenneth Birman, Springer, 2005

3.     Quantitative System Performance: Computer System Analysis Using Queueing Network Models, Edward D. Lazowska, John Zahorjan, G. Scott Graham, Kenneth C. Sevcik (available online)

 

Journals:

1.     ACM Transactions on Storage Systems

2.     IEEE Transactions on Parallel and Distributed Systems

3.     IEEE/ACM Transactions on Networking

 

Conferences:

1.     USENIX Conference on File and Storage Technologies (FAST)

2.     USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI)

3.     USENIX Symposium on Operating Systems Design and Implementation (OSDI)

4.     ACM SIGCOMM

5.     IEEE Conference on Computer Communications (INFOCOM)

6.     IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing – SC)

 

Grading

Research paper reviews, class participation: 50%

Project report and presentation: 50%