log in  |  register  |  feedback?  |  help  |  web accessibility
PhD Defense: On Scheduling and Communication Issues in Data Centers
Sheng Yang
Monday, November 9, 2020, 5:00-7:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
The proliferation of datacenters to handle the rapidly growing amount of data being managed in the cloud, necessitates the design, management and effective utilization of the thousands of machines that constitute a data center. Many modern big data applications require access to a large number of machines and data sets for training neural nets or for other processing.

In this thesis, we present research challenges and progress along two fronts.

The first challenge addresses the need to schedule communication between the machines in a much more effective manner, as several running applications compete for network bandwidth. We address a basic question known as coflow scheduling to optimize the average completion time of tasks that are running across different  machines in the data center and to effectively handle their communication needs. In addition, we also study a related model that addresses communication needs of tasks that process data on multiple data centers and handles communication requirements of such tasks across a wide area network with possibly widely varying bandwidth across different pairs of machines.

The second challenge is from the user perspective - since access to resources such as those provided by Amazon AWS can be expensive at scale, cloud computing providers often sell under utilized resources at a significant discount via a spot instance market. However, these instances are not dedicated and while they offer a cheaper alternative, there is a chance that the user's job will be interrupted to process higher priority tasks. Certain non-critical applications are not significantly impacted by delays due to interruptions, and we develop a framework to study some basic scheduling questions.

In all of these topics, the problems we study are NP-hard and our focus is on developing good approximation algorithms. In addition, our work has shown that the algorithms developed in this framework are practical and efficient and can be easily deployed in practice.

Examining Committee: 
                           Chair:              Dr. Samir Khuller            
                           Dean's rep:      Dr. Uzi Vishkin 
                          Members:         Dr. Mosharaf Chowdhury
                                                Dr. David Mount
                                                Dr. Neil Spring   

Sheng Yang is a PhD candidate in CS department at University of Maryland, College Park, advised by Prof. Samir Khuller. He obtained his bachelor degree in 2015  from Tsinghua University Yao Class. His research interests include approximation algorithms, graph theory, scheduling, computer networks, and cloud computing. He focuses on theoretical problems originating from practice, and designing algorithms that achieve both good theoretical bounds and practical efficiency.

This talk is organized by Tom Hurst