On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee

  • Yechao She*
  • , Minming Li
  • , Yang Jin
  • , Meng Xu
  • , Jianping Wang
  • , Bin Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

To meet increasing demands for machine-learning-based applications, pushing inference services to the network edge has been a trend. This work aims to design an on-demand edge inference scheduler with accuracy and deadline guarantee for repetitive tasks. Specifically, we consider an edge server that is preinstalled with multiple early-exit Deep Neural Networks (DNNs), and each DNN-exit pair can provide inference service of different quality. We also consider tasks' diversity in quality of service requirements and related utility. We aim to maximize the system's total utility by optimizing service assignment and time scheduling subject to resource, accuracy, and deadline constraints. We present this problem's integer linear problem formulation and show this problem is NP-hard even for the offline case. This problem is challenging due to the coupled effect of service assignment and time scheduling. To derive low-complexity scheduling solutions, we introduce a task-service graph and convert this problem into a service assignment selection problem with schedulability constraints. Then, we design a polynomial complexity algorithm with $\frac{\rho}{\delta}$-approximation ratio for the offline problem, with $\rho$ referring to the task-wise utility ratio, $\delta$ referring to the maximum number of concurrent tasks. To handle the online problem, we propose an online heuristic algorithm. Simulation results show that the proposed algorithms outperform the state-of-the-art baseline algorithms.

Original languageEnglish
Title of host publication2023 IEEE/ACM 31st International Symposium on Quality of Service, IWQoS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350399738
DOIs
StatePublished - 2023
Externally publishedYes
Event31st IEEE/ACM International Symposium on Quality of Service, IWQoS 2023 - Orlando, United States
Duration: 19 Jun 202321 Jun 2023

Publication series

NameIEEE International Workshop on Quality of Service, IWQoS
Volume2023-June
ISSN (Print)1548-615X

Conference

Conference31st IEEE/ACM International Symposium on Quality of Service, IWQoS 2023
Country/TerritoryUnited States
CityOrlando
Period19/06/2321/06/23

Fingerprint

Dive into the research topics of 'On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee'. Together they form a unique fingerprint.

Cite this