LL-ICM: Image Compression for Low-Level Machine Vision via Large Vision-Language Model

  • Yuan Xue
  • , Qi Zhang*
  • , Chuanmin Jia
  • , Shiqi Wang
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Image Compression for Machines (ICM) aims to compress images for machine vision tasks, while current methods mostly focus on the demands for high-level tasks. However, the quality of original images is usually not guaranteed in the real world, leading to even worse downstream task performance after compression. Thus, lowlevel (LL) restoration tasks should also be considered in ICM. In this paper, we propose the first ICM framework for LL machine vision tasks, namely LL-ICM, which optimizes the compression and LL processing performance simultaneously. Moreover, LL-ICM leverages large vision-language model (VLM) to solve different LL task within a single model, which is particularly useful when the distortion type of the original image is uncertain. As illustrated in Fig. 1(a), LL-ICM consists of a neural image codec and a VLM-based LL processing module. Given an original image with distortions, LL-ICM firstly compress it as X. Then, we extract a generalized feature F from X , which is then encoded as two representations, distortion type f and caption s. After that, the LL processing module receives X and its representations to generate the restored version of X, i.e., XH.

Original languageEnglish
Title of host publicationProceedings - DCC 2025
Subtitle of host publication2025 Data Compression Conference
EditorsAli Bilgin, James E. Fowler, Joan Serra-Sagrista, Yan Ye, James A. Storer
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages408
Number of pages1
ISBN (Electronic)9798331534714
DOIs
StatePublished - 2025
Externally publishedYes
Event2025 Data Compression Conference, DCC 2025 - Snowbird, United States
Duration: 18 Mar 202521 Mar 2025

Publication series

NameData Compression Conference Proceedings
ISSN (Print)1068-0314

Conference

Conference2025 Data Compression Conference, DCC 2025
Country/TerritoryUnited States
CitySnowbird
Period18/03/2521/03/25

Fingerprint

Dive into the research topics of 'LL-ICM: Image Compression for Low-Level Machine Vision via Large Vision-Language Model'. Together they form a unique fingerprint.

Cite this