TY - GEN
T1 - Leveraging Conv-Attention for Efficient and High-Quality JPEG AI Image Coding
AU - Wang, Meng
AU - Esenlik, Semih
AU - Zhang, Zhaobin
AU - Wu, Yaojun
AU - Zhang, Kai
AU - Zhang, Li
AU - Wang, Shiqi
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In this paper, we present a Conv-Attention, a decoder-friendly attention mechanism, in an effort to advancing the practical application of the artificial intelligence-based image coding. More specifically, the proposed method is tailored for JPEG AI, which is the latest advanced neural-network based image coding standard. By identifying the obstacles by profiling the decoding complexity of JPEG AI, the attention module accounts for a significant proportion, which mainly attributes to the intricate network structure and involvement of less efficient operations. Conv-Attention model is composed with plain convolution and activation computations, equipping with sub-scaling and up-scaling design, such that the non-adjacent features can be well captured, leading to the reduction of decoding complexity and maintenance of the synthesis and attentive capability. Simulation results verify the effectiveness of the proposed method with JPEG AI reference software, wherein the decoding complexity is reduced by 80% with negligible coding performance loss. The proposed method was adopted in the 100th JPEG meeting.
AB - In this paper, we present a Conv-Attention, a decoder-friendly attention mechanism, in an effort to advancing the practical application of the artificial intelligence-based image coding. More specifically, the proposed method is tailored for JPEG AI, which is the latest advanced neural-network based image coding standard. By identifying the obstacles by profiling the decoding complexity of JPEG AI, the attention module accounts for a significant proportion, which mainly attributes to the intricate network structure and involvement of less efficient operations. Conv-Attention model is composed with plain convolution and activation computations, equipping with sub-scaling and up-scaling design, such that the non-adjacent features can be well captured, leading to the reduction of decoding complexity and maintenance of the synthesis and attentive capability. Simulation results verify the effectiveness of the proposed method with JPEG AI reference software, wherein the decoding complexity is reduced by 80% with negligible coding performance loss. The proposed method was adopted in the 100th JPEG meeting.
UR - https://www.scopus.com/pages/publications/85194854591
U2 - 10.1109/DCC58796.2024.00012
DO - 10.1109/DCC58796.2024.00012
M3 - 会议稿件
AN - SCOPUS:85194854591
T3 - Data Compression Conference Proceedings
SP - 43
EP - 52
BT - Proceedings - DCC 2024
A2 - Bilgin, Ali
A2 - Fowler, James E.
A2 - Serra-Sagrista, Joan
A2 - Ye, Yan
A2 - Storer, James A.
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 Data Compression Conference, DCC 2024
Y2 - 19 March 2024 through 22 March 2024
ER -