Special Session 7
Special Session 7: Retrieval-Augmented Generation and Fine-Grained Object Analysis
Description:
Fine-grained object retrieval and analysis remain
cornerstone challenges in computer vision, pattern
recognition, and multimodal AI. These tasks are critical for
applications such as fashion recommendation, species
identification, cross-modal search, and industrial
automation. While advances in deep learning have improved
semantic representation, emerging paradigms like
Retrieval-Augmented Generation (RAG) now offer
transformative potential by integrating retrieval systems
with generative models. This fusion enables context-aware,
interpretable, and scalable solutions for fine-grained
tasks, particularly when handling unlabeled data, unseen
domains, or multimodal interactions.
However, challenges persist in balancing retrieval precision
with generative diversity, optimizing cross-modal alignment,
and ensuring efficiency in real-world systems. This special
session seeks to unite cutting-edge research on fine-grained
retrieval, matching, ranking, and RAG-driven methodologies,
bridging gaps between traditional retrieval systems and
generative AI.
We invite submissions that advance the theory, algorithms,
and applications of fine-grained object analysis and
retrieval-augmented systems. Emphasis will be placed on
innovations that address multi-level semantics, cross-modal
alignment, and domain adaptation, with a focus on
scalability and real-world utility. Submissions exploring
novel integrations of retrieval and generation (RAG) are
particularly encouraged.
Session organizers
Assoc. Prof. Ying Li, Nanjing Normal University, China
Assoc. Prof. Tao Yao, Ludong University & Southwest Jiaotong
University, China
The topics of interest include, but are not limited
to:
• Fine-grained object retrieval and ranking in images, video, 3D data, or cross-modal contexts (text-to-image, image-to-text).
• Retrieval-Augmented Generation (RAG) for enhancing fine-grained analysis, including context-aware feature synthesis and hybrid retrieval-generation pipelines.
• Adversarial/generative models for robust representation learning in low-data or open-world scenarios.
• Self-supervised, meta-, and unsupervised learning for adapting retrieval/RAG systems to unseen domains or classes.
• Manifold learning, graph neural networks, and transformers for modeling complex semantic hierarchies in fine-grained data.
• Efficient indexing, real-time retrieval, and lightweight RAG architectures for large-scale applications.
• Cross-modal alignment and multimodal fusion techniques for joint retrieval and generation.
• Novel datasets, benchmarks, and evaluation metrics for fine-grained retrieval and RAG systems.
• Applications in e-commerce, healthcare, biodiversity monitoring, autonomous systems, and augmented reality.
Submission method
Submit your Full Paper (no less than 8 pages) or your paper
abstract-without publication (200-400 words) via
Online Submission System, then choose Special Session 7
(Retrieval-Augmented Generation and Fine-Grained Object Analysis)
Template Download
Introduction of session organizers
Assoc. Prof. Ying Li
Nanjing Normal University, China
Ying Li received the B.E. degree in electronics and information engineering, and the M.S. and Ph.D. degrees in signal and information processing from the Dalian University of Technology, Dalian, China, in 2012, 2015, and 2019, respectively. From 2015 to 2017, she was a Visiting Graduate Student with the Department of Computer Science, The University of Texas at San Antonio (UTSA), San Antonio, TX, USA. She is currently an Associate Professor with the School of Computer and Electronic Information and the School of Artificial Intelligence, Nanjing Normal University, Nanjing, China. Her research interests include multimedia retrieval and computer vision. In recent years, She has led three research projects funded by the NSFC and Jiangsu Province, and she has published over 20 papers in international conferences and journals such as ACM MM, PR, ACM TOMM and IEEE TCSVT. She serves as an Associate Editor for the journal Pattern Recognition and as Area Chair for ACM Multimedia 2023. She has also held roles including Session Chair for IJCAI 2020 and Reviewer for top-tier journals and conferences (e.g., TPAMI, TMM, TNNLS, CVPR, ICCV, ACM MM).
Assoc. Prof. Tao Yao
Ludong University & Southwest Jiaotong University, China
Tao Yao received the Ph.D. degree from the Dalian University of Technology, China, in 2017. He currently is an Associate Professor with the Department of Information and Electrical Engineering, Ludong University and also a researcher with Yantai Research Institute of New Generation Information Technology, Southwest Jiaotong University. His research interests include multimedia retrieval, computer vision, and machine learning.