Special Session 7

Special Session 7: Retrieval-Augmented Generation and Fine-Grained Object Analysis

Description: Fine-grained object retrieval and analysis remain cornerstone challenges in computer vision, pattern recognition, and multimodal AI. These tasks are critical for applications such as fashion recommendation, species identification, cross-modal search, and industrial automation. While advances in deep learning have improved semantic representation, emerging paradigms like Retrieval-Augmented Generation (RAG) now offer transformative potential by integrating retrieval systems with generative models. This fusion enables context-aware, interpretable, and scalable solutions for fine-grained tasks, particularly when handling unlabeled data, unseen domains, or multimodal interactions.

However, challenges persist in balancing retrieval precision with generative diversity, optimizing cross-modal alignment, and ensuring efficiency in real-world systems. This special session seeks to unite cutting-edge research on fine-grained retrieval, matching, ranking, and RAG-driven methodologies, bridging gaps between traditional retrieval systems and generative AI.

We invite submissions that advance the theory, algorithms, and applications of fine-grained object analysis and retrieval-augmented systems. Emphasis will be placed on innovations that address multi-level semantics, cross-modal alignment, and domain adaptation, with a focus on scalability and real-world utility. Submissions exploring novel integrations of retrieval and generation (RAG) are particularly encouraged.

Session organizers
Assoc. Prof. Ying Li, Nanjing Normal University, China
Assoc. Prof. Tao Yao, Ludong University & Southwest Jiaotong University, China

The topics of interest include, but are not limited to:
• Fine-grained object retrieval and ranking in images, video, 3D data, or cross-modal contexts (text-to-image, image-to-text).
• Retrieval-Augmented Generation (RAG) for enhancing fine-grained analysis, including context-aware feature synthesis and hybrid retrieval-generation pipelines.
• Adversarial/generative models for robust representation learning in low-data or open-world scenarios.
• Self-supervised, meta-, and unsupervised learning for adapting retrieval/RAG systems to unseen domains or classes.
• Manifold learning, graph neural networks, and transformers for modeling complex semantic hierarchies in fine-grained data.
• Efficient indexing, real-time retrieval, and lightweight RAG architectures for large-scale applications.
• Cross-modal alignment and multimodal fusion techniques for joint retrieval and generation.
• Novel datasets, benchmarks, and evaluation metrics for fine-grained retrieval and RAG systems.
• Applications in e-commerce, healthcare, biodiversity monitoring, autonomous systems, and augmented reality.

Submission method
Submit your Full Paper (no less than 8 pages) or your paper abstract-without publication (200-400 words) via Online Submission System, then choose Special Session 7 (Retrieval-Augmented Generation and Fine-Grained Object Analysis)
Template Download

Introduction of session organizers

Assoc. Prof. Ying Li
Nanjing Normal University, China

Ying Li received the B.E. degree in electronics and information engineering, and the M.S. and Ph.D. degrees in signal and information processing from the Dalian University of Technology, Dalian, China, in 2012, 2015, and 2019, respectively. From 2015 to 2017, she was a Visiting Graduate Student with the Department of Computer Science, The University of Texas at San Antonio (UTSA), San Antonio, TX, USA. She is currently an Associate Professor with the School of Computer and Electronic Information and the School of Artificial Intelligence, Nanjing Normal University, Nanjing, China. Her research interests include multimedia retrieval and computer vision. In recent years, She has led three research projects funded by the NSFC and Jiangsu Province, and she has published over 20 papers in international conferences and journals such as ACM MM, PR, ACM TOMM and IEEE TCSVT. She serves as an Associate Editor for the journal Pattern Recognition and as Area Chair for ACM Multimedia 2023. She has also held roles including Session Chair for IJCAI 2020 and Reviewer for top-tier journals and conferences (e.g., TPAMI, TMM, TNNLS, CVPR, ICCV, ACM MM).

Assoc. Prof. Tao Yao
Ludong University & Southwest Jiaotong University, China

Tao Yao received the Ph.D. degree from the Dalian University of Technology, China, in 2017. He currently is an Associate Professor with the Department of Information and Electrical Engineering, Ludong University and also a researcher with Yantai Research Institute of New Generation Information Technology, Southwest Jiaotong University. His research interests include multimedia retrieval, computer vision, and machine learning.