Dynamic Correction Decoding for Hallucination Mitigation

## Overview **Dynamic Correction Decoding (DCD)** is an innovative approach to mitigate hallucinations in multimodal large language models (MLLMs). The system actively uses visual information to detect inconsistencies and dynamically correct generated text during the decoding process. ## Key Features - **Real-time Hallucination Detection**: Cross-modal consistency checking using vision-language alignment scores - **Dynamic Correction Module**: Context-aware alternative token generation when inconsistencies are detected - **Multimodal Integration**: Seamlessly combines visual and textual information for improved accuracy ## Technical Highlights - Reduced factual errors by 40-50% across multiple vision-language benchmarks - Maintained generation fluency while significantly improving reliability - Production-ready implementation suitable for deployment ## Applications This technique is particularly valuable for: - Image captioning systems - Visual question answering - Document understanding and OCR - Medical imaging report generation ## Publications - **MLLM Can See? Dynamic Correction Decoding for Hallucination Mitigation** (2024)