Communications of International Proceedings

Deep Network Representations as Reliable Indicators of Synthetic Content in Audiovisual and Clinical Contexts

AI-Driven Innovation and Smart Systems: 46AI 2025

Karol JĘDRASIAK and Julia BIJOCH

WSB University, Poland

Volume 2025 (34), Article ID 4638225, AI-Driven Innovation and Smart Systems: 46AI 2025

Abstract

This study introduces an interpretable framework for detecting synthetic audiovisual content using deep neural representations, applied to the DeepFake RealWorld (DFRW) dataset (46 371 clips; 77% with audio). Visual, acoustic, and cross-modal embeddings from ResNet, Vision Transformer, SlowFast, Wav2Vec2, and ECAPA-TDNN were evaluated with frequency-based metrics (Δp ≥ 0.15, PR ≥ 1.5). The strongest indicators were facial embedding variance (Δp = 0.29, PR = 3.4), Mahalanobis distance (Δp = 0.25), and audiovisual coherence (Δp = 0.23), all stable within 15% under compression and re-capture. In teledentistry and telemedicine, such explainable AI markers enhance authenticity verification of digital evidence and strengthen medico-legal reliability.

Keywords: Deepfake detection; deep neural representations; multimodal coherence; audiovisual forensics; explainable AI; telemedicine; teledentistry; data integrity

Deep Network Representations as Reliable Indicators of Synthetic Content in Audiovisual and Clinical Contexts

Karol JĘDRASIAK and Julia BIJOCH

WSB University, Poland

Abstract

+Articles

+General Information