Virtual camera detection: Catching video injection attacks in remote biometric systems

Authors: Daniyar Kurmankhojayev, Andrei Shadrikov, Dmitrii Gordin, Mikhail Shkorin, Danijar Gabdullin, Aigerim Kambetbayeva, Kanat Kuatov

Published: 2025-12-11 14:01:06+00:00

AI Summary

This study introduces a machine learning-based approach for Virtual Camera Detection (VCD) to counter video injection attacks in remote facial biometric systems. The approach trains a model on metadata collected during authentic user sessions, focusing on camera behavior rather than visual cues, to distinguish between real and virtual camera inputs. Empirical results demonstrate the model's effectiveness in identifying video injection attempts and enhancing the security of Face Anti-Spoofing (FAS) systems.

Abstract

Face anti-spoofing (FAS) is a vital component of remote biometric authentication systems based on facial recognition, increasingly used across web-based applications. Among emerging threats, video injection attacks -- facilitated by technologies such as deepfakes and virtual camera software -- pose significant challenges to system integrity. While virtual camera detection (VCD) has shown potential as a countermeasure, existing literature offers limited insight into its practical implementation and evaluation. This study introduces a machine learning-based approach to VCD, with a focus on its design and validation. The model is trained on metadata collected during sessions with authentic users. Empirical results demonstrate its effectiveness in identifying video injection attempts and reducing the risk of malicious users bypassing FAS systems.


Key findings
All three models (CatBoost, HGB, Ensemble) achieved strong discriminative capability with AUC-ROC scores above 0.9, effectively distinguishing real from virtual camera inputs. At an Attack Presentation Classification Error Rate (APCER) of 10^-1, the system maintained a low Bona Fide Presentation Classification Error Rate (BPCER) of 14.6%, indicating a balanced security-usability trade-off. However, stricter APCER thresholds led to significantly higher BPCERs, highlighting a trade-off that may require VCD to be part of a multi-layered anti-spoofing framework.
Approach
The authors developed a machine learning model that analyzes metadata collected from browser API interactions with the camera driver. This metadata includes camera capabilities, response times to configuration changes (e.g., frame rate, height, width), and reported vs. actual settings, which reveal patterns distinctive to physical versus virtual cameras.
Datasets
A proprietary dataset collected by the authors, comprising over 30,000 authentication sessions across various platforms (Android, iOS, Linux, MacIntel, Win32) and browsers (Chrome, Firefox), including both bonafide and attack sessions.
Model(s)
Histogram Gradient Boosting (HGB), Categorical Boosting (CatBoost), and an ensemble combining both.
Author countries
Kazakhstan