Honeyfile Camouflage: Hiding Fake Files in Plain Sight
Authors: Roelien C. Timmer, David Liebowitz, Surya Nepal, Salil S. Kanhere
Published: 2024-05-08 02:01:17+00:00
AI Summary
This paper addresses the challenge of camouflaging honeyfiles (fake files used for intrusion detection) within real filesystems by focusing on filename generation. Two metrics are developed to quantify filename camouflage: one based on simple averaging and another using clustering with mixture fitting, both evaluated on a GitHub software repository dataset.
Abstract
Honeyfiles are a particularly useful type of honeypot: fake files deployed to detect and infer information from malicious behaviour. This paper considers the challenge of naming honeyfiles so they are camouflaged when placed amongst real files in a file system. Based on cosine distances in semantic vector spaces, we develop two metrics for filename camouflage: one based on simple averaging and one on clustering with mixture fitting. We evaluate and compare the metrics, showing that both perform well on a publicly available GitHub software repository dataset.