Abstract:
This paper presents a lightweight heuristic framework for static analysis of Microsoft Word Office Open XML (OOXML) documents aimed at early identification of potentially malicious files without executing their content. The proposed method analyses the internal structure of decompressed document containers to extract structural and textual indicators associated with suspicious behaviour. These indicators are aggregated using a weighted heuristic scoring model that estimates the overall risk of a document while accounting for correlations between multiple artifacts. The approach is designed to maintain low computational overhead enabling deployment as a pre execution inspection mechanism close to the end user environment. Experimental demonstrates that the proposed method effectively differentiates between benign and suspicious files while maintaining high detection confidence
