Master's Thesis

Hardware Security – Automated Firmware Analysis of OT Devices for Known Vulnerabilities

University of Rostock, May 2025 CVE-2015-4590
YARA Signatures Ghidra Reverse Engineering Control Flow Graph STM32 L476RG ArduinoJson ARM Cortex-M4 Graph Isomorphism

Overview

This thesis developed an automated approach to detect known vulnerabilities in OT firmware binaries without source code access. Using CVE-2015-4590 (a buffer overflow in ArduinoJson library versions < 4.5) as a case study, I demonstrated two detection methods:

  • Signature-based detection: Extract machine code patterns from vulnerable functions using Ghidra, create YARA rules for binary matching
  • Graph-based detection: Compare Control Flow Graph structures across different architectures to identify vulnerable code patterns

The goal: enable OT security teams to scan firmware for known CVEs before deployment, even when firmware updates are delayed or unavailable.

Why It Matters

  • OT firmware risks: Many embedded devices run outdated libraries; firmware updates are rare or never applied
  • Limited scanning options: Traditional antivirus/IDS solutions don't work well on compiled embedded firmware
  • Need for automation: Manual reverse engineering doesn't scale across hundreds of devices
  • Offline detection: Analyze firmware before deployment without touching production networks

Methodology

1. Vulnerability Selection & Reproduction

  • Selected CVE-2015-4590: Buffer overflow in ArduinoJson library (versions < 4.5), located in extractFrom() function
  • Compiled vulnerable firmware for STM32 L476RG (ARM Cortex-M4) using PlatformIO with ArduinoJson v4.4
  • Injected code to invoke extractFrom() path, ensuring vulnerable function appears in compiled binary

2. Reverse Engineering with Ghidra

  • Loaded compiled .bin firmware into Ghidra for disassembly
  • Located extractFrom() function in ARM assembly
  • Analyzed machine code to identify unique byte sequence marking the vulnerability (bne instruction with specific offset)

3. YARA Signature Development

  • Extracted 8-byte machine code signature from vulnerable block
  • Created YARA rules with full function signature + optimized 8-byte pattern
  • Added function size (84 bytes) as additional matching criterion to reduce false positives
  • Tested against 30 vulnerable builds and 30 patched firmware images

4. Control Flow Graph Analysis

  • Generated CFGs for extractFrom() across STM32 L476RG (ARM) and ESP8266 (Xtensa LX106) architectures
  • Applied graph isomorphism testing to detect structural similarity despite different instruction sets
  • Validated: Vulnerable builds (v4.4) on both architectures are isomorphic; patched version (v4.5) is not

Results

YARA Detection Accuracy

76.7% true positive rate (23 of 30 vulnerable builds detected) with 0% false positives on patched firmware

Optimized Signature

Reduced signature to 8 bytes with collision probability of ~1 in 1.8×10¹⁹, includes function size (84 bytes) as additional criterion

Cross-Architecture Detection

CFG-based graph isomorphism successfully identified vulnerable functions across ARM (STM32) and Xtensa (ESP8266) architectures

Detection Reliability Variance

GCC 12.x and 13.x: Accurate detection | GCC 14.x: Not accurate (compiler optimization affects signature)

Tools & Technologies

Ghidra

NSA's reverse engineering framework for firmware disassembly, decompilation, and machine code analysis

YARA

Pattern matching tool for malware analysis, used to create byte-signature detection rules

STM32 L476RG & ESP8266

ARM Cortex-M4 and Xtensa LX106 microcontrollers used for cross-architecture validation

PlatformIO

IDE for embedded development, used to compile vulnerable and patched firmware builds

Key Contributions

  • Signature-based detection: Developed compact YARA rules that survive compiler variations (GCC 12.x, 13.x)
  • Graph-based detection: Demonstrated CFG isomorphism as a cross-architecture detection method
  • Automated approach: Proposed pipeline from CVE to architecture-specific signatures
  • Real-world validation: Tested across 30 vulnerable builds with documented accuracy metrics

Future Work

The research proposes a fully automated YARA signature generation pipeline: CVE → Vulnerable Library → Architecture-Specific Signatures. This would enable security teams to automatically generate detection rules for new CVEs without manual reverse engineering, scaling vulnerability detection across thousands of OT devices.