Secure Data Analytics in Apache Spark with Fine-grained Policy Enforcement and Isolated Execution

Byeongwook Kim

Network and Distributed System Security (NDSS) Symposium 2025 · Day 3 · Confidential Computing 2

In an era defined by massive data generation and the increasing demand for collaborative analytics, cloud-based Apache Spark has emerged as a cornerstone for processing big data. However, the convenience and scalability offered by cloud platforms come with significant security and privacy challenges. This talk, "Secure Data Analytics in Apache Spark with Fine-grained Policy Enforcement and Isolated Execution," addresses these critical issues head-on. It introduces a novel architecture for cloud-based Spark that enables secure data analytics while adhering to stringent privacy regulations.

AI review

Competent systems security research that combines TEE-based confidential computing with a regex-based policy enforcement layer on Spark logical plans — a reasonable contribution to the big data security space. The ideas are sound and the threat model is honest, but neither component is individually novel, and the combination doesn't produce a result greater than the sum of its parts.

Watch on YouTube