Differentially Private Streaming Metrics with Laplace Noise in Apache Flink

Authors

  • Sai Charan Ponnoju Fidelity Investments, USA Author
  • Prabhu Muthusamy Cognizant Technology Solutions, Canada Author
  • Chiranjeevi Devi LinkedIn Corp, USA Author

Keywords:

differential privacy, Laplace mechanism, Apache Flink, streaming metrics, data utility, event-time processing

Abstract

This study explores integrating Laplace-mechanism-based differential privacy into Apache Flink-based high-throughput streaming data pipelines. Calibrated Laplace noise per-user metrics accelerate and enhance statistical accuracy across billion-record event streams. We analyze value, privacy, and computational cost. We developed an adaptive noise scaling strategy to improve ε-differential privacy based on user behavior and time. Watermark-aligned aggregation windows guard privacy and event-time semantics. Time is saved and utility distortion reduced. Actual telemetry was used to test the strategy on geo-distributed cloud clusters. The method worked. The framework simplifies privacy-sensitive metric calculation with good differential privacy and low relative inaccuracy.

Downloads

Download data is not yet available.

References

C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography Conference (TCC), Springer, 2006, pp. 265–284.

C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014.

A. Bittau et al., “Prochlo: Strong privacy for analytics in the crowd,” in Proc. 26th ACM Symp. Operating Systems Principles (SOSP), 2017, pp. 441–459.

I. Roy, S. Setty, A. Kilzer, V. Shmatikov, and E. Witchel, “Airavat: Security and privacy for MapReduce,” in Proc. USENIX NSDI, vol. 10, 2010, pp. 297–312.

S. McSherry, “Privacy integrated queries: an extensible platform for privacy-preserving data analysis,” in Proc. ACM SIGMOD, 2009, pp. 19–30.

R. Chen, B. C. M. Fung, B. C. Desai, and N. M. S. Zhang, “Differentially private transit data publication: A case study on the Montreal transportation system,” in Proc. ACM SIGKDD, 2012, pp. 213–221.

D. Fan, M. Zhang, and W. Zhang, “Streaming differential privacy with low regret,” in Proc. NeurIPS, 2020.

M. Fredrikson, E. Lantz, S. Jha, and T. Ristenpart, “Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing,” in USENIX Security Symposium, 2014, pp. 17–32.

A. Korolova, K. Kenthapadi, N. Mishra, and A. Roth, “Differentially private web search click logs,” in Proc. 20th Int. Conf. WWW, 2011, pp. 87–96.

F. McSherry and R. Mahajan, “Differentially private network trace analysis,” SIGCOMM Computer Communication Review, vol. 40, no. 4, pp. 123–134, Oct. 2010.

A. Mir, S. Isaacman, R. L. Cummings, and R. Wright, “DP-WHERE: Differentially private modeling of human mobility,” in Proc. IEEE BigData, 2013, pp. 580–588.

L. Fan and L. Xiong, “Real-time aggregate monitoring with differential privacy,” in Proc. ACM CIKM, 2012, pp. 2169–2173.

H. Xu, H. Guo, K. Wang, and C. Jiang, “Achieving differential privacy in data release using random noise,” Journal of Information & Computational Science, vol. 7, no. 3, pp. 826–833, 2010.

K. Hsu, M. Gaboardi, A. Haeberlen, B. C. Pierce, and A. Narayan, “Differential privacy: An economic method for choosing epsilon,” in Proc. IEEE CSF, 2014, pp. 398–410.

Apache Flink Documentation, “Event Time and Watermarks,”

J. Manku, S. Srivastava, and A. Tomkins, “Approximate frequency counts over data streams,” in Proc. VLDB, 2002, pp. 346–357.

M. Bolot, A. Karagiannis, N. Taft, and G. Varghese, “Aggregating distributed data streams with hierarchical sketches,” in Proc. IEEE INFOCOM, 2003, pp. 3–14.

D. Kellaris, S. Papadopoulos, and D. Papadias, “Practical differential privacy for SQL queries using elastic sensitivity,” in Proc. VLDB, 2014.

G. Kellaris, S. Papadopoulos, X. Xiao, and D. Papadias, “Differentially private event sequences over infinite streams,” Proc. VLDB Endowment, vol. 7, no. 12, pp. 1155–1166, 2014.

N. Ristenpart and T. Ristenpart, “Scalable privacy-preserving streaming analytics,” in Proc. USENIX ATC, 2020.

Downloads

Published

03-05-2022

How to Cite

[1]
Sai Charan Ponnoju, Prabhu Muthusamy, and Chiranjeevi Devi, “Differentially Private Streaming Metrics with Laplace Noise in Apache Flink”, American J Auton Syst Robot Eng, vol. 2, pp. 417–451, May 2022, Accessed: Dec. 12, 2025. [Online]. Available: https://ajasre.org/index.php/publication/article/view/78