Network Digital Twins for High-Performance Computing

This talk outlines a holistic framework for network digital twins (NDTs) in high-performance computing, covering construction, optimization, and application. The approach integrates federated learning for privacy-preserving model updates and reinforcement learning for closed-loop control, enabling real-time adaptation to traffic and system dynamics. Topics Covered HPC network digital twin construction: topology and telemetry ingestion, calibration Distributed and regional twin orchestration Federated learning for privacy-preserving NDT updates Reinforcement learning for closed-loop HPC network control Case studies: workload forecasting, congestion-aware routing, anomaly detection Portability to edge and 6G-class environments Links Workshop Program

November 2025 · Z. Zhang

Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks

Download Paper arXiv Abstract Optimizing edge caching is crucial for the advancement of next-generation (nextG) wireless networks, ensuring high-speed and low-latency services for mobile users. Existing data-driven optimization approaches often lack awareness of the distribution of random data variables and focus solely on optimizing cache hit rates, neglecting potential reliability concerns, such as base station overload and unbalanced cache issues. This oversight can result in system crashes and degraded user experience. To bridge this gap, we introduce a novel digital twin-assisted optimization framework, called D-REC, which integrates reinforcement learning (RL) with diverse intervention modules to ensure reliable caching in nextG wireless networks. We first develop a joint vertical and horizontal twinning approach to efficiently create network digital twins, which are then employed by D-REC as RL optimizers and safeguards, providing ample datasets for training and predictive evaluation of our cache replacement policy. By incorporating reliability modules into a constrained Markov decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints, minimizing the risk of network failures. Theoretical analysis demonstrates comparable convergence rates between D-REC and vanilla data-driven methods without compromising caching performance. Extensive experiments validate that D-REC outperforms conventional approaches in cache hit rate and load balancing while effectively enforcing predetermined reliability intervention modules. ...

May 2024 · Z. Zhang, Y. Liu, Z. Peng, M. Chen, D. Xu, S. Cui