ScaDL 2023

AI IN SECURE PRIVACY-PRESERVING COMPUTING CONTINUUM

About
- Partners
- External Advisory Board
Results
Use Cases
Tools
- Components
- MOOC
Publications
- Scientific Publications
- Press Clippings and Press Releases
News & Events
- News
- Events
- Webinars

Contact

Scope of the workshop

Recently, Deep Learning (DL) has received tremendous attention in the research community because of the impressive results obtained for a large number of machine learning problems. The success of state-of-the-art deep learning systems relies on training deep neural networks over a massive amount of training data, which typically requires a large-scale distributed computing infrastructure to run. In order to run these jobs in a scalable and efficient manner, on cloud infrastructure or dedicated HPC systems, several interesting research topics have emerged which are specific to DL. The sheer size and complexity of deep learning models when trained over a large amount of data makes them harder to converge in a reasonable amount of time. It demands advancement along multiple research directions such as, model/data parallelism, model/data compression, distributed optimization algorithms for DL convergence, synchronization strategies, efficient communication, federated learning and specific hardware acceleration. Distributed DL becomes even more challenging when one considers additional desiderata of trustworthiness such as privacy, adversarial robustness, and fairness.

Areas of interest

In this workshop, we solicit research papers focused on distributed deep learning aiming to achieve efficiency and scalability for deep learning jobs over distributed and parallel systems. Papers focusing both on algorithms as well as systems are welcome. We invite authors to submit papers on topics including but not limited to:

Deep learning on cloud platforms, HPC systems, and edge devices
Model-parallel and data-parallel techniques
Asynchronous SGD for Training DNNs
Communication-Efficient Training of DNNs
Scalable and distributed graph neural networks, Sampling techniques for graph neural networks
Federated deep learning, both horizontal and vertical, and its challenges
Model/data/gradient compression
Learning in Resource constrained environments
Coding Techniques for Straggler Mitigation
Elasticity for deep learning jobs/spot market enablement
Hyper-parameter tuning for deep learning jobs
Hardware Acceleration for Deep Learning including digital and analog accelerators
Scalability of deep learning jobs on large clusters
Deep learning on heterogeneous infrastructure
Efficient and Scalable Inference
Data storage/access in shared networks for deep learning
Communication-efficient distributed fair and adversarially robust learning
Distributed learning techniques applied to speed up neural architecture search

ScaDL Research Directions

ScaDL seeks to advance the following research directions:

Asynchronous and Communication-Efficient SGD: Stochastic gradient descent is at the core of large-scale machine learning. Parallelizing SGD gradient computation across multiple nodes increases the data processed per iteration, but exposes the SGD to communication and synchronization delays and unpredictable node failures in the system. Thus, there is a critical need to design robust and scalable distributed SGD methods to achieve fast error-convergence in spite of such system variabilities.
High performance computing aspects: Deep learning is highly compute intensive. Algorithms for kernel computations on commonly used accelerators (e.g. GPUs), efficient techniques for communicating gradients and loading data from storage are critical for training performance.
Model and Gradient Compression Techniques: Techniques such as reducing weights and the size of weight tensors help in reducing the compute complexity. Using lower-bit representations such as quantization and sparsification allow for more optimal use of memory and communication bandwidth.
Distributed Trustworthy AI: New techniques are needed to meet the goal of global trustworthiness (e.g., fairness and adversarial robustness) efficiently in a distributed DL setting.
Emerging AI hardware Accelerators: with the proliferation of new hardware accelerators for AI such in memory computing (Analog AI) and neuromorphic computing, novel methods and algorithms need to be introduced to adapt to the underlying properties of the new hardware (example: the non-idealities of the phase-change memory (PCM) and the cycle-to-cycle statistical variations).
The intersection of Distributed DL and Neural Architecture Search (NAS): NAS is increasingly being used to automate the synthesis of neural networks. However, given the huge computational demands of NAS, distributed DL is critical to make NAS computationally tractable (e.g., differentiable distributed NAS).

This intersection of distributed/parallel computing and deep learning is becoming critical and demands specific attention to address the above topics which some of the broader forums may not be able to provide. The aim of this workshop is to foster collaboration among researchers from distributed/parallel computing and deep learning communities to share the relevant topics as well as results of the current approaches lying at the intersection of these areas.

AI-SPRINT at the ScaDL 2023

AI-SPRINT project representatives are joining the ScaDL 2023 as General Chairs and members of the Steering Committee:

Danilo Ardagna, Associate Professor at Politecnico di Milano and AI-SPRINT Scientific Coordinator, is a member of the Steering Committee
Daniele Lezzi, Senior Researcher at Barcelona Supercomputing Center and member of the technical team in AI-SPRINT, is one of the General Chairs

follow us

ScaDL 2023

Scope of the workshop

Areas of interest

ScaDL Research Directions

AI-SPRINT at the ScaDL 2023

Read more about the conference on the official website