Performance Models support the AI-SPRINT design and runtime components in selecting an appropriate configuration to:

  1. avoid applications performance violations,
  2. avoid under or overestimation of continuum resource utilisation, and
  3. predict the execution time of Deep Learning components on a target configuration.

The best regression model is built for each task, and then used to predict the execution time of inference components/pipelines or training jobs to support the selection of the most appropriate system configuration for executing them fulfilling QoS requirements while minimising operational costs. 


Open source/proprietary

The software of a-MLLibrary is available at: https://github.com/a-MLLibrary/a-MLLibrary, https://gitlab.polimi.it/ai-sprint/a-mllibrary, while the performance profiling tool a-GPUBench (for the TensorFlow jobs training)  is available at https://gitlab.polimi.it/ai-sprint/a-gpubench. Both are licensed under Apache License, Version 2.0.



The tool is a separate component. The current implementation is based on Python and relies on standard Python libraries.  The tool is used by SPACE4AI-D, SPACE4AI-R, AI models architecture search, and GPU scheduler components to estimate the execution time for inference or training tasks under different deployments.