Developing artificial intelligence in the cloud: the AI_INFN Platform
Date
Presentation Date
Editor
Other contributors
Other title
Resource type
Version
Pagination/Pages:
Research Project
Description
Abstract
The INFN CSN5-funded project AI_INFN (“artificial intelligence at INFN”) aims to promote ML and AI adoption within INFN by providing comprehensive support, including state of-the-art hardware and cloud-native solutions within INFN Cloud. This facilitates efficient sharing of hardware accelerators with out hindering the institute’s diverse research activities. AI_INFN advances from a Virtual-Machine-based model to a flexible Kubernetes-based platform, offering features such as JWT-based authentication, JupyterHub multitenant interface, distributed file system, customizable conda environments, and specialized monitoring and accounting systems. It also enables virtual nodes in the cluster, offloading computing payloads to remote resources through the Virtual Kubelet technology, with InterLink as provider. This setup can manage workflows across various providers and hardware types, which is crucial for scientific use cases that require dedicated infrastructures for different parts of the workload. Results of initial tests to validate its production applicability, emerging case studies and integration scenarios are presented.

