Spinning an engine and using it in RZT aiOS
Modified on: 2020-12-28 12:29:08 +0530
An engine is required to run pipelines that need a large amount of computing resources. Though it is possible to execute pipelines and blocks directly in Jupyter notebook kernel, for executing real world machine learning and deep learning tasks one needs scalable and distributed computing infrastructure like spark and horovod (for distributed GPU training). An admin user can add and manage computing resources from the "Infrastructure" management portal.
Role for performing this task
- Log in to RZT aiOS as an Admin, and click Settings icon on lower left corner
The Settings dialog is displayed.
- Click on the "INFRASTRUCTURE" option on the left panel. List of all provisioned engines and their current status is displayed. To add a new engine click on icon
- The Provision Engine dialogue box is displayed. Enter a name and description for the engine. Based on your computing requirements, select a type of machine from the left panel. The cost and capacity of (CPU cores, RAM and GPU) the selected machine is displayed on the right panel. Selecting the checkbox "Experiment mode" allows the infrastructure to be shared across multiple jobs instead of allocating a dedicated resource for a job. Select the required option and click on "PROVISION"
- The engine gets added to the list and status is shown as "Provisioning".
- Once provisioning is completed that status will change to "Running" and the newly provisioned engine is available to run your pipelines from UI and jupyter notebook
Can you please tell us how we can improve this article?