As an Razorthink Platform User or Admin, you want to access Amazon S3(Simple Storage Service) drive for storing and retrieving data. A preconfigured data source can be accessed from a Jupyter notebook and blocks code.
Roles for performing this task
- Log in to Razorthink Platform as a User or Admin. Click the Settings icon
on the bottom left corner.
- Click on Data Source Credentials. The list of all configured datasources are displayed. Click the Add Data Source Credential icon
- The Add New Configuration dialogue box is displayed. Select S3 from the Configuration dropdown. Enter the data source name, access key, and secret key in the appropriate fields and click on the Create button.
- The datasource credential configuration is saved and can be used to create a datasource inside a project
- Open the project in which you want to add the datasource and click on data icon . List of all configured data sources are displayed. Click on Add Datasource icon
- In the Add New Datasource dialogue box, select type as S3. Enter all other details and click on ADD button
- The data source gets created and can be accessed from Jupyter notebook and block code
Using the connected Datasource
- Open a Pipeline in the Pipeline Builder, and add the Blocks called 'S3CSVReader'. The Block inputs are:
- 'data_source_name' - the name of data source you just created
- 'bucket_name' - bucket name you are using in S3
- 'data_path' - path to your CSV file in that bucket
The Block in the Pipeline canvas will look something like this:
That's all you need to do to use a file from S3! Now you can just go ahead and run the Pipeline with your other Blocks.