Environment Preparation
Before we start extracting data from SAP, we need to create a repository to hold the extracted data.
In this example we will be using Azure Data Factory to extract the data and store it in a Blob Storage of a Datalake.
This section will show the steps required to prepare the environment for that:
- Log on to the Azure Portal and look for Storage Accounts
- Create a new Storage Account and provide the required information. Select the same region that SAP was deployed:
- Resource Group: SAP_ADF (create new RG)
- Storage Account Name: sapadfXXXX (must be unique on Azure, change XXXX by any set of 4 numbers)
- Region: East US
- Redundancy: LRS
- On the newly created storage account, select Containers on the left pane and click on + Container
- Name it cntsapadf and accept the default security of Private, once we don’t want to expose business data to the internet.
With those 4 steps we have created a repository to store the data.
Let’s create the Azure Data Factory instance now:
- Log on to the Azure Portal and look for Data factories
- Create a new Data factory and provide the required information. Select the same region that SAP was deployed:
- Resource Group: SAP_ADF
- Name: sapadfXXXX (change XXXX by any set of 4 numbers)
- Region: East US
- Click Next or select the tab Git configuration and check Configure Git later once we will not be storing our pipelines in a repository in this example. After this, click Create
Now we will configure the Integration Runtime on our Bastion Host to have access to the SAP data:
- On the Data Factory overview page, click on Open Azure Data Factory Studio
- On the new tab that will open, create a New Pipeline
- Name it adfsap_pipeline on the right tab that will open:
- Click on the Toolbox icon on the left pane, then click on Integration runtimes and click on New. Select the Self Hosted option
- On the Integration runtime setup step, select Self Hosted again
- And name it SAPIntegrationRuntime; click Create
- Once we are not performing those action on the Bastion host itself, we will choose Option 2: Manual Setup.
- Copy the Key 1 Authentication Key that will be used on the runtime installation to some scratch area.
- Right click and Copy the URL from the Step 1 (we will use it to download the software on the bastion host)
- Go to the Bastion Host via RDP and download the latest version, using the URL provided in the step #7:
- Install the downloaded software:
- On the setup phase, paste the Authentication Key from step #7, and click Register:
- For this example, we will Enable remote access from intranet and click Next:
- The Integration Runtime will contact the Data FActory using the Authentication Key provided and register itself. Click on Launch Configuration Manager and confirm that everything was OK:
- All set on the Bastion Host, we will go back to the Azure Portal. Let’s make sure the Integration runtime is showing up on the Data Factory. Click on Refresh button and it should show the newly registered SAPIntegrationRuntime:
Now, finally, we are ready to start extracting data from SAP!
The next 2 steps will be similar, divided by data provider.