In this section, we’ll use a dev-endpoint in AWS Glue to query data in S3 that was exported from QLDB. Since we’re querying the data stored in S3, our queries will not impose any load on our QLDB ledger or interfere with ongoing transactions.
Glue Dev End Points can incur heavy cost if left on. Ensure to disable or delete the end-point as soon as dev work is completed.
We’ll create all of our Glue components with CloudFormation in your desired region.
This template will be used to create the following items:
|Region||Launch CloudFormation Template|
|US East (Virginia)||Launch Stack in us-east-1|
|US East (Ohio)||Launch Stack in us-east-2|
|US West (Oregon)||Launch Stack in us-west-2|
|Europe (Frankfurt)||Launch Stack in eu-central-1|
or, download the file to your local workstation and create a CloudFormation stack by uploading the template.
You will see the Quick create stack page as shown below. In the Stack name block, leave the Stack name as
Leave the Parameters block as-is as no inputs are needed.
Check the “I acknowledge that CloudFormation might create IAM resources.” box and click
The stack will take several minutes to create. Its status will be updated to CREATE_COMPLETE.
Return to the main page of the AWS console by clicking the AWS logo in the upper-left hand corner of any console page. Go to the Glue page by typing
Glue in the Find Services box or by clicking
AWS Glue under the Analytics section under all All Services.
On the left of the console, select
You will see an Endpoint name of
QLDBLabEndpoint and it should have the Provisioning status of
Now, lets make sure the notebook is ready. On the left of the console, click on
Notebooks under Dev endpoints
Make sure SageMaker noteboooks is selected and the “Status” if
Ready for the Notebook name of
Now check the box next to the Notebook name and then click
OK on the popup.
You should see your Jupyter Notebook as show below. If you are interested in how to use PySpark more, you can go through the examples in the Glue Examples directory.
Let’s add a notebook specific to QLDB exports. Download notebook file linked below and you might need to hold down the
option keybaord key to download.
Once downloaded, click
Upload on the top right of the main Jupyter page.
qldb-id-lab3.ipynb file just downloaded and then click
Upload again. Once uploaded,
click on the file name to open the notebook.
You should see the notebook like below. Now go through the notebook and then this lab is completed. Due to the cost of long running glue endpoints, please clean up this lab once complete.