QLDB Streams

Setting up the QLDB stream will capture every document revision that is committed to your journal and delivers this data to Amazon Kinesis Data Streams in near-real time. A QLDB stream is a continuous flow of data from your ledger’s journal to a Kinesis data stream resource.

QLDB streams can be useful in many different use cases like Event-driven architecture, Real-time analytics, Historical analytics, and others. In this lab, the use case that is covered will be Replication to purpose-built databases.

QLDB Streams and Kinesis Streams will both be referred to in this section. Know that the two streams are different and QLDB Streams delivers all data records that are produced to Kinesis streams.


Configure the QLDB stream

From the QLDB service page, click Streams on the left Amazon QLDB menu.

Take note of the top level streams description.

The QLDB stream will continuously write data to a Amazon Kinesis Data Stream.

So in addition to creating a QLDB stream. A Kinesis stream will also need to be created in the step.

Click on Create QLDB stream.

In the Stream information block. Give the stream a unique name like, StreamingLab-QLDBStream.

In the Source data block, select the QLDB ledger that was created in the last section. If the State date and time (UTC) shows the current time then leave as default. If not, enter the current UTC time.

The current UTC time can be used from the following link. timeanddate.com/worldclock/timezone/utc

Now, in the Write data to Kinesis Data Streams block. Click on Create Kinesis stream.


Kinesis streams

A new tab in your browser should have opened and take you to the Create data stream page for Kinesis as shown below.

Now, in the Data stream configuration block. Enter a unique Data stream name like StreamingLab-KinesisStream.

Copy or write down the Kinesis Data stream name down. The name will be used throughout the rest of the lab.

In the Data stream capacity block. Enter 50 for the Number of open shards. 50 is over provisioned for this workshop scenario but it is important to always test QLDB streams in a multi-shard scenario. This will also increase the number of concurrent connections to our Serverless Aurora Cluster.

A single shard Kinesis shard will not scale. It is best practice to use a multi-shard Kinesis stream.

Click on Create data stream.

Back to QLDB streams

The Kinesis stream is created but we need to finish creating the QLDB streams. Head back over to the open tab where the QLDB stream is being configured.

In the Write data to Kinesis Data Streams click on Browse and select the just created Kinesis stream. It might take a few min to register the new Kinesis stream and the refresh button might need to be used until the Kinesis stream appears.

For convenience, the necessary IAM role for QLDB streams was created from the RDS cloudformation template used in the first section. Click on the Choose an IAM role dropdown and select the role named QLDBStreamRole-${AWS::Region}.

Now, click on Create QLDB stream.

Control, Block Summary, and Revision Details.

An Amazon QLDB stream writes three types of data records to a given Amazon Kinesis Data Streams resource: control, block summary, and revision details. All three record types are written in the binary representation of the Amazon Ion format.

Control records indicate the start and completion of your QLDB streams. Whenever a revision is committed to your journal, a QLDB stream writes all of the associated journal block data in block summary and revision details records.

In this lab, the REVISION_DETAILS will be the focus as the goal is to replicate data from QLDB to Amazon Aurora in specific projections or views.

A revision details record represents a document revision that is committed to your journal. The payload contains all of the attributes from the committed view of the revision, along with the associated table name and table ID. The following is an example of a revision record with sample data.

A sample revision details record can be seen below. Notice that the tableName, data, and metadata fields are all in the record.

{
  qldbStreamArn:"arn:aws:qldb:us-east-1:123456789012:stream/exampleLedger/IiPT4brpZCqCq3f4MTHbYy",
  recordType:"REVISION_DETAILS",
  payload:{
    tableInfo:{
      tableName:"VehicleRegistration",
      tableId:"Ad3A07z0ZffC7Gpso7BXyO"
    },
    revision:{
      blockAddress:{
        strandId:"ElYL30RGoqrFCbbaQn3K6m",
        sequenceNo:60807
      },
      hash:{{qJID/amu0gN3dpG5Tg0FfIFTh/U5yFkfT+g/O6k5sPM=}},
      data:{
        VIN:"1N4AL11D75C109151",
        LicensePlateNumber:"LEWISR261LL",
        State:"WA",
        City:"Seattle",
        PendingPenaltyTicketAmount:90.25,
        ValidFromDate:2017-08-21,
        ValidToDate:2020-05-11,
        Owners:{
          PrimaryOwner:{PersonId:"7z2OpEBgVCvCtwvx4a2JGn"},
          SecondaryOwners:[]
        }
      },
      metadata:{
        id:"K0FpsSLpydLDr7hi6KUzqk",
        version:0,
        txTime:2019-09-18T17:00:14.602Z,
        txId:"9RWohCo7My4GGkxRETAJ6M"
      }
    }
  }
}

While the QLDB stream is being created. Head on over to the next section.