Bulk loading postgreSQL records to Cloudant with RQL

Georges Lagardère
4 min readJul 23, 2021

Move data easily from relational to noSQL database

Introduction

This post illustrates how we can use RQL to move data from a relational database to a noSQL document database.

I will use source data taken from a PostgreSQL table loaded with data from the TCP benchmark H and bulk load it into an IBM Cloudant instance (Cloudant is an Apache CouchDB managed service).

For this example, I will use RAW Labs platform, which can be accessed here at https://just-ask.raw-labs.com

Tutorials and documentation about RQL is available at: http://docs.raw-labs.com and the platform is described at: https://www.raw-labs.com/platform.

The RQL code source can be found here.

PostgreSQL setup in Just-Ask

It is possible to configure access to multiple RDBMS at https://just-ask.raw-labs.com/settings/rdbms

For this example I have created a PostgreSQL data source named PGSLQ-TPC-H10 and I will use the lineitem table.

In the Just-Ask scratchpad, entering the following command will help understand the table structure:

This produces the following result:

The JSON description can be downloaded and here is what the lineitem table columns look like:

Cloudant setup

In the cloud.ibm.com portal I have created a basic Cloudant service and a database named tpc-h

Cloudant provides an endpoint to be used via REST calls for all database operations (create database, query document, insert document, update document…).

For instance:

will return information about the ‘tpc-h’ document database that I have created.

The endpoint can be found on the Cloudant service page and is in the form of https://……….-bluemix.cloudantnosqldb.appdomain.cloud/

The token can be retrieved by the following command

And the apikey is found on the IBM cloud service management page.

Bulk inserting documents

In order to insert multiple documents in the database we need to use a post instruction as explained here.

To achieve this, we need to convert the records that we will read from the postgreSQL table into JSON documents and produce something equivalent to the following curl command (ie /tcp-h/_bulk_docs)

The HTTP() RQL function will be used in Just-Ask to call the Cloudant endpoint.

First we need to retrieve records from the lineitem postgreSQL table. We will limit them to 20.

Because of the various data types that the table contains (long, int, decimal, string and date) I have created some helper functions to cast all types into JSON like string tuples where either we have “key”:”string” or “key”: number

I have also created one to enclose a string with curly brackets in the JSON document style.

The MKSTRING() RQL function will help to merge the collection of tuples with a comma separator. The square brackets [] encompassing the tuples defines them as an RQL collection.

The above statement produces an output of records formatted as JSON documents as shown in the Just-Ask output tab.

In order to make a bulk insert we need to group them into the “docs” collection in the body of the POST message to be sent via the Cloudant API.

Another small helper function will complete the surrounding text and the use of MKSTRING() will create the string of comma separated documents.

Now we can call the HTTP() RQL function with the resulting data variable which contains the correct body message.

Et voilà, 20 records where just uploaded as documents into the tpc-h database on Cloudant.

This example is quite straightforward, in most cases documents have nested items and are not just flat records. The same mechanism could apply to generate such elements whatever the complexity.

--

--