Data Virtualization in OutSystems 11 with SQL Server Polybase

Data Virtualization in OutSystems 11 with SQL Server Polybase

I recently had a meeting where we discussed integration options for a MongoDB collection in an OutSystems 11 on-premise environment and during the discussion, we talked about Polybase, a data virtualization feature of Microsoft SQL Server.

What is Data Virtualization

Data Virtualization refers to a virtual data layer that shows external data as if it were local, without moving the data itself.

Data Fabric for OutSystems Developer Cloud (ODC) is such a virtual data layer. With Data Fabric, you can connect to one or more external data sources and select individual data items (entities) that are then accessible in your applications just like any other local entity. You can also join external and local entities in your aggregates, giving you a seamless experience when working with application data. Data Virtualization is not limited to relational database management systems; it also includes API-based data like Salesforce and SAP.

💡
As of now, Data Fabric for ODC supports a limited set of data sources, with more to be added in the future.

Wouldn't it be great if all external data, whether from a relational database, a document-oriented database, or even web service data, could seamlessly integrate into OutSystems as regular entities? This is what data virtualization is all about, and Data Fabric is the key technology for OutSystems Developer Cloud that makes this possible.

But what about OutSystems 11?

External Data in OutSystems 11

In short, true Data Virtualization does not exist in OutSystems 11. To integrate with external data, you would use Integration Studio for connecting to external RDBMS, Integration Builder for additional data sources like MongoDB, REST consume operations for integrating with web services, pre-built Forge connector components, or a custom code extension. However, this integration is not as seamless as with Data Fabric in ODC. For example, external entities added with Integration Studio appear in Service Studio just like local entities, but you cannot directly join external entities with local ones, which may sometimes require a sync pattern. Other integration types may look and behave differently, so it is not really seamless.

In addition, if you are running OutSystems 11 on-premise, you may not be able to use Integration Builder at all. This is because Integration Builder is a cloud-hosted product that requires a connection to one of your frontend servers in the development environment. That said, integrating with external data can be challenging in O11 and can consume a lot of your developer resources, as you cannot delegate this task to non-developers.

Most likely, when running OutSystems 11 on-premise, your DBMS is Microsoft SQL Server. Starting with version 2016, SQL Server includes a feature called Polybase, which adds Data Virtualization capabilities to your SQL Server instance.

Data Virtualization with Polybase

Polybase is a data virtualization feature integrated into SQL Server. Similar to Data Fabric for ODC, Polybase allows you to connect to various data sources, making the connected data appear as tables in a SQL Server database. You can then use these external tables just like regular database tables in your queries. However, writing (INSERT, UPDATE, and DELETE) to these external tables depends on whether the selected data source supports it, as not all data sources allow writing to connected data.

Availability of Polybase data sources depends on the SQL Server version you are using. For example, the Simple Storage Service (S3) data source, which lets you query objects in a bucket and attach stored CSV or Parquet file data to a SQL Server database, is only available in SQL Server version 2022.

Read the Polybase documentation to see which connectors are available for your SQL Server version.

Virtualizing a MongoDB Collection using Polybase

In this article, we will explore how to connect a sample MongoDB collection as external data to a SQL Server database. After connecting, the collection will appear in the SQL Server database as an external table, behaving like any other SQL Server database table. Finally, we will configure a Database Connection in OutSystems Service Center and use Integration Studio (not Builder) to create an integration module for the table.

In this article, I assume you are already familiar with MongoDB and Microsoft SQL Server, as I will not cover the basics.

Prerequisites

  • Make sure you have a MongoDB instance that is accessible from your OutSystems Frontend Servers and that it accepts authentication using a username and password with the default authentication mechanism.

  • Install the Polybase feature for SQL Server on your instance.

💡
Polybase only supports the default authentication mechanism using a username and password. SCRAM is not supported, so you cannot use MongoDB Cloud Offerings where SCRAM is the default authentication mechanism.

Create MongoDB Database and Collection

Create a database and collection in your MongoDB instance and import the sample JSON data.

  • Create Database sample

  • Create Collection contacts

  • Import sample contacts data to created collection

The easiest way to import sample JSON data into a collection is by using MongoDB Compass.

Enable Polybase

Connect to your SQL Server instance using SQL Server Management Studio (or any other tool that allows you to execute T-SQL).

Run the following T-SQL script to enable the Polybase feature

EXEC sp_configure @configname = 'polybase enabled', @configvalue = 1;
RECONFIGURE;

Create a SQL Server Database

Polybase external data tables are linked to a specific SQL Server database, so the next step is to create a new database. The default options are sufficient.

  • Create a SQL Server Database mongodb.

Create Database Scoped MongoDB Credentials

After that, you need to run the following commands to store your MongoDB credentials (username and password) that will be used to access the sample contacts collection. The credentials are encrypted and stored in the contacts database using a configured master encryption key.

To create a master encryption key use the following script.

USE mongodb;
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<strong master key password>'
💡
Make sure to securely store your master encryption key password, such as in a password vault.

Then create the MongoDB credentials in the mongodb database using

USE mongodb;
CREATE DATABASE SCOPED CREDENTIAL MongoCredentials WITH IDENTITY = '<MongoDB username>', Secret = '<user password>';

Create External Data Source

Now we connect the MongoDB Collection using the stored credentials by creating a data source with the following

USE mongodb;
CREATE EXTERNAL DATA SOURCE MongoSource
WITH (
    LOCATION = 'mongodb://<mongodb instance address>:27017',
    CONNECTION_OPTIONS='tls=false;ssl=false',
    CREDENTIAL = MongoCredentials
);

Please note that the location should point to your MongoDB instance and use the specified credentials name you created in the Create Database Scoped MongoDB Credentials.

Create External Table

The final step in SQL Server is to create the external table schema and map the MongoDB collection attributes to SQL columns and value types.

Here is on example record from the sample data imported to MongoDB

 {
  "_id": {
    "$oid": "66dfda89564afea89634e59a"
  },
  "firstName": "John",
  "lastName": "Doe",
  "middleName": "Michael",
  "addresses": [
    {
      "type": "Home",
      "street": "123 Main St",
      "city": "Anytown",
      "state": "CA",
      "zipCode": "12345"
    },
    {
      "type": "Business",
      "street": "456 Office Blvd",
      "city": "Workville",
      "state": "NY",
      "zipCode": "67890"
    }
  ],
  "phoneNumber": "555-123-4567",
  "emailAddress": "john.doe@email.com"
}

And the following script maps the collection attributes to columns

USE mongodb;
CREATE EXTERNAL TABLE Person(
    [_id]                         NVARCHAR(24)    COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
    [firstName]                     NVARCHAR(64)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [lastName]                     NVARCHAR(64)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [middleName]                 NVARCHAR(64)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [phoneNumber]                 NVARCHAR(64)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [emailAddress]                 NVARCHAR(128)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [contacts_addresses_type]     NVARCHAR(16)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [contacts_addresses_street]     NVARCHAR(128)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [contacts_addresses_city]     NVARCHAR(128)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [contacts_addresses_state]     NVARCHAR(128)    COLLATE SQL_Latin1_General_CP1_CI_AS,
    [contacts_addresses_zipCode] NVARCHAR(10)    COLLATE SQL_Latin1_General_CP1_CI_AS
) WITH (
    LOCATION='sample.contacts',
    DATA_SOURCE=MongoSource
)

Notice how the attributes are referenced in the script. The addresses attribute is an array of objects, and the naming convention used is <collection name>_<array_attribute>_<attribute>. You can find more details about this object flattening in the Polybase documentation.

For each element in the addresses array, a complete Person record is created.

Query the created external table with:

SELECT * FROM [mongodb].[dbo].[Person]

which should produce the following result

Take note of the multiple records for John Doe, with one record for each associated address.

With our external table set up and working, we can now move on to the OutSystems side.

Create a Database Connection in Service Center

Before we can use Integration Studio to integrate our Person table, we first need to configure a new database connection in Service Center.

  • In Service Center, create a new Database Connection for the MongoDB database.

Create an Integration Module in Integration Builder

  • Open Integration Studio and connect to your environment.

  • Use MongoDBPersons for the extension name.

  • Right-click on Entities and select "Connect to external table or view."

  • In the "Select the Database Connection" screen, choose the database connection you created earlier.

  • Select "mongodb" in the "Select the Database" screen.

  • In the "Select the Tables and Views" screen, select the Person table on the left and click the "Add»" button.

  • Leave the default logical name in the next screen and finish the table selection.

You can make further changes to the Person entity if needed, like renaming individual attributes.

💡
When you're done, publish the extension to your environment. After the module is published, click the Configure button and set the Database Connection for the extension in Service Center.

Now you can use the integrated Person entity in your application by referencing it.

Summary

In this article, we explored how to use the Microsoft SQL Server Polybase feature to add a MongoDB collection as an external table and create an integration module using OutSystems Integration Studio. The collection data remains in MongoDB, and any changes to the data are immediately reflected.

The advantage of this approach is that MongoDB data appears as a regular entities in OutSystems, not just server/service actions. However, Polybase's MongoDB support is read-only, so you will still need additional actions for write operations.

As mentioned earlier, this is not true data virtualization like with Data Fabric. External entities in OutSystems 11 (when using Polybase, you could say these are external, external entities 😏) cannot be directly joined with application entities.

Thank you for reading. I hope you enjoyed it. Follow me on LinkedIn to get notifications whenever I publish something new.