Implementing a Microsoft Fabric Data Warehouse with CI Collation

Blogs

Couchbase: The NoSQL Database Powerhouse
November 20, 2024
SQL Server Data Compression: Unlocking Storage and Performance Gains
November 20, 2024

Implementing a Microsoft Fabric Data Warehouse with CI Collation

A data warehouse is a centralized repository designed for efficient querying and analytics. When working with Microsoft Fabric, ensuring optimal performance and usability is crucial, especially in environments with diverse data sources. One key consideration is collation settings, which determine how string data is sorted and compared.

By default, all Fabric warehouses are configured with the case-sensitive (CS) collation Latin1_General_100_BIN2_UTF8. This means that string comparisons and sorting are treated as distinct based on case.

Fabric also supports creating warehouses with case-insensitive (CI) collation, specifically Latin1_General_100_CI_AS_KS_WS_SC_UTF8. Case-Insensitive (CI) Collation allows string comparisons to ignore case differences, enhancing usability for end-users by reducing errors caused by case mismatches in queries. This configuration is especially useful in scenarios involving mixed-case data inputs or systems requiring seamless integration of datasets with varying case standards.

Note: Currently, the only available method to create a case-insensitive data warehouse in Microsoft fabric is by using the REST API.

Setting Up Visual Studio Code for REST API Requests

Prerequisites:

  • Ensure you have a Fabric workspace with either an active or trial capacity.
  • Download and install Visual Studio Code to set up the application.
  • Install the REST Client extension from the Visual Studio Marketplace.

Invoke the REST API using Visual Studio Code

To create a warehouse using the REST API, send a POST request to the following endpoint:
Endpoint:
POST https://api.fabric.microsoft.com/v1/workspaces/<workspace-id>/items

  • Base URL: https://api.fabric.microsoft.com/v1
  • Path: /workspaces/<workspace-id>/items
  • Method: POST
  •  Purpose: Creates a new warehouse within a specified workspace.
  • Workspace ID: Replace <workspace-id> with the unique identifier of target workspace

Steps to Create and Input API Request in a .http File in VS Code

Step1: Create a new text file in VS Code with the .http extension
Step2: Input the request details in the file body

 POST https://api.fabric.microsoft.com/v1/workspaces/<workspaceID>/items HTTP/1.1
   Content-Type: application/json
   Authorization: Bearer <bearer token>
  {
     “type”: “Warehouse”,
     “displayName”: “<Warehouse name here>”,
     “description”: “<Warehouse description here>”,
     “creationPayload”: {
     “defaultCollation”: “Latin1_General_100_CI_AS_KS_WS_SC_UTF8”
   }
   }

Step3: Replace the placeholder values as follows:

  • <workspaceID>: Find the workspace GUID in the URL after the /groups/ section or retrieve it by running SELECT @@SERVERNAME in an existing warehouse
  • <bearer token>: Obtain this by following these steps:
    1. Open your Microsoft Fabric workspace in a browser
    2. Press F12 to open Developer Tools.
    3. Select the Console tab.
    4. Type the command powerBIAccessToken and press Enter. The bearer token, a long string of alphanumeric characters, will be displayed in the console. Copy that                token to clipboard.
    5. Paste the token in place of <bearer token>
  •  <Warehouse name here>: Enter the preferred warehouse name.
  • <Warehouse description here>: Provide the description for warehouse.

Step 4: Click the Send Request link displayed above the POST command in the VS Code editor.
Step 5: A response with a status code of 202 Accepted should be received, along with additional details about the POST request.
The new CI collation warehouse has been created and can be checked in the Fabric portal.

Run the SQL command:
SELECT name, collation_name FROM sys.databases;

The newly created warehouse will appear in the list along with the collation name: Latin1_General_100_CI_AS_KS_WS_SC_UTF8.

Conclusion
Creating a data warehouse in Microsoft Fabric with CI collation enabled enhances the flexibility and usability of string comparisons and sorting, especially in environments with mixed-case data. By using the REST API, it is possible to create a warehouse with the desired case-insensitive collation, specifically Latin1_General_100_CI_AS_KS_WS_SC_UTF8, which improves consistency and reduces errors caused by case mismatches in queries.
The steps outlined for setting up Visual Studio Code, invoking the REST API, and configuring the warehouse request are straightforward, ensuring an efficient way to create a warehouse that meets your requirements. Upon successful creation, the new warehouse can be confirmed in the Fabric portal, and the collation settings can be verified by running a simple SQL query.
This approach streamlines the process of managing warehouses in Microsoft Fabric, making it easier to integrate and query data without concerns about case sensitivity.


Chandana R L

Leave a Reply

Your email address will not be published. Required fields are marked *