Leveraging SHORTCUTS for Data Management in Microsoft Fabric

Blogs

Understanding Cardinalities in Power BI
September 9, 2024
Unlocking Big Data Power: Apache Spark and Microsoft Fabric for Scalable Data Processing
September 9, 2024

Leveraging SHORTCUTS for Data Management in Microsoft Fabric

Shortcuts are objects in OneLake that point to storage locations, that locations can be internal or external. Internal locations are considered internal to OneLake in fabric. External locations such as Azure storage account, Amazon S3 storage account, Dataverse and Google cloud storage. The storage location to which the shortcut points is called target. The location where shortcut appears is called shortcut path. If the shortcut you created is deleted, it will not affect the target data. If the target data is moved, renamed or deleted, it will impact the shortcut.
You can access shortcuts in Apache spark notebooks, Apache spark jobs, SQL and Real time intelligence.

Note: As of today, shortcuts can be created in Fabric Lakehouse or KQL database. Warehouses still don’t support shortcuts.

Benefits of Shortcuts in fabric:

  1. No data duplication and data movement
    Shortcuts in Microsoft Fabric introduce a new and effective way to manage data. They let you access and use data from different sources without making copies. This means you can easily work with data across various domains and clouds while avoiding extra data movement and duplication
  2. OneLake shortcuts enables us to create live connections between OneLake and existing target data sources, whether internal or external to Azure.
  3. Cross cloud data analysis
    This cross-cloud capability allows users to effortlessly integrate and analyze data across different platforms such as Amazon S3 storage and Google cloud storage.
  4. Consolidate data access workspaces
    The advantage of shortcuts in Microsoft Fabric is that they allow you to consolidate data from various items or workspaces without altering its ownership. This means you can reuse data repeatedly without creating duplicates, thereby optimizing storage resources and enhancing overall efficiency.
  5. Simplified data synchronization
    Shortcuts eliminate the need for setting up and monitoring data movement jobs across different sources. By using shortcuts, you can keep your data synchronized across various domains and clouds without the complexities of traditional data synchronization methods

In Fabric, shortcuts can be created in Lakehouse and KQL database. For this explanation, we’ll focus on how to create shortcuts in Lakehouse.

Shortcuts can be created in Lakehouse using the “New Shortcut” option from the menu in Lakehouse Explorer. Shortcuts can be created at top level (2 top levels in Lakehouse: Tables and Files). In Tables level you can create the shortcuts in tables folder not in the subdirectories of the table. In the files folder you can create shortcuts at any level of folder hierarchy.

Types of Shortcuts

Internal OneLake shortcuts: Internal shortcuts refer to shortcuts that can link directly to storage locations within the fabric, such as Warehouse and Lakehouse.

Steps to create internal shortcuts:
1. In Lakehouse explorer, select the ‘New shortcut’ option
2. Choose Microsoft OneLake under internal sources

3. Select the source, such as Warehouse or Lakehouse
4. Choose the objects (tables) for which you want to create the shortcuts, then click ‘Next’
5. Click ‘Create’

Below is the screenshot for your reference:

Shortcuts created in Lakehouse:

External Shortcuts: External Shortcuts refer to shortcuts that link to external storage locations, such as ADLS Gen2, Amazon S3, Dataverse and Google Cloud Storage. These shortcuts allow us to import data from sources other than OneLake into fabric.

Here is the explanation on shortcuts for Amazon S3 and Google cloud storage

Amazon S3 storage account: Amazon S3 organizes data into objects within containers known as buckets. An object consists of a file and its metadata, while a bucket serves as the container for these objects. To store data in Amazon S3, first create a bucket by giving it a name and selecting an AWS Region. After that, you can upload your files to the bucket, where they will be stored as objects.

Steps to create Amazon S3 shortcuts:
1. In Lakehouse explorer, select the ‘New shortcut’ option from files folder
2. Choose Amazon S3 under external sources
3. You need to provide the URL in connection settings and connection credentials (URL, Connection name, Authentication kind, Access key Id, Secret Access key)
URL: The connection string for your Amazon S3 bucket
Connection name: The Amazon S3 connection name
Authentication kind: The Identity and Access Management (IAM) policy. The policy must have read  and list permissions.
Access key Id: The Identity and Access Management (IAM) user key
Secret Access key: The Identity and Access Management (IAM) secret key
4. Select a bucket or directory
5. You can rename the shortcut name and click on ‘create’

Below is the screenshot of Amazon S3 shortcut:

Google cloud storage: A service that enables users to store data in Google’s cloud. It supports storing various types of unstructured data, with individual files up to 5 TB in size, in containers known as buckets.

Steps to create Google cloud storage shortcuts:
1. In Lakehouse explorer, select the ‘New shortcut’ option from files folder
2. Choose Google Cloud Storage under external sources
3. You need to provide the URL in connection settings and connection credentials (URL, Connection name, Authentication kind, Access ID, Secret)
URL: The connection string for your GCS bucket
Connection Name: The user defined name for the connection.
Authentication kind: Fabric uses Hash-based Message Authentication Code (HMAC) keys to access Google Cloud storage. These keys are associated with a user or service             account
Access ID: The access key associated with a user or service account.
Secret: The secret for the access key.
4. Select a bucket or directory
5. You can rename the shortcut name and click on ‘create’

Note: If you choose the ‘New shortcut’ option under tables in Lakehouse explorer, you can create a shortcut at the table level. By default, this will create an unidentified folder at the table level, and the shortcut will be placed inside that folder.

Below is the screenshot of Google cloud storage shortcut:

Conclusion:

Shortcuts provide a new approach for referencing external data sources in fabric. You can access data from any Fabric workspace, Azure Data Lake Storage account, or Amazon S3 bucket, regardless of the engine in use. Shortcuts allow you to reference external delta folders as tables in your warehouse. They are more flexible, secure, and consistent compared to external tables, which are limited to certain engines and require more configuration and maintenance. It’s not about reducing unnecessary data duplication or movement. Shortcuts offer a better solution for accessing external data in Fabric, enhancing efficiency, clarity and collaboration in data analytics.

 

 


Chandana R L

Leave a Reply

Your email address will not be published. Required fields are marked *