Time To Live (TTL) in Managed Virtual Network in Azure Data Factory

Blogs

SQL QUERY OPTIMIZATION AND PERFORMANCE TUNING
March 31, 2024
Python Automation script for checking in different databases
May 8, 2024

Time To Live (TTL) in Managed Virtual Network in Azure Data Factory

Managed Virtual Network (VNet)

Before we explore Time to Live (TTL), it’s essential to understand what a Managed Virtual Network (VNet) is within Azure Data Factory.

When you create an Azure integration runtime within a Data Factory managed virtual network, the integration runtime is provisioned with the managed virtual network. It uses private endpoints to connect to supported data stores securely.

Creating an integration runtime within a managed virtual network ensures the data integration process is isolated and secure.

A VNet is a private network in Azure that allows you to securely connect Azure resources, such as VMs, to each other, the internet, or your on-premises network. When you configure Azure Data Factory to operate within a managed VNet, you gain several benefits:

  • Enhanced security by isolating your data factory from the public internet.
  • Access control through network security groups (NSGs) to control inbound and outbound traffic.
  • Improved performance as data flows within the VNet, reducing latency.
  • Connectivity to on-premises data sources via Azure ExpressRoute or VPN Gateway.
  • A managed virtual network along with managed private endpoints protects against data exfiltration.
  • Deep Azure networking knowledge isn’t required to do data integrations securely. Instead, getting started with secure ETL is much simpler for data engineers.
  • With a managed virtual network, you can offload the burden of managing the virtual network to Data Factory. You don’t need to create a subnet for an integration runtime that could eventually use many private IPs from your virtual network and would require prior network infrastructure planning.

Currently, the managed virtual network is only supported in the same region as the Data Factory region.

Importance of TTL in Managed Virtual Network

A managed virtual network offers customers a secure and controllable data integration solution. However, because of architectural constraints, we are required to provision computes within a managed virtual network every time we run an activity. This can result in somewhat extended queue times, which may not be optimal, particularly for smaller jobs executed in sequence. To address this, Microsoft has introduced a Time to Live (TTL) feature. This feature enables users to reserve computes, ensuring they remain allocated within the TTL period following the last activity execution.

Time to Live (TTL) in a managed virtual network refers to the duration for which resources and connections are kept alive within the VNet. When a resource or connection reaches the TTL expiration, it is automatically closed or released. This mechanism helps in managing resources efficiently, preventing unnecessary resource consumption and potential security risks.

Components of TTL while Creating Managed Virtual Network:

Virtual Network Configuration: Enabling a Managed Virtual Network ensures that the Azure Integration Runtime compute is provisioned within it, and can access data securely using Private Endpoints.

Interactive authoring: Interactive authoring capability is used during authoring for functionalities like Test connection / Browse and Preview data / Import parameter / Import schema inside managed Virtual Network.

  • Time to live: The allowed idle time for interactive authoring. Specifies how long it stays after completion of an interactive authoring run if there are no other active actions. By default it will be set to 60 minutes.

Under advance setting:

Copy compute scale: When enabled, the compute will terminate after the specified time interval of inactivity.

  • Compute size for copy: Specify the powerfulness of the copy activity executor.
  • Time to live: The allowed idle time for copy activity. Specifies how long it stays after completion of a copy activity run if there are no other active jobs. By default the time is 5 minutes.

 

Pipeline and external compute scale: The compute will terminate after the specified time interval of inactivity.

  • Compute size for pipeline: Specify the powerfulness of the pipeline activity executor. By Default, the compute size is Small(1 node)
  • Compute size for external: Specify the powerfulness of the external activity executor. By Default, the compute size is Small(1 node)
  • Time to live: The allowed idle time for interactive authoring. Specifies how long it stays after completion of an interactive authoring run if there are no other active actions. By default, it will be set to 60 minutes.

 

Key Benefits of TTL in Managed Virtual Network:

  • Resource Optimization: By allowing users to reserve computes within the managed VNet, TTL ensures efficient utilization of resources. This means that resources are not constantly provisioned and released for every activity execution, reducing unnecessary overhead and optimizing resource allocation.
  • Reduced Queue Times: With TTL, users can avoid the delays caused by provisioning computes for each activity execution. By reserving computes within the VNet, subsequent executions can leverage these pre-allocated resources, resulting in shorter queue times and faster job completion.
  • Improved Efficiency for Sequential Jobs: For workflows involving multiple sequential jobs, TTL proves to be particularly advantageous. Instead of provisioning computes for each job individually, TTL allows users to reserve resources upfront, ensuring seamless execution of subsequent tasks without the overhead of compute provisioning.
  • Enhanced Cost Management: By reserving computes within the managed VNet for a specified TTL period, users can effectively manage costs. TTL helps in avoiding unnecessary resource consumption by ensuring that resources are released only after the TTL period elapses, thereby preventing unnecessary expenditure.
  • Consistent Performance: With TTL, users can maintain consistent performance levels across their data integration tasks. By pre-allocating resources within the managed VNet, TTL ensures that compute resources are readily available when needed, leading to predictable and reliable performance.
  • Streamlined Workflow: TTL simplifies the workflow by eliminating the need to provision computes for each activity execution manually. Users can set up TTL once and benefit from pre-allocated resources for subsequent executions, streamlining the process and reducing administrative overhead.
  • Enhanced Security: By operating within a managed VNet, TTL provides an added layer of security for data integration tasks. TTL ensures that resources are reserved within the secure VNet environment, reducing exposure to potential security threats associated with provisioning computes dynamically.

Anamika Sar

Leave a Reply

Your email address will not be published. Required fields are marked *