Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

On This Page

Overview

While all platforms do not impose limits on the number of items that can be stored in a single directory, their API may limit the number of items that can be downloaded/uploaded. This means that when DryvIQ attempts to download content from the source that exceeds the platform API limit, errors may start to occur. The DryvIQ Platform engine also has a limit of how many items it can crawl and download from a single directory based on the environment setup such as memory allocation

  • A job in DryvIQ does not use a fixed amount of memory.

  • Memory usage for individual jobs will vary based on a number of factors, the most significant one is the number of files and how those files are distributed (all in one folder, throughout sub-folders, etc.).

  • The main factors for memory usage for a DryvIQ node will be number of concurrent jobs plus Parallel Writes Per Job for each job and the memory impact of the specific jobs.

Based on source/destination platform limits and potential excessive memory usage, DryvIQ system configuration is defaulted to maximum 10,000 items per container to ensure a successful transfer.

Recommendations

To avoid these errors, it is recommended to configure the directory item limit policy for your job. This option is configurable with a default system maximum of 10,000 for a single directory, corresponding with most platforms recommendations. While some platforms such as Box may allow upwards of 15,000 items in a single directory, changing this configuration limit may result in reduced performance and potentially unexpected errors. To improve the performance of your transfer, it is recommended to use the DryvIQ system configuration of maximum 10,000 items per container. DryvIQ will identify the directories that exceed this limit, flag and notify with the following message: 

Other activity | Error activity

The path exceeds the maximum number of 10,000 children: /path

During remediation, evaluate the source content and move items into sub-folders until each directory contains 10,000 items or less. 

What can happen when increasing the Directory Limit?

DryvIQ strongly recommends using the default directory limit of 10,000. Increasing this value can have the following impacts:

  • Performance issues with excessive memory usage

  • Limit errors from the source and destination platforms causing failures in your DryvIQ job. (See Platform Limit documentation references below.)

Addressing Memory Issues

If memory issues occur due to increasing the Directory Item Limit or Parallel Writes Per Job, there is no other mitigation other than reducing the number of current jobs or breaking up the source content in to multiple jobs. DryvIQ will keep using memory until it runs out (it will not self-limit) and it will eventually reach the environment max. Reaching an environment max may result in a non-graceful termination of DryvIQ that could result in jobs re-transferring files, permissions or metadata.  In the case of larger jobs being stopped in this manner, they will enter recovery mode, continue to use all the memory, then get stopped again, in a loop causing a loss of throughput.

Expected Behavior

The parent folder will transfer to the destination. Permissions will be applied, and the job will have failures with the following log:

Other activity | Error activity

The path exceeds the maximum number of 10,000 children: /path

Default Behavior

If "max_items_per_container" is not configured in the job JSON, the default 10,000 limit will still apply.

Change Limit When Creating a Job

See “Additional Configuration Options” for Scripting.

Configuration Options in the appSettings.json

See Configuration Options. The default value can also be changed using "transfers:max_items_per_container" config key.

Reporting in The User Interface

Limit Transfer error will be presented on the overview.

Platform Limitation 

If you configure DryvIQ above 10,000 please reference your source and destination platforms documentation for potential restrictions on that platform.


Job Configuration Through Rest API

{
    "name":"Example Job Configuration - Max Items Per Container",
    "kind": "transfer",
    "transfer": {
      "transfer_type": "copy",
      "audit_level": "trace",
      "batch_mode": "always",
      "conflict_resolution": "latest",
      "delete_propagation": "ignore_both",
      "failure_policy": "continue",
      "large_item": "skip",
      "lock_propagation": "ignore",
      "max_items_per_container": 10000,
      "performance": {
                "parallel_writes": {
                    "requested": 4
                }
      },          
      "permissions": {
    		"policy": "add",
    		"links": true,
    		"failures": "exceptions"
      },
      "preserve_owners": true,
      "timestamps": true,
      "empty_containers": "create",
      "duplicate_names": "rename",
      "item_overwrite": "overwrite",
      "restricted_content": "convert",
      "segment_transform": true,
      "versioning": {
    		"preserve": "native",
    		"select": "all"
      },
         "group_map": {
         "id": "{{group_map_id}}",
         "type": "group_map"
      },
      "account_map": {
         "id": "{{account_map_id}}",
         "type": "account_map"
    },
        "filter":{
         "source":[
            {
               "action":"exclude",
               "rules":[
                  {
                     "type":"filter_shared"
                  }
               ],
               "type":"filter_rule"
            }
         ],
         "destination":[
            {
               "action":"exclude",
               "rules":[
                  {
                     "type":"filter_shared"
                  }
               ],
               "type":"filter_rule"
            }
         ]
        },        
        "source": {
            "connection": { "id": "{{cloud_connection_source}}" },
            "impersonate_as": { "email": "jsmith@yourdomain.com" },
            "target": {
                "path": "/SourcePath"
            }
        },
        "destination": {
            "connection": { "id": "{{cloud_connection_destination}}" },
            "impersonate_as": { "email": "jsmith@mydomain.onmicrosoft.com" },
            "target": {
                "path": "/DestinationPath"
            }
        },
        "simulation_mode": false
    },
    "schedule": {
        "mode": "manual"
    },
    "stop_policy": {
        "on_success": 5,
        "on_failure": 5,
        "on_execute": 25
    },
    "category": {
      "name": "Category 1"
    }
}

  • No labels