Allow Link Detection and Remediation on Supported Files
On This Page
- 1 Overview
- 2 File Handling
- 3 Supported File Types
- 4 Supported Links
- 5 Unsupported links
- 6 Link Detection Impact on Job Performance
- 7 Viewing Link Information
- 7.1 Content Insights
- 7.2 Items
- 7.3 Links
- 8 Remediating Links
- 9 Link Remediation Impact
Overview
Link detection and remediation can be turned on when defining the behaviors for the job during job creation. It will run for both simulation and transfer jobs. Link detection scans files and identifies any links in the files. The Job reports will display link information when available for the job. Once all the job runs have completed, you can then execute link remediation to update the links, so they don’t have to be edited manually.
Link detection only scans the latest version of each file and reports the links detected. It does not scan previous versions.
File Handling
When doing the content analysis for link detection, DryvIQ needs a seekable stream. To obtain that, DryvIQ downloads the file into memory if it is small enough or into a temp location on the processing node if the file is too large. DryvIQ analyzes that stream, resets it, and uploads the file to the destination. After the transfer is complete, DryvIQ removes the temp file if it was needed to for file analysis.
Supported File Types
Link detection currently only identifies links in the following file types:
Files with the DOCX extension (available in Microsoft Word 2007 and newer)
Files with the PPTX extension (available in Microsoft PowerPoint 2007 and newer)
Files with the XLSX extension (available in Microsoft Excel 2007 and newer)
Google Docs
Google Sheets
Google Slides
Supported Links
Hyperlinks: These are links to websites or documents. Hyperlinks can be http/https/ftp/ftps URLs or links to files.
In Microsoft Word, Excel, and PowerPoint files, these links are created using the Link option on the Insert tab or by right-clicking on selected text/cell and selecting Link from the shortcut menu.References to other Excel spreadsheets: In Microsoft Excel files, these are links to cells in other Microsoft Excel files. These links are made by creating a formula that references a cell or range of cells in another Microsoft Excel file. The cells are formatted similar to the following examples:
=[AnotherSpreadsheet.xlsx]SheetName!A1
='C:\Absolute\Path\To\[AnotherSpreadsheet.xlsx]SheetName'!B1
Links documents/object: In Microsoft PowerPoint files, this is content that has been imported into the presentation. This content is imported using the Object option on the Insert tab or using the Paste Special option to insert a link to a Microsoft Word Document Object.
Unsupported links
Unformatted links: DryvIQ will not count unformatted links (URLs that are added as plain text in the document).
IncludeText fields: In Microsoft Word files, link detection does not support links added through IncludeText fields using the Insert Quick Parts option.
Job filter exclusions take precedence over Link Detection. Therefore, if a job filter exclusion is set to ignore DOCX, PPTX, or XLSX files, Link Detection will also ignore these files.
Link Detection Impact on Job Performance
Simulation Jobs: When link detection is enabled on Simulation jobs, the simulation job execution will be longer because the document must be downloaded into memory to detect links. (Files are not normally loaded into memory during a simulation job because they are not actually being migrated.) DryvIQ estimates a 5-10% impact.
Transfer Jobs: As noted above, DryvIQ scans for links while the document is in memory while migrating the file. Therefore, the impact on job time is minimal. The document's size has a negligible impact on link detection times unless the file size is very large (GBs in size). Link detection will cause a nominal amount of CPU utilization to detect links. Memory is not affected.
Viewing Link Information
When enabled, link detection will identify the links in files and make the information available for review on the individual Job reports and the roll-up reports. Information is available on the Content Insights, Items, and Log pages.
It is important to note that link counts for spreadsheets will not always match depending on how the link was added to a cell. If the links are added to multiple cells at the same time, DryvIQ reads the link as one link shared across cells. In this instance, all shared links count as one link. If the links are added to multiple cells separately (one cell at a time), DryvIQ counts each cell as separate. In this instance, each link is counted individually.
Content Insights
The bottom of the Content Insights page for jobs that have Link Remediation enabled will display a “Link remediation status overview” section. This section lists the number of files without links, the number of links identified that need to be remediated, the number of links for which remediation has been completed, the number of links where remediation failed and needs to be executed again, and the number of links for which remediation failed. Specific details about the individual links can be viewed on the Items page and Links page.
This information can be exported to a csv file for further review using the Export this report link. The export includes the following information.
Field | Description |
---|---|
source_id | The ID assigned to the file on the source platform |
source_name | The filename on the source platform. The source and destination file names may not match if DryvIQ needed to sanitize the the filename due to character or length restrictions for the destination platform. |
source_path | The path where the file is located on the source platform. |
destination_id | The ID assigned to the file on the destination platform |
destination_name | The filename on the destination platform. The source and destination file names may not match if DryvIQ needed to sanitize the the filename due to character or length restrictions for the destination platform. |
destination_path | The path where the file is located on the destination platform. |
link | The URL for the link detected. |
count | The number of times the link was found in the file. Link counts for spreadsheets will not always match depending on how the link was added to a cell. If the links are added to multiple cells at the same time, DryvIQ reads the link as one link shared across cells. In this instance, all shared links count as one link. If the links are added to multiple cells separately (one cell at a time), DryvIQ counts each cell as separate. In this instance, each link is counted individually. |
Items
A link remediation status is assigned to every file included in a migration even if link detection isn’t enabled for a job. You can configure the Items page to display the status by changing the third or fourth column header to Link remediation status.
The column will display the link remediation status for every file. There are five statuses:
Nothing to remediate: No links were detected in the file.
Remediation needed: Links were detected in the file and require remediation to be executed to update the links.
Complete: Remediation was executed and finished processing. Regular URLs and unsupported URLs will also be considered “Complete” as there is no action to take against them.
Retry: Remediation was triggered but was not completed. Link remediation needs to be executed again to remediate the link.
Failed: At least one link in the file failed to be remediated. Failed files will not be reprocessed during subsequent link remediation executions unless the status is changed to “Retry.”
You also have the option of filtering the Items page based on a specific link remediation status. This allows you narrow the results to display only files that need to be remediated, retried, etc.
Links
The Links page provides information about each link identified. There will be an entry for each link identified; therefore, you will see the source item listed multiple times if multiple links were identified within the file. You can edit the second, third, and fourth columns to display the information most relevant to your review. Review the table below for a summary of the available column options.
Filtering by Remediation Status
The Filter by option on the Links page provides the ability to filter by remediation status. This allows you to view only files that have a specific status.
None: The file has no remediation status because no links were detected in the file.
Remediated: Remediation was executed and finished processing.
Ignored: DryvIQ was unable to match the link to the target file, or the link does not require remediation, such as a URL to an external website. You should verify that the link is correct and that the item it targets has been included in a transfer job so DryvIQ has tracking data for it. You can retry remediation for the link.
Unsupported: The link is unsupported and cannot be remediated. See Unsupported Links for a list of unsupported link types.
Retry: Remediation was triggered but was not completed. Link remediation needs to be executed again to remediate the link.
Failed: At least one link in the file failed to be remediated. Failed files will not be reprocessed during subsequent link remediation executions unless the status is changed to “Retry.”
Remediating Links
You must manually trigger link remediation for the job(s) that contain links. When link remediation runs, it will remediate the linked URL so it matches the new location of the linked file.
Supported Link Formats for Remediation
Link remediation will remediate supported links upon execution. For certain platforms, however, links must be in specific formats in order for link remediation to work. Information for those platforms is provided below.
Box
When remediating links from Box, only links in the following format are supported:
<https://<tenant>>.app.box.com/file/<platform id>
<https://<tenant>>.app.box.com/folder/<platform id>
<https://<tenant>>.app.box.com/integrations/officeonline/openOfficeOnline?fileId=<platform Id>&sharedAccessCode=
Microsoft
When remediating links from Microsoft OneDrive for Business, only links in the following format are supported:
<https://<tenant>>-my.sharepoint.com/:w:/r/personal/<User's OneDrive path>/_layouts/15/Doc.aspx?sourcedoc=%7B<Platform Id>%7D&file=<Filename>&action=default&mobileredirect=true
When remediating links from Microsoft Office 365, only links in the following formats are supported:
http://sharepoint.com/123/abc.txt
https://skysyncdesktop.sharepoint.com/123/456
http://sharepoint.com/_layouts/15/doc.aspx?sourcedoc=%7bf79c7ceb-c458-4c5d-bc51-4e76a280fd4a%7d&action=edit
https://skysyncdesktop.sharepoint.com/:x:/r/_layouts/15/Doc.aspx?sourcedoc=%7BD1825663-F6D6-4277-BE01-F5E8B67CA932%7D&file=Book.xlsx&action=default&mobileredirect=true
Executing Link Remediation
Choose the job(s) by selecting the box in front of the job name.
Click More options and select Execute link remediation in the menu that displays.
The job will be queued to run.
Once the job is finished running, the link remediation status will be Complete if remediation was successful for all identified links.
If the link remediation status is Retry, link remediation did not run. You need to execute link remediation against the job again.
If the link remediation status is Failed, at least one link could not be remediated. You need to edit the link manually.The link detection information on the Content Insights, Items, and Links pages will be updated to reflect the current link information.
Link Remediation Impact
Link remediation does not affect the transfer times or speed of migration jobs because it is a separate process executed after migration when link detection is completed. It does, however, entail making additional calls to the destination and source platform, so platforms with caps or overage charges may be impacted. Link remediation does add time to the overall migration project because it adds a separate process that requires execution. The link remediation process is roughly equivalent to the extra time it would take to do another delta run on a document count basis. For example, remediating links in 1000 files in a job takes about as much time as running a delta run with modifications to a 1000 files. This should be factored in when planning your project if you plan on using link detection and remediation.