...
The models below are provided for the purposes of providing general architectural examples that serve as a good starting point for solution design. SkySync can be easily scaled up if necessary and prudent.
Model A – Low Volume / Modest Throughput
This model represents deployments that include the transfer or synchronization of (3) million file and folder objects or less. Achievable throughput will be somewhat modest and generally isn’t the top priority for the solution. This will be an easy configuration with minimal planning requirements. While this whitepaper can help and deployments can be fully managed by typical administrative staff, SkySync recommends that the Client Solutions Silver Launch Package be utilized to ensure solution success.
...
Model A SkySync Architecture Guidance
...
This is a single server model with no clustering configured
8+GB RAM, 60GB+ system drive, dual core processor or better
Windows Server 2008 SP2 or newer, fully patched, with .NET framework 4.5 or newer
SkySync installation generally consists of accepting the default answers which results in the creation and usage of a local SQL CE database
SQL CE supports a maximum database size of 4GB
If the 4GB SQL CE database size is reached, it is possible to convert the database to a full SQL Server database format
When SkySync is deployed to use SQL CE for the database, a maximum of (2) jobs may be run simultaneously
This model is intended for ease of implementation as opposed to high throughput and/or redundancy
...
Model B – Moderate Volume and/or Moderate Throughput
This model represents deployments that include the transfer or synchronization of (10) million file and folder objects or less at a moderate throughput. This model essentially represents the scale of what can be reasonably accomplished with a single server solution that is provisioned with better than standard resources. This solution will require some planning around database configuration and overall server specifications. Again, this whitepaper will be useful in deploying this solution model, but SkySync strongly encourages all customers to take advantage of the Client Solutions Gold Launch Package be utilized to ensure solution success.
Model B SkySync Architecture Guidance
...
This is a single server model with no clustering configured
32+GB RAM, 60GB+ system drive (solid state if possible), 4 or 8 core processor or better
Windows Server 2008 SP2 or newer, fully patched, with .NET framework 4.5 or newer
Before SkySync installation, SQL Server Express or full SQL Server Standard/Enterprise should be deployed on the SkySync processing server
Database Planning and Tuning Concepts (below) should also be implemented
SQL Server Express supports a maximum database size of 10GB
SQL Server Standard/Enterprise maximum database size is not a factor for this model
When SkySync is deployed to use SQL Server Express/Standard/Enterprise for the database, a maximum of (6) jobs may be run simultaneously by default
SkySync Tuning Concepts (below) may be employed to increase throughput when processing server resources are available and rate limiting is not a factor
This model is intended for advanced, single server implementation as opposed to the simpler default install that uses SQL CE
...
Model C – High Volume and/or High Throughput
This model represents deployments that include the transfer or synchronization of more than (10) million file and folder objects and/or transfers or synchronizations that require a very high throughput. Achievable throughput will be significantly determined by available bandwidth, network latency, SQL Server I/O and other resources, number of processing servers/threads and the level of API rate limiting on either the source or destination. This solution will require very careful planning around database configuration, SkySync cluster configuration, cluster specifications, and cluster location with respect to source and destination location. To be able to achieve the highest levels of throughput, SkySync Client Solutions will need to be engaged to assist with planning, deployment and cluster tuning processes.
Model C SkySync Architecture Guidance
...
“Standard” storage performance for a given organization disk subsystem will be sufficient when there are from approximately 1 to 3 or maybe 4 processing servers. Anything more than that or when a very large volume of file and folder objects will be under transfer management (10’s of millions) and the storage volume performance must be better than standard. For these extreme scale SkySync solutions with (5) or more clustered processing servers, it’s good to start thinking about Solid State Drive (SSD) class or Tier 1 class storage. Some organizations consider SSD class performance to be Tier 0. The point here is that very high IOPS support and very low latency becomes the single most important factor in ensuring high sustained transfer throughput and overall solution stability throughout the SkySync processing cluster. The SkySync transfer engine can be massively parallelized and given the amount of data logging that occurs during data transfer operations, this can put tremendous READ/WRITE pressure on the SQL Server database disk subsystem.
Guidance for disk performance is simple for extreme scale solutions. If throughput is the highest priority, then the SkySync SQL Server should be provisioned with the best available disk the organization has access to. If the disk is strong enough, then additional processing servers can be added to the cluster. If not, then adding additional processing servers will actually make transfer performance or cluster stability worse. So the key then becomes how to understand whether or not the disk subsystem is “keeping up” with requests.
The easiest way to monitor the SQL disk I/O subsystem to ensure that it is servicing requests fast enough is to monitor the Disk Queue Length available in Resource Monitor. The easiest way to do this is to launch the Windows Task Manager, navigate to the “Performance” tab where basic system performance can be viewed first. Then click the “Open Resource Monitor” link at the bottom of the Task Manager
Once the Resource Monitor is open, navigate to the “Disk” tab and observe the disk queue length for the logical disk associated with the volume where the SkySync database and TempDB database data files are stored. The general rule of thumb is that Disk Queue Length should be less than the total number of physical drives that comprise the volume. That can get complicated with certain storage arrays where the number of disks for a given volume is obfuscated or, in the case of an SSD volume, there may only be a single drive.
In general, the ideal Disk Queue Length is less than 1. Anything in the single digit range during heavy processing is generally OK. If Disk Queue Length is in the low double-digit range then the disk is a little under powered and is beginning to impact performance. But if Disk Queue continues to climb, or is in the hundreds or thousands range, then your disk subsystem is certainly insufficient to handle incoming requests. In this situation, the disk subsystem will need to be upgraded to improve transfer throughput. If this is not possible, then it is recommended to shut down one or more SkySync processing servers while monitoring Disk Queue Length until reasonable numbers are achieved and maximum transfer throughput is identified.
...
As data and log files grow in a SQL Server, by default, during the extent operation SQL Server will overwrite any data in the new segment with zeros. This is a security measure designed to make sure that data from any old files, that once consumed the same space on disk, have no possibility of being read by SQL administrators. It is a pretty edge case scenario but it is a very minor security risk so, by default, Instant File Initialization is not enable by default which allows the “zero overwrite” to occur.
This becomes a relevant concept as SQL databases grow by substantial amounts (another topic addressed below). An extent operation of several hundred megabytes or even gigabytes can take many seconds or even minutes to occur under the default condition when zeros must be written. During this time, the tables in the database are blocked. Blocked tables are very bad for SkySync’s highly tuned transfer scheduler engine. This condition can cause unexpected behavior as job processing threads try to figure out how to continue.
There are (2) ways to combat this condition. The best way to ensure fast extent operations is to implement Instant File Initialization. Another way, also recommended as a best practice, is to pre-create multiple data files that are pre-sized in anticipation of the amount of data that may potentially be stored in the database. This latter option will be addressed below in further detail.
The issue with the second option is that it can be difficult for inexperienced SkySync Administrators and DBAs to determine just how big those pre-sized files should be. So, the tendency is to over-allocate which can waste precious high performance disk space. Essentially, the best solution is a combination of the two. Pre-create and conservatively pre-size the SkySync and TempDB database data files to minimize extent operations. Then enable Instant File Initialization to ensure that if there must be extent operations, they happen quickly.
Steps to Enable Instant File Initialization
...
5) Restart the SQL Server service
Pre-Creating and Pre-Sizing SQL Data Files
SQL Server experts suggest that multiple database files can have a meaningful impact on database performance. Paul Randal is one of the leading authorities on all things SQL Server. His company, SQLSkills, maintains a website full of useful and reliable information regarding SQL Server. Among the many articles in his “In Recovery…” blog is this one which highlights the benefits of properly scaling out database data files. It also draws attention to potential performance pitfalls of not doing it correctly.
When a database is first created in SQL Server through the user interface, the database will be created with a single “mdf” data file by default. It is possible to create this database with the primary “mdf” data file and then multiple “ndf” data files as well. This gives the database an opportunity to store content across multiple data files.
Configuring SQL data files in this way allows for improved data access performance because it allows SQL Server to multi-thread disk I/O operations. This is particularly helpful when the administrator has the freedom to store each of the data files on a unique, high performance disk volume. This technique can be used to enhance the ability of the SkySync solution to scale out with many processing servers.
The general rule of thumb for the number of data files that should be created is ¼ to ½ the number of physical cores available to SQL Server. For an 8 core SQL Server, deploying (2) to (4) data files is ideal. For a 16 core SQL Server (4) to (8) files would be good. For fewer cores, trend towards the “1/2” number. For a machine with 16 cores or more, trending towards the “1/4” number is generally typical.
...
Given the highly transactional nature of any migration platform, database maintenance plans become very important. Without proper maintenance, indexes can quickly become fragmented resulting in reduced performance.
Info |
---|
It is beyond the scope of this document to provide specific scripts for database maintenance. However, it is important to run index defragmentation and reorganization scripts at least twice per week to ensure that SkySync tables are properly maintained. If maintenance scripts are not standardized in the organization, Ola Hallengren provides very useful scripts that can rebuild indexes across all tables in a database but only when necessary. |
...
Like any migration solution, the data in the database can generally be considered transient. In other words, even if the database is lost, organizational content is not lost. This means that while database backups are useful to minimize any lost migration or synchronization processing time, recent log backups are not necessary.
Info |
---|
For these reasons, it is recommended that the SkySync database be configured for Simple Recovery model. There is no need to waste storage space and processing resources on data logging. |
...
With the SkySync database operating in the Simple Recovery model, database backups are straightforward. For high throughput solutions, a nightly full backup is generally sufficient. However, if the Recovery Point Objective (RPO) for the organization indicates that a full day of migration/synchronization processing is too much time loss, then additional incremental backups can be scheduled at intervals throughout the day.
Ideally, database backups should be executed with compression enabled to minimize backup storage size and backup duration. However, this will come at a cost of increased CPU utilization on the SQL Server which is important to consider.
...
2)Click on the performance tab in the SkySync configuration window:
3)Increase or decrease parallel writes as necessary by (1) or (2) at a time.
...