Microsoft introduced support for SQL Server AlwaysOn Availability Groups with BizTalk 2016; and it’s a fair bit to digest. Availability Groups is available in SQL Server 2016 Enterprise Edition, and introduces some valuable and important options for disaster recovery and high availability. Implementing these features to provide resiliency and protection for BizTalk requires even more understanding and a lot of complexity.
Overview
SQL Server AlwaysOn Availability Groups (AAG from here on in) is essentially an extension of what we previously know as database mirroring; with a healthy dose of steroids! It utilises (and requires) Windows Server Failover Clustering, however, doesn’t necessarily require SQL Server to be ‘clustered’ in the typical failover instance style most are used to (these days referred to AlwaysOn FCI). Instead, you can have a combination of standalone instances and/or AlwaysOn FCI instances, with the AAG function acting as an umbrella over the top. This means you don’t need to have exotic hardware to provide the usual requirements of shared disk access, for example (this is not a requirement for AAG at all; unless you combine it with FCI, discussed later).
AAG delivers allows for “synchronous commits” for up to 3 nodes (i.e. your active and two passive nodes), which ensures any transactions are committed across your AAG nodes, resulting in a highly optimised disaster recovery solution – read: an RPO (recovery point objective, or data loss objective) approaching zero.
Furthermore, AAG enables “listeners” to provide network access points which allow for automatic failover (think network load balancing). This complements the DR functionalities, enabling the promotion of a new featureset, being HADR Enabled (High Availability/Disaster Recovery).
What’s interesting here, is that these technologies are necessarily new; SQL Server has evolved each of the previously mentioned feature sets which make up AAG over many releases, dating back to SQL Server 2008 (and possibly further back). SQL 2012 embraced the new “AlwaysOn” message and combined these technologies as a mature offering, not without its limits.
Does AAG & BizTalk work on SQL Server versions prior to 2016?
No. Prior to SQL Server 2016, cross database transactions on AAG & database mirroring implementations was not supported. This is a key requirement for BizTalk. From Microsoft for SQL Server 2014 (https://msdn.microsoft.com/en-us/library/ms366279(v=sql.120).aspx)
Cross-database transactions and distributed transactions are not supported by Always On Availability Groups or by database mirroring. This is because transaction atomicity/integrity cannot be guaranteed for the following reasons:
- For cross-database transactions: Each database commits independently. Therefore, even for databases in a single availability group, a failover could occur after one database commits a transaction but before the other database does. For database mirroring this issue is compounded because after a failover, the mirrored database is typically on a different server instance from the other database, and even if both databases are mirrored between the same two partners, there is no guarantee that both databases will fail over at the same time.
For distributed transactions: After a failover, the new principal server/primary replica is unable to connect to the distributed transaction coordinator on the previous principal server/primary replica. Therefore, the new principal server/primary replica cannot obtain the transaction status.
In Theory
Microsoft’s guidance is to create a number of SQL Server instances; a minimum of 8 (YES 8!) to create the Availability Groups with MSDTC support. That in itself is a challenge if you’re not used to automating your installs; an automated install with 4 instances on two servers running in parallel took approximately 2 hours just to install the SQL Server database engine components for this many instances.
This requires some serious infrastructure.
Learnings & Considerations
You can’t simply implement AAG & BizTalk without properly considering the entire solution. With so many moving parts, and various levels of documentation, it’s important to design first, validate your design, improve it, and move forward.
- Automate! The complexity of implementing AAG does not allow for mismatches here and there; while you might get away with it, investigating errors down the track is much more involved.
- Break it. Don’t assume AAG will magically continue to work if a data centre drops its’ network link. Test your DR strategy; AAG doesn’t replace the need for backups, even though it enables an offsite DR instance.
- Keep up to date. Not just with BizTalk and SQL Server, but your Windows Server version also! Windows Server 2008 works, but has some serious flaws in WSFC that will crumble your implementation if, for example, you span multiple datacentres; WSFC is much more robust and complete in Windows Server 2012 with dynamic quorum calculation.
- If you enable AAG on an instance, don’t mix it with other non-AAG databases. Include your databases in the AAG or put them on another cluster – you’ll find out the hard way if your instance fails over, that non-AAG databases aren’t replicated to your passive node!
- OnPremise, you’re looking at a minimum of 2 VMs for the SQL Server components – and that’s if we forget about SSIS & SSAS. 32GB of RAM and 8vCPUs is a minimum; with 4 instances + other services running on a single server, we have to really think about resource contention – CPU & RAM is just one part of the equation – what about your DISK? Ideally, you’ll have separate spindles for each instance’s Data/Log/TempDb locations.