Active-Active

For Application Services + SQL

Executive Summary

Active–active redundancy ensures high availability by running multiple active instances, each fully capable of serving consumers independently of the others. Consistency and failover are managed by external mechanisms, often through application-level APIs that selectively synchronize only trusted objects and block suspicious changes. This independence limits the impact of a breach—an attack on one instance doesn't spread to the others—and enhances efficiency, as fewer standby resources can protect more primary systems.

Features below are examined in the context of Milestone XProtect VMS.

Architecture in a Milestone VMS environment

The architecture comprises two complete and independent XProtect solution instances, each featuring all XProtect services, SQL, and a separate media stack (recording service + media database). Devices stream to both VMSs simultaneously. Optionally, the VMS instances can be federated.

Active-Active Solution with Milestone XProtect

Failover

Two active VMS instances run in parallel, each capable of delivering all core services—authentication, configuration, video, events, and APIs. Clients can consume all services from a single instance or selectively draw different services/subservices from separate instances if one becomes unavailable. In this model, failover is client-driven, granular to the (sub)service level, and occurs as quickly as the client can redirect its requests.

Some failover types are discussed below.

Authentication Failover

Authentication failover is managed by an external load-balancing gatekeeper, which routes all client requests to the primary and automatically redirects them to the secondary if the primary becomes unavailable.

Stream Failover

Near-instantaneous live and playback failover is achieved because at least two media stacks remain active, with redundancy-aware logic built directly into the Client (Smart Client/PSIM). See Active-Active Media Stack.

Rapid Stream Failover

HA Events (Subservice Failover)

Each VMS instance has an independent event stream. When both VMS instances function perfectly, event streams are identical. However, in practice, various internal and external causes (camera sources, server loads, and client loads) may cause some events to appear in one stream but not the other. For example, a motion detection event may be raised in one VMS instance but not in the other, because the second instance has lost the stream from the camera. In cases such as these, use of redundancy-aware software that processes events from both streams yields event-level HA.

HA Events

Consistency

An application-aware service ensures database consistency by using the VMS's API to retrieve known objects from one database and deposit them into another. The method, which we call Selective Object Synchronization, provides precise control over exactly which content, in the application's context, the user wants to keep consistent.

Active-Active SQL with Selective Object Synchronization

For example, the user may not want to keep Role changes automatically synchronized. They may wish to verify changes manually before committing to the other replicas. The service makes this possible.

Selective Object Synchronization blocking Role Changes

Latency

Near-instant synchronization relies on the VMS application's ability to inform external listeners of changes quickly. In XProtect, certain events—such as camera password updates and stream parameter adjustments—are immediately available to listeners; however, many changes do not trigger these events. These updates can only be synchronized through polling, which prevents the process from being truly instant. Nonetheless, even with large systems of up to a thousand cameras, full synchronization takes less than an hour, and multiple runs can be scheduled daily, minimizing the overall impact.

Cybersecurity

An active-active setup offers enhanced cybersecurity for two key reasons: data center isolation and cyber-secure consistency management.

Datacenter Isolation

Each data center hosting a full VMS stack operates independently, with no shared clustering or storage with the other center. This isolation blocks lateral movement during an attack and ensures a clean, uncompromised environment is always available for recovery.

Insulated Datacenters provide better cybersecurity

Selective Object Synchronization

The mechanism of achieving consistency through an application-aware service provides unique cybersecurity advantages.

  • Ransomware Encryption: The attack is not propagated because no objects are returned through the API.

  • Stealth Procedure Injection: Application APIs expose no DDL, so the procedure can’t propagate.

  • Schema Tampering: API calls fail when attempting to retrieve objects, and corruption is not replicated even when object-level controls are disabled.

  • Standby Disk Overfill: Because the primary and standby servers are no longer joined at the hip by an ever-growing transaction log, an overflowing disk on the backup becomes a localized maintenance issue rather than a cascading system failure.

Efficiency

Active–active systems can achieve M:N efficiency: N standby VMS instances can cover M simultaneous primary instance failures, with N < M. Although there are no solutions in market that achieve N < M at this time, for XProtect, it is not difficult to envision the deployment of such a solution should the demand arise.

2:1 Efficiency

Last updated