Targeted massive incident notification system for a globally distributed computation network

Vadim Efimov
15m
A typical 'cloud' service, provided by means of a globally distributed computations network commonly has availability SLA below 100%, meaning a possibility of an incident with service outage or service degradation. With a multi-service approach and global segmentation, this incident in most cases does not happen for the whole system and all services at the same time. On the contrary, there are multiple incidents in different global regions and different services, each affecting only subset of customers. An incident notification to all customers creates a negative service provider image in terms of service availability. A targeted incident notification system is introduced in this paper, sending an incident notification to customers who are currently experiencing service outage or degradation only. An implementation of this system in RingCentral, a global 'cloud' telecommunications service provider is given.