The principle of cloud scale routing protocol: lesson & learn from GCP outage

  1. Control Plane must keep Consistency and partition tolerance(CP).
  2. Data Plane must keep Partition tolerance and Availability(AP).
  3. Isomorphism.

CAP theorem

In theoretical computer science, the CAP theorem, also named Brewer’s theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

  • Consistency: Every read receives the most recent write or an error
  • Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write
  • Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes

The facts of existing design

Destination-based routing is CP mode

The destination-based forwarding requires all nodes have consistent forwarding table, Inconsistent forwarding table scenario may well-known as “Micro Loop” or “Black hole”. It means the CP mode cannot support Availability(CP without A).

Traditional Routing protocols and SDN are CA mode

Traditional Routing protocols like BGP has centralized RR which cannot be partition to multiple nodes, in some massive scale datacenter whom deploy BGP+EVPN must choose EBGP connect to the Spine and Leaf to avoid such issue.

Segment Routing is AP mode

When Clarence design Segment Routing, the fact of the TI-LFA assume In-consistent in network, but focus on Availability and Partition tolerance(AP Mode). TI-LFA sperate the topology to P Space and Q space, In each space assume the consistency and use destination based forwarding, but insert a Label for PQ node to connect two space together.

In Consistency Tolerance Forwarding to keep basically Availability

Design principal for cloud scale routing protocol

Principal.1 Control Plane must be CP.

Control Plane must be Consistency and Partition tolerance(CP), but not guarantee availability. ETCD distribution key value store could be used for next generation control plane.

Principal.2 Data Plane must be AP->BASE.

Data Plane must be Partition tolerance and Availability (AP), BASE(Basically Available, Soft State, Eventual Consistency) is our design goal for dataplane. Segment Routing with some smart link-state protocol could be used achive this goal.

Principal.3 Isomorphism.

When we design SD-WAN system ,this whole system must be worked as a big router. when we design cloud scale SDN system, it must be worked as a big switch.

GCP outage root Cause

Learn from GCP’s outage

From the RCA, it clearly shows Google’s internal routing protocol is based on CP(consistent but not available under network partitions) mode, It could be Chubby lock service to elect leader in this massive scale distributed system.

The best cloud scale routing system

Based on these thought and the principles, a prototype which called “Ruta” was made by me. It used ETCD as control plane(draft-zartbot-srou-signalling), and Segment Routing over UDP as dataplane(draft-zartbot-sr-udp).

CONCLUSION

Based on CAP theorem, we analysis the SOTA routing protocol and SDN desgin with their limitation and proposal 3 principals for next generation cloud scale routing protcol:

  1. Control Plane must keep Consistency and partition tolerance(CP).
  2. Data Plane must keep Partition tolerance and Availability(AP).
  3. Isomorphism.

Reference

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store