SONiC departs from a traditional networking operating system where configuration commands directly update the hardware tables. Instead, SONiC transforms user intent/configuration to a system state via shared data models and control plane responsibilities spread across loosely tied services.
This article examines SONiC from a system design perspective, using VLAN creation as an example to deep dive into how the user intent (vlan add) flows through layers of SONiC NOS and becomes the state of the networking system.
While following the lifecycle of a VLAN across Redis databases and control plane services, the article explains why the SONiC architecture optimizes for large-scale operations rather than control plane latency.
Redis Database: the Control-Plane Backbone
In system design terms, SONiC manages a pipeline of increasing specificity. It moves from User Intent to Logical Representation to Hardware Programming. Redis acts as both the state store and the event bus, allowing independently developed services to converge on a consistent system state.
VLAN creation flow:
The following section traces a VLAN creation as it propagates through SONiCโs control plane, highlighting ownership boundaries and state transitions.
Step 1: Intent Injection and Desired State
The Management Framework (CLI) writes the intent to CONFIG_DB:4
config vlan add 100
redis-cli -n 4 HGETALL "VLAN|Vlan100"
Result: vlanid = 100
At this Stage, the intent is declarative and hardware has not been touched
Step 2: Intent Validation and Normalization
Vlanmgr monitors CONFIG_DB for VLAN related updates and validates the configuration, performs internal checks, handles linux kernel operations and writes the processed configuration to all relevant tables in APPL_DB:0:
redis-cli -n 0 HGETALL "VLAN_TABLE:Vlan100"
Result :admin_status = up mtu = 9100 mac = 00:01:02:03:04:05
Step 3: Intent to Hardware translation (Orchastration)
Orchagent : Orchagent subscribes to APPL_DB updates and translates normalized objects into SAI operations, which are serialized into ASIC_DB via the SAI Redis interface
Step 4: Hardware Realization through ASIC_DB (e.g. SONiC-VS)
redis-cli -n 1 HGETALL "ASIC_STATE:SAI_OBJECT_TYPE_VLAN:oid:0x260000000005dd"
Result: SAI_VLAN_ATTR_VLAN_ID "100"
Vendor agnostic entries. ASIC_DB:1: stores the exact SAI state expected in the ASIC.
Step 5: Execution Boundary โ Syncd
- Reads the REQUEST from ASIC_DB that libsairedis wrote
- Executes the real SAI API on the ASIC driver
- UPDATES ASIC_DB with the RESPONSE (serialized attributes from hardware)
Once Syncd successfully executes the SAI operations, the VLAN is realized in the ASIC forwarding tables. At this point, the original user intent has fully converged into hardware state.
Step 6: Observed State and Statistics – STATE_DB and COUNTERS_DB
- STATE_DB reflects operations state (e.g. link status)
- COUNTERS_DB reports statistics and resources
SONiC enables reconciliation after crashes, restarts, or partial failures without replaying commands by separating desired state from observed state.
Key Design Invariants:
- All configuration/user intent originates in CONFIG_DB and hardware state is derived from user intent
- Validation, Orchestration and hardware programming are intentionally separated to prevent cross-domain fixed coupling
- Vendor specific ASIC behavior is encoded inside SAI boundary and SONiC control plane components are abstracted from it
- Control plane component must be able to restart and restore its state from Redis db without reconfiguration
- STATE_DB and COUNTERS_DB provide visibility not direct control/override of configuration/user intent
Cost of Disaggregation:
SONiC’s control plane is designed to be modular, and that modularity comes with its own challenges.
A simple “VLAN add” operation results in multiple Redis writes, pub/sub notifications, cross-container communication. From latency and simplicity standpoint, this is objectively more expensive than a legacy NOS where a single process updated hardware tables. This intentionally designed overhead is the cost of disaggregation.
SONiC trades per operation efficiency for,
- Fault isolation across modules
- Independent services restarts without impacting the system
- Setting up ownership boundaries for system state and control
- Vendor agnostic hardware programming through SAI
Design Rationale:
SONiC’s design architecture make little sense if evaluated as a pizza box networking operating system. It makes sense when evaluated for hyperscale systems.
The VLAN walk through illustrates that, SONiC treats switch configuration as a distributed systems problem rather than a local device optimization problem. The design assumes that correctness, ability to recover the system, and operational scale matter more than minimizing number of control plane transactions for a configuration change. Eventually, the cost of disaggregation becomes favorable as the size of the switch fleet grows.


Leave a Reply