Hi, I'm Cedric — but most people know me as cedi.
I'm a Site Reliability Engineering Tech Lead, working mainly on distributed systems, chaos engineering, and platform resilience at scale. Basically, if it's complex, distributed, and the lights have to stay on no matter what: I'm into it.
- Building reliable, large-scale systems with a focus on resilience, SLOs, and automation
- Leading teams and setting technical direction in high-stakes, high-scale environments
- Designing chaos experiments, improving release workflows, and modernizing infrastructure
- Evangelizing good SRE practices through talks, docs, and community work
- Home lab with Raspberry Pi K3s cluster, CEPH storage, and a Stratum 1 NTP/PTP time server
- Cluster API managed cloud Kubernetes cluster running a full Grafana LGTM Stack
- Kernel recompilation just for fun (and for weird hardware drivers)
- Low-level distributed systems algorithms to explore gossip and consensus protocols
- Be excellent to each other 🤝
- Focus on fundamentals > chasing hype
- Alert on symptoms, not vitals
- Incidents are opportunities to learn
- There is no single "root cause"
- How Complex Systems Fail is required reading
- Your beloved system architecture exists mostly in your head and
behavesfails differently than you'd expect. (See the Above the line/below the line framework)