Saratchandra Patnaik
Backend & Distributed Systems Engineer
I specialize in backend engineering, cloud infrastructure, and systems reliability — with production depth in distributed streaming systems, concurrent programming in C++, edge ML inference, and AI-driven observability tooling.
At Amagi Media Labs I owned reliability for 15+ microservices serving live broadcast clients on AWS EKS. I hold an MS in Computer Science from Arizona State University and currently research fault analysis and software verification in distributed systems.

Experience
Graduate Research Assistant
Arizona State University · Tempe, AZ
Software Verification, Validation & Testing
- ▹Research fault analysis and correctness properties for distributed system components, contributing to active work in software verification, validation, and automated testing.
- ▹Prepare publication-ready manuscripts through technical synthesis, literature analysis, and structured academic writing in collaboration with faculty.
- ▹Develop course materials on formal verification techniques and automated testing frameworks for graduate-level instruction.
Software Implementation Engineer
Amagi Media Labs · Bengaluru, India
AWS EKS · Python · FastAPI · Kubernetes · Docker · ArgoCD · Terraform · Linux
Owned reliability and operations for large-scale broadcast streaming infrastructure on AWS EKS — debugging live failures, building failover logic, and running the automation layer that kept 15+ production microservices stable for major media clients.
- ▹Optimized media workflow latency by 93.75%, from 8 minutes to 30 seconds, by building asynchronous Python/FastAPI frame and metadata pipelines with CV models.
- ▹Reduced manual debugging time by 60% by designing an AI-powered observability pipeline that ingested server logs into LLMs to automate root cause analysis.
- ▹Led the first successful deployment of Amagi-native ML speech-to-text services, Capsequo and Akashvani, and enabled org-wide adoption by training teams across the organization on deployment workflows.
- ▹Improved deployment frequency by 30% for a Kubernetes platform hosting 15+ microservices by owning AWS EKS releases, implementing ArgoCD GitOps, and maintaining 99.9% rollout stability.
Stream Reliability & Incident Response
Incident: A critical client's live stream fell back to rescue content due to an audio silence condition — a silent failure that required tracing logs across EC2 instances, Kubernetes pods, and multiple microservices to diagnose.
Finding: The provider had primary and secondary input streams but no automated health check or switching mechanism. Audio silence propagated undetected until the playout system gave up and triggered rescue content.
Fix: Implemented threshold-based failover logic that monitors audio presence and switches to the healthy secondary stream when silence persists beyond a defined window — eliminating rescue fallback and hardening the client's stream against input failures.
Software Engineer Intern
Blueplanet Solutions Inc. · India
MySQL · PHP · JavaScript · Linux
- ▹Database Optimization: Analyzed MySQL execution plans, refactored queries into optimized stored procedures with proper indexing — cut query execution time by 50%, enabling sub-second retrieval for 1,000+ user profiles.
- ▹Root Cause Analysis: Traced intermittent memory leaks to unclosed DB connections in legacy PHP by parsing web server logs. Patched connection handling — improved stability by 30% and eliminated recurring crashes.
- ▹Frontend: Built an async search UI with JavaScript (AJAX) and PHP to replace full-page reloads — improved time-to-result by 60%.
Skills
Cloud & Infrastructure
Backend Engineering
Reliability & DevOps
Systems Programming
AI & ML
Data & Storage
Languages
Projects
Multithreaded UDP Packet Processing Server
Problem: Build a telecom-grade server that receives encrypted binary UDP packets at high throughput and processes them without dropping or reordering under concurrent load.
Approach: Producer-consumer architecture with a thread pool and mutex-locked queues. Producer threads read raw UDP datagrams off the socket; consumers decrypt and process binary payloads in parallel. Statistics are published via POSIX shared memory IPC for external monitoring.
Result: High packet ingestion at sustained throughput with ordered processing guarantees and zero data loss under concurrent load.
AWS IoT Greengrass Edge Face Recognition
Problem: Run real-time face recognition on an edge device with no pip access — cloud-based inference introduced too much latency and the Greengrass Lambda environment had no package manager.
Approach: Packaged raw PyTorch MTCNN and FaceNet models into a custom deployment bundle that runs in a pip-free Lambda environment on Greengrass. Edge inference events stream asynchronously to the cloud via MQTT and SQS, fully decoupling edge processing from cloud consumption.
Result: Sub-second face recognition at the edge with a cloud-synchronized event stream and no internet dependency at inference time.
CUDA GPU Accelerated Image Processing
Problem: CPU-based 2D Gaussian filtering was the bottleneck for image processing workloads — single-threaded and unable to scale with image resolution.
Approach: Parallel CUDA kernel with shared memory tiling to eliminate redundant global memory reads, loop unrolling to maximize instruction throughput, and grid-stride loops to handle arbitrary image sizes. Validated output fidelity against CPU reference using PSNR and SSIM.
Result: 20× speedup over CPU — full-resolution frames processed in milliseconds instead of seconds.
Personal AI Agent — RAG System
Problem: Build a composable AI assistant for vehicle diagnostics, ATS resume optimization, and document Q&A — without locking the system to a single LLM provider.
Approach: Clean Architecture to decouple domain logic from model providers (OpenAI, Anthropic). ChromaDB handles vector retrieval for document grounding; GPT-4 Vision processes image inputs for diagnostics. Each capability is an isolated use case wired through a shared retrieval layer. React frontend, FastAPI backend.
Result: Fully swappable model backends with consistent retrieval quality across task types and a usable dashboard for diagnostics and resume analysis.