Support Engineer (US Time Zone

Software engineer

Support Engineer (US Time Zone

test

Holla

5 years

Software engineer

Job Description

How to Apply

Support Engineer (US Time Zone)

Experience: 1–3 years
Location: Remote
Work Hours: US Time Zone

About the Role

We’re looking for a coding-forward Support Engineer to own L2 investigations and fixes across our Intel OPEA-based GenAI services.n production.

What You’ll Do

Own L2 incidents end-to-end: triage, root-cause, hotfix (small code changes), and drive long-term fixes for OPEA services (e.g., retriever/embedding/services, agents, inference gateway).
Debug microservices & APIs: reproduce issues locally with Docker Compose/Kubernetes; verify health checks; trace requests across components (LLM, vector DB, tool/agent).
Code to unblock customers: write focused patches and scripts (Python/TypeScript, FastAPI/Node) for data prep, adapters, and service hardening.
Pipeline reliability: monitor and tune RAG/agent pipelines (token/latency budgets, timeouts, batching, retries, circuit-breakers).
Observability first: build/run dashboards and alerts (logs, metrics, traces; OpenTelemetry where applicable).
CI/CD & IaC: maintain build/deploy for OPEA components; contribute to Terraform/Helm changes with DevOps.
Compatibility & model routing: validate OpenAI-compatible endpoints, model switches, and fallbacks (on-prem/cloud).
Docs & learning loops: keep high-signal runbooks, RCAs, and “best known methods” for recurring issues.
Participate in US-hours on-call rotations; provide crisp stakeholder updates.

Required Skills

1–3 years in Support/Platform/Dev/DevOps roles with significant coding in Python (preferred) or TypeScript/Node.
Solid microservices debugging: REST/gRPC, auth, queues, caching, concurrency, rate limits.
Containers & orchestration: Docker, Docker Compose; working knowledge of Kubernetes.
Linux fluency and shell scripting.
Cloud familiarity: AWS/Azure/GCP (networking, IAM, storage, managed K8s).
Version control & CI/CD: Git + a common CI (GitHub Actions/Jenkins).
Strong troubleshooting, crisp written/verbal comms, and customer empathy.

Nice to Have

OPEA ecosystem familiarity (GenAIComps microservices like retriever/embedding/reranker; Agent service built on LangChain/LangGraph).
Vector databases (Milvus/pgvector/FAISS), RAG patterns, prompt/tool/agent debugging.
OpenAI-compatible API experience; gateway/proxy patterns; token accounting.
Observability: Grafana/Prometheus, ELK/Datadog, OpenTelemetry traces.
Infra & MLOps: Helm/Terraform; KServe/Ray/Airflow basics.
Intel stack awareness helpful (Xeon, Gaudi accelerators, OpenVINO), but not required.
Jira/ServiceNow/Zendesk for incident workflows; Agile practices.

What Success Looks Like

Can reproduce and fix common OPEA microservice issues locally (compose/k8s), validate via health endpoints, and contribute small PRs.
Ship/run dashboards + actionable alerts for latency, error budgets, and throughput across RAG/agent paths.
Improve customer-visible SLO (availability, P50/P95 latency) through code/config changes.
Author clean runbooks and RCA that prevents a repeat incident.

We’re looking for a coding-forward Support Engineer to own L2 investigations and fixes across our Intel OPEA-based GenAI services. You’ll dive into microservices (retriever/embedding/reranker/agent), APIs, and infra, reproducing issues, shipping small patches, and partnering with Platform/Dev teams to keep customer workloads healthy and fast. OPEA uses a composable, microservice architecture for enterprise GenAI (e.g., RAG blueprints, agents, OpenAI-compatible inference endpoints), which you’ll support and extend in production.

Support Engineer (US Time Zone

Support Engineer (US Time Zone)

About the Role

What You’ll Do

Required Skills

Nice to Have

What Success Looks Like

Company

Product

Resources