VSOL is a digital enabler with a mission to help public and private organizations evolve their businesses through data and technology. We provide an end-to-end service from consulting to execution, that drives the growth and innovation of our clients. As VSOL is in a phase of rapid expansion, we offer a dynamic, creative environment that accelerates your personal and professional development. We are looking for talented individuals eager to develop in international markets while contributing to the company’s future in a constructive and supportive manner.
Responsibilities:
The Manager – Technical Operations is a critical leadership role responsible for ensuring the stability, reliability, and performance of the organization’s technical infrastructure and systems. This position is accountable for Technical Support in delivering 24/7 operational support, managing system and network operations, implementing monitoring solutions, and maintaining centralized tools and services. The role requires a unique blend of deep technical expertise in Linux/Unix/Windows systems, networking, monitoring platforms, and proven people’s management capabilities:
- Act as technical sponsor and signatory for systems onboarded into managed services, reviewing and approving solution architectures with focus on operability, supportability, reliability, resilience, and scalability.
- Lead Production Readiness Reviews (PRR) prior to go-live, define and validate SLIs, SLOs, and SLAs, and ensure architectural decisions align with contractual and operational. commitments.
- Act as L3/L4 technical escalation owner for major or complex production incidents, lead technical triage, root cause analysis (RCA), and post-incident reviews.
- Define and maintain technical and operational standards for managed services, ensuring monitoring, alerting, logging, and tracing meet service requirements.
- Review and approve operational runbooks, escalation paths, and recovery procedures, and assess technical impact of changes on production stability and SLA/SLOs
- Provide architectural guidance on observability design (metrics, logs, traces) and guide teams on effective troubleshooting patterns for distributed systems
- Coordinate across engineering, DevOps, SRE, infrastructure teams, and external vendors during incident response and drive corrective and preventive actions.