Senior Cloud Architect

Ho Chi Minh

IT

Full-time

  Facebook   Linkedin

Senior Cloud Infrastructure & Reliability Engineer

We are seeking an experienced cloud engineer to take full responsibility for the design, operation, and continuous improvement of large-scale cloud platforms. This role focuses on building resilient, secure, and cost-efficient systems that remain highly available under heavy traffic and business-critical conditions.

You will own cloud reliability end to end — from infrastructure design and automation to incident response, security, and compliance. This is a hands-on leadership role for someone who thrives in high-ownership environments and is comfortable being accountable for uptime and recovery.


Key Responsibilities

Cloud Architecture & Operations
Design, deploy, and operate scalable cloud environments across major cloud providers, ensuring high availability, fault tolerance, and optimized operating costs. Take clear ownership of system reliability and performance.

Automation & Delivery Pipelines
Build and maintain automated deployment pipelines that support frequent, low-risk releases. Ensure infrastructure and application changes are repeatable, auditable, and safe.

Observability & Resilience
Implement monitoring, alerting, and automated recovery mechanisms to support near-zero downtime operations. Continuously improve visibility into system health and performance.

Business-Critical Continuity
Design and maintain disaster recovery strategies, multi-region redundancy, and automated failover mechanisms across application layers to ensure uninterrupted service delivery during outages.

Incident Ownership
Lead incident response during critical events, including investigation, coordination, and resolution. Perform root-cause analysis and implement long-term fixes to reduce recurrence and recovery time.

Infrastructure as Code Leadership
Define and manage infrastructure using Infrastructure as Code practices to ensure consistency, traceability, and rapid recovery across environments.

Security & Compliance
Establish and enforce cloud security standards, including identity management, encryption, logging, and access controls. Support compliance initiatives by implementing controls, automating evidence collection, and maintaining audit readiness.

Data Protection & Reliability
Ensure databases and stateful systems are resilient through proper backup strategies, replication, automated failover, and regular recovery testing.

Optimization & Improvement
Continuously improve system scalability, performance, reliability, security posture, and cost efficiency, with measurable outcomes.

Cross-Team Enablement
Partner with engineering, product, and go-to-market teams to embed reliability, security, and cost-awareness into everyday development. Mentor teams on cloud best practices and automation.


What We’re Looking For

  • Strong hands-on experience designing and operating cloud infrastructure in AWS, Azure, or similar platforms

  • Proven ability to build and maintain highly reliable, scalable, and secure systems

  • Practical experience with Infrastructure as Code tools and automation workflows

  • Solid background in CI/CD design and cloud-native deployment practices

  • Familiarity with monitoring, alerting, and automated recovery approaches

  • Clear understanding of redundancy, failover, and disaster recovery principles

  • Working knowledge of cloud security and compliance frameworks (e.g., SOC-type or ISO-aligned environments)

  • A collaborative mindset with the ability to mentor others and raise overall platform reliability

Application form

Full Name *
Email Address *
Phone Number *
Your Resume *
To attach your Resume, click here to upload from your Computer.
Security code *

Submit