The Alibaba Cloud Cloud Native Serverless Team is a leading innovation force within Alibaba Cloud, dedicated to empowering developers and enterprises with cutting-edge serverless technologies. Focused on building scalable, cost-efficient, and fully managed serverless solutions, the team drives the evolution of cloud-native architectures by abstracting infrastructure complexity and enabling seamless integration with modern application development paradigms. Delivering industry-leading serverless solutions that directly compete with AWS Lambda and other global cloud providers.
Cloud Product Operations & Reliability
● Oversee stability maintenance, performance tuning, and high-availability architecture design for serverless system components. Ensure 24/7 reliability of mission-critical systems.
● Manage containerized lifecycle on serverless clusters: Implement deployments, auto-scaling, version upgrades, and resource optimization in serverless environments.
Incident Response & Root Cause Analysis
● Lead troubleshooting of serverless, middleware, cloud products related incidents (e.g., key-value storage, message backlog, service registration failures) through log analysis, distributed tracing, and monitoring systems.
● Develop diagnostic tools using Go/Rust to resolve production issues, performance bottlenecks, and compatibility challenges.
Automation & Operational Excellence
● Build automation tools to standardize serverless system deployment, monitoring, and disaster recovery.
● Implement chaos engineering experiments, capacity planning strategies, and failover mechanisms to enhance system resilience.
Collaboration & Best Practices
● Partner with teams to optimize cloud product adoption strategies and deliver architecture design consultation.
● Create comprehensive technical documentation and drive standardization of serverless operations.
Minimum qualification:
● Bachelor's+ in Computer Science with 3+ years in SRE/serverless operations.
● Deep understanding of SRE principles: Balancing reliability metrics (SLIs/SLOs) with engineering velocity.
● Proven ability to diagnose complex distributed system failures under pressure.
● Excellent communication skills to drive cross-team collaboration and technical documentation.
Preferred qualification:
● Experience modifying cloud-based product source code for performance optimization, serverless experience is preferred.
● Expertise in large-scale distributed systems (10k+ topics, 1k+ node clusters).
● Kubernetes certifications (CKA/CKAD) or cloud provider certifications.
The pay range for this position at commencement of employment is expected to be between $104,400 and $171,000/year. However, base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience.
If hired, employee will be in an “at-will position” and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and market factors.
...lead projects, provide technical guidance, and deliver AI solutions to maximize the impact of unstructured data, collaborating with engineers and product managers. Design, develop, test, deploy, and support AI components, including RAG, LLMs, data pipelines, similarity...
...Our client in East Houston is seeking an Inside Sales Support Specialist to join their team on a permanent, full-time basis! Key Responsibilities: Provide outstanding customer support, resolving inquiries efficiently and maintaining long-term client relationships...
...Airport Security Officer Aviation Security Company Dulles International Airport- Dulles, VA Global Elite Group- Providing world-class aviation security through innovation and people committed to excellence. Global Elite Group is a highly specialized aviation...
...Consulting Inc. is an award-winning private investigative company. We are the creators of... ...projects such as eKnowID.com - 1st consumer background checking system of its kind, and... ...experienced Background Investigators to conduct federal security clearance investigations for...
...Must possess a valid drivers license with a clean driving record. Able to pass a criminal background check. Willingness to travel for up to two weeks at a time (total travel less than 25%). Must be available to work on-site in Lawrenceville, NJ, five days a...