Cloud Solutions Specialist
Cloud Solutions Specialist
Toku’s mission is to help companies engage with their customers efficiently. We have helped companies move all the voice communications to the cloud and recently embarked on creating products that help companies keep engaging with their customers no matter where their employees are working from.
Toku is currently getting into a scale-up mode. We want to continue creating momentum for our products in the APAC regions and helping customers with their communications needs. As we build the Operations foundation, we are looking for an experienced System Engineer who can help enhance and maintain our hardware and software. Want to be part of our journey?
What would you be doing?
The Cloud Solutions Specialist is responsible for the configuration, reliability, and efficiency of cloud systems. He optimizes the capacity and performance of infrastructure, using knowledge of coding and scripting to automate recurring tasks, elimination of issues and enable scalable and distributed systems. He also supports system installation and upgrades, performs continuous infrastructure monitoring, and ensures security and compliance in leveraging cloud platforms.
He possesses a high level of proficiency in scripting and programming languages. He is familiar with cloud platforms, scaling, and management of IT infrastructure. He works well with a variety of internal and external stakeholders. He can work on an on-call and shift basis, with the ability to prioritize effectively and operate under pressure.
The Cloud Solutions Specialist enjoys hands-on problem-solving and is driven by investigating challenging and complex problems. He is a resourceful and self-directed individual who performs independently with minimal guidance. He is also an analytical thinker who demonstrates strong interpersonal skills in cross-team collaboration.
What would you be responsible for:
- Develop processes and standards for system or application reliability in areas of availability, performance, latency, capacity, emergency response, capacity planning, change management, security, and monitoring
- Translate business needs into cloud architectural requirements
- Design scalable, robust systems using cloud architecture
- Create procedures and documentation for site reliability and incident management
- Build and run large-scale, massively distributed, and fault-tolerant systems
- Perform provisioning of cloud resources
- Configure infrastructure environment for software development and prototyping
- Conduct pre-deployment testing of systems to ensure reliability and scalability
- Implement operational cost control mechanisms for cloud infrastructure
- Identify and resolve deployment issues
- Patch and update systems for bugfixes and address security vulnerabilities
- Oversee configuration of operational systems to ensure alignment with technical and security requirements
- Conduct measurement and monitoring of overall performance, system health, system availability, and latency
- Provide proactive updates or alerts on infrastructure availability to relevant stakeholders
- Address gaps in performance or availability based on identified metrics
- Carry out testing and release procedures to ensure the rigour of infrastructure and services
- Resolve service operation issues and prevent recurrences
- Perform regular tuning of infrastructure and services
- Conduct capacity planning for cloud infrastructure and systems performance analysis
- Identify opportunities to enhance operational workflows, systems, and processes through automated deployment
We would love to hear from you if you have:
- Bachelor’s degree in Computer Science, Information Systems or a related field
- In-depth knowledge of Linux: RedHat, CentOS, Debian, Amazon Linux, etc; Windows Server 2019
- Experience with cloud services (AWS, Microsoft Azure)
- Experience with Oracle, MySQL and MS SQL Server a plus
- Experience in Docker and Kubernetes
- Expert in Shell, Perl, Python scripting, automation tools like Terraform and Ansible
- Solid knowledge of protocols such as DNS, HTTP, LDAP, SMTP, and SNMP
- Managing servers virtualization technologies like VMWare, KVM, or Microsoft Hyper-V
- Additional Linux certifications (RHCT, RHCE, etc) and Microsoft certifications will be considered an advantage
- Strong problem solving, communication, and documentation skills
- Must be organized, flexible, and able to adapt to a rapidly changing environment
- Positive, self-motivated individual who can complete tasks independently
- Ability to create and maintain strong working relationships with colleagues and customers
- Ability to work to tight deadlines under pressure
What would you get?
- Flexible working locations
- Training and Development
- Discretionary Yearly Bonus & Salary Review
- Healthcare Coverage based on location
- 20 days Paid Annual Leave (excluding Bank holidays)
If you would love to experience working in a start-up growing at an accelerated speed, and you think you tick most of the requirements, join us!