The Role:
Our Platform Management engineers oversee the health, scalability and security of our cloud infrastructure. They take hands-on responsibility for maintaining and evolving the systems, services and tools shared across the engineering teams. In addition, they use their in-depth understanding of cloud development to advise, coach and assist the engineering teams as they evolve to respond to the ever-growing needs of our customers.
We’re looking for an experienced Dev Ops Engineer to join us and help us maintain a healthy, scalable platform that will meet the ever-increasing demands on our rapidly growing platform.
Responsibilities:
- DevOps Concepts and Execution: Comprehensive knowledge of DevOps methodologies, core principles, and best practices, encompassing continuous integration, continuous delivery, and automation.
- Cloud Service Providers: Skilled in using cloud platforms such as AWS, GCP, among others, including services like compute, storage, networking, and serverless architectures.
- Containerization and Orchestration Tools: Proficient in container technologies like Docker and orchestration platforms such as Kubernetes for effective deployment, scaling, and application management.
- Infrastructure Automation (IaC): Experienced with infrastructure as code tools like Terraform, CloudFormation, or Ansible to automate the setup and management of infrastructure.
- CI/CD Pipeline Management: Capable of establishing and maintaining continuous integration and continuous delivery pipelines using tools such as Jenkins, GitLab CI/CD, CircleCI, and others.
- Scripting and Development Languages: Strong skills in scripting languages like Bash, Python, or Groovy, and proficient in programming languages relevant to DevOps, such as Java, Ruby, and Python.
- Monitoring and Logging Solutions: Knowledgeable about monitoring tools like Prometheus, Grafana, the ELK stack, and skilled in setting up efficient monitoring and alerting systems.
- Security and Compliance Practices: Understanding of security best practices, identity and access management (IAM), and the ability to secure applications and infrastructure effectively.
- Networking Knowledge: Familiar with networking basics, including TCP/IP, DNS, load balancing, and firewalls.
- Collaboration and Communication Tools: Proficient in using tools like Slack, JIRA, Confluence, or similar platforms to enhance team communication and coordination.
- Problem-Solving and Troubleshooting: Capable of analyzing complex technical issues, diagnosing problems, and implementing effective solutions.
Required Skills:
- At least 6+ years of technology experience.
- 4+ years of working experience in Python, Kubernetes, and AWS/GCP.
- Hands-on experience on managing build tools like Jenkins, Bitbucket Pipelines, GitHub Actions.
- Implement solutions to improve performance and scalability by identifying inefficiencies in developer workflows, tools, and implement solutions to improve performance and scalability.
- Hands-on experience using IaC tools like Terraform/Terragrunt.
- Experience working on Linux based infrastructure
- Excellent understanding of Ruby, Python, and Java.
- Expert technical troubleshooting skills and/or the ability to implement processes and controls to find root cause.
- Proven ability to learn new technologies quickly.
- The ideal candidate should have handled operations, deployment, and security of multiple SaaS/B2C products
- Hands-on in log collection designs like EFK and metric collection tools like a new relic, Prometheus/Grafana or SignalFx
- Hands-on in the design, creation, and consumption of RESTful API, and Microservice architectures on public clouds preferably AWS/GCP.
- Excellent analytical, communication, and coding skills are a must.