The Role
As an AI DevOps Engineer III, you'll enhance AI/ML services, automate processes, and collaborate closely with data scientists and engineers. Responsibilities include optimizing performance, implementing compliance tools, developing observability dashboards, creating CI/CD pipeline frameworks, and providing training on business support tools.
Summary Generated by Built In
The team
Vonage AI group is looking for a sharp DevOps engineer to work on AI/ML based services specializing in audio, text and conversational AI systems. The developed services enhance existing products and drive innovation with new products within the company.
What will you do?
- Drive operational excellence with automation of business processes that enhance productivity of Engineering teams
- Work closely with Data Science and Developers for performance optimization.
- Proactively identify opportunities for business process improvements, recommending and driving implementation of tools and technologies
- Collaborate closely with cross-functional teams to integrate tools and services for Security and Compliance
- Build and maintain internally developed and vendor tools utilized to support business operations
- Design and implement central observability dashboards assimilating data from various Monitoring, Alert Management and Analytics platforms
- Develop and maintain tools to support Incident Management with integration across Okta, Slack, Jira, Confluence and Google Docs
- Develop secure and compliant CI/CD pipeline frameworks which can be replicated and adopted by cross functional engineering teams
- Build tools to generate periodic reports on service availability, performance, top incidents and other key organizational metrics
- Provide training and support to engineering team on business support tools
What You Will Bring:
- Willingness and ability to learn Tools & Technologies quickly and apply them to improve business processes
- Experience building internal platform engineering tools with Python or Go
- Expertise working with Docker, Kubernetes, Helm and Argo.
- Experience with IaC tools like Terraform, Pulumi or other.
- Experience with Code Repositories, and Code deployment tools GitHub, GitHub Actions, Azure Devops, ArgoCD
- Extensive experience with AWS components and services like EKS, EC2, VPC, CloudWatch, S3, IAM, Lambda, API Gateway, SQS
- Experience developing integrations with Collaboration and Issue tracking tools like Slack, Jira, Confluence, Google Docs, Google Sheets
- Experience using and developing integrations with Monitoring, Alert Management, Analytics platforms like Opsgenie, Nagios, Grafana, Prometheus, AWS cloudwatch, Elastic Search, Kibana, Tableau
- Expertise with Linux Operating Systems and good understanding of TCP/IP networking, IT Security concepts
- Knowledge of user identity management, single sign on and role-based access concepts
- Ability to present/lead technical discussions with cross functional Development, IT and Security teams
- Self-directed, works independently and with the attitude that everything can be automated
Good to have:
- MLops / LLMops experience
- Working on AI/ML projects including building infrastructures, deploying and running self hosted models.
#LI-JB1
#LI-Remote
Top Skills
Go
Python
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Resume Uploaded Successfully
The Company
What We Do
We’re making communications more flexible, intelligent, and personal, to help enterprises the world over stay ahead. We provide unified communications, contact centers and programmable communications APIs, built on the world's most flexible cloud communications platform.
Gallery
