Responsibilities
This is a key role that should have the engineering knowledge, production experience and hands-on implementation ability. You will contribute in areas such as:
– Ensure the highest levels of our system performance, availability and scalability.
– Work closely with the development team to integrate new deployment processes and strategies.
– Seek out problems and opportunities in devops enablers infrastructure areas and solve them.
● Help develop and maintain a state of the art platform as a service solution, using the latest and greatest technologies and approaches (e.g. Kubernetes, Docker, Microservices, etc.)
● Help develop the best possible continuous delivery pipelines supporting features like an automated promotion to production, automated canary releasing or blue-green deployments.
● Implement monitoring and logging solutions that enable the production systems to be monitored 24/7.
● Respond to requests from engineering by building self-service solutions
● Make sure that any tech solution that you put in place is robust, will scale and that failover/BCP systems are in place.
● Implement robust security measures for infrastructure, including monitoring and responding to attacks on our systems.
● Able to guide other SRE members on large, complex projects
● Work collaboratively with the engineering team, give technical solutions or accept challenges from them.
Requirements
● Strong computer engineering foundation from work and related academic degrees.
● +5 years of experience from IT Operations and infrastructure and system engineering.
● Must have experienced in maintaining Data Center and Manage large Network
● Hands-on experience with containerisation and container orchestration (e.g. Docker and Kubernetes)
● Must have experience in Linux System Administration, performance tuning in RHEL/CentOS/Debian/Ubuntu distribution.
● Must have experience in Load balancer, Cluster and failover technologies.
● Must have experience in CD, CI and configuration management tools such as Jenkins, Gitlab, Ansible.
● Must have experience in Scripting skills in Bash.
● Must have experience in configuring service discovery, cloud and non-cloud based monitoring tools (Consul, Nagios/icinga, Cacti, Stackdriver, Newrelic).
● Experienced in design failover cluster using Nginx, Varnish, MongoDB, Postgres, TimescaleDB, Redis, Kafka and ElasticSearch will be a plus
● Hands on experience with AWS or GCP will be a plus ● Strong team player with the capability to learn, communicate and guide other members on new technology.
Top Skills
What We Do
Two95 International Inc., is a global technology firm specializing in enterprise solutions that evolves over BPM, Mobility, Cloud, Analytics, E-commerce & Social Business. Our client base includes several Fortune 500 and mid-market companies across industries and varying geographies.
With vast knowledge and knowhow of 20 years in the IT field, we have been chosen as INC500 fastest growing company in North America in 2013. With the accolade of being ranked 11th in Human Resources by INC500, we have also been nominated as the 3rd fastest growing company in South Jersey by SJBM. We are ranked among the Top 20 IT Companies in New Jersey based on the year-on-year growth for the last 3 years. With a seasoned team of highly qualified personnel, our offices are located in New Jersey, Canada and India.