Senior Cloud Network Operations Engineer

Posted 17 Hours Ago
Be an Early Applicant
Heredia, Heredia
Mid level
Big Data • Machine Learning • Software • Analytics • Big Data Analytics
The Role
The Senior Cloud Network Operations Engineer will monitor critical infrastructure, triage alerts, investigate incidents, perform root cause analysis, develop monitoring tools, and communicate with stakeholders to improve platform reliability and stability at Databricks.
Summary Generated by Built In

P-363

While candidates in the listed location(s) are encouraged for this role, candidates in other locations will be considered. This role is hybrid.

We're growing fast and attracting the best talent in the world. Bricksters — as we call ourselves — are a special mix of smart, curious, quick thinkers. If you ask a Brickster what they love about working here, you'll likely hear about our culture.

We are seeking an experienced a Network Operations Center engineer to join our team. The successful candidate will be responsible for monitoring critical Databricks' infrastructure and developing monitoring tools and alerting dashboards. They will also work closely with stakeholders to investigate and resolve incidents, perform root cause analysis, and propose solutions to increase the reliability and stability of the Databricks platform.

The impact you will have:

  • Monitor critical infrastructure, triage alerts to proactively identify incidents, and work with stakeholders to resolve incidents.
  • Investigate incidents and propose solutions to improve platform reliability and stability.
  • Perform root cause analysis for reoccurring incidents and provide proactive solutions.
  • Develop toolings or automate processes to improve platform monitoring and alerting.
  • Contribute to software development efforts to improve overall service reliability and stability.
  • Communicate with internal stakeholders, including executive staff, to provide incident analysis.
  • Participate in war rooms and temporary communication channels during outages.
  • Demonstrate cross-functional leadership and establish ownership of incidents and outages.
  • Multitask on several incidents and/or projects at once

What we look for:

  • 3 years of experience as a NOC, SRE, or DevOps engineer
  • Knowledge of cloud technologies such as Azure, AWS, and GCP
  • Hands-on experience with monitoring, logging, and alerting tools
  • Hands-on experience with containers and orchestration technologies
  • Automation and scripting skills
  • Linux systems administration skills.
  • Knowledge of managing incidents
  • Excellent communication skills.
  • Technical degree or equivalent experience
  • Willingness to learn the Databricks products

About Databricks

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Benefits
At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit https://www.mybenefitsnow.com/databricks. 

Our Commitment to Diversity and Inclusion

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Compliance

If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.

Top Skills

AWS
Azure
GCP
The Company
New York, NY
2,200 Employees
On-site Workplace
Year Founded: 2013

What We Do

As the leader in Unified Data Analytics, Databricks helps organizations make all their data ready for analytics, empower data science and data-driven decisions across the organization, and rapidly adopt machine learning to outpace the competition. By providing data teams with the ability to process massive amounts of data in the Cloud and power AI with that data, Databricks helps organizations innovate faster and tackle challenges like treating chronic disease through faster drug discovery, improving energy efficiency, and protecting financial markets.

Similar Jobs

Cencora Logo Cencora

Engineer III - Software Engineering (CR)

Healthtech • Logistics • Pharmaceutical
Heredia, Heredia, CRI
46000 Employees

TransUnion Logo TransUnion

Software Engineer (.NET, C#, Python) - Real-Time Data Pipelines & AWS Cloud

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote
2 Locations
13000 Employees

TransUnion Logo TransUnion

Software Engineer DevOps - Remote

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote
2 Locations
13000 Employees

TransUnion Logo TransUnion

Software Development Tester Lead-Remote

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote
Heredia, Ulloa, Lagunilla, CRI
13000 Employees

Similar Companies Hiring

InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account