Principal Reliability Engineer - Bozeman, MT

  • Oracle
  • Bozeman, MT, USA
  • Apr 25, 2022
Full time Engineering Software Tech

Job Description

Job Description

Analyze, design develop, troubleshoot and debug software programs for commercial or end user applications. Writes code, completes programming and performs testing and debugging of applications.

As a member of the software engineering division, you will analyze and integrate external customer specifications. Specify, design and implement modest changes to existing software architecture. Build new products and development tools. Build and execute unit tests and unit test plans. Review integration and regression test plans created by QA. Communicate with QA and porting engineering to discuss major changes to functionality.

Work is non-routine and very complex, involving the application of advanced technical/business skills in area of specialization. Leading contributor individually and as a team member, providing direction and mentoring to others. BS or MS degree or equivalent experience relevant to functional area. 7 years of software engineering or related experience.

If you are a Colorado resident, Please Contact us  or Email us at to receive compensation and benefits information for this role. Please include this Job ID: 163740 in the subject line of the email.


My Oracle Support (MOS) provides an enterprise support solution for Oracle's customers.   MOS is undergoing a re-write and will be going live in late 2022.  The re-write of MOS will be running on Oracles Cloud (OCI), leveraging a number of Oracle technologies and services and has a microservice based midtier running on Kubernetes.   We are building out a MOS SRE team to empower our users with a support solution that's based on the latest technologies, is highly available and provides stellar performance to ensure our customers a world class support experience. We are seeking experienced SRE's to assist in improving our solution and ensuring its reliable operation. Specifically, we are searching for Engineers who bring fresh ideas, demonstrates a unique and informed viewpoint, and enjoy collaborating with a cross-functional teams to develop an industry leading solution and provide positive user experiences.


Our SREs are close partners with service owners in all aspects of service operations and ownership. We have a strong culture of innovation, collaboration and teamwork inspired by the DevOps philosophy.  Be part of a team of smart, motivated, and diverse people given the autonomy and support to do their best work. It is a dynamic and flexible workplace where you'll belong and be encouraged

  • Design, write, ship, and motivate the creation of software and systems to increase observability, product reliability and organizational efficiency
  • Work closely with development and testing teams on maintaining operational health of core services 
  • Monitor application performance take steps to improve overall application performance and stability and follow through with implementation
  • Establish end-to-end monitoring and alerting on all critical components of the application
  • Monitor and manage uptime, end-to-end performance and operability of all service processes and dependent infrastructure to meet SLAs
  • Solve complex problems related to infrastructure cloud services to prevent problem recurrence.
  • Contribute to making our infrastructure simple, reliable, and easy to operate
  • Document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it and turn into repeatable actions–and then into automation.
  • Troubleshoot complicated, cross platform issues handling OS, Networking, Database in a cloud-based SaaS environment and handle live production incidents, debug/troubleshoot application and infrastructure issues
  • Conduct periodic on call duties, respond to production incidents and provide support for development to address customer incidents
  • Results driven; thrive in a development environment that is agile, collaborative and in start-up mode, even when faced with ambiguity


  • Minimum of a BS in Computer Science or Engineering field or equivalent experience
  • 5+ years’ related experience
  • Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - Prometheus, Alertmanager, Grafana, 
  • Ability to operate independently, make decisions, take action and take responsibility
  • Effective communication and interpersonal skills, ability to work and coordinate between multiple teams
  • Have a software-centric mindset
  • Strong programming and scripting skills
  • Excellent troubleshooting skills
  • Experience architecting software

About Us

Diversity and Inclusion:
An Oracle career can span industries, roles, Countries and cultures, giving you the opportunity to flourish in new roles and innovate, while blending work life in. Oracle has thrived through 40+ years of change by innovating and operating with integrity while delivering for the top companies in almost every industry.
In order to nurture the talent that makes this happen, we are committed to an inclusive culture that celebrates and values diverse insights and perspectives, a workforce that inspires thought leadership and innovation.
Oracle offers a highly competitive suite of Employee Benefits designed on the principles of parity, consistency, and affordability. The overall package includes certain core elements such as Medical, Life Insurance, access to Retirement Planning, and much more. We also encourage our employees to engage in the culture of giving back to the communities where we live and do business.
At Oracle, we believe that innovation starts with diversity and inclusion and to create the future we need talent from various backgrounds, perspectives, and abilities. We ensure that individuals with disabilities are provided reasonable accommodation to successfully participate in the job application, interview process, and in potential roles. to perform crucial job functions.
That’s why we’re committed to creating a workforce where all individuals can do their best work. It’s when everyone’s voice is heard and valued that we’re inspired to go beyond what’s been done before.