Production Support Engineer
Requisitos
Your Qualifications and Skills:
- 2+ years of experience in a Production Support, Application Support, Technical Operations, Site Reliability Engineering (SRE), Support Helpdesk or Engineering role focused on production system operations.
- Hands-on experience using monitoring and observability platforms to investigate live incidents. (e.g., Splunk, Datadog, ELK).
- Solid experience with database systems, including the ability to write and execute complex SQL queries for data analysis and issue resolution.
- Experience coordinating between CX or non-technical teams and engineering, comfortable with technical and non technical communication.
- Proficiency in at least one scripting language (e.g., Python, Bash) for automation and ad-hoc analysis.
- Bonus: Experience with incident management frameworks (e.g. PagerDuty, OpsGenie) and platforms such as Jira Service Management or Zendesk.
Beneficios
Benefits of working at Rover:
- Long-term incentive plan with a company performance-based cash payout
- Pension plan
- Private medical insurance
- 25 days PTO
- Meal allowance and flexible compensation plan (transport and nursery)
- Gym membership
- €450 to cover the costs associated with the adoption of a pet
- Annual €150 wellness reimbursement
- Flexible work hours, sometimes you'll need to be in at certain times, but on the whole, we're pretty flexible when it comes to managing workload and time
- Grab snacks, fresh fruit, in our kitchen to keep yourself going
- Regular team activities, events, game nights, and more
- Dog-friendly office
Anuncio original
Who we are:
At Rover, pets and their people are at the heart of everything we do. We connect pet parents with trusted pet care across the U.S., Canada, Europe, and Australia.
Headquartered in Seattle and Barcelona, we're a values driven, fast-growing tech company focused on building safe and personalized experiences tailored to the needs of each unique pet. We're investing in AI as a business accelerator and provide every team member access to AI tools in service of creating better experiences for our community.
We are proud to be recognized as a great place to work, having been named among the 100 Best Companies to Work For in Seattle Business Magazine and Washington's Best Workplaces in the Puget Sound Business Journal. At Rover we're committed to creating an accessible, inclusive, and welcoming community, which starts with our employees.
Want to make an impact? Join our pack and come work (and play!) with us.
Meet the Site Reliability and Production Support Team
What we are looking for
Your Responsibilities:
- Act as the primary technical point of contact when CX escalates customer-impacting issues, translating business impact into clear technical problem statements.
- Triage incoming incidents, assess severity and urgency, and communicate status updates across stakeholders in a clear and timely manner.
- Manage the incident lifecycle from initial report through resolution, communicating status updates clearly to relevant stakeholders (CX, Product, Engineering).
- Develop and maintain comprehensive runbooks and knowledge base articles for common issues and standard operational procedures.
- Troubleshoot and debug complex production issues utilizing logging platforms (e.g., Splunk, ELK stack), monitoring tools (e.g., Datadog, Prometheus, Grafana), and database query tools (SQL, NoSQL) to diagnose the root cause of problems.
- Perform code-level analysis when necessary to pinpoint defects or architectural weaknesses contributing to production instability.
- Collaborate effectively with Product Development teams to prioritize, document, and hand off confirmed bugs and large-scale systemic issues for permanent resolution.
- Act as the escalation point for the CX team when issues require deeper technical investigation or coordination with engineering teams.
Our style:
- We are proud to be professional software developers building high quality, scalable and supportable solutions.
- We are curious and passionate about learning, providing the right environment and resources for professional growth.
- We are committed to building, fostering and maintaining a culture of inclusivity and diversity both on our teams and in our products.
- We embrace progressive engineering practices including automated testing and a continuous deployment pipeline.
- We are serious about the quality of our production operation, and have thorough system, application and user interaction monitoring and anomaly detection.
- We are passionate about data-driven decision-making.
- We are friendly, supportive and respectful, and we pay attention to the impact and quality of our work as well as keeping work/life balance.
- And, dogs in the office. Bring yours, too!
Candidatura gestionada por Rover