Site Reliability Engineer – Hotel Search
Our team develops the next-generation technologies that change how millions of users search for their ideal hotel at the best rate. They build and run the heart of trivago – our hotel search engine. We are looking for a full-time Cloud Engineer to help our on-site team in Düsseldorf operate, automate and improve the existing systems to run it most efficiently and deliver the best user experience to our customers.
The current team consists of seven Cloud Engineers and one engineering manager. We are responsible for operating the complete backend of our main product – our hotel search engine. Backend in this context means client-facing APIs, the search engine itself, a storage engine for prices and finally services to connect to all our advertisers. Every request on trivago is going through this system. We recently migrated the majority of the backend from On-Premise to the Cloud. We operate these microservices on Google Cloud with a managed Kubernetes Engine. Services communicate via gRPC over a service mesh powered by Istio. Prometheus powers our monitoring, and for distributed tracing we use Jaeger. A Kafka Cluster manages the data transfer, and we automate our setups via Terraform.
We believe in the power and community of Open Source, and our motivation is to tackle operational challenges with a Software Engineering mindset. That is just a bit of information about the current configuration. We have a lot of ideas on how to move forward and be better than yesterday. We are looking forward to hearing which initiatives you will bring to the team!
Get an inside look at tech at trivago
What you’ll do:
- Work in a cross-functional team with software engineers, technical project managers, and data scientists to design, implement, and improve our systems, procedures, and evaluate the performance of the systems through advanced monitoring and observability.
- Help us to finish our migration of business-critical applications to a container based infrastructure.
- Be part of a fully paid On-Call rotation.
- Troubleshoot reasons for malfunctions, support the team members by investigating misbehavior of the application, implement solutions and document them via a transparent and blameless postmortem.
- Implement and maintain an intelligent monitoring, metrics and alerting system for our services.
- Take ownership, contribute your ideas and help us to stay one step ahead: You will be encouraged to challenge our current processes and consider what we can do differently while always keeping business priorities and value creation in mind.
- Be an ambassador for cloud technologies, spread the word and share knowledge among your colleagues in peer exchanges, guild meetings and meet-ups.
What you’ll need:
- You have relevant industry experience working in the area of (agile) Software Engineering, Site Reliability Engineering or Operations.
- A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical discipline or equivalent experience.
- Experience working with highly scalable systems, preferably in the cloud.
- Good knowledge of shell scripting and Linux-based systems and tools.
- Experience in one or more of the following programming languages: Go (preferably), Java, Kotlin, C or C++.
- A pragmatic, value-oriented mindset to drive for results in a fast paced environment.
- A good enough level of English (our company language) to be comfortable speaking it daily.
What we’d love you to have:
- Hands-On experience with technologies from our application stack, like Docker, Google Cloud, Terraform, gRPC, Kafka, Istio, Prometheus, Linux, Kubernetes, Envoy, Jaeger or the similar Cloud Native Projects.
- Experience in applying the DevOps and Site Reliability Engineering mindset.
- Experience in being On-Call.
- Experience in working in a cross-functional team with software engineers, technical project managers and data scientists.
- A good understanding of container-based infrastructures, resource schedulers, advanced monitoring, and observability systems and modern infrastructure best practices.
- Being open-minded and the desire to learn about modern application and infrastructure best practices like Monitoring, Networking, Containers and other Cloud Native areas.
- A proactive personality and good communication skills and confidence in presenting ideas and findings to stakeholders.
What you can expect from life at trivago:
- Growth: We help you grow as trivago grows through support for personal and professional development, constant new challenges, regular peer-feedback, mentorship and world-class training.
- Autonomy: Every talent has the ability to make an impact independently by driving topics thanks to our strong entrepreneurial mindset, our horizontal workflow and self-determined working hours.
- International environment: Our agile, international culture and environment with talents from 50+ nations encourages mutual trust and creates a safe space to discuss openly and act freely.
- Collaborative spaces: Our state-of-the-art campus in Düsseldorf offers interactive spaces where we can easily collaborate, exchange ideas, take a break and workout together.
- Relocation: We offer our international talents support with relocation costs, work permit and visa questions, free language classes, subsidies for public transport, flat search, company pension and insurance.
- trivago N.V. is proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background will lead to a better environment for our employees and a better product for our users.
- To find out more about life at trivago follow us on social media @lifeattrivago.
- To learn more about tech at trivago, check out our blog: https://tech.trivago.com/
- Want to learn more about trivago’s business model to prepare for your interview? Visit https://company.trivago.com/our-product/
How to apply
If you like the sound of this position, please apply here