At trivago, we generate a huge amount of logs and we have our own custom setup for shipping logs using mostly Protocol Buffers. Eventually we end up with some fields in Elasticsearch (ES) that contain partial (or full) URLs. For instance, in our specific case we store the query component of the URL in a field called query and the path component in a field named url_path. Sample values for these fields could be:
Posts about Monitoring
The Web Performance Impact Of Lossy Network Conditions
tl;dr: continuously monitor your CDN and origin servers on layer 3 with tools like MTR. Layer 3 issues on external middleware can have a significant impact on layer 7 web performance. In a recent rollout of a new cloud service, we monitored the impact of this service on web performance, UX and business metrics. For all cloud regions and origin servers, we had Synthetic and Real User Monitoring for our site in place.
Nomad - our experiences and best practices
Hello from trivago’s performance & monitoring team. One important part of our job is to ship more than a terabyte of logs and system metrics per day, from various data sources into elasticsearch, several time series databases and other data sinks. We do so by reading most of the data from multiple Kafka clusters and processing them with nearly 100 Logstashes. Our clusters currently consists of ~30 machines running Debian 7 with bare-metal installations of the aforementioned services.
Splitting a Monitoring Monolith into Separate Components
Ever heard about Microservices? Those tiny litte pieces of code that are used to split a big pile of magic into smaller pieces of magic? Well, they’re not that tiny after all and require lots of preliminary work to use them properly. Have a look at this post to hear about my journey of splitting an existing monolith written in PHP up into several microservices written in Go.
Cluecumber Report Maven Plugin for Cucumber test reporting
We were not as happy as we could be with out Cucumber test reporting solution - so we decided to build a new and shiny one from scratch.
Continuous Performance Monitoring for PHP - The tale of Blackfire at trivago
We’re a data-driven company. At trivago we love measuring everything. Collecting metrics and making decisions based on them comes naturally to all our engineers. This workflow also applies to performance, which is key to succeed in the modern Internet.
Introducing Protector - a Circuit Breaker for Time Series Databases
At trivago we store a subset of our realtime metric data in InfluxDB and we are quite impressed by the load it can handle. Despite all the joy, we had to learn some lessons the hard way. It is pretty easy to overload the database or the web browser by executing queries that return too many datapoints.
Better Log Parsing with Logstash and Google Protocol Buffers
At trivago we rely heavily on the ELK stack for our log processing. We stream our webserver access logs, error logs, performance benchmarks and all kind of diagnostic data into Kafka and process it from there into Elasticsearch using Logstash.
Elasticsearch and Kibana for Selenium Automation
The advances and growth of our Selenium based automated testing infrastructure generated an unexpected number of test results to evaluate. We had to rethink our reporting systems. Combining the power of Selenium with Kibana’s graphing and filtering features totally changed our way of working.
We're Hiring
Tackling hard problems is like going on an adventure. Solving a technical challenge feels like finding a hidden treasure. Want to go treasure hunting with us?
View all current job openings