Customer success is a critical dimension in our daily operations. We aim to reduce any sort of user friction across our product portfolio. We historically handled customer complaints in a reactive manner with a focus on:
Being easily reachable for all audiences (mail, chatbox, phone); and
Ensuring quick response times.
When user volumes for some products started accelerating, we noticed that many of these user complaints had a technical nature. Edge cases no one could ever think of start producing error scenarios that result in time-consuming fishing expeditions by skimming through the raw logs. Rather than spending more time on handling an increasing amount of incoming requests, we started looking for ways to prevent them altogether and spot broken code, crashes and failed API calls before user complaints would pour in.
Sentry as error tracking platform
With QIT online as the first test case, Sentry was introduced as an error and performance tracking tool. It covered multiple of our needs:
Automatic reporting of errors and exceptions
Automatic identification of performance issues
Pre-built Software Development Kits (SDK’s) for integration
Supporting multiple languages used in our portfolio (Node, Typescript, Symphony Python) + availability of source maps
Integration in our daily operational model
Once we settled on Sentry, we standardized our approach and applied the same principles to all products. Both for the staging and production environment, each product in our portfolio now has the following setup in Sentry:
APP: front-end Typical error examples:
TypeError: null is not an object (evaluating '<something>')
TypeError: Cannot read properties of undefined (reading '<property>')
Error: Rendered more hooks than during the previous render.
API: back-end Typical error examples:
QueryFailedError: column "<column>" is of type uuid but expression is of type text
QueryFailedError: invalid input syntax for type uuid: "undefined"
Error: ENOENT: no such file or directory, unlink '<file_path>'
Error logging of technical back-end issues
Performance tracking: specific endpoint speed and loading time
It’s one thing to set up an error tracking system, it’s another one to act upon it. To make error tracking part of our daily operational model, we introduced:
Slack integration: by using the Sentry-Slack integration, all errors are automatically posted in the communication channel of the relevant product. This allows us to immediately prioritize and select an assignee. To avoid clutter in the Slack channels, Sentry offers a wide range of configuration options to determine which type of errors are worth notifying the team.It includes a clever mechanism so the alerts won't spam the slack channels. An example would be that a new error only sends an alert once or once in a configurable time frame.
Response time thresholds: we have set a response time threshold for each transaction. By measuring the actual response time the user faced as well as the number of times the threshold was exceeded, Sentry provides an automatic “user misery” score. It allows us to act first on those requests with a higher score.
Release logging: we log release dates in Sentry and map them to our code repository commits in Github as well as to people who worked on this particular release. This way we can track whether errors were triggered before or after a specific release, making it easier to pinpoint the exact issue.
Stress testing: in combination with tools like k6 (Node) and Locust.io (Python), we regularly stress-test our products and simulate extreme usages.
Error monitoring as a quality principle in our operating model
As a product studio, we are dealing with multiple products, users and related transactions on a daily basis. In the past few months we made error monitoring an integral part of our operating model. While preventing user complaints and boosting performance we proactively reduce user friction.
Want to discuss how your product can benefit from proactive error logging? Contact Sebastiaan via email@example.com for more information.