Codenil

Why Your HTTP Response Handling Is Hurting Observability (and How to Fix It)

Published: 2026-05-17 21:32:36 | Category: Programming

When building API clients in Elixir, it's tempting to handle response parsing and error translation inside your function body after making the HTTP call. While this approach works on the surface, it creates a hidden gap in your observability—failures that occur after a successful HTTP response are silently ignored by your telemetry stack. This article explores why placing response handling inside middleware (rather than after Tesla.get or similar) is critical for accurate monitoring, and how to refactor your code to capture every failure.

What is the common mistake developers make when handling HTTP responses?

Many Elixir developers write API clients where response handling logic—like validating structure, extracting fields, and translating errors—occurs in a function called after the HTTP request finishes, often at the end of a pipeline. For example, a fetch_user/1 might look like client() |> Tesla.get("/users/1") |> handle_response(). While this pattern centralizes error translation, it means that any failure rooted in the response content (like malformed JSON or invalid field values) never triggers the request-level telemetry that your APM or logging system expects. The HTTP response itself succeeded (status 200), so the framework treats it as a success, even though your application deems it a failure.

Why Your HTTP Response Handling Is Hurting Observability (and How to Fix It)
Source: dev.to

Why does placing response handling outside of middleware cause telemetry gaps?

The key structural difference between a pipeline chain (|>) and Tesla middleware is where failure can be signaled. Middleware can short-circuit the request lifecycle and emit telemetry events (like [:tesla, :request, :error]) that your observability tooling picks up. A regular function called after Tesla.get cannot do this—it runs after the HTTP response is already recorded as successful. So if your handle_response/1 determines that the body is invalid (e.g., a field expected to be a boolean is the string "true"), your logs, metrics, and alerts will miss that failure entirely. This leads to undercounting of real problems, making your system seem healthier than it is.

Can you give real-world examples of APIs that produce unexpected responses?

Absolutely. Tutorials often assume clean APIs like GitHub, but real APIs are far messier. I've seen:

  • Production JSON that fails to parse—actually broken syntax, not just unexpected fields.
  • A field documented as a boolean returning the JSON string "true" for true, and "No" for false.
  • A string field returning "N/A" or an empty string to signal missing data, even when the schema says it's always present.

These examples show that parsing response content is just as fallible as parsing the HTTP envelope. A connection timeout fails the request; a malformed boolean should too. But if your failure logic lives outside middleware, the telemetry never sees it.

How does the typical handle_response/1 pattern fall short?

Consider a shared handle_response/1 function that every endpoint in your module calls. In fetch_user/1, you might have:

Why Your HTTP Response Handling Is Hurting Observability (and How to Fix It)
Source: dev.to
case client() |> Tesla.get("/users/#{id}") |> handle_response() do
  {:ok, body} -> {:ok, decode_user(body)}
  {:error, :not_found} -> {:error, :not_found}
  _ -> {:error, :server_error}
end

This looks clean, but the problem is subtle: handle_response/1 runs after Tesla has already recorded a successful HTTP transaction. If decode_user(body) fails (e.g., because body has a missing field), the error is swallowed by the catch-all case, and your APM tool never receives a failure telemetry event. You've effectively hidden the real failure from your monitoring.

What is the correct approach using middleware to ensure proper observability?

The fix is to move validation and parsing into a custom Tesla middleware. Middleware runs inside the request/response pipeline, so it can abort the request with {:error, ...} when content is invalid. This triggers telemetry events for errors, lighting up your alerts. For example, create a module Acme.SuccessDecoder that implements call/3 and inspects the response body. If the JSON is fine but a required field is missing, return {:error, :invalid_body}. Then plug that middleware into your client. Now, every failure—whether from a network timeout or a badly formed field—gets counted equally in your logs and metrics.

Are there differences between Tesla and Req that matter for this pattern?

For the purposes of this article, Tesla and Req have functional parity. Both support middleware-style extension points and built-in telemetry. In Req, you use Req.Request.attach() or Req.Steps to modify responses. The same concept applies: attach a step that validates the response body and returns an error tuple if something's off. Whether you use Tesla or Req, the principle remains: handle response content inside the pipeline, not after it. If you're using Req, simply translate the middleware approach to its step system—the underlying telemetry benefits are identical.

For a step-by-step refactoring guide, check our previous question on implementing middleware.