← Back to The 11 Things That Will Break Your AI in Production
2026-04-13·Ryan Bolden·Part of: The 11 Things That Will Break Your AI in Production

The database write that poisoned every patient

It was 11 PM on a Tuesday. The monitoring system fired an alert: response times had spiked from 200 milliseconds to 30 seconds across every endpoint. Then another alert: database connection errors. Then another: patients hitting error pages. Within three minutes, the entire platform was down.

The root cause was a single INSERT statement. One database write. One missing column.

Here is what happened. A new feature required adding a record to a table. The developer — me — wrote the INSERT statement, tested it in development, and deployed it. In development, the INSERT worked fine. The column existed. The record was created. Everything passed.

In production, the column did not exist. The migration that was supposed to add the column had not been applied. The INSERT failed with a database error.

This should have been a simple error. One failed operation. One error log. One fix.

Instead, it took down the entire platform for every patient across every tenant. Because of session poisoning.

Here is the mechanism. In a multi-tenant system using an ORM like SQLAlchemy, database sessions can be shared across requests for performance. When a request starts, it borrows a session from the pool. When it finishes, it returns the session. The next request gets the same session.

When our INSERT failed, the session entered an error state. But the error was caught by a try-except block that logged the error and continued — without rolling back the session. The session was returned to the pool in a corrupted state. The next request that borrowed that session — a patient checking their appointment — inherited the corruption. That request also failed. And the next one. And the next one.

Within seconds, every session in the pool was poisoned. Every request, for every patient, across every tenant, was failing. Not because of a systemic infrastructure failure. Because of one bad INSERT and one missing rollback.

The fix was two lines of code: explicit session rollback on any error, regardless of whether the error was expected. But those two lines only work if you understand that session poisoning is possible — and most developers do not, because it does not happen in development. Development environments typically use a new session per request. They do not run with shared session pools under concurrent load. The failure mode is invisible until production.

We also added a second layer of defense: before any INSERT or UPDATE in production, we verify the actual production schema. Not the migration files. Not the development database. The actual production schema, queried in real time. Migrations are aspirational. Production is real. If the column does not exist in the production schema, the code does not attempt the write.

This incident also led us to implement a third defense: runtime monitoring of session pool health. If the error rate on any session exceeds a threshold, that session is killed and replaced rather than returned to the pool. The monitoring system now catches session poisoning within 30 seconds — before it can propagate to the full pool.

Three layers of defense, all born from one night where a single missing column took down an entire healthcare platform. If you are building a multi-tenant system — in healthcare or any other industry — and you have not explicitly designed your session management for failure isolation, you are carrying the same risk we carried that Tuesday night. The bomb is silent until it goes off. We know because ours did.

This is one piece of a larger framework we built and operate in production. The full picture — and how it applies to your business — is in the playbook.

We specialize in healthcare because it is the hardest vertical — strict HIPAA regulation, PHI handling, BAA chains, and zero tolerance for failure. If we can build it for healthcare, we can build it for any industry. We work across verticals.

Written by Ryan Bolden · Founder, Riscent · ryan@riscent.com