[PECOBLR-1655] Implemented the functionality of pool_pre_ping #53
[PECOBLR-1655] Implemented the functionality of pool_pre_ping #53msrathore-db wants to merge 7 commits intomainfrom
Conversation
… to see if the connection is alive. If not it'll recycle the session
|
@msrathore-db This maybe a better solution. |
…ing the default behaviour
Fix CI failures caused by incompatible newer poetry versions by pinning poetry to version 2.2.1 in all GitHub Actions workflows. This follows the same fix applied in databricks-sql-python PR #735. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhanced is_disconnect() to catch databricks.sql.exc.Error (base class) and detect "closed connection" errors. This fixes pool_pre_ping failing when attempting to ping a closed connection. The default do_ping() tries to create a cursor, which raises Error with message "Cannot create cursor from closed connection" on closed connections. Now properly detects this as a disconnect error. Fixes: - test_pool_pre_ping_with_closed_connection - test_is_disconnect_handles_runtime_errors Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
@msrathore-db Any ETA on this PR? We are seeing |
We should be able to get it merged and released this week. Thanks |
src/databricks/sqlalchemy/base.py
Outdated
| # This catches errors like "Cannot create cursor from closed connection" | ||
| if isinstance(e, Error): | ||
| error_msg = str(e).lower() | ||
| return "closed" in error_msg or "cannot create cursor" in error_msg |
There was a problem hiding this comment.
is having closed a definite case for closed connection? There can be error message like cannot be closed etc. Seems to be error prone.
There was a problem hiding this comment.
Modified to be precise after checking all error occurences in the python SQL connector. Thanks
| from databricks.sql.exc import InterfaceError, DatabaseError, Error | ||
|
|
||
| # InterfaceError: Client-side errors (e.g., connection already closed) | ||
| if isinstance(e, InterfaceError): |
There was a problem hiding this comment.
InterfaceError can also be raised for programming errors like invalid params. Will this be an expected behavior then?
There was a problem hiding this comment.
That's not the case with the current connector. All the interface errors are raised when connection is closed.
gopalldb
left a comment
There was a problem hiding this comment.
LG, minor comments on error handling
- InterfaceError: check for "closed" in message instead of blanket True - Add RequestError handling for transport/network-level disconnect errors - Add "unexpectedly closed server side" to DatabaseError disconnect checks - Keep base Error fallback with precise "closed connection"/"closed cursor" matching for older connector versions (v4.0.0-v4.0.5) that raise Error instead of InterfaceError - Expand tests to cover all connector error messages with exact code paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
447e1bf to
0145f62
Compare
| - name: Install Poetry | ||
| uses: snok/install-poetry@v1 | ||
| with: | ||
| version: "2.2.1" |
There was a problem hiding this comment.
The pyproject.toml uses 2.0.8, why is this using 2.2.1 ?
| # RequestError (subclass of DatabaseError via OperationalError): | ||
| # transport/network-level errors indicating connection is unusable. | ||
| # Check before DatabaseError since RequestError is a subclass. | ||
| if isinstance(e, RequestError): | ||
| return True | ||
|
|
||
| # DatabaseError: server-side errors indicating session/operation gone | ||
| if isinstance(e, DatabaseError): | ||
| return ("invalid" in error_msg and "handle" in error_msg) or ( | ||
| "unexpectedly closed server side" in error_msg | ||
| ) | ||
|
|
||
| # Base Error class: older connector versions raise Error (not InterfaceError) | ||
| if isinstance(e, Error): | ||
| return "closed connection" in error_msg or "closed cursor" in error_msg |
There was a problem hiding this comment.
- How are you even deciding these conditions?
- All these string in checks are they based on some logic or its just arbitrary ?
The only error that is possible in the case where the connection is itself broken is RequestError
As far as I can see DatabaseError is returned when you have thrift status code, or when server sends some response which is not the expectation when connection is closed
Ideally there should be only one error condition flow that must indicate Connection is closed, only solve for that, don't solve for arbitary errors and random string checks
There was a problem hiding this comment.
The string checks aren't arbitrary, they match exact error messages from specific code paths in the connector. But I agree the approach is fragile, especially for DatabaseError where the message comes from the server response.
The three cases we need to handle for pool_pre_ping are:
- Error/InterfaceError from client.py:cursor() when self.open is False — connection closed locally
- RequestError — transport/network failure (clean isinstance check, no strings)
- DatabaseError from thrift_backend._check_response_for_error() when server returns INVALID_HANDLE_STATUS — session expired on server
For (3), the connector raises the same DatabaseError for both ERROR_STATUS and INVALID_HANDLE_STATUS, so string matching is the only option today. The better fix would be a dedicated exception class for INVALID_HANDLE_STATUS in the connector — then we can replace all string checks with isinstance checks. I can raise that as a follow-up in databricks-sql-python.
Also the base error class check is needed for some versions of driver where interfaceError was not defined, more specifically
- v4.0.0–v4.0.5: raises Error (base class)
- v4.0.6+: raises InterfaceError
There was a problem hiding this comment.
I am saying why to even make this so complex, pool pre ping is just a health check. Simply try to execute a query if there is an error throw true. So sqlalchemy will use a new connection.
Then when they try again, if there is some real error let it bubble up
There was a problem hiding this comment.
It's needed for the is_disconnect calls for mid query failures where we shouldn't be closing the connection pool for every failure type. Let's park it for now. For current mitigation, I'll override do_ping in the PR and we can look into the complete resolution later with this PR.
|
Parking now until we find a cleaner way to implement is_disconnect without tight coupling with databricks-sql-connector. Fix for the issue is raised in the PR |
Summary
Implements
is_disconnect()method to enable connection health checking and prevent "Invalid SessionHandle" errors when connections expire or are closed by the server.Closes #47
Problem
Users experienced
Invalid SessionHandleerrors when connections expired or were closed by the server. SQLAlchemy'spool_pre_ping=Truefeature wasn't working because the dialect didn'timplement
is_disconnect()to classify disconnect errors.When connections died (either before checkout or during query execution), SQLAlchemy couldn't determine if errors were due to disconnects, leaving dead connections in the pool and causing
repeated errors.
Solution
Implements
is_disconnect()method that classifies exceptions as disconnect errors, enabling both pessimistic and optimistic disconnect handling as described in SQLAlchemy's pooling documentation.