FEAT: Pass all connection string params to mssql-py-core for bulk copy#439
FEAT: Pass all connection string params to mssql-py-core for bulk copy#439bewithgaurav wants to merge 21 commits intomainfrom
Conversation
- Add extract_auth_type() fallback for Windows Interactive where process_connection_string returns auth_type=None (DDBC handles auth natively but bulkcopy needs it for Azure Identity) - Preserve auth_type in fallthrough return so bulkcopy can attempt fresh token acquisition even when connect-time token fails - Fix get_raw_token docstring to match actual credential behavior - Use 'is None' instead of '== None' in test assertion
Raises ValueError with supported types instead of KeyError when an unsupported auth_type is passed to _acquire_token.
…llthrough return, clean up extract_auth_type parsing
… bewithgaurav/bulkcopy-accesstoken-support
Drop app, workstationid, language, connect_timeout, mars_connection from key_map — the parser rejects these keywords so they never reach connstr_to_pycore_params. Also clean up bool_keys/int_keys to match.
048ec5d to
4953f9c
Compare
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/helpers.pyLines 271-279 271 # mars_connection) are silently dropped here. In the normal connect()
272 # path the parser validates keywords first (validate_keywords=True),
273 # but bulkcopy parses with validation off, so this mapping is the
274 # authoritative filter in that path.
! 275 key_map = {
276 # auth / credentials
277 "uid": "user_name",
278 "pwd": "password",
279 "trusted_connection": "trusted_connection",Lines 303-311 303 "packet size": "packet_size",
304 "connectretrycount": "connect_retry_count",
305 "connectretryinterval": "connect_retry_interval",
306 }
! 307 int_keys = {
308 "packet_size",
309 "connect_retry_count",
310 "connect_retry_interval",
311 "keep_alive",Lines 311-329 311 "keep_alive",
312 "keep_alive_interval",
313 }
314
! 315 pycore_params: dict = {}
316
! 317 for connstr_key, pycore_key in key_map.items():
! 318 raw_value = params.get(connstr_key)
! 319 if raw_value is None:
! 320 continue
321
322 # First-wins: match ODBC behaviour — first synonym in the
323 # connection string takes precedence (e.g. Addr before Server).
! 324 if pycore_key in pycore_params:
! 325 continue
326
327 # ODBC values are always strings; py-core expects native types for int keys.
328 # Boolean params (trust_server_certificate, multi_subnet_failover) are passed
329 # as strings — all Yes/No validation is in connection.rs for single-locationLines 327-345 327 # ODBC values are always strings; py-core expects native types for int keys.
328 # Boolean params (trust_server_certificate, multi_subnet_failover) are passed
329 # as strings — all Yes/No validation is in connection.rs for single-location
330 # consistency with Encrypt, ApplicationIntent, IPAddressPreference, etc.
! 331 if pycore_key in int_keys:
332 # Numeric params (timeouts, packet size, etc.) — skip on bad input
! 333 try:
! 334 pycore_params[pycore_key] = int(raw_value)
! 335 except (ValueError, TypeError):
! 336 pass # let py-core fall back to its compiled-in default
337 else:
338 # String params (server, database, encryption, etc.) — pass through
! 339 pycore_params[pycore_key] = raw_value
340
! 341 return pycore_params
342
343
344 # Settings functionality moved here to avoid circular imports📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.helpers.py: 77.4%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.cursor.py: 84.7%🔗 Quick Links
|
| # mars_connection) are intentionally excluded — the parser rejects them. | ||
| key_map = { | ||
| # auth / credentials | ||
| "uid": "user_name", |
There was a problem hiding this comment.
this map is being used to convert all the connection string params into pycore context
we can go ahead with using the conn string params as is inside pycore but that will require changing a lot of tests, hence added a map
| # bulkcopy() is called long after the original connection. | ||
| pycore_context = dict(context) | ||
| # Translate parsed connection string into the dict py-core expects. | ||
| pycore_context = connstr_to_pycore_params(params) |
There was a problem hiding this comment.
have kept cursor.py to be just passing the params and not at all containing any "validation" logic since we'll take care of that in pycore
| --arg patch_coverage_pct "${{ env.PATCH_COVERAGE_PCT }}" \ | ||
| --arg low_coverage_files "${{ env.LOW_COVERAGE_FILES }}" \ | ||
| --arg patch_coverage_summary "${{ env.PATCH_COVERAGE_SUMMARY }}" \ | ||
| --arg low_coverage_files "$LOW_COVERAGE_FILES" \ |
There was a problem hiding this comment.
minor code coverage fix, this PR had some codeblocks which needed to be escaped, since they were getting into the bash terminal
|
code coverage isn't reflecting correct numbers since the tests are a part of mssql-py-core as of now |
Use jq --rawfile instead of ${{ env.* }} expansion for
PATCH_COVERAGE_SUMMARY and LOW_COVERAGE_FILES to prevent
raw Python source in diff hunks from being interpreted as
shell commands.
4953f9c to
0b1941c
Compare
…ate and multi_subnet_failover as strings
There was a problem hiding this comment.
Pull request overview
Refactors the bulk copy (“py-core”) connection context construction so that parsed ODBC connection-string parameters are centrally translated into the parameter format expected by mssql_py_core, reducing inline mapping logic in cursor.bulkcopy().
Changes:
- Added
connstr_to_pycore_params()helper to map/convert parsed connection-string params into py-core snake_case keys (with basic int coercion). - Updated
Cursor._bulkcopy()to buildpycore_contextvia the new helper and simplified sensitive-key cleanup. - Adjusted PR code coverage workflow to pass multiline env values into
jqvia shell variables.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
mssql_python/helpers.py |
Introduces connstr_to_pycore_params() for bulkcopy py-core parameter translation. |
mssql_python/cursor.py |
Uses the new helper to build py-core bulkcopy connection context and streamlines sensitive key cleanup. |
.github/workflows/pr-code-coverage.yml |
Tweaks jq --arg inputs to use shell vars for better handling of multiline env values. |
Comments suppressed due to low confidence (1)
mssql_python/cursor.py:2651
- In the Azure AD token flow, pycore_context is built from connstr_to_pycore_params(params) before adding access_token, which means user_name/password from the original connection string may also be included (and retained until finally). Previously, SQL credentials were only added in the non-AAD path. Consider explicitly removing user_name/password when _auth_type is set (or when setting access_token) to minimize sensitive-data exposure and avoid ambiguous/dual auth inputs being passed to mssql-py-core.
# Translate parsed connection string into the dict py-core expects.
pycore_context = connstr_to_pycore_params(params)
# Token acquisition — only thing cursor must handle (needs azure-identity SDK)
if self.connection._auth_type:
# Fresh token acquisition for mssql-py-core connection
from mssql_python.auth import AADAuth
try:
raw_token = AADAuth.get_raw_token(self.connection._auth_type)
except (RuntimeError, ValueError) as e:
raise RuntimeError(
f"Bulk copy failed: unable to acquire Azure AD token "
f"for auth_type '{self.connection._auth_type}': {e}"
) from e
pycore_context["access_token"] = raw_token
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
_normalize_params and connstr_to_pycore_params now skip a canonical key that was already set by an earlier synonym. Matches ODBC Driver 18 fFromAttrOrProp first-wins semantics confirmed via live test against sqlcconn.cpp. Add TestSynonymFirstWins (12 tests) covering server/addr/ address, trustservercertificate snake_case, and packetsize with-space synonym groups.
db08e21 to
dffadd6
Compare
… bewithgaurav/bulkcopy-connstr-to-pycore
sumitmsft
left a comment
There was a problem hiding this comment.
I am approving this with one question. Although I know the answer but just want to confirm:
Previously bulkcopy effectively defaulted TrustServerCertificate to “yes” when not specified (it used params.get("trustservercertificate", "yes") ... and passed a boolean to py-core).
Now we only forward trust_server_certificate if the keyword is present, which means bulkcopy depends on mssql-py-core’s default when it’s absent.
Since ODBC 18.5 is “secure by default” (Encrypt on; TrustServerCertificate off), can you confirm:
- What does mssql-py-core do when trust_server_certificate is not provided?
- Does this match ODBC 18.5’s effective defaults and our non-bulkcopy connection behavior?
Work Item / Issue Reference
Summary
This pull request refactors and simplifies how ODBC connection string parameters are translated for use with the bulk copy feature in
mssql_python. The key change is the introduction of a new helper function,connstr_to_pycore_params, which centralizes and standardizes the parameter mapping logic. This reduces code duplication, improves maintainability, and ensures sensitive data is handled more securely.Bulk copy connection context refactor:
connstr_to_pycore_paramsinmssql_python/helpers.pyto translate ODBC connection string parameters into the format expected by the py-core bulk copy path, including type conversion and key normalization.mssql_python/cursor.pyto useconnstr_to_pycore_paramsfor building thepycore_contextdictionary, replacing the previous inline mapping and logic. [1] [2]Security and code quality improvements:
popcalls.Documentation and maintainability:
cursor.pyindicating the mapping from ODBC connection-string keywords.