feat(python): Improve Jupyter notebook support with SQL magic commands and examples#1430
Draft
littleKitchen wants to merge 1 commit intoapache:mainfrom
Draft
feat(python): Improve Jupyter notebook support with SQL magic commands and examples#1430littleKitchen wants to merge 1 commit intoapache:mainfrom
littleKitchen wants to merge 1 commit intoapache:mainfrom
Conversation
…s and examples This PR implements all items from the checklist in issue apache#1398: ## Implementation Checklist - [x] Add example .ipynb notebooks to python/examples/ - getting_started.ipynb - Basic connection and queries - dataframe_api.ipynb - DataFrame transformations - distributed_queries.ipynb - Multi-stage query examples - [x] Document notebook support in Python README - Added comprehensive Jupyter section with examples - [x] Create ballista.jupyter module with magic commands - Full implementation with BallistaMagics class - [x] Add %ballista connect/status/tables/schema line magics - connect: Connect to Ballista cluster - status: Show connection status - tables: List registered tables - schema: Show table schema - disconnect: Disconnect from cluster - history: Show query history - [x] Add %%sql cell magic - Line magic for single-line queries - Cell magic for multi-line queries - Variable assignment support - --no-display and --limit options - [x] Add explain_visual() method for query plan rendering - Generates DOT/SVG visualization - Supports Jupyter _repr_html_ - Fallback when graphviz not installed - [x] Add progress indicator support for long-running queries - collect_with_progress() method - Callback support for custom progress handling - Jupyter-aware display - [x] Consider JupySQL integration - Documented as alternative in README ## Additional Features - ExecutionPlanVisualization class for plan rendering - tables() method on BallistaSessionContext - Optional jupyter dependency in pyproject.toml - Comprehensive test coverage (45 tests passing) Closes apache#1398
Contributor
|
there are also some ruff errors which need to be fixed |
Contributor
|
hey @littleKitchen i'm converting this PR to draft as it has been reviewed and further actions from author is needed. please have a look when you get chance |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
This PR implements the improvements outlined in #1398 to enhance the Jupyter notebook experience for Ballista.
Implementation Checklist
All items from the issue have been implemented:
✅ Add example .ipynb notebooks to python/examples/
getting_started.ipynb- Basic connection and queriesdataframe_api.ipynb- DataFrame transformationsdistributed_queries.ipynb- Multi-stage query examples✅ Document notebook support in Python README
✅ Create ballista.jupyter module with magic commands
BallistaMagicsclass✅ Add %ballista connect/status/tables/schema line magics
connect: Connect to Ballista clusterstatus: Show connection statustables: List registered tablesschema: Show table schemadisconnect: Disconnect from clusterhistory: Show query history✅ Add %%sql cell magic
%sql SELECT ...)%%sql)%%sql my_result)--no-displayand--limit Noptions✅ Add explain_visual() method for query plan rendering
_repr_html_for inline display.save()method for exporting to files✅ Add progress indicator support for long-running queries
collect_with_progress()method on DataFrame✅ Consider JupySQL integration
Additional Improvements
ExecutionPlanVisualizationclass for plan rendering with DOT/SVG conversiontables()method onBallistaSessionContextfor listing registered tablesjupyterdependency group inpyproject.tomlUsage Examples
Testing
All 45 tests pass:
Closes #1398