Skip to content

Upgrading Flask, Python, several requirements, resolved breaking changes#411

Open
Evan-Leon wants to merge 3 commits intomainfrom
feature/requirement-upgrades
Open

Upgrading Flask, Python, several requirements, resolved breaking changes#411
Evan-Leon wants to merge 3 commits intomainfrom
feature/requirement-upgrades

Conversation

@Evan-Leon
Copy link
Collaborator

@Evan-Leon Evan-Leon commented Feb 27, 2026

Runtime & Dependency Modernization

Summary

This PR upgrades the runtime and dependency stack, migrates from csvkit to agate, and restores full test coverage under Python 3.14. The library now runs on a supported and stable technical foundation.


Python Upgrade

  • Upgraded Python 3.10 → 3.14
  • Resolved compatibility issues introduced by newer Python versions
  • Updated affected APIs and version-sensitive logic

Dependency Modernization

  • Major upgrades:

    • Flask
    • Flask-Babel
    • Flask-Assets
    • NLTK
    • NumPy / SciPy
  • Removed deprecated and unused libraries

  • Replaced legacy encoding detection with charset-normalizer

  • Added openpyxl for deterministic Excel handling

  • Eliminated inactive OAuth / Google-related components


agate Migration (Data Layer Refactor)

Replaced csvkit.table.Table with agate.Table.

Key refactors:

  • Updated column and row access patterns
  • Reworked type handling to align with agate’s immutable column model
  • Adjusted statistical computations for agate’s typing system
  • Removed assumptions about list-like column structures

This restores functionality after removal of the outdated csvkit dependency.


File & Encoding Handling

  • Standardized binary vs text read modes

  • Corrected UTF-8 handling inconsistencies

  • Improved fallback behavior for unknown encodings

  • Added fixture coverage for:

    • ASCII
    • UTF-16
    • Windows-1252
    • MacRoman

Graph Processing Fixes

Updated logic to align with new data structures:

  • Corrected agate row access
  • Rebuilt edge list extraction
  • Fixed NetworkX attribute assignment
  • Resolved Decimal JSON serialization issues
  • Ensured GEXF export correctness

Test Suite

  • Repaired failures caused by API and dependency upgrades
  • Updated assertions for agate and Python 3.14 behavior
  • Removed obsolete expectations
  • Added missing encoding fixtures

All tests pass under Python 3.14.


Outcome

  • Supported Python runtime
  • Modernized dependency stack
  • Completed agate migration
  • Passing test suite
  • Stable core library ready for follow-up feature remediation

@Evan-Leon Evan-Leon added bug enhancement dependencies Pull requests that update a dependency file labels Feb 27, 2026
@rahulbot
Copy link
Collaborator

rahulbot commented Mar 5, 2026

I did a pull and created a new conda env, installed requirements. Then ran start.sh and saw this error that looped over and over:

[2026-03-05 13:50:36 -0600] [60700] [INFO] Starting gunicorn 25.1.0
[2026-03-05 13:50:36 -0600] [60700] [INFO] Listening at: http://127.0.0.1:8000 (60700)
[2026-03-05 13:50:36 -0600] [60700] [INFO] Using worker: sync
[2026-03-05 13:50:36 -0600] [60700] [INFO] Control socket listening at /Users/r.bhargava/Documents/northeastern/projects/data-basic-website/gunicorn.ctl
[2026-03-05 13:50:36 -0600] [60701] [INFO] Booting worker with pid: 60701
[2026-03-05 13:50:36 -0600] [60701] [WARNING] Reloader is on. Use in development only!
objc[60701]: +[NSString initialize] may have been in progress in another thread when fork() was called.
objc[60701]: +[NSString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
[2026-03-05 13:50:36 -0600] [60700] [ERROR] Worker (pid:60701) was sent SIGKILL! Perhaps out of memory?
[2026-03-05 13:50:36 -0600] [60702] [INFO] Booting worker with pid: 60702
[2026-03-05 13:50:36 -0600] [60702] [WARNING] Reloader is on. Use in development only!
objc[60702]: +[NSString initialize] may have been in progress in another thread when fork() was called.
objc[60702]: +[NSString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
[2026-03-05 13:50:36 -0600] [60700] [ERROR] Worker (pid:60702) was sent SIGKILL! Perhaps out of memory?
[2026-03-05 13:50:36 -0600] [60703] [INFO] Booting worker with pid: 60703
[2026-03-05 13:50:36 -0600] [60703] [WARNING] Reloader is on. Use in development only!
objc[60703]: +[NSString initialize] may have been in progress in another thread when fork() was called.

AI suggested a fix in start.sh that set an env var:

# Fix for macOS fork() issue with gunicorn
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

That made it run.

Copy link
Collaborator

@rahulbot rahulbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that change, and after creating a new mongo database (for version upgrade) on my local MacOS dev machine I am able to see all demos functioning and verify WTFCSV xlsx upload and WordCounter text entry work. 👏🏽

@jsaikali
Copy link
Collaborator

jsaikali commented Mar 6, 2026

@rahulbot can you confirm that the languages all still work fine for you? We weren’t able to get more than EN and CY to show up as options, even before we started developing. We tried running the translations scripts but we must have done something wrong!

Just want to make sure language functionality didn’t break!

@jsaikali
Copy link
Collaborator

jsaikali commented Mar 6, 2026

With that change, and after creating a new mongo database (for version upgrade) on my local MacOS dev machine I am able to see all demos functioning and verify WTFCSV xlsx upload and WordCounter text entry work. 👏🏽

Awesome!! Thanks for verifying!
We didn’t run into the same loop error, maybe it’s environment / OS dependent. We can add the export statement to the start script, but it may not be best practice.
We could add a note to the readme, letting people know to add to their .env file if they’re experiencing issues? Or alternatively did you try dropping the ‘—reload’ flag and see if the error still occurs?
Let us know what you prefer! We can just add to start.sh if the alternatives are not ideal.

@rahulbot
Copy link
Collaborator

rahulbot commented Mar 7, 2026

I'd say add a note to readme, since it seems likely to be specific to my OS/IDE setup.
I do see all the language options and clicking them switches the interface as expected.
Cursor_and_DataBasic_io

@jsaikali jsaikali requested a review from rahulbot March 9, 2026 14:54
Copy link
Collaborator

@rahulbot rahulbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to merge now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug dependencies Pull requests that update a dependency file enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants