Skip to content

sync with open source how#118

Draft
lesterhaynes wants to merge 8372 commits intolinkedin:li_trunkfrom
apache:master
Draft

sync with open source how#118
lesterhaynes wants to merge 8372 commits intolinkedin:li_trunkfrom
apache:master

Conversation

@lesterhaynes
Copy link

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

pabloem and others added 21 commits January 13, 2026 15:57
* Update users.yml - fix bad role label

@ksobrenat32 - is this correct? looks like only `beam_writer` failed.

fyi @yalah5084 the previous change did not add the privileges so I'm trying to fix it here.

* Update users.yml
…s. (#37294)

Co-authored-by: Claude <cvandermerwe@google.com>
* Update neo4j resource manager

* Fix a test log spam in it/common

* Add back javadoc
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.39.0 to 0.40.0.
- [Commits](golang/sys@v0.39.0...v0.40.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-version: 0.40.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.48.0 to 0.49.0.
- [Commits](golang/net@v0.48.0...v0.49.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.49.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…type_hints() (#37299)

* [Python] Fix AttributeError in ExternalTransform.expand by using get_type_hints() (Fixes #37289)

* Fix lint: wrap long comment at line 808

* Fix yapf formatting and ignore local venv
* Remove standalone uses of the apitools HttpError class

* Update sdks/python/apache_beam/testing/pipeline_verifiers_test.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* linting and formatting

* more linting and formatting

* even more linting

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Add LRU cache eviction to CachingStateProvider

Fixes #37213

Implements LRU (Least Recently Used) cache eviction to prevent
unbounded memory growth in long-running workers. Adds configurable
maxCacheSize parameter (default: 1000 entries) and maintains LRU
order using JavaScript Map's insertion order.

- Add maxCacheSize constructor parameter with default value of 1000
- Implement evictIfNeeded() to remove oldest entry when cache is full
- Implement touchCacheEntry() to move accessed items to end (LRU)
- Add comprehensive test coverage in state_provider_test.ts

This addresses the TODO comment in the code and improves reliability
for production workloads.

* Address review comments: size-based LRU eviction for CachingStateProvider

- Fixed bug: removed incorrect evictIfNeeded() call in promise callback
- Removed unnecessary this_ variable (arrow functions capture this)
- Changed from count-based to size-based eviction (similar to Python statecache.py)
- Added estimateSize() to calculate memory weight of cached values
- Default cache weight: 100MB
- Updated tests to work with weight-based eviction

* Fix prettier formatting

* Address review comments: circular references, eviction ordering, tests

- Fixed sizeof function to handle circular references using visited Set
- Fixed eviction ordering: add to cache first, then evict (fixes edge case)
- Added test for oversized item that exceeds maxCacheWeight
- Implemented custom sizeof instead of object-sizeof package (has Node.js compatibility issues)

* Address Gemini comments: fix race condition, optimize evictIfNeeded

- Fixed critical race condition in promise callback: only update cache if
  the entry is still the same promise we're resolving
- Optimized evictIfNeeded: use entries() iterator and removed redundant checks
…37325)

Bumps [keras](https://github.com/keras-team/keras) from 3.12.0 to 3.13.1.
- [Release notes](https://github.com/keras-team/keras/releases)
- [Commits](keras-team/keras@v3.12.0...v3.13.1)

---
updated-dependencies:
- dependency-name: keras
  dependency-version: 3.13.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
kennknowles and others added 30 commits February 24, 2026 13:29
* Remove relative change threshold condition for identifying valid change points.

* Remove unused epsilon variable from `_find_valid_change_points`.
* Fix Playground Frontend Test

* Comment out integration test

* Enable integration tests

* Comment out standalone_miscellaneous_ui tests

* Add link to the issue
* add first draft of yaml column

* remove pubsublite
…37706)

* Add an instruction for using docker buildx

* Also update instructions that containers must be pushed.

* formatting

* formatting
* Fix DebeziumIO resuming from worker restart

* Move startTime recording into setup to fix NPE in restarted worker

* Fix DebeziumIO poll loop not exiting when record list isn't empty

* Make pollTimeout configurable

* Include first poll in stopWatch

* Fix pipeline terminate when poll returns null, which is valid per spec

* Adjust MaxNumRecord logic after previous fix. Previously if there is
  in total N=MaxNumRecord records the pipeline won't finish until N+1 record
  appears. The test relied on null poll to return actually
…7554)

* Enable pickling main by reference in cloudpickle vendor

* Reverted changes to paths not taken for the main by ref

* yapf

* undo unnecessary move
* Add Pyrefly configuration for Beam Python

* apache license

* Update sdks/python/pyrefly.toml

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

---------

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
Updated title and header for clarity.
* Add Observability Metrics

* Update script

* update readme

* fix readme

* update readme

* update port

* Update examples/terraform/envoy-ratelimiter/deploy.sh

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix gemini review

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…Elements (Part 2/3) (#37565)

* Add length-aware batching to BatchElements and ModelHandler

- Add length_fn and bucket_boundaries parameters to ModelHandler.__init__
  to support length-aware bucketed keying for ML inference batching
- Add WithLengthBucketKey DoFn to route elements by length buckets
- Update BatchElements to support length-aware batching when
  max_batch_duration_secs is set, reducing padding waste for
  variable-length sequences (e.g., NLP workloads)
- Default bucket boundaries: [16, 32, 64, 128, 256, 512]
- Add comprehensive tests validating bucket assignment, mixed-length
  batching, and padding efficiency improvements (77% vs 68% on bimodal data)
- All formatting (yapf) and lint (pylint 10/10) checks passed

* Refine length bucketing docs and fix boundary inclusivity

Expands parameter documentation for clarity and replaces bisect_left with bisect_right to ensure bucket boundaries are inclusive on the lower bound. Updates util_test.py assertions accordingly.
* Add num of models reporting

* Export memory estimate as well

* Fix lint

* switch to distribution

* Apply suggestion from @gemini-code-assist[bot]

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Store the namespace instead of strings

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Fix GroupBy snippet tests for issue #30778

- Add missing strawberry entry to GROCERY_LIST in all snippet files
- Fix test() block scoping (moved inside with beam.Pipeline() context)
- Remove beam.Map(print) from pipeline, add to else branch for standalone use
- Fix check_simple_aggregate_result to compare Row objects directly
- Set skip_due_to_30778 = False now that tests are passing
- Restore [START]/[END] snippet markers and 2-space indentation

* Fix yapf formatting issues in GroupBy snippet files

* Fix skip flag, restore markers, remove to_grocery_row, fix check_simple_aggregate_result

* Remove skip_due_to_30778 entirely and restore groupby_table markers

* Fix pylint W0106 by assigning beam.Map(print) result to _ in else branches
#37720)

* Fix Python PostCommit Flink runner log spam switching to simple logger

* Fix duplicate
* Fix ml_base dependency

* install twice
)

* add causedByDrain to DoFnRunner.onTimer interface and all implementations. Mostly passthrough.
)

Bumps [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) from 1.38.0 to 1.40.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.38.0...v1.40.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.40.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Claude <cvandermerwe@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.