Skip to content

Fix: Avoid materializing large ranges in foreach loops#201

Open
fglock wants to merge 2 commits intomasterfrom
fix/foreach-range-memory
Open

Fix: Avoid materializing large ranges in foreach loops#201
fglock wants to merge 2 commits intomasterfrom
fix/foreach-range-memory

Conversation

@fglock
Copy link
Owner

@fglock fglock commented Feb 15, 2026

Summary

Fixed memory issue where the compiler was materializing entire ranges (e.g., 1..50_000_000) when iterating in foreach loops, causing OOM with low memory limits.

Problem

The compiler had an INSTANCEOF PerlRange optimization in the isGlobalUnderscore path, but the standard path for lexical loop variables was missing this check. This caused getArrayOfAlias() to be called on ranges, materializing all elements into memory.

Solution

Added the same INSTANCEOF check to the standard foreach path:

  • If iterable is PerlRange: call .iterator() directly (lazy, no materialization)
  • If not: call .iterator() on base type (preserves array aliasing)

Testing

  • ✅ Works with 4MB heap: for my $v (1..50_000_000) { $x++ }
  • ✅ All unit tests pass
  • ✅ Array aliasing still works correctly
  • ✅ Range loop variables remain modifiable lvalues
  • ✅ Compiler now matches interpreter efficiency

Verification

Disassembly now shows the INSTANCEOF check:

INVOKESTATIC org/perlonjava/runtime/PerlRange.createRange (...)
DUP
INSTANCEOF org/perlonjava/runtime/PerlRange
IFEQ L1
INVOKEVIRTUAL org/perlonjava/runtime/RuntimeBase.iterator ()

Fixes memory usage parity between compiler and interpreter for large ranges.

🤖 Generated with Claude Code

The compiler was calling getArrayOfAlias() on ranges in some code paths,
which materialized the entire range (e.g., 50 million elements for 1..50_000_000).

The isGlobalUnderscore path already had an INSTANCEOF PerlRange check to use
.iterator() directly, but the standard path (for lexical loop variables) was
missing this optimization.

This fix adds the same INSTANCEOF optimization to the standard foreach path,
ensuring that ranges always use lazy iteration without materializing.

Benefits:
- Massive memory savings for large ranges
- No semantic changes - range iterators already return proper lvalues
- Array aliasing still works correctly for non-range iterables

The interpreter already had this optimization via the ITERATOR_CREATE opcode,
so this brings the compiler to parity with the interpreter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant