Skip to content

Interpreter: Add array operators and fix critical bugs#194

Merged
fglock merged 18 commits intomasterfrom
feature/interpreter-array-operators
Feb 13, 2026
Merged

Interpreter: Add array operators and fix critical bugs#194
fglock merged 18 commits intomasterfrom
feature/interpreter-array-operators

Conversation

@fglock
Copy link
Owner

@fglock fglock commented Feb 13, 2026

Summary

Implements comprehensive array operator support in the interpreter and fixes two critical bugs that were blocking progress.

Array Operators Implemented

  • push/pop: Add/remove elements from end of array
  • shift/unshift: Add/remove elements from beginning of array
  • splice: Remove/insert elements at any position
  • grep: Filter array elements with code block
  • map: Transform array elements with code block
  • sort: Sort array elements with comparison block
  • reverse: Reverse array order
  • split: Split string into array
  • Array slice: Access multiple elements (@array[indices])
  • Array slice assignment: Modify multiple elements at once
  • Array element assignment: Single element modification ($array[n] = value)
  • Multidimensional arrays: Support for @array[$x][$y]

Critical Bugs Fixed

1. Infinite Loop in Bare Blocks

Issue: Bare blocks { my @a = (1,2,3); } created infinite loops

Root Cause: BytecodeCompiler ignored For3Node.isSimpleBlock flag, always generating loop bytecode with GOTO back to start

Fix: Check isSimpleBlock flag and execute body once without loop structure

Impact: Bare blocks now execute correctly without hanging

2. Disassembler Index Out of Bounds

Issue: ./jperl --interpreter --disassemble failed with "Index 89 out of bounds for length 3"

Root Cause: SLOW_OP switch statement had default case that didn't skip operands, causing PC misalignment

Fix: Added disassembler cases for all new SLOW_OP operations:

  • SLOWOP_SPLICE
  • SLOWOP_ARRAY_SLICE
  • SLOWOP_REVERSE
  • SLOWOP_ARRAY_SLICE_SET
  • SLOWOP_SPLIT

Each case now properly reads and skips all operands.

Impact: Disassembler works correctly for all bytecode

Known Issue (Not Addressed)

Array element access in function arguments returns wrong value:

my @arr = (1, 2, 3);
is($arr[1], 2, "test");  # gets: 1 instead of 2

Workaround: Assign to variable first:

my $x = $arr[1];
is($x, 2, "test");  # works correctly

Root Cause: ARRAY_SIZE opcode called after ARRAY_GET in scalar context for function arguments. This is a separate issue that will be addressed in a follow-up PR.

Files Modified

  • Opcodes.java: Added SLOWOP_REVERSE, SLOWOP_ARRAY_SLICE_SET, SLOWOP_SPLIT
  • BytecodeCompiler.java: Fixed For3Node.isSimpleBlock handling, added array operator compilation
  • BytecodeInterpreter.java: Implemented array operation opcodes
  • InterpretedCode.java: Added disassembler support for new SLOW_OP operations
  • SlowOpcodeHandler.java: Implemented handlers for reverse, array slice set, split
  • RuntimeArray.java: Added setSlice() method
  • SKILL.md: Documented debugging patterns and critical lessons learned

Testing

All array operators tested with:

./jperl --interpreter src/test/resources/unit/array.t

Most array.t tests now pass. Remaining failures are due to the known scalar context issue in function arguments.

Commits

  1. Add interpreter support for array operators and update SKILL.md
  2. Add interpreter support for array slices, += operator, and % modulus
  3. Add missing arithmetic opcodes to disassembler
  4. Implement grep, map, and sort operators in interpreter
  5. Implement reverse operator in interpreter
  6. Implement array slice assignment in interpreter
  7. Implement array element and multidimensional array assignment
  8. Implement split operator in interpreter
  9. Fix disassembler for new SLOW_OP opcodes (CRITICAL)
  10. Fix infinite loop in bare blocks with isSimpleBlock flag (CRITICAL)
  11. Document critical debugging patterns from array operators work

🤖 Generated with Claude Code

fglock and others added 18 commits February 13, 2026 10:35
Fixed crash when eval receives a RuntimeList (from string interpolation) instead
of RuntimeScalar. The executeEvalString handler now properly handles both types by
converting RuntimeList to RuntimeScalar using scalar() method.

Before:
  eval "$x++"  # Crash: ClassCastException

After:
  eval "$x++"  # No crash (but variable capture not yet working)

Known Limitation:
Lexical variable capture in eval STRING is not yet implemented. Variables declared
in the outer interpreted scope are not accessible to the eval'd code. This requires
detecting variable references in the eval string and passing the corresponding
registers as captured variables.

Example that doesn't work yet:
  my $x = 1; eval "$x++"; print $x  # Prints 1 (should print 2)

See EvalStringHandler.java lines 86-94 for TODO.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds support for lexical variable capture in eval STRING, matching compiler
mode behavior. Variables from outer scope are now accessible and modifiable
within eval'd code.

Changes:
- InterpretedCode: Add variableRegistry field to track variable name → register
  index mappings for eval STRING support
- BytecodeCompiler: Add constructor accepting parentRegistry for eval STRING,
  populate variableRegistry in compile(), mark parent variables as captured
  using capturedVarIndices, use SET_SCALAR for assignments to captured
  variables instead of MOVE to preserve aliasing
- EvalStringHandler: Build adjusted registry and captured variables array from
  parent scope, pass to eval'd InterpretedCode
- BytecodeInterpreter: Preserve variableRegistry when creating closures
- Disable ADD_ASSIGN optimization for captured variables (use SET_SCALAR path)

Fixes:
- my $x = 1; for (1..10) { eval "\$x++" }; print $x  # now prints 11
- my $x = 1; my $y = 2; eval "\$x = \$x + \$y"       # now updates $x to 3
- Nested eval STRING with variable capture works correctly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Perl allows underscores as digit separators in numeric literals (e.g.,
10_000_000). The interpreter was not handling these correctly while the
compiler mode was.

Changes:
- BytecodeCompiler.visit(NumberNode): Strip underscores before parsing,
  use ScalarUtils.isInteger() for consistent number validation, handle
  large integers (>32-bit) by storing as strings, use LOAD_INT for
  regular integers to create mutable scalars (needed for ++/-- operations)
- BytecodeCompiler range operator: Strip underscores when parsing
  constant range bounds

Implementation note:
We use LOAD_INT (creates new mutable RuntimeScalar) instead of cached
scalars because MOVE copies references, and variables need to be mutable
for operations like ++, --, etc. Floats use LOAD_CONST since they're less
commonly modified in-place.

Fixes:
- ./jperl --interpreter -e 'my $x = 10_000_000; print $x'  # now works
- ./jperl --interpreter -e 'for (1..100_000) { $x++ }'      # now works

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Documents real-world performance characteristics showing interpreter
excels at dynamic eval while compiler wins on cached eval.

Benchmarks:
- Cached eval (static string): Compiler 3.7x faster than interpreter
- Dynamic eval (unique strings): Interpreter 12.7x faster than compiler
- Dynamic eval vs Perl 5: Interpreter 4x slower, Compiler 50x slower

Key findings:
- Interpreter avoids compilation overhead for dynamic eval strings
- Compilation cost: 50-90ms per unique string (compiler) vs 15-30ms
  (interpreter) = 3-6x faster
- For 1M unique evals: Compiler 75s vs Interpreter 6s vs Perl 5 1.5s
- Interpreter design validated: excels exactly where it should

Primary use case: Dynamic eval strings for code generation, templating,
meta-programming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The interpreter was throwing "Increment/decrement of non-lexical variable
not yet supported" when trying to increment/decrement global variables.
This is essential for eval STRING with dynamic variable names.

Changes:
- BytecodeCompiler.visit(OperatorNode): For ++ and -- operators, handle
  global variables by:
  1. Loading the global variable with LOAD_GLOBAL_SCALAR
  2. Applying PRE/POST_AUTOINCREMENT/DECREMENT opcode
  3. Storing back with STORE_GLOBAL_SCALAR
- Applies to both bare identifiers (x++) and sigiled operators ($x++)

Fixes:
- $vartest++; print $vartest  # now prints 1
- eval "\$vartest++"; print $vartest  # now prints 1
- for my $x (1..N) { eval " \$var$x++" }  # now works

This enables dynamic eval STRING patterns like code generation and
templating that create variables with computed names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After implementing global variable increment/decrement, the interpreter
achieves Perl 5 parity for dynamic eval workloads.

Updated benchmarks (1M unique eval strings):
- Perl 5: 1.62s (baseline)
- Interpreter: 1.64s (1% slower) ✓ Parity achieved!
- Compiler: 76.12s (4600% slower)

Key findings:
- Interpreter is 46x faster than compiler for dynamic eval
- Interpreter matches Perl 5 performance (1% slowdown vs 4600%)
- For 1M unique evals: 1.6s (interpreter) vs 76s (compiler)

Conclusion:
The interpreter isn't just "good enough" for dynamic eval - it's the
RIGHT tool, achieving native Perl performance where compilation overhead
would dominate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented support for core array operations in the interpreter:
- push: Add elements to end of array
- pop: Remove and return last element
- shift: Remove and return first element
- unshift: Add elements to beginning of array
- splice: Remove and replace array elements (via SLOWOP_SPLICE)
- unaryMinus: Negation for negative array indices

Key improvements:
- Fixed ARRAY_PUSH to accept RuntimeBase instead of RuntimeScalar
  (enables pushing lists via RuntimeList.addToArray())
- Added ARRAY_POP, ARRAY_SHIFT, ARRAY_UNSHIFT cases to BytecodeInterpreter
- Replaced hardcoded "main::" with NameNormalizer.normalizeVariableName()
  throughout BytecodeCompiler for proper package resolution
- Added SLOWOP_SPLICE (ID 28) for splice operation

Documentation:
- Updated SKILL.md with comprehensive guide on adding operators:
  * Pattern 1: Binary operators (push, unshift)
  * Pattern 2: Unary operators (pop, shift, unaryMinus)
  * When and how to use SLOW_OP for complex operations
  * Common parse structures for arrays, slices, and list operators
  * Implementation patterns by AST structure
  * Best practices: NameNormalizer, RuntimeBase vs RuntimeScalar

Testing:
All implemented operators work correctly:
./jperl --interpreter -E 'my @A = (1,2,3); push @A, 4; pop @A; shift @A; unshift @A, 0'
./jperl --interpreter -E 'my @A = (0,2,3,4,5); splice @A, 2, 1, (10,11)'

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented:
- Array slices: @array[1..3], @array[1,3,5], @$arrayref[indices]
- Array slice support for dereferenced arrays: @$ref[...]
- Compound assignment: += operator
- Modulus operator: %

Changes:
- Added SLOWOP_ARRAY_SLICE (ID 29) for array slice operations
- Updated case "[" to distinguish between:
  * Single element access: $array[index]
  * Array slice: @array[indices]
- Enhanced "@" operator handler to support dereferencing: @$arrayref
- Added += compound assignment operator in BinaryOperatorNode
- Added % modulus operator in BinaryOperatorNode
- Implemented MOD_SCALAR case in BytecodeInterpreter

Testing:
./jperl --interpreter -E 'my @A = (0,2,10,11); my @s = @A[1..3]; say "@s"'  # 2 10 11
./jperl --interpreter -E 'my $r = \@A; my @s = @$r[1,3]; say "@s"'           # 2 11
./jperl --interpreter -E 'my $x = 0; $x += 5; say $x'                        # 5
./jperl --interpreter -E 'say 10 % 3'                                        # 1

TODO: Update disassembler for new opcodes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added disassembler cases for:
- SUB_SCALAR (opcode 18)
- MUL_SCALAR (opcode 19)
- DIV_SCALAR (opcode 20)
- MOD_SCALAR (opcode 21)

These opcodes were already implemented in BytecodeInterpreter but were
missing from the disassembler, causing them to show as UNKNOWN(n).

Testing:
./jperl --disassemble --interpreter -E 'say 10 % 3'
Now shows: MOD_SCALAR r7 = r5 % r6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added support for list operators that take code blocks (grep, map, sort):
- grep: filters list elements based on block condition
- map: transforms list elements using block expression
- sort: sorts list elements using comparison block

Implementation:
- Added GREP (100) and SORT (101) opcodes in Opcodes.java
- MAP (92) opcode already existed and was reused
- BytecodeCompiler: Added cases for "grep", "map", "sort" in BinaryOperatorNode
- BytecodeInterpreter: Implemented execution for all three opcodes
- InterpretedCode: Added disassembler cases for GREP and SORT
- All three call runtime ListOperators.{grep,map,sort} methods

Pattern: BinaryOperatorNode with SubroutineNode (block) and ListNode (data)
- Block is compiled to closure via visitAnonymousSubroutine
- Closure is passed to runtime operator along with input list

Updated SKILL.md with detailed implementation guide for Pattern 3.

Test results:
- grep: ./jperl --interpreter -E 'my @evens = grep { \$_ % 2 == 0 } (1,2,3,4); say "@evens"' => "2 4"
- map: ./jperl --interpreter -E 'my @doubled = map { \$_ * 2 } (1,2,3,4); say "@doubled"' => "2 4 6 8"
- sort: ./jperl --interpreter -E 'my @sorted = sort { \$a <=> \$b } (4,2,3,1); say "@sorted"' => "1 2 3 4"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added support for the reverse operator which reverses arrays or strings:
- In list context: reverses the order of list elements
- In scalar context: reverses the string representation

Implementation:
- Added SLOWOP_REVERSE (30) in Opcodes.java
- BytecodeCompiler: Added case "reverse" in OperatorNode handler
  - Compiles all arguments into a RuntimeList
  - Calls SLOW_OP with SLOWOP_REVERSE
- SlowOpcodeHandler: Added executeReverse method
  - Extracts RuntimeList to array
  - Calls Operator.reverse(ctx, args...)
  - Runtime handles both list and scalar context

Pattern: OperatorNode with ListNode operand
- Arguments are compiled and collected into RuntimeList
- Passed to runtime Operator.reverse() with context

Test result:
./jperl --interpreter -E 'my @Rev = reverse (1,2,3,4); say "@Rev"' => "4 3 2 1"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added support for array slice assignment: @array[indices] = values

Implementation:
- Added setSlice method to RuntimeArray.java
  - Takes indices (RuntimeList) and values (RuntimeList)
  - Iterates in parallel and sets each element
  - Uses arr.get(index).set(value) idiom
- Added SLOWOP_ARRAY_SLICE_SET (31) in Opcodes.java
- BytecodeCompiler: Added handler for array slice assignment
  - Detects BinaryOperatorNode("[") with @ sigil on left
  - Compiles indices from ArrayLiteralNode
  - Compiles values from RHS
  - Emits SLOW_OP with SLOWOP_ARRAY_SLICE_SET
- SlowOpcodeHandler: Added executeArraySliceSet method
  - Extracts array, indices, and values registers
  - Calls array.setSlice(indices, values)
- Fixed error messages: Changed RuntimeException to throwCompilerException
  - Now includes file, line, and code context in errors

Pattern: Assignment where left side is BinaryOperatorNode("[")
with @ sigil (array slice) vs $ sigil (single element)

Test result:
./jperl --interpreter -E 'my @array = (1..10); @array[1, 3, 5] = (20, 30, 40); say "@array"'
=> "1 20 3 30 5 40 7 8 9 10"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added support for single element array assignment and multidimensional arrays:
- $array[index] = value (single element assignment)
- $matrix[3][0] = value (multidimensional with autovivification)

Implementation:
- BytecodeCompiler: Added handler for array element assignment
  - Detects BinaryOperatorNode("[") with $ sigil (single element)
  - For simple case: $array[index] = value
    - Gets array register (lexical or global)
    - Compiles index and value
    - Emits ARRAY_SET
  - For multidimensional: $matrix[3][0] = value
    - Compiles outer array access recursively
    - Uses SLOWOP_DEREF_ARRAY to dereference intermediate result
    - Compiles index and value
    - Emits ARRAY_SET with autovivification
- Reuses existing ARRAY_SET opcode from BytecodeInterpreter

Pattern: Assignment where left is BinaryOperatorNode("[") with $ sigil
- Single element vs slice distinguished by sigil ($ vs @)
- Multidimensional arrays handled via recursive compilation + dereferencing

Test results:
./jperl --interpreter -E 'my @matrix; \$matrix[3][0] = 7; say \$matrix[3][0]'
=> 7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added support for the split operator which splits strings into arrays:
- split pattern, string, limit

Implementation:
- Added SLOWOP_SPLIT (32) in Opcodes.java
- BytecodeCompiler: Added case "split" in BinaryOperatorNode
  - Compiles pattern (left operand)
  - Compiles arguments list (right operand contains string and optional limit)
  - Emits SLOW_OP with SLOWOP_SPLIT
- SlowOpcodeHandler: Added executeSplit method
  - Extracts pattern, args, and context
  - Calls Operator.split(pattern, args, ctx)
  - Runtime handles string-to-regex conversion

Pattern: BinaryOperatorNode where:
- left = pattern (string or regex)
- right = ListNode (string to split and optional limit)

Test result:
./jperl --interpreter -E 'my \$str = "a,b,c"; my @Parts = split ",", \$str; say "@Parts"'
=> "a b c"

Note: There appears to be an infinite loop issue in array.t causing test
repetition (29000+ tests). This needs investigation separate from split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added disassembler cases for all new SLOW_OP operations to properly decode
their operands and advance the program counter correctly.

Fixed operations:
- SLOWOP_SPLICE: [rd] [arrayReg] [argsReg]
- SLOWOP_ARRAY_SLICE: [rd] [arrayReg] [indicesReg]
- SLOWOP_REVERSE: [rd] [argsReg] [ctx]
- SLOWOP_ARRAY_SLICE_SET: [arrayReg] [indicesReg] [valuesReg]
- SLOWOP_SPLIT: [rd] [patternReg] [argsReg] [ctx]

Issue: The disassembler was not skipping operands for these new SLOW_OP cases,
causing it to read operand bytes as opcodes, leading to "Index out of bounds"
errors when trying to decode stringPool entries.

Fixed by adding proper cases in the SLOW_OP switch statement in
InterpretedCode.disassemble() to read and skip the correct number of operands.

Test result:
./jperl --interpreter --disassemble -E 'my \$str = "a,b,c"; my @Parts = split ",", \$str; say "@Parts"'
Now works correctly and shows: SLOW_OP split (id=32) r8 = split(r6, r7, ctx=2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed infinite loop when bare blocks ({ ... }) contain array slices or other
operations. The interpreter was treating all bare blocks as loops, causing them
to execute indefinitely.

Root cause: For3Node has an isSimpleBlock flag to indicate bare blocks that
should execute once (not loop), but BytecodeCompiler.visit(For3Node) was
ignoring this flag and always generating loop bytecode:
- LOAD_INT 1 (condition always true)
- GOTO_IF_FALSE -> end
- body
- GOTO -> start  ← infinite loop!

Solution: Check node.isSimpleBlock at the start of visit(For3Node):
- If true: Just execute body once and return (no loop bytecode)
- If false: Generate full loop bytecode as before

Test cases that now work:
./jperl --interpreter -E '{ my @array = (1, 2, 3); my @slice = @array[1..2]; print "done\n"; }'
=> "done" (previously: infinite loop)

./jperl --interpreter src/test/resources/unit/array.t
=> Runs to completion (previously: infinite loop at line 43)

Note: array.t now hits a different error (RuntimeList vs RuntimeArray type mismatch)
which is unrelated to the loop issue and will be fixed separately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add learnings from fixing disassembler and infinite loop issues:
- For3Node.isSimpleBlock flag pattern
- Disassembler operand skipping requirement
- Known issue with array element scalar context in function arguments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fglock fglock merged commit 04de783 into master Feb 13, 2026
2 checks passed
@fglock fglock deleted the feature/interpreter-array-operators branch February 13, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant