Skip to content

Implement interpreter array operators and fix type conversions#198

Merged
fglock merged 4 commits intomasterfrom
feature/interpreter-array-operators
Feb 14, 2026
Merged

Implement interpreter array operators and fix type conversions#198
fglock merged 4 commits intomasterfrom
feature/interpreter-array-operators

Conversation

@fglock
Copy link
Owner

@fglock fglock commented Feb 14, 2026

Summary

This PR implements missing Perl operators and features in the PerlOnJava interpreter, enabling complex test suites to run. Major focus on array/hash operations, type conversions, and error reporting.

Key Features Implemented

Operators (17 new)

  • List assignment: ($a, $b) = @array with proper scalar context
  • Array dereferencing: push/pop/shift/unshift/splice with @$arrayref
  • Hash operations: keys, values, exists, delete, slices
  • Comparison operators: All numeric/string comparisons with auto-scalar conversion
  • Math operators: All arithmetic operators (add, sub, mul, div, mod, pow) with auto-scalar conversion
  • Reference operators: ref, bless, isa, defined
  • Variable declarations: my/our with ListNode my ($x, $y)
  • String operators: length, cmp

Type System Improvements

  • Auto-conversion of RuntimeArray/RuntimeList to RuntimeScalar in:
    • All comparison operators (==, !=, <, >, <=>, cmp)
    • All arithmetic operators (+, -, *, /, %, **)
    • Conditional jumps (if/while/for conditions)
    • Global variable storage
    • JOIN, PRINT, SAY operations
  • ARRAY_GET handles both RuntimeArray and RuntimeList
  • ARRAY_SIZE returns count for RuntimeList (not last element)

Error Reporting

  • Accurate line number tracking using TreeMap with floorEntry
  • Bytecode context display for ClassCastException errors
  • Proper PC-to-token-index mapping like codegen

Bug Fixes

  • Fixed SLOWOP_LIST_SLICE_FROM bytecode reading (2 shorts, not 4)
  • Fixed list assignment scalar context to return correct counts
  • Fixed infinite loops in bare blocks with isSimpleBlock flag
  • Fixed context propagation in assignments and operators

Test Results

Before: 0 tests passing in demo.t
After: 50+ tests passing (7+ subtests complete)

✅ Subtest 1: Variable assignment (2/2)
✅ Subtest 2: List assignment in scalar context (13/13)
✅ Subtest 3: List assignment with lvalue array/hash (16/16)
✅ Subtest 4: Basic syntax tests
✅ Subtest 6: Map tests (2/2)
✅ Subtest 7: Grep tests (2/2)
⚠️  Subtest 8: Sort tests (4/5)
✅ Subtest 9: Object tests (2/2)

Benchmark Results

50 million iteration loop comparison:

# Perl 5
time perl -e 'my $x; for my $v (1..50_000_000) { $x++ }; print $x, "\n";'
# 0.434 seconds

# PerlOnJava compiled mode
time ./jperl -e 'my $x; for my $v (1..50_000_000) { $x++ }; print $x, "\n";'
# 0.373 seconds (16% faster than Perl 5!)

# PerlOnJava interpreter mode
time ./jperl --interpreter -e 'my $x; for my $v (1..50_000_000) { $x++ }; print $x, "\n";'
# 2.717 seconds (~7.3x slower than compiled)

Performance: Interpreter is ~7-10x slower than compiled mode, which is acceptable for debugging/development use cases.

Technical Details

Line Number Tracking

Implemented statement-level tracking in BlockNode visitor, matching codegen's approach:

  • TreeMap for PC→tokenIndex with floorEntry lookup
  • ErrorMessageUtil for token→line conversion
  • Accurate source locations in error messages

Type Conversions

Consistent pattern across all operators:

RuntimeBase val = registers[rs];
RuntimeScalar s = (val instanceof RuntimeScalar)
    ? (RuntimeScalar) val
    : val.scalar();

Bytecode Architecture

  • Short-based bytecode (65K registers)
  • SLOW_OP gateway for rare operations
  • Three-address code: rd = rs1 op rs2

Performance Impact

  • Zero overhead for compiled code (interpreter is opt-in)
  • Interpreter ~7-10x slower than compiled (acceptable for debugging/development)
  • In some cases, compiled mode is faster than Perl 5

Related Issues

  • Enables running Perl 5 test suites in interpreter mode
  • Foundation for interactive debugging features
  • Supports eval STRING and dynamic code generation

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

fglock and others added 4 commits February 13, 2026 23:27
…RAY_GET/ARRAY_SIZE enhancements

- Add POW_SCALAR (22) opcode for exponentiation operator (**)
- Fix ARRAY_SIZE to return count for RuntimeList instead of last element
- Fix ARRAY_GET to handle both RuntimeArray and RuntimeList
- Fix SLOWOP_LIST_SLICE_FROM to read correct number of shorts (2 instead of 4)
- Add disassembler support for SCALAR_TO_LIST, LIST_TO_SCALAR, POW_SCALAR, and SLOWOP_LIST_SLICE_FROM
- Improve interpreter error formatting to show PC and token index

Progress: 3 subtests of demo.t now pass completely (26 tests passing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eeMap

Track line numbers like Perl codegen does:
- Add line number tracking at each statement in BlockNode visitor
- Use TreeMap for pcToTokenIndex (instead of HashMap) to enable floorEntry lookup
- Use floorEntry to find nearest token index when exact PC match not found
- Pass ErrorMessageUtil through InterpretedCode for line number conversion

Error messages now show accurate source line numbers:
Before: "Interpreter error in demo.t:1 at pc=724" and "demo.t:1446"
After:  "Interpreter error in demo.t line 112 (pc=724)"

All line numbers are now within the actual file size and point to the correct source lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Make JOIN opcode handle non-scalar separators by converting to scalar
- Make PRINT and SAY opcodes handle RuntimeArray in addition to RuntimeList/RuntimeScalar
- Add ClassCastException handler to show bytecode context around errors
- Display bytecode hex dump (10 bytes before, 5 after) to help debug opcode issues

Error messages now show:
- Bytecode context: [ ... >>> AE <<< ... ]
- Helps identify undefined opcodes or PC misalignment issues

Progress: 3 subtests passing (26/26 tests), working on opcode 0xAE issue

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rands

- Add auto-conversion to scalar for all comparison operators (==, !=, <, >, <=>, cmp)
- Implement missing COMPARE_STR (cmp) opcode in BytecodeInterpreter
- Fix STORE_GLOBAL_SCALAR to convert non-scalar values before storing
- All comparison operators now handle RuntimeArray/RuntimeList by converting to scalar

This fixes the ClassCastException when comparing non-scalar values:
- `keys %hash == 2` now works (keys returns array, converted to count for comparison)
- String interpolation with arrays now works correctly

Test Results:
✅ Subtest 1: Variable assignment (2/2)
✅ Subtest 2: List assignment in scalar context (13/13)
✅ Subtest 3: List assignment with lvalue array/hash (16/16)
✅ Subtest 4: Basic syntax tests
✅ Subtest 6: Map tests (2/2)
✅ Subtest 7: Grep tests (2/2)
⚠️  Subtest 8: Sort tests (4/5 - one sort without block issue)
✅ Subtest 9: Object tests (2/2)

Total: ~50+ tests passing (major progress from 26!)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fglock fglock merged commit a62e3d1 into master Feb 14, 2026
2 checks passed
@fglock fglock deleted the feature/interpreter-array-operators branch February 14, 2026 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant