Skip to content

Conversation

@RanVaknin
Copy link
Contributor

@RanVaknin RanVaknin commented Feb 10, 2026

[NEEDS MORE TEST COVERAGE TO MERGE]

Overview

This PR targets two bottlenecks in UpdateExpressionConverter.toExpression as part of the Update operation request pipeline. toExpression scales and becomes the most noticable bottleneck the bigger the input is: ~40% cpu time in SMALL datasets, and ~74% in HUGE_FLAT datasets.

Interactive Flamegraph

image

1. String.format overhead (~18% CPU time)

groupExpressions uses String.format to build the update expression for each attribute. Can replace with a simple string concatenation which the JVM optimizes heavily.

image

2. Nested Stream.concat() in mergeExpressionNames() and mergeExpressionValues() (15-50% CPU time)

image

Current implementation:

private static Map<String, String> mergeExpressionNames(UpdateExpression expression) {
    return streamOfExpressionNames(expression)
        .reduce(Expression::joinNames)
        .orElseGet(Collections::emptyMap);
}

private static Stream<Map<String, String>> streamOfExpressionNames(UpdateExpression expression) {
    return Stream.concat(
        expression.setActions().stream().map(SetAction::expressionNames),
        Stream.concat(
            expression.removeActions().stream().map(RemoveAction::expressionNames),
            Stream.concat(
                expression.deleteActions().stream().map(DeleteAction::expressionNames),
                expression.addActions().stream().map(AddAction::expressionNames)
            )
        )
    );
}

The problem is that Stream.reduce() operation calls Expression.joinNames() for each action type (SET, ADD, REMOVE, DELETE). This method creates a new HashMap and copies all existing entries on every iteration:

public static Map<String, String> joinNames(Map<String, String> map1, Map<String, String> map2) {
    Map<String, String> result = new HashMap<>(map1);  // <- Allocate + copy all entries
    map2.forEach((key, value) -> {
        String oldValue = result.put(key, value);
        if (oldValue != null && !oldValue.equals(value)) {
            throw new IllegalArgumentException(...);
        }
    });
    return Collections.unmodifiableMap(result);
}

Example: With N attributes distributed across 4 action types, Stream.reduce() creates 3 intermediate HashMaps. For example, with 20 attributes (5 per action type), it copies 5 entries, then 10, then 15, before producing the final 20 entry result. The cost scales poorly with dataset size.

Instead we can iterate directly on each action and use a single HashMap:

private static Map<String, String> mergeExpressionNames(UpdateExpression expression) {
    Map<String, String> merged = new HashMap<>();
    
    for (SetAction action : expression.setActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (RemoveAction action : expression.removeActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (DeleteAction action : expression.deleteActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (AddAction action : expression.addActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    
    return merged.isEmpty() ? Collections.emptyMap() : Collections.unmodifiableMap(merged);
}

private static void mergeNamesInto(Map<String, String> target, Map<String, String> source) {
    if (source == null || source.isEmpty()) {
        return;
    }
    source.forEach((key, value) -> {
        String oldValue = target.get(key);
        if (oldValue != null && !oldValue.equals(value)) {
            throw new IllegalArgumentException(
                String.format("Attempt to coalesce two expressions with conflicting expression names. "
                            + "Expression name key = '%s'", key));
        }
        target.put(key, value);
    });
}

This eliminates intermediate HashMap allocations and theredundant entry copying. The same proposed optimization also applies to mergeExpressionValues().


Results:

toExpression went from ~40% CPU time to ~15%

  • groupExpressions went from 18% CPU time to 6%
  • mergeExpressionValues and mergeExpressionNames went down from 18% to ~4%
image
Operation Size Master (μs) String Fix (μs/op) Δ +Stream flattening (μs/op) Total Δ
Update TINY 1.38 1.395 +1.09% 1.366 -1.01%
Update SMALL 9.174 8.653 -5.68% 7.202 -21.50%
Update HUGE 39.23 37.51 -4.38% 34.438 -12.22%
Update HUGE_FLAT 232.677 224.088 -3.69% 81.248 -65.08%

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant