expand: cgu=3 for binary size by oech3 · Pull Request #10863 · uutils/coreutils

oech3 · 2026-02-10T16:35:47Z

~~1349360 -> 1210472 byte. cgu=2 caused performance regression.~~
#10863 (comment)

github-actions · 2026-02-10T16:46:42Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/follow-name (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/pr/bounded-memory is no longer failing!

xtqqczze · 2026-02-11T00:57:33Z

why not codegen-units = 1?

oech3 · 2026-02-11T02:12:40Z

see comment at Cargo.toml

xtqqczze · 2026-02-11T02:28:17Z

Looking at rust-lang/rust#148670, it looks like the issue is due to stack spill on x86? if so can we use codegen-units = 1 for aarch64?

oech3 · 2026-02-11T08:01:29Z

I cannot bench it. We should fix perf regression with cgu=1 for all targets instead.

github-actions · 2026-02-11T09:12:25Z

GNU testsuite comparison:

Congrats! The gnu test tests/pr/bounded-memory is no longer failing!

oech3 · 2026-02-11T15:28:25Z

I tried putting #inline(never) to some funcs, but it regressed performance at most cases.

xtqqczze · 2026-02-13T15:49:44Z

if so can we use codegen-units = 1 for aarch64?

Unfortunately it doesn't appear to be possible to specify target_arch cfg with profile.release

oech3 · 2026-02-13T15:52:39Z

Also we can't have cgu=N workaround for coreutils binary.

xtqqczze · 2026-02-13T15:54:18Z

I tried putting #inline(never) to some funcs, but it regressed performance at most cases.

We would likely need to manually outline cold code.

xtqqczze · 2026-02-13T16:05:40Z

coreutils/src/uu/expand/src/expand.rs

Line 427 in 93ae934

    
           // Fast path: if there are no tabs, backspaces, and (in UTF-8 mode or no carriage returns),

Here: might be worth moving cold code into an inner slow_path function.

ChrisDryden · 2026-02-13T16:30:17Z

I'm trying to go through these compiler flag changing PR's and my general take is that it is just a outcome driven decision where if we have measurements an benchmarks that show an improvement should always go for it, but if we don't have benchmarks its probably not the best use of time at the moment relative to other things to spend time on these flags because we are still changing a bunch of functionality. I'm seeing in a bunch of PR's now where code is changing in a function that has compiler hints and there's no easy way to tell whether they should stay or not since its not benchmarked.

xtqqczze · 2026-02-13T16:31:19Z

Also to outline cold code, we should should refactor error handling, for example:

map_err_context(|| translate!("expand-error-failed-to-write-output"))

should be:

map_err_context(ExpandError::FailedToWriteOutput)

oech3 · 2026-02-13T16:31:42Z

This reduces binary size.

xtqqczze · 2026-02-13T16:33:20Z

This reduces binary size.

what is binary size with codegen-units = 1?

oech3 · 2026-02-13T16:40:54Z

16: 1211048 byte
3: 1210536 byte
1: 1208152 byte (bad perf)

oech3 · 2026-02-13T16:45:48Z

Performance with cgu=1 is important because #10863 (comment)

github-actions · 2026-02-13T16:50:52Z

GNU testsuite comparison:

Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/factor/t27 is no longer failing!
Congrats! The gnu test tests/factor/t29 is no longer failing!
Congrats! The gnu test tests/factor/t34 is no longer failing!
Congrats! The gnu test tests/pr/bounded-memory is no longer failing!
Note: The gnu test tests/cut/cut-huge-range is now being skipped but was previously passing.
Note: The gnu test tests/dd/no-allocate is now being skipped but was previously passing.
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.
Congrats! The gnu test tests/csplit/csplit-heap is now passing!
Congrats! The gnu test tests/cut/bounded-memory is now passing!

ChrisDryden · 2026-02-13T16:55:41Z

Its more of a question of where to focus the effort on, some of the fixes to reduce the need for a dependency especially for the locale stuff would probably have orders of magnitude more impact on binary sizes than working on compiler flags. The flag stuff is more fragile in the long run than the dependency stuff because things can change in the compiler and when we update rust where the fixes may no longer apply whereas the work on streamlining the dependency chain will be the same across versions. Just as an example, according the cargo bloat, only 2.6% of the release size is coming from uu_expand and the rest is coming from other dependencies.

oech3 · 2026-02-13T16:59:03Z

If someone found the code imprving both of readability and cgu1, is it OK for you?

oech3 marked this pull request as ready for review February 10, 2026 16:53

oech3 mentioned this pull request Feb 11, 2026

expand: Fix perf regression with codegen-units=2 #10874

Open

expand: cgu=3 for binary size

9c3167a

oech3 force-pushed the expand3 branch from 3bf2e71 to 9c3167a Compare February 13, 2026 16:36

Uh oh!

Conversation

oech3 commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

xtqqczze commented Feb 11, 2026

Uh oh!

oech3 commented Feb 11, 2026

Uh oh!

xtqqczze commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oech3 commented Feb 11, 2026

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

oech3 commented Feb 11, 2026

Uh oh!

xtqqczze commented Feb 13, 2026

Uh oh!

oech3 commented Feb 13, 2026

Uh oh!

xtqqczze commented Feb 13, 2026

Uh oh!

xtqqczze commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChrisDryden commented Feb 13, 2026

Uh oh!

xtqqczze commented Feb 13, 2026

Uh oh!

oech3 commented Feb 13, 2026

Uh oh!

xtqqczze commented Feb 13, 2026

Uh oh!

oech3 commented Feb 13, 2026

Uh oh!

oech3 commented Feb 13, 2026

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

ChrisDryden commented Feb 13, 2026

Uh oh!

oech3 commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oech3 commented Feb 10, 2026 •

edited

Loading

xtqqczze commented Feb 11, 2026 •

edited

Loading

xtqqczze commented Feb 13, 2026 •

edited

Loading