Skip to content

Convert non-ASCII TLD filenames to Punycode in public-suffix data#1689

Merged
henriquemoody merged 1 commit into3.0from
copilot/fix-installation-error-non-ascii
Feb 9, 2026
Merged

Convert non-ASCII TLD filenames to Punycode in public-suffix data#1689
henriquemoody merged 1 commit into3.0from
copilot/fix-installation-error-non-ascii

Conversation

Copy link
Contributor

Copilot AI commented Feb 9, 2026

Installation fails on platforms where unzip doesn't handle Unicode filenames in ZIP archives (notably macOS). The public suffix data contains files named 香港.php, ไทย.php, ישראל.php, and СРБ.php.

Changes

  • Renamed data files to uppercase Punycode equivalents:

    • 香港.phpXN--J6W193G.php
    • ไทย.phpXN--O3CW4H.php
    • ישראל.phpXN--4DBRK0CE.php
    • СРБ.phpXN--90A3AC.php
  • PublicDomainSuffix validator: Convert TLD to Punycode before loading data file using idn_to_ascii():

    $punycoded = idn_to_ascii($tld, IDNA_DEFAULT, INTL_IDNA_VARIANT_UTS46);
    $tldForFile = $punycoded !== false ? $punycoded : $tld;
    $dataSource = DataLoader::load('domain/public-suffix/' . mb_strtoupper($tldForFile) . '.php');
  • UpdateDomainSuffixesCommand: Generate Punycode filenames for non-ASCII TLDs to ensure future updates from Mozilla's public suffix list maintain compatibility.

  • Added symfony/polyfill-intl-idn dependency: Ensures idn_to_ascii() function is available even when the PHP intl extension is not installed, providing a pure PHP fallback implementation.

International domain validation continues to work transparently (e.g., ทหาร.ไทย, 個人.香港).

Original prompt

This section details on the original issue you should resolve

<issue_title>Installation fails due to non-ASCII filenames in public-suffix data</issue_title>
<issue_description>## Description

Installing respect/validation via Composer fails when unzip cannot handle non-ASCII (Unicode) filenames in the data/domain/public-suffix/ directory.

Error

Failed to extract respect/validation: (50) /usr/bin/unzip -qq .../tmp-48b1dadfa049a506943a53f060b2c123.zip -d .../0c2c0bef

.../Respect-Validation-ddb5ef6/data/domain/public-suffix/������.php:  write error (disk full?).  Continue? (y/n/^C) fchmod (file attributes) error: Bad file descriptor
warning:  cannot set modif./access times for .../Respect-Validation-ddb5ef6/data/domain/public-suffix/������.php
          No such file or directory
warning:  .../Respect-Validation-ddb5ef6/data/domain/public-suffix/������.php is probably truncated

Cause

The following public suffix data files use non-ASCII characters in their filenames:

  • 香港.php (Hong Kong)
  • ไทย.php (Thailand)
  • ישראל.php (Israel)
  • СРБ.php (Serbia)

The system unzip utility on some platforms (notably macOS) does not properly handle Unicode filenames in ZIP archives, causing extraction to fail.

Possible solutions

  • Rename the files to use their ASCII/Punycode equivalents (e.g., xn--j6w193g.php instead of 香港.php) and map them internally
  • Rename the files to use transliterated or descriptive ASCII names</issue_description>

<agent_instructions>Make sure you update the UpdateDomainSuffixesCommand and PublicDomainSuffix</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@alganet Punycode seems like a good solution.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Fix installation failure due to non-ASCII filenames Convert non-ASCII TLD filenames to Punycode in public-suffix data Feb 9, 2026
Copilot AI requested a review from henriquemoody February 9, 2026 15:16
Copy link
Member

@henriquemoody henriquemoody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Install symfony/polyfill-intl-idn as a dependency so users don't need to have intl installed

Copy link
Contributor Author

Copilot AI commented Feb 9, 2026

Install symfony/polyfill-intl-idn as a dependency so users don't need to have intl installed

Added in 6483a2b. The polyfill ensures idn_to_ascii() is available even when the intl extension is not installed.

Copilot AI requested a review from henriquemoody February 9, 2026 15:36
@henriquemoody henriquemoody force-pushed the copilot/fix-installation-error-non-ascii branch from 6483a2b to ff4631b Compare February 9, 2026 16:21
Some systems and tools (e.g., certain archive extractors, Windows
environments, or CI pipelines) do not properly handle non-ASCII
characters in file paths. The public suffix data files for
internationalized TLDs (such as ישראל, СРБ, 香港, and ไทย) were stored
using their native Unicode names, which caused installation failures
on those systems.

This commit converts those filenames to their Punycode equivalents
(e.g., XN--4DBRK0CE.php instead of ישראל.php) using `idn_to_ascii()`.
Both the data generation command (`UpdateDomainSuffixesCommand`) and the
runtime validator (`PublicDomainSuffix`) are updated to use the same
Punycode-based file lookup, ensuring consistency. A polyfill dependency
(`symfony/polyfill-intl-idn`) is added so that `idn_to_ascii()` is
available even when the `intl` PHP extension is not installed.

Assisted-by: Claude Code (Claude Opus 4.6)
Co-authored-by: Henrique Moody <henriquemoody@gmail.com>
@henriquemoody henriquemoody force-pushed the copilot/fix-installation-error-non-ascii branch from ff4631b to ce50d33 Compare February 9, 2026 16:29
@henriquemoody henriquemoody marked this pull request as ready for review February 9, 2026 16:31
@henriquemoody henriquemoody merged commit eedce8f into 3.0 Feb 9, 2026
7 checks passed
@henriquemoody henriquemoody deleted the copilot/fix-installation-error-non-ascii branch February 9, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Installation fails due to non-ASCII filenames in public-suffix data

2 participants