Skip to content

Night Shift: Unicode display names for i18n users#22

Open
EtanHey wants to merge 1 commit intomasterfrom
nightshift/2026-02-26-3370
Open

Night Shift: Unicode display names for i18n users#22
EtanHey wants to merge 1 commit intomasterfrom
nightshift/2026-02-26-3370

Conversation

@EtanHey
Copy link
Owner

@EtanHey EtanHey commented Feb 26, 2026

User description

Automated improvement by Golems Night Shift.

Unicode display names for i18n users


PR Type

Enhancement


Description

  • Enable Unicode display names for international users (Hebrew, Arabic, Korean, Persian)

  • Update regex pattern from ASCII-only to Unicode-aware character validation

  • Improve code formatting and consistency throughout leaderboard module

  • Update UI placeholder text to reflect Unicode support


Diagram Walkthrough

flowchart LR
  A["ASCII-only regex<br/>[a-zA-Z0-9\s]"] -->|Replace with| B["Unicode-aware regex<br/>[\p{L}\p{N}\s]u"]
  B -->|Allows| C["International scripts<br/>Hebrew, Arabic, Korean, Persian"]
  C -->|Enables| D["Global user support"]
Loading

File Walkthrough

Relevant files
Enhancement
leaderboard.ts
Unicode regex and code formatting improvements                     

convex/leaderboard.ts

  • Changed display name validation regex from /^[a-zA-Z0-9\s]+$/ to
    /^[\p{L}\p{N}\s]+$/u to support Unicode letters and numbers
  • Updated error message comment to reflect Unicode character support
  • Applied consistent code formatting: multi-line function parameters,
    proper indentation, quote standardization
  • Fixed trailing whitespace issues throughout the file
+58/-24 
welcome.tsx
Update UI text for Unicode support                                             

src/routes/_authed/welcome.tsx

  • Updated UI placeholder text from "3-20 characters, letters and spaces
    only" to "3-20 characters, letters, numbers, and spaces"
  • Applied consistent code formatting with multi-line function calls
+5/-3     

Note

Low Risk
Small, localized input-validation change; main risk is allowing additional characters that may affect downstream rendering/search, but no auth or data model changes.

Overview
Updates setDisplayName validation to allow Unicode letters/numbers (with spaces) instead of ASCII-only, enabling i18n display names on leaderboards.

Adjusts the welcome screen copy to match the new rule and includes minor formatting-only refactors across leaderboard query helpers (no functional behavior changes beyond display-name validation).

Written by Cursor Bugbot for commit a2daa87. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • New Features
    • Display names can now include Unicode letters, numbers, and spaces. The input validation hint has been updated to reflect this expanded character support.

…ian)

The setDisplayName regex only accepted ASCII [a-zA-Z0-9\s], blocking all
non-Latin scripts. Changed to Unicode-aware [\p{L}\p{N}\s] so users of
the app's target languages can set proper display names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
songscript Building Building Preview, Comment Feb 26, 2026 2:06am

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

The changes update display name validation to accept Unicode letters and numbers in addition to spaces, replacing the previous ASCII-only constraint. Related user-facing help text is updated to reflect the expanded character set. Remaining modifications are formatting adjustments.

Changes

Cohort / File(s) Summary
Display Name Validation
convex/leaderboard.ts
Regex pattern for display name validation broadened from ASCII alphanumerics to Unicode letters and numbers with spaces. Error messages reformatted for clarity. Structural formatting adjustments applied throughout.
Welcome UI Text
src/routes/_authed/welcome.tsx
User-facing description text updated from "3-20 characters, letters and spaces only" to "3-20 characters, letters, numbers, and spaces" to align with expanded validation rules. Mutation invocation reformatted across lines.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰✨ Unicode hops into view,
Letters and numbers, both old and new,
From ASCII bounds we now break free,
A broader world for names to be! 🌍🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main functional change: expanding display name validation to accept Unicode characters for international users, which aligns with the regex update from ASCII to Unicode character classes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch nightshift/2026-02-26-3370

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Control whitespace allowed

Description: The new Unicode validation regex uses \s (/^[\p{L}\p{N}\s]+$/u), which matches tabs and
newlines in JavaScript, allowing multi-line or control-whitespace display names that can
be used for UI spoofing/layout manipulation or log/notification injection; consider
limiting whitespace to a normal space (e.g., [ ]) and/or explicitly rejecting \r, \n, and
\t.
leaderboard.ts [373-380]

Referred Code
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
const validPattern = /^[\p{L}\p{N}\s]+$/u;
if (!validPattern.test(displayName)) {
  return {
    success: false,
    error: "Display name can only contain letters, numbers, and spaces",
  };
}
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Overbroad whitespace allowed: The new Unicode display name validation regex ^[\p{L}\p{N}\s]+$ allows any \s whitespace
(e.g., tabs/newlines), permitting multi-line or control-like spacing in user-generated
display names which can enable UI spoofing or log/formatting injection issues.

Referred Code
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
const validPattern = /^[\p{L}\p{N}\s]+$/u;
if (!validPattern.test(displayName)) {
  return {
    success: false,
    error: "Display name can only contain letters, numbers, and spaces",
  };
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logging: The setDisplayName mutation updates user profile data but the new/changed code adds no
audit log entry containing actor user ID, timestamp, action, and outcome.

Referred Code
export const setDisplayName = mutation({
  args: {
    displayName: v.string(),
  },
  handler: async (ctx, { displayName }) => {
    const userId = await requireAuth(ctx);

    // Validate display name length
    if (displayName.length < 3 || displayName.length > 20) {
      return { success: false, error: "Display name must be 3-20 characters" };
    }

    // Validate display name contains valid characters (Unicode letters, numbers + spaces)
    const validPattern = /^[\p{L}\p{N}\s]+$/u;
    if (!validPattern.test(displayName)) {
      return {
        success: false,
        error: "Display name can only contain letters, numbers, and spaces",
      };
    }



 ... (clipped 19 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing failure handling: New/modified async DB operations (e.g., ctx.db.get inside Promise.allSettled) do not add
any logging/telemetry or explicit handling for rejected results beyond setting null,
making production debugging and error reconstruction difficult.

Referred Code
const lineProgress = await ctx.db
  .query("lineProgress")
  .withIndex("by_user", (q: any) => q.eq("userId", userId))
  .filter((q: any) => q.eq(q.field("learned"), true))
  .collect();

const uniqueSongIds = [...new Set(lineProgress.map((l: any) => l.songId))];
const songSettled = await Promise.allSettled(
  uniqueSongIds.map((id: any) => ctx.db.get(id)),
);
const songMap = new Map(
  uniqueSongIds.map((id: any, i: number) => [
    id,
    songSettled[i].status === "fulfilled"
      ? (songSettled[i] as PromiseFulfilledResult<any>).value
      : null,
  ]),
);

const languageScores = new Map<
  string,


 ... (clipped 10 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.

// Validate display name contains valid characters (alphanumeric + spaces)
const validPattern = /^[a-zA-Z0-9\s]+$/;
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
const validPattern = /^[\p{L}\p{N}\s]+$/u;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unicode names reject combining marks

Medium Severity

setDisplayName validates with ^[\p{L}\p{N}\s]+$, which excludes Unicode mark characters. Many valid names require combining marks, so users in several scripts can still be blocked despite the Unicode rollout. This makes the new i18n display-name support incomplete in convex/leaderboard.ts.

Fix in Cursor Fix in Web

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Refactor inefficient leaderboard queries to scale

The current leaderboard implementation suffers from a severe N+1 query problem
by calculating scores for each user inside a loop, leading to poor scalability.
It is recommended to refactor this by denormalizing scores onto user records or
using database aggregations for more efficient retrieval.

Examples:

convex/leaderboard.ts [264-280]
    for (const user of users) {
      const { score, topLanguage } = await calculateProgressScore(
        ctx,
        user._id,
      );

      userScores.push({
        displayName: user.displayName!,
        progressScore: score,
        topLanguage,

 ... (clipped 7 lines)
convex/leaderboard.ts [176-241]
    for (const user of users) {
      // Use user's ID for practice data lookup
      const streak = await calculateCurrentStreak(ctx, user._id);

      // Detect user's primary language from their practice data
      const lineProgress = await ctx.db
        .query("lineProgress")
        .withIndex("by_user", (q) => q.eq("userId", user._id))
        .filter((q) => q.eq(q.field("learned"), true))
        .collect();

 ... (clipped 56 lines)

Solution Walkthrough:

Before:

async function getProgressLeaderboard(ctx, ...) {
  // 1. Get all users
  const users = await ctx.db.query("users").collect();
  const userScores = [];

  // 2. Loop N times for N users
  for (const user of users) {
    // 3. Perform multiple DB queries for EACH user inside this function
    const { score } = await calculateProgressScore(ctx, user._id);
    userScores.push({ displayName: user.displayName, progressScore: score });
  }

  // 4. Sort and return
  return userScores.sort(...).slice(...);
}

After:

// Suggestion: Denormalize scores onto the user table,
// and update them via a separate process (e.g., after a practice session).

async function getProgressLeaderboard(ctx, ...) {
  // 1. Fetch users already sorted by their pre-calculated score
  // This is a single, efficient, indexed query.
  const topUsers = await ctx.db
    .query("users")
    .withIndex("by_progressScore") // Assumes an index on a new 'progressScore' field
    .order("desc")
    .filter(q => q.neq(q.field("displayName"), undefined))
    .take(limit);

  return topUsers;
}
Suggestion importance[1-10]: 10

__

Why: The suggestion correctly identifies a critical N+1 query performance issue in the leaderboard logic, which will severely impact scalability and is a major design flaw in the modified file.

High
General
Trim input before validation

Introduce a trimmed version of the displayName at the beginning of the
validation block and use this trimmed variable for all subsequent length and
pattern checks.

convex/leaderboard.ts [368-385]

+const trimmedName = displayName.trim();
 // Validate display name length
-if (displayName.length < 3 || displayName.length > 20) {
+if (trimmedName.length < 3 || trimmedName.length > 20) {
   return { success: false, error: "Display name must be 3-20 characters" };
 }
 // Validate display name contains valid characters (Unicode letters, numbers + spaces)
 const validPattern = /^[\p{L}\p{N}\s]+$/u;
-if (!validPattern.test(displayName)) {
+if (!validPattern.test(trimmedName)) {
   return { success: false, error: "Display name can only contain letters, numbers, and spaces" };
 }
 // Reject names with only spaces
-if (displayName.trim().length === 0) {
+if (trimmedName.length === 0) {
   return { success: false, error: "Display name cannot be only spaces" };
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This is a critical improvement for data sanitization, ensuring that all validations are performed on a trimmed displayName, which prevents storing leading/trailing whitespace and simplifies the validation logic.

Medium
Validate and trim input client-side

In the handleContinue function, add client-side validation to check the display
name's length and character set before making the API call.

src/routes/_authed/welcome.tsx [35-45]

 const handleContinue = async () => {
   setError(null);
+  const trimmed = displayName.trim();
+  const validPattern = /^[\p{L}\p{N}\s]+$/u;
+  if (trimmed.length < 3 || trimmed.length > 20 || !validPattern.test(trimmed)) {
+    setError("Display name must be 3-20 characters, letters, numbers, and spaces");
+    return;
+  }
   setLoading(true);
   try {
-    await setDisplayNameMutation.mutateAsync({ displayName });
+    await setDisplayNameMutation.mutateAsync({ displayName: trimmed });
     navigate({ to: "/dashboard" });
   } catch (e: any) {
     setError(e.message);
   } finally {
     setLoading(false);
   }
 };

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: Adding client-side validation improves user experience by providing immediate feedback and reduces unnecessary network requests for invalid data, complementing the existing server-side validation.

Medium
Restrict whitespace in display names

In the display name validation regex, replace the overly permissive \s character
class with a literal space character to prevent unwanted whitespace like
newlines or tabs.

convex/leaderboard.ts [373-380]

 // Validate display name contains valid characters (Unicode letters, numbers + spaces)
-const validPattern = /^[\p{L}\p{N}\s]+$/u;
+const validPattern = /^[\p{L}\p{N} ]+$/u;
 if (!validPattern.test(displayName)) {
   return {
     success: false,
     error: "Display name can only contain letters, numbers, and spaces",
   };
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that \s is too permissive for a display name and could lead to UI formatting issues. Replacing it with a literal space is a good practice for data hygiene.

Low
  • More

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@convex/leaderboard.ts`:
- Around line 373-379: The display name validation regex (validPattern) is too
permissive and misses Unicode combining marks; update the pattern used where
displayName is tested so it accepts Unicode letters, numbers and combining marks
but only literal space characters (not tabs/newlines/other whitespace).
Specifically, replace the character class [\p{L}\p{N}\s] with one that adds
\p{M} for combining marks and uses a literal space instead of \s, and keep the
start/end anchors and /u flag as used in validPattern to ensure line terminators
and control whitespace are rejected while preserving i18n support.

In `@src/routes/_authed/welcome.tsx`:
- Line 100: Add a new test file WelcomePage.test.tsx next to welcome.tsx that
exercises the WelcomePage component’s state, side effects, mutation calls and
form validation (including the UX text "3-20 characters, letters, numbers, and
spaces"); mount WelcomePage with React Testing Library, mock the
mutation/network layer used by the component (e.g., the create/update hook or
API client), simulate user input and submit flows to assert success and failure
branches, assert the validation message at the relevant input shows for invalid
values and disappears for valid values, and assert any side-effect behavior
(redirects, toasts, or mutation calls) by spying/mocking the functions
referenced in the component (e.g., the submit handler and mutation hook used
inside WelcomePage).

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a964b0 and a2daa87.

📒 Files selected for processing (2)
  • convex/leaderboard.ts
  • src/routes/_authed/welcome.tsx
📜 Review details
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TanStack Start framework with Bun runtime for the application

Tests must pass locally via bun run test before committing code, as Husky pre-commit hooks will block commits with failing tests

Files:

  • convex/leaderboard.ts
  • src/routes/_authed/welcome.tsx
convex/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

Use Convex for database queries, mutations, and authentication configuration

Files:

  • convex/leaderboard.ts
**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

All new helpers and utilities MUST have corresponding test files

Files:

  • convex/leaderboard.ts
convex/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

convex/**/*.ts: NEVER create .js files in the convex/ folder - only .ts files belong there
Before starting Convex dev server, always run rm -f convex/*.js to clean compiled JavaScript files
Use Convex schema.ts for database schema definitions and songs.ts for queries and mutations

Files:

  • convex/leaderboard.ts
**/*.tsx

📄 CodeRabbit inference engine (CLAUDE.md)

Components with logic MUST have corresponding test files following the naming convention ComponentName.test.tsx

Files:

  • src/routes/_authed/welcome.tsx
🧠 Learnings (5)
📚 Learning: 2026-01-23T18:12:49.193Z
Learnt from: CR
Repo: EtanHey/songscript PR: 0
File: prd-json/AGENTS.md:0-0
Timestamp: 2026-01-23T18:12:49.193Z
Learning: Applies to prd-json/**/convex/**/leaderboard*.{ts,tsx,js} : Users can play without displayName, but must set displayName before appearing on the leaderboard

Applied to files:

  • convex/leaderboard.ts
  • src/routes/_authed/welcome.tsx
📚 Learning: 2026-01-23T18:12:38.519Z
Learnt from: CR
Repo: EtanHey/songscript PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-23T18:12:38.519Z
Learning: Applies to convex/*.ts : Use Convex for database queries, mutations, and authentication configuration

Applied to files:

  • convex/leaderboard.ts
  • src/routes/_authed/welcome.tsx
📚 Learning: 2026-01-23T18:12:49.193Z
Learnt from: CR
Repo: EtanHey/songscript PR: 0
File: prd-json/AGENTS.md:0-0
Timestamp: 2026-01-23T18:12:49.193Z
Learning: Applies to prd-json/**/convex/**/*{score,progress,leaderboard}*.{ts,tsx,js} : Use Progress Score Formula: (words_learned × multiplier) + (lines_completed × multiplier × 0.5)

Applied to files:

  • convex/leaderboard.ts
📚 Learning: 2026-01-23T18:12:49.193Z
Learnt from: CR
Repo: EtanHey/songscript PR: 0
File: prd-json/AGENTS.md:0-0
Timestamp: 2026-01-23T18:12:49.193Z
Learning: Applies to prd-json/**/convex/**/*.{ts,tsx,js} : Apply language difficulty multipliers based on FSI rankings: Standard languages (Spanish, French, Italian, Dutch) use 1.0x, Hard languages (Hebrew, Persian, Russian, Hindi) use 1.5x, Super-Hard languages (Arabic, Japanese, Korean, Chinese) use 2.0x

Applied to files:

  • convex/leaderboard.ts
📚 Learning: 2026-01-28T12:17:53.877Z
Learnt from: CR
Repo: EtanHey/songscript PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-28T12:17:53.877Z
Learning: Applies to convex/**/*.ts : Use Convex schema.ts for database schema definitions and songs.ts for queries and mutations

Applied to files:

  • src/routes/_authed/welcome.tsx
🔇 Additional comments (2)
src/routes/_authed/welcome.tsx (1)

18-20: Formatting-only hook call changes look safe.

No behavioral delta here; this is a readability-only reformat.

Also applies to: 24-24

convex/leaderboard.ts (1)

22-25: Formatting-only updates are clean.

These edits improve readability without changing behavior.

Also applies to: 57-57, 72-73, 88-89, 93-97, 99-106, 125-127, 160-163, 175-175, 190-191, 195-197, 204-208, 248-250, 263-263, 265-268, 298-300, 342-345, 392-395

Comment on lines +373 to +379
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
const validPattern = /^[\p{L}\p{N}\s]+$/u;
if (!validPattern.test(displayName)) {
return { success: false, error: "Display name can only contain letters, numbers, and spaces" };
return {
success: false,
error: "Display name can only contain letters, numbers, and spaces",
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

JavaScript Unicode regex \s whitespace characters tabs newlines

💡 Result:

In JavaScript (ECMAScript), regex \s matches all ECMAScript “WhiteSpace” code points plus all “LineTerminator” code points (so it includes tabs and newlines). [2][3]

Practically, it’s equivalent to this set (as listed by MDN):
[\f\n\r\t\v\u0020\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff] [1]

So yes, \s matches:

  • Tabs: \t (U+0009) and vertical tab \v (U+000B) [1][2]
  • Newlines / line breaks: \n (LF, U+000A), \r (CR, U+000D), plus Unicode line/paragraph separators U+2028, U+2029 [1][2]
  • Other whitespace: space U+0020, NBSP U+00A0, many Unicode “space separator” chars (e.g. U+2000–U+200A), etc. [1][2]

Sources:
[1] MDN – Character classes (\s equivalence list) (developer.mozilla.org)
[2] ECMAScript spec – WhiteSpace & LineTerminator code points (tc39.es)
[3] ECMAScript spec – CharacterClassEscape :: s definition (tc39.es)

Citations:


🌐 Web query:

Unicode combining marks \p{M} Hindi Arabic script names

💡 Result:

  • \p{M} in Unicode-aware regex engines means General_Category = Mark, i.e. the union Mn (Nonspacing_Mark) + Mc (Spacing_Mark) + Me (Enclosing_Mark). In other words, “combining marks” in the broad Unicode sense. [1]

Devanagari (Hindi) — examples of \p{M} character names

Common combining marks used in Hindi/Devanagari include (not exhaustive):

  • U+094D DEVANAGARI SIGN VIRAMA (Hindi name: halant) [2]
  • U+0901 DEVANAGARI SIGN CANDRABINDU [3]
  • U+0902 DEVANAGARI SIGN ANUSVARA [3]
  • U+093C DEVANAGARI SIGN NUKTA [3]
  • Many dependent vowel signs are also marks (Unicode includes both spacing and nonspacing dependent vowel signs in Devanagari). [4]

Arabic — examples of \p{M} character names

Arabic vowel/phonetic combining marks (“tashkil”) include:

  • U+064B ARABIC FATHATAN
  • U+064C ARABIC DAMMATAN
  • U+064D ARABIC KASRATAN
  • U+064E ARABIC FATHA
  • U+064F ARABIC DAMMA
  • U+0650 ARABIC KASRA
  • U+0651 ARABIC SHADDA
  • U+0652 ARABIC SUKUN [5]
    Unicode also encodes additional Arabic combining marks such as ARABIC HAMZA ABOVE (U+0654) and ARABIC HAMZA BELOW (U+0655) (among others). [6]

Practical regex filters (Unicode-property regex)

  • “Any combining mark”: \p{M} [1]
  • “Devanagari combining marks”: \p{Script=Devanagari}&&\p{M}
  • “Arabic combining marks”: \p{Script=Arabic}&&\p{M}

Sources: [1] UAX #44 General_Category (Mark = Mn/Mc/Me); [2] Unicode Devanagari names list (Virama/halant); [3] Unicode Standard Ch.12 (Devanagari combining marks: candrabindu, anusvara, nukta); [4] Unicode Standard Ch.12 (dependent vowel signs as marks); [5] Unicode Arabic names list (tashkil U+064B..U+0652); [6] UTR #53 list of Arabic modifier combining marks (includes hamza above/below, etc.).


🏁 Script executed:

# First, let's look at the context of this code in the leaderboard.ts file
cd convex && head -390 leaderboard.ts | tail -25

Repository: EtanHey/songscript

Length of output: 990


Display name validation regex needs Unicode combining marks and must reject line terminators.

The current pattern /^[\p{L}\p{N}\s]+$/u has two problems:

  1. \s is over-permissive: It allows tabs, newlines, and Unicode line/paragraph separators (U+2028, U+2029) — not just spaces — which could introduce formatting control characters into display names.

  2. Missing \p{M}: Omitting combining marks rejects valid names in scripts like Hindi (Devanagari) and Arabic, where combining marks (e.g., Devanagari VIRAMA, Arabic FATHA) are essential to proper character representation.

Update to:

-    const validPattern = /^[\p{L}\p{N}\s]+$/u;
+    // Letters + combining marks + numbers + space separators (no tabs/newlines)
+    const validPattern = /^[\p{L}\p{M}\p{N}\p{Zs}]+$/u;

This allows only regular space characters (not tabs/newlines) and includes combining marks needed for i18n support.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
const validPattern = /^[\p{L}\p{N}\s]+$/u;
if (!validPattern.test(displayName)) {
return { success: false, error: "Display name can only contain letters, numbers, and spaces" };
return {
success: false,
error: "Display name can only contain letters, numbers, and spaces",
};
// Validate display name contains valid characters (Unicode letters, numbers + spaces)
// Letters + combining marks + numbers + space separators (no tabs/newlines)
const validPattern = /^[\p{L}\p{M}\p{N}\p{Zs}]+$/u;
if (!validPattern.test(displayName)) {
return {
success: false,
error: "Display name can only contain letters, numbers, and spaces",
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@convex/leaderboard.ts` around lines 373 - 379, The display name validation
regex (validPattern) is too permissive and misses Unicode combining marks;
update the pattern used where displayName is tested so it accepts Unicode
letters, numbers and combining marks but only literal space characters (not
tabs/newlines/other whitespace). Specifically, replace the character class
[\p{L}\p{N}\s] with one that adds \p{M} for combining marks and uses a literal
space instead of \s, and keep the start/end anchors and /u flag as used in
validPattern to ensure line terminators and control whitespace are rejected
while preserving i18n support.

/>
<p className="text-xs text-gray-500 mt-1">
3-20 characters, letters and spaces only
3-20 characters, letters, numbers, and spaces
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# First, let's find if WelcomePage.test.tsx exists anywhere
echo "=== Searching for WelcomePage.test.tsx ==="
fd 'WelcomePage\.test\.tsx$' 2>/dev/null || echo "No test file found"

echo ""
echo "=== Searching for test files in welcome route directory ==="
fd '\.test\.tsx$' src/routes/_authed/ 2>/dev/null || echo "No test files found in that directory"

echo ""
echo "=== Locating welcome.tsx component ==="
fd 'welcome\.tsx$' src/routes/_authed/ 2>/dev/null

echo ""
echo "=== Checking component structure at src/routes/_authed/welcome.tsx ==="
wc -l src/routes/_authed/welcome.tsx 2>/dev/null && echo "File exists, reading..."

Repository: EtanHey/songscript

Length of output: 367


🏁 Script executed:

# Read the welcome.tsx file to see the component definition and line 100
cat -n src/routes/_authed/welcome.tsx | head -120

Repository: EtanHey/songscript

Length of output: 4765


🏁 Script executed:

# Check for any test files in the project that might test this component
echo "=== Searching for test files that reference WelcomePage ==="
rg -l 'WelcomePage' --type=tsx --type=ts src/ 2>/dev/null | grep -i test

Repository: EtanHey/songscript

Length of output: 121


Create WelcomePage.test.tsx to cover this component's logic and validation flow.

The WelcomePage component contains state management, side effects, mutations, and form validation (including the UX text at line 100). A corresponding test file WelcomePage.test.tsx is required in the same directory per coding guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/_authed/welcome.tsx` at line 100, Add a new test file
WelcomePage.test.tsx next to welcome.tsx that exercises the WelcomePage
component’s state, side effects, mutation calls and form validation (including
the UX text "3-20 characters, letters, numbers, and spaces"); mount WelcomePage
with React Testing Library, mock the mutation/network layer used by the
component (e.g., the create/update hook or API client), simulate user input and
submit flows to assert success and failure branches, assert the validation
message at the relevant input shows for invalid values and disappears for valid
values, and assert any side-effect behavior (redirects, toasts, or mutation
calls) by spying/mocking the functions referenced in the component (e.g., the
submit handler and mutation hook used inside WelcomePage).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant