Skip to content

fix: filter low-reputation authors from evergreen sitemap#3676

Merged
nimrodkra merged 4 commits intomainfrom
fix/seo-p1-canonical-and-sitemap
Mar 8, 2026
Merged

fix: filter low-reputation authors from evergreen sitemap#3676
nimrodkra merged 4 commits intomainfrom
fix/seo-p1-canonical-and-sitemap

Conversation

@nimrodkra
Copy link
Member

Summary

  • Add author reputation filter to buildEvergreenSitemapQuery to match the existing filter in buildPostsSitemapQuery
  • Prevents noindex posts from appearing in the evergreen sitemap

Context

Ahrefs audit found a noindex post (es6-handbook-the-complete-guide-dsyovcz-z) listed in the evergreen sitemap. The webapp sets noindex on posts by authors with reputation ≤ 10 (pages/posts/[id]/index.tsx:306). The posts.xml sitemap already filtered these via (u.id is null or u.reputation > 10), but evergreen.xml did not — allowing old, high-upvote posts by low-reputation authors to slip through.

Related apps PR: dailydotdev/apps#5677

Test plan

  • Added test: verifies low-reputation author's post is excluded from evergreen sitemap
  • Existing sitemap tests pass

🤖 Generated with Claude Code

…temap

Remove the isFallback check in canonicalFromRouter that suppressed
canonical URLs during ISR fallback rendering. Filter posts with
flags->'noindex' from sitemap generation to prevent noindex pages
from appearing in sitemaps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nimrodkra
Copy link
Member Author

@claude please review this pr

@claude
Copy link

claude bot commented Mar 8, 2026

Claude finished @nimrodkra's task in 58s —— View job


PR Review

  • Read changed files and gather context
  • Review implementation in src/routes/sitemaps.ts
  • Review tests in __tests__/sitemaps.ts
  • Post review feedback

Summary: This is a clean, targeted fix. The change is minimal and correctly mirrors the existing filter in buildPostsSitemapQuery.

Implementation (src/routes/sitemaps.ts)

The two added lines in buildEvergreenSitemapQuery are exactly right:

.leftJoin(User, 'u', 'p."authorId" = u.id')
// ...
.andWhere('(u.id is null or u.reputation > 10)')

This is a faithful copy of the existing pattern in buildPostsSitemapQuery (lines 124, 130). The LEFT JOIN ensures posts without an author are still included — only posts by explicitly low-reputation authors are filtered out. No concerns here.

Tests (__tests__/sitemaps.ts)

The test covers the key scenario well:

  • Creates a user with reputation: 5 (below the threshold of 10)
  • Inserts two posts that would both otherwise qualify for the evergreen sitemap (91 days old, 100 upvotes)
  • Asserts the low-rep post is excluded and the no-author post is included

Minor nit: The time calculation 91 * 24 * 60 * 60 * 1000 uses magic numbers. Per the project's conventions (CLAUDE.md), time constants from src/common/constants.ts should be used (e.g., 91 * ONE_DAY_IN_SECONDS * 1000). This is low-priority test code, but worth keeping consistent. Fix this →

Overall

The fix is correct, well-scoped, and the test validates the intended behavior. No blocking issues.

The slug is generated from title + id (not shortId). Fix test
assertions to match the actual generated slug format. Also replace
magic time numbers with ONE_DAY_IN_SECONDS constant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pulumi
Copy link

pulumi bot commented Mar 8, 2026

🍹 The Update (preview) for dailydotdev/api/prod (at 822c4b0) was successful.

✨ Neo Explanation

This is a standard application release that rolls out a new container image to all 7 production deployments and 33 cron jobs, while also running fresh database and ClickHouse schema migration jobs for the new version.

Root Cause Analysis

A new version of the API application has been built and is being deployed to production. Every workload in the cluster is being updated to point to the new container image, and the versioned migration Jobs from the previous release are being replaced with new ones for this release.

Dependency Chain

The new application version cascades uniformly across the entire fleet:

  • 7 Deployments (main API, background workers, WebSocket server, private API, Temporal worker, personalized digest, worker jobs) — all rolling out the new container image via Kubernetes rolling update, so no downtime is expected.
  • 33 CronJobs — their job templates are updated to reference the new image, so the next scheduled run of each cron will use the new version.
  • 2 one-time migration Jobs — the previous release's database migration job (api-db-migration) and ClickHouse migration job (api-clickhouse-migration) are deleted, and new equivalents for this release are created. These run typeorm migration:run and runClickhouseMigrations.js against the production databases before (or alongside) the new pods coming up.

Risk analysis

No stateful resources (databases, storage buckets, persistent volumes) are being replaced or deleted. The migration Jobs do touch the production database and ClickHouse schemas, but they are additive-migration patterns and not destructive replacements. Deployments use rolling updates, so no downtime is expected.

Resource Changes

    Name                                                       Type                           Operation
~   vpc-native-update-trending-cron                            kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-updated-sync-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-validate-active-users-cron                      kubernetes:batch/v1:CronJob    update
~   vpc-native-update-views-cron                               kubernetes:batch/v1:CronJob    update
+   vpc-native-api-clickhouse-migration-e95b60ea               kubernetes:batch/v1:Job        create
~   vpc-native-update-source-tag-view-cron                     kubernetes:batch/v1:CronJob    update
~   vpc-native-post-analytics-history-day-clickhouse-cron      kubernetes:batch/v1:CronJob    update
~   vpc-native-private-deployment                              kubernetes:apps/v1:Deployment  update
~   vpc-native-squad-posts-analytics-refresh-cron              kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-analytics-clickhouse-cron          kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-analytics-history-clickhouse-cron  kubernetes:batch/v1:CronJob    update
~   vpc-native-update-source-public-threshold-cron             kubernetes:batch/v1:CronJob    update
~   vpc-native-ws-deployment                                   kubernetes:apps/v1:Deployment  update
~   vpc-native-sync-subscription-with-cio-cron                 kubernetes:batch/v1:CronJob    update
~   vpc-native-update-highlighted-views-cron                   kubernetes:batch/v1:CronJob    update
-   vpc-native-api-clickhouse-migration-fb269ce0               kubernetes:batch/v1:Job        delete
~   vpc-native-clean-gifted-plus-cron                          kubernetes:batch/v1:CronJob    update
~   vpc-native-worker-job-deployment                           kubernetes:apps/v1:Deployment  update
~   vpc-native-daily-digest-cron                               kubernetes:batch/v1:CronJob    update
~   vpc-native-hourly-notification-cron                        kubernetes:batch/v1:CronJob    update
+   vpc-native-api-db-migration-e95b60ea                       kubernetes:batch/v1:Job        create
~   vpc-native-update-tag-recommendations-cron                 kubernetes:batch/v1:CronJob    update
~   vpc-native-generic-referral-reminder-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-post-analytics-clickhouse-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-check-analytics-report-cron                     kubernetes:batch/v1:CronJob    update
~   vpc-native-update-tags-str-cron                            kubernetes:batch/v1:CronJob    update
~   vpc-native-generate-search-invites-cron                    kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-zombie-images-cron                        kubernetes:batch/v1:CronJob    update
~   vpc-native-temporal-deployment                             kubernetes:apps/v1:Deployment  update
~   vpc-native-expire-super-agent-trial-cron                   kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-stale-user-transactions-cron              kubernetes:batch/v1:CronJob    update
~   vpc-native-user-posts-analytics-refresh-cron               kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-zombie-users-cron                         kubernetes:batch/v1:CronJob    update
~   vpc-native-personalized-digest-cron                        kubernetes:batch/v1:CronJob    update
~   vpc-native-bg-deployment                                   kubernetes:apps/v1:Deployment  update
-   vpc-native-api-db-migration-fb269ce0                       kubernetes:batch/v1:Job        delete
~   vpc-native-agents-digest-cron                              kubernetes:batch/v1:CronJob    update
~   vpc-native-calculate-top-readers-cron                      kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-zombie-opportunities-cron                 kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-zombie-user-companies-cron                kubernetes:batch/v1:CronJob    update
~   vpc-native-personalized-digest-deployment                  kubernetes:apps/v1:Deployment  update
... and 5 other changes

@nimrodkra
Copy link
Member Author

@claude let's run a 2nd round of review for this pr

@claude
Copy link

claude bot commented Mar 8, 2026

Claude finished @nimrodkra's task in 1m 20s —— View job


PR Review (Round 2)

  • Read changed files
  • Review implementation
  • Review tests
  • Post findings

Summary: The round 1 nit (magic numbers) has been resolved — the test already uses ONE_DAY_IN_SECONDS. Commit a3b6429 also corrected the slug assertions in the test. The PR is clean and ready to merge.

Implementation (src/routes/sitemaps.ts)

No new concerns. The two lines added to buildEvergreenSitemapQuery (lines 142, 149) are a faithful, minimal mirror of the existing pattern in buildPostsSitemapQuery (lines 124, 130). User was already imported — no import changes needed.

Tests (__tests__/sitemaps.ts)

All round 1 feedback has been addressed:

  • ✅ Magic number replaced with ONE_DAY_IN_SECONDS * 1000
  • ✅ Slug assertions corrected (evergreen-no-author-evergreen-norep, evergreen-low-rep-evergreen-lowrep)

The test is well-scoped: it validates the two meaningful cases — a post by a low-rep author is excluded, and a post with no author is included. No further test coverage is needed for this change.

One minor observation (non-blocking): the test verifies reputation: 5 is excluded but doesn't verify a high-reputation author is still included. This would make the boundary logic explicit:

// Optional: would fully document the threshold behavior
{ id: 'evergreen-highrep', ..., authorId: 'high-rep-user', reputation: 100 }
expect(res.text).toContain('/posts/evergreen-high-rep-evergreen-highrep');

This is entirely optional — the existing test is sufficient to prove the filter works.

Overall

The fix is correct, minimal, and consistent with the existing posts.xml filter. No issues. LGTM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nimrodkra nimrodkra enabled auto-merge (squash) March 8, 2026 11:06
@nimrodkra nimrodkra merged commit 6d24d70 into main Mar 8, 2026
9 checks passed
@nimrodkra nimrodkra deleted the fix/seo-p1-canonical-and-sitemap branch March 8, 2026 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants