fix: Fixed an issue which causes random scheduling of playbooks#1969
fix: Fixed an issue which causes random scheduling of playbooks#1969igorhrcek wants to merge 1 commit intorobusta-dev:masterfrom
Conversation
WalkthroughModifies Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/robusta/core/schedule/scheduler.py (1)
157-159: Dead code: CronScheduleRepeat branch is now unreachable.Since all
CronScheduleRepeatjobs now return early (line 143 for NEW, line 145 otherwise), this branch can never execute. Remove it to avoid confusion.if isinstance(job.scheduling_params, DynamicDelayRepeat): next_delay_idx = min(job.state.exec_count, len(job.scheduling_params.delay_periods) - 1) next_delay = job.scheduling_params.delay_periods[next_delay_idx] - elif isinstance(job.scheduling_params, CronScheduleRepeat): - now = time.time() - next_delay = croniter(job.scheduling_params.cron_expression, now).get_next() - now else: # FIXED_DELAY_REPEAT type next_delay = job.scheduling_params.seconds_delay
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/robusta/core/schedule/scheduler.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/robusta/core/schedule/scheduler.py (1)
src/robusta/core/schedule/model.py (3)
CronScheduleRepeat(25-26)JobStatus(7-10)DynamicDelayRepeat(21-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run_tests
🔇 Additional comments (2)
src/robusta/core/schedule/scheduler.py (2)
135-145: LGTM! Clean fix for the stale timestamp issue.The cron-specific path correctly computes delay from current time using
croniter, avoiding the stalelast_exec_time_secproblem. Themax()guard for NEW jobs preserves deployment race condition protection while allowing running jobs to execute at their correct scheduled times.
147-152: LGTM!The refactored NEW job handling for non-cron types correctly preserves the original behavior.
Problem
For cron-scheduled jobs, the scheduler used
last_exec_time_secto compute the next run. This could be stale (e.g., after restarts, long delays, or missed executions), leading to:This PR solves the following issues:
This patch has been part of my production for 2 weeks now and I haven't observed any issues.
Solution
The fix changes cron delay calculation to:
croniter(job.scheduling_params.cron_expression, now).get_next()- now computes the delay from now to the next occurrence.last_exec_time_secfor cron schedules.JobStatus.NEW), still enforceINITIAL_SCHEDULE_DELAY_SEC(120 seconds) to avoid race conditions during deployments when old and new runners might both see the job.Impact