Skip to content

refactor: use in operator instead of regex operator in crontab query to fix mssql regression #900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 18, 2025

Conversation

alirafiei75
Copy link
Contributor

@alirafiei75 alirafiei75 commented May 16, 2025

Description

This pull request refactors _get_crontab_exclude_query method of DatabaseScheduler class to use a list of valid numeric hours for crontab query instead of regex pattern

Related Issues

Fixes: #897

Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.67%. Comparing base (673dbc5) to head (e05c85a).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #900      +/-   ##
==========================================
- Coverage   88.24%   87.67%   -0.57%     
==========================================
  Files          32       32              
  Lines        1012     1014       +2     
  Branches      105       82      -23     
==========================================
- Hits          893      889       -4     
- Misses        101      107       +6     
  Partials       18       18              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cclauss cclauss requested a review from Copilot May 17, 2025 05:00
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the _get_crontab_exclude_query method in DatabaseScheduler to replace a regex-based hour filter with an explicit __in lookup over a generated list of valid hour strings.

  • Removed the numeric_hour_pattern regex filter.
  • Introduced a valid_hours list combining non-padded and zero-padded representations for hours 0–23.
  • Updated the queryset to use hour__in=valid_hours instead of hour__regex.
Comments suppressed due to low confidence (1)

django_celery_beat/schedulers.py:323

  • Add unit tests to verify that the filter correctly matches both padded and non-padded hour values in the database, ensuring no valid entries are missed.
hour__in=valid_hours

@auvipy auvipy self-requested a review May 17, 2025 08:21
Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we verify / fix this with modified tests please? the copilot suggestion seems to break the tests

@auvipy
Copy link
Member

auvipy commented May 17, 2025

=================================== FAILURES ===================================
___________ test_DatabaseScheduler.test_crontab_timezone_conversion ____________

self = <t.unit.test_schedulers.test_DatabaseScheduler object at 0x7f2fbbbd74d0>
mock_get_tz =
mock_aware_now =

  @pytest.mark.django_db
  @patch('django_celery_beat.schedulers.aware_now')
  @patch('django.utils.timezone.get_current_timezone')
  def test_crontab_timezone_conversion(self, mock_get_tz, mock_aware_now):
      # Set up mocks for server timezone and current time
      from datetime import datetime
      server_tz = ZoneInfo("Asia/Tokyo")
  
      mock_get_tz.return_value = server_tz
  
      # Server time is 17:00 Tokyo time
      mock_now_dt = datetime(2023, 1, 1, 17, 0, 0, tzinfo=server_tz)
      mock_aware_now.return_value = mock_now_dt
  
      # Create tasks with the following crontab schedules:
      # 1. UTC task at hour 8 - equivalent to 17:00 Tokyo time (current hour)
      #    - should be included
      # 2. New York task at hour 3 - equivalent to 17:00 Tokyo time
      #    (current hour) - should be included
      # 3. UTC task at hour 3 - equivalent to 12:00 Tokyo time
      #    - should be excluded (outside window)
  
      # Create crontab schedules in different timezones
      utc_current_hour = CrontabSchedule.objects.create(
          hour='8', timezone='UTC'
      )
      ny_current_hour = CrontabSchedule.objects.create(
          hour='3', timezone='America/New_York'
      )
      utc_outside_window = CrontabSchedule.objects.create(
          hour='3', timezone='UTC'
      )
  
      # Create periodic tasks using these schedules
      task_utc_current = self.create_model(
          name='utc-current-hour',
          crontab=utc_current_hour
      )
      task_utc_current.save()
  
      task_ny_current = self.create_model(
          name='ny-current-hour',
          crontab=ny_current_hour
      )
      task_ny_current.save()
  
      task_utc_outside = self.create_model(
          name='utc-outside-window',
          crontab=utc_outside_window
      )
      task_utc_outside.save()
  
      # Run the scheduler's exclusion logic
      exclude_query = self.s._get_crontab_exclude_query()
  
      # Get excluded task IDs
      excluded_tasks = set(
          PeriodicTask.objects.filter(exclude_query).values_list(
              'id', flat=True
          )
      )
  
      # Current hour tasks in different timezones should not be excluded
      assert task_utc_current.id not in excluded_tasks, (
          "UTC current hour task should be included"
      )
      assert task_ny_current.id not in excluded_tasks, (
          "New York current hour task should be included"
      )
  
      # Task outside window should be excluded
  assert task_utc_outside.id in excluded_tasks, (
          "UTC outside window task should be excluded"
      )

E AssertionError: UTC outside window task should be excluded
E assert 15 in {11}
E + where 15 = <PeriodicTask: utc-outside-window: * 3 * * * (m/h/dM/MY/d) UTC>.id

t/unit/test_schedulers.py:1133: AssertionError

@alirafiei75
Copy link
Contributor Author

yes I will. the copilot suggestion was wrong

@alirafiei75
Copy link
Contributor Author

As we need both padded and non-padded hours ("1" and "01"), we cannot do it in just one list comprehension, or if we do it would not be readable i think

@auvipy
Copy link
Member

auvipy commented May 17, 2025

can you apply the improvement and also add unit tests to verify this please?

@auvipy
Copy link
Member

auvipy commented May 17, 2025

Yup, that looks better

@auvipy auvipy changed the title refactor: use in operator instead of regex operator in crontab query refactor: use in operator instead of regex operator in crontab query to fix mssql regression May 18, 2025
@auvipy auvipy merged commit 26100b8 into celery:main May 18, 2025
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REGEXP_LIKE incompatible with SQL Server
2 participants