As others have already mentioned, crond or systemd is a common and simple solution to run PHP code regularly. Always be mindful of the user permissions under which the code is executed. Make sure the code does NOT run as "root" because anyone who gains write access to your PHP code could easily take over the entire server!
If you're not using a framework that implements an event queue, it’s also a good idea to design cron jobs so that they can handle not being executed exactly at the scheduled time or not running error-free. If a job runs incompletely or not at all, a later run should be able to catch up on the missed tasks.
Let me give an example: Suppose we have a forum where users receive “reputation points” at the end of the month if they posted at least once during the previous month.
A simple approach would be to run the job exactly once at the end of the month, selecting all users via an SQL query who posted at least once in the last month, and adding 10 reputation points to each of them.
However, if for some reason the job fails after processing the first 5 out of 20 users, only the first 5 would receive their points, leaving the remaining 15 without any points. If the job were to be restarted, those first 5 users would receive the points again, as there’s no way to distinguish who has already received their points for that month and who hasn’t.
A better approach would be to extend the data structure by adding a field in the database that tracks the last time points were successfully awarded to each user. The SQL query could then be modified to select users who posted in the previous month AND for whom the new field “date_last_monthly_reputation_granted” is either NULL or contains a date older than the current month.
When awarding the 10 points, the date in the new field is updated at the same time. This way, if a failure occurs and the job is restarted, it can continue processing from where it left off.
If you're dealing with large datasets, it might also be a good idea not to select and process all users who meet the criteria at once. Instead, you could select the next 100 users, for example, and run the job every few minutes. This ensures that users are processed gradually without PHP trying to load all users into memory at once.
Additionally, it's important to handle concurrency to prevent multiple instances of the same job running simultaneously, which could lead to data corruption or inconsistent results. One simple solution is to use a file-based locking mechanism like flock
, which ensures that only one instance of the job is running at any given time. By locking a specific file at the start of the job and releasing it upon completion, you can prevent overlapping executions. If another instance of the job tries to run while the file is locked, it will either wait or terminate, depending on how you've configured it. This ensures that the job processes data safely and sequentially, even in environments where the cron scheduler might trigger multiple instances.
However, this introduces another challenge: monitoring for stale jobs or lockfiles. If a job hangs or crashes without releasing the lock, the system could be blocked indefinitely. Proper monitoring is essential to catch such scenarios and clean up stale lockfiles. In general, monitoring cron jobs is a story in itself. It’s a good idea to redirect STDOUT and STDERR to a log file for easy tracking of job execution. Additionally, setting up another cron job or using logrotate
to regularly truncate and manage that log file helps prevent it from growing indefinitely and consuming excessive disk space.