Guessing the next occurrence in a date/time sequence

Analyzing a sequence of dates to determine the best guess for when the next event will accor is something best left to statistical modeling tools, but I’m going to try and show you a very basic way to do exactly that using PHP and an array of date/time stamps.


First off the data/time stamp sample array:

$time_stamps = array(
    "5/10/2013 9:30",
    "5/11/2013 9:50",
    "5/12/2013 10:20",
    "5/13/2013 10:59",
    "5/14/2013 11:22",
    "5/15/2013 11:51",
    "5/16/2013 12:28",
    "5/17/2013 12:57",
    "5/18/2013 13:13");

Sense the object of this exercise is to guess the next occurrence we are going to need a function that analyzes the difference between the times and then gives us an average based on that difference.

 function get_average($time_array)
 {
      $total = "";
      $last = "";
      $count = count($time_array);
      for($t=0;$t<$count;$t++)
      {
           $time = $time_array[$t];
           if($t == 0)
           {
                $last = $time;
           } else {
                $diff = get_diff($last, $time);
                $last = $time;
                $total += $diff;
           }
      }
      $average = ($total/($count-1));
      return round($average);
 }

 function get_diff($time1, $time2)
 {
      $t1 = strtotime($time1);
      $t2 = strtotime($time2);
      $diff = ($t2 - $t1)/60;
      return $diff;
 }

In the above function “get_average” you will see the for loop running through the array of date/time stamps and calling the function “get_diff”. The first time the loop runs it sets the varable $last to the first date/time stamp, this is to give us a starting point otherwise when we ran the “get_diff” function it would return an error or a number based on a date out of our sample rage and would therefore make are guess extreamly inaccurate, instead of just somewhat being inaccurate. The get_diff function compaires the last date/time sample and the current date/time sample in the array and returns the differance in minutes, that time is then added to the $total. Once all the items in the array have been calculated the $total is divided by the count of the date/time stamp array minus one (1), we minus one (1) because the first value of the array is not calculated, it is just a starting point to calculate the difference of the next value in the array.

 function guess_next($date_array)
 {
      $diff_avg = get_average($date_array);
      $last_date = new DateTime(end($date_array));
      $last_date->add(new DateInterval('PT'.$diff_avg.'M'));
      $next = $last_date->format('m/d/Y H:i');
      return $next;
 }

We now have the average increase in time to base our prediction on and we can use the above function to tie it all together. We take the last date in the date/time stamp array and add the average to that date and then format it to a readable format and we are done. The next occurrence will most likely occur around “05/19/2013 13:41”.

Obviously this isn’t going to be super accurate but it will provide a “best guess” based on the data provided. If anyone has a better algorithm for guess the next occurrence in a date/time array I would love to see it.

Thanks for reading!


Full code:

 function get_average($time_array)
 {
      $total = "";
      $last = "";
      $count = count($time_array);
      for($t=0;$t<$count;$t++)
      {
           $time = $time_array[$t];
           if($t == 0)
           {
                $last = $time;
           } else {
                $diff = get_diff($last, $time);
                $last = $time;
                $total += $diff;
           }
      }
      $average = ($total/($count-1));
      return round($average);
 }

 function get_diff($time1, $time2)
 {
      $t1 = strtotime($time1);
      $t2 = strtotime($time2);
      $diff = ($t2 - $t1)/60;
      return $diff;
 }

 function guess_next($date_array)
 {
      $diff_avg = get_average($date_array);
      $last_date = new DateTime(end($date_array));
      $last_date->add(new DateInterval('PT'.$diff_avg.'M'));
      $next = $last_date->format('m/d/Y H:i');
      return $next;
 }

 $date_stamps = array(
      "5/10/2013 9:30",
      "5/11/2013 9:50",
      "5/12/2013 10:20",
      "5/13/2013 10:59",
      "5/14/2013 11:22",
      "5/15/2013 11:51",
      "5/16/2013 12:28",
      "5/17/2013 12:57",
      "5/18/2013 13:13");

 print(guess_next($date_stamps));