A return to time zone troubles

A few months ago I wrote, see "Time zone troubles", about the problems my hitchhike statistics programs had with rides that span multiple time zones, and ended that entry with a paragraph starting with "I'll probably figure something out." and a "To be continued, someday…"

And that someday suddenly came to me today, 2 February 2020 (a palindromic date, 20200202), just after midnight. I had added some tracing code to the PL/I version of the program running on z/OS, and looking at the output of that code, the solution of the problem hit me like a ton of bricks, and I really don't know why it didn't occur to me earlier. Then again, it took me ages to get to a "Distance in a 24-hour period" table that actually was a "Distance in a calendar day" table, and looking back on solving that one also makes me remember the ton of bricks that hit me on that occasion.

OK, so you have a ride, like ride 3 of trip 223, that spans multiple days, crosses several borders, and one of those border crossings, from Lithuania to Poland involves moving into a different time zone. The input for lift for this ride is encoded, omitting many irrelevant other fields, as

Day Distance Time Type Cnty Split dTime aTime Date
1 # 2077.0 18.57 - * 10.02
2 1 1446.0 13.00 - # 2019-06-30
3 2 631.0 5.57 - # 6.22 2019-07-01
4 1a 95.5 0.59 - LT *
5 1b 856.0 7.42 - PL *
6 1c 932.5 8.25 - D *
7 2d 84.1 0.52 - B *
8 2e 108.9 0.59 - NL *
9 ! 1 !B LT ! 11.01 10-01
10 ! 5 !B PL ! 18.35
11 !12 !B D ! 4.31
12 !13 !B B ! 5.23

A full explanation for every column can be found by clicking on the headers, but here are some short explanatory notes per row:

Row 1
Row 2
Row 3
Rows 4, 5, 6, 7, and 8
Rows 9, 10, 11, and 12

With the above in mind, lift creates two "distance/time" lines, one for dates, which would look like

+---- Day 1 ----+---- Day 2 ----+
|               |               |
+-- D: 0.0      +-- D: 1446.0   +-- D: 631.0
|               |               |
+-- T: 10:02    +-- T:  0:00    +-- T:  6:22

and one for countries, which would look like

+----- LT -----+----- PL -----+----- D -----+----- B -----+----- NL -----+
|              |              |             |             |              |
+-- D: 0.0     +-- D: 95.5    +-- D: 856.0  +-- D: 932.5  +-- D: 84.1    +-- D: 108.9
|              |              |             |             |              |
+-- T: 10:02   +-- T: 10:01   +-- T: 18:35  +-- T:  4:31  +-- T:  5:23   +-- T:  6:22

and by judiciously shifting these line alongside each other, we create a unified line that looks like

+- Day 1 - LT -+- Day 1 - PL -+- Day 1 - D -+- Day 2 - D -+- Day 2 - B -+- Day 2 - NL -+
|              |              |             |             |             |              |
+-- D: 0.0     +-- D: 95.5    +-- D: 856.0  +-- D: 494.5  +-- D: 438.0  +-- D: 84.1    +-- D: 108.9
|              |              |             |             |             |              |
+-- T: 10:02   +-- T: 10:01   +-- T: 18:35  +-- T:  0:00  +-- T:  4:31  +-- T:  5:23   +-- T:  6:22

Merging all these "per-ride" "time/distance" lines into a long "per trip" line gives us all we need to scan for 24-hour periods, just jumping from timestamp to timestamp, making sure that the difference between two timestamps never exceed 1,440 minutes. (Note that the actual timestamps also contain the date converted to a JDN, which is then multiplied by 1,440)

And that's where the problem arose…

Just looking at the above segment, one might erroneously assume, and that's what Prino had been doing ever since he created this procedure, that this ride, starting at 10:02 and ending at 6:22 the next day, took 20:20 of elapsed time. Combine this segment with the ones for the previous rides,

223/1

+- Day 1 - NL -+
|              |
+-- D: 0.0     +-- D: 25.2
|              |
+-- T:  7:50   +-- T:  8.03

223/2

+- Day 1 - LT -+
|              |
+-- D: 0.0     +-- D: 64.0
|              |
+-- T:  8:12   +-- T:  9:09

and next rides,

223/4

+- Day 2 - NL -+
|              |
+-- D: 0.0     +-- D: 34.0
|              |
+-- T:  6:42   +-- T:  6.58

223/5

+- Day 2 - NL -+- Day 2 - BL -+
|              |              |
+-- D: 0.0     +-- D: 17.9    +-- D: 86.1
|              |              |
+-- T:  7:10   +-- T:  7:21   +-- T:  8:45

and it looks like the combined segment, from 223/1 (7:50) to 225/5 (7:21) is less than 24 hours (23:31), and adds up to 2,218.1 km, and that's what lift has been telling me, not just for this entry, but for another nearly 200(!) similar ones.

Now I did know that the problem existed, having manually checked some periods close to 24 hours, and going in a westerly direction, from Lithuania to Belgium, which extends the day by one hour, and finding that 23:xx was actually 24:xx, but I never managed to find a workable solution.

Yes, it's possible to convert times to UTC, but then you also have to be prepared to change dates, and I tried on many occasions, without ever getting the code to work.

And then came this early morning…

I added a "put data(split_list);" statement to LIFT#NEW on z/OS, and when I looked at the resulting printout, I suddenly realized that every split only contains data for one country, in other words the "Day 1 - LT" split above contains the country, LT, the distance for LT, 95.5 km, and starting and ending time of that particular segment, 10:02 and 11:01, and it should be blatantly obvious that those times can without any trouble be duplicated into UTC based copies, and when those UTC base times are used, the, as mentioned before, really big problem has turned into an ex-problem. The effort reuired? Ten additional lines of PL/I, and two altered statements. Wow!

I'm a happy bunny!

Last updated on 7 Febuary 2020 (Changed round-robin forward link)


Flags