So you want to keep track of your rides (see also [[Statistics]]) and, once you've managed
to get safely back home, do something with the data you've collected? If yes, read on...
This work-in-progress page is based on the notes I, Prino, have used since I started
recording my rides on 16 June 1980 @ 7:47. The notes I use are pretty simple, you can
obviously adapt them to your own requirements.
A possible form to record data
The forms I use to record my data look like this and properly spaced you can put three
columns with four rows on a sheet of A4 (210x297mm) paper. Fitting three columns on a US
"Letter" size sheet (215.9x279.4mm) of paper might be possible if the column widths are
reduced. I save my notes in 17-ring binders.
The key to the detail of the produced statistics is the
Opmerkingen
section. I've simply called it Opmerkingen as you are unlikely to add specific
comments to all of your rides...
The fields
What follows is a description of the various fields in the above format. There is one field
that might seem to be missing, 'Waiting Time', but it should be obvious that it's
automagically included as the difference between the arrival time of one ride and the
departure time of the next ride, and in those cases where something happens in between, the
Opmerkingen section comes to the rescue.
Date
Pretty obvious.
If you use the ISO 8601
format (YYYY-MM-DD), it's easy to extend this to rides spanning multiple days by modifying
the format into YYYY-MM-DD/DD.
Departure/Arrival
Again pretty obvious.
The three columns contain the time, odometer and place of departure and arrival. In cases
where no odometer is available, or where it doesn't work, you can use
Google maps to determine distances,
I've found that it is usually accurate to the nearest kilometre
The unnamed row below 'Departure' contains the total time and distance of the ride.
Speed
The contents of this field depends on individual preferences. I put the
real speed in it, i.e. the distance divided by the actual driving time,
which is the arrival time minus the departure time minus any time recorded
for stops.
Opmerkingen
This is a free-format field that you can use for any purpose you like. Why free format?
Easy, it's unlikely that you would need a specific format for all of your rides, e.g. why
include specific headings for rides that cross borders or span more than one day, when the
number of such rides is likely to be pretty small compared to 'normal' rides.
Here's a non-exhaustive list of things I've been using this field for:
The type of the driver. (Male, female, truck, van, taxi, etc, you can make it as detailed
as you like)
The nationality of the driver, I use the
international license plate code,
but only record this information if the driver is not from the country where they pick me up.
Details of stops (reason: meal, rest, toilet, etc) and times. Recording of these times
allows you to calculate the true driving speed!
Details (i.e. time and odometer reading) at border crossings, which allows you to create
per-country totals.
Odometer readings at midnight, which allows you to create per-calendar-day totals.
Special vehicles (or even every type/model of vehicle/car).
On the last ride of a day I add a 'Total distance/time/speed' entry.
There are probably lots of other things you might want to put into it.
If the Opmerkingen section is too small (and I've had rides with a dozen stops and
going across five borders in two days), I usually continue on the back of the form - see
the two samples somewhere on this page.
Storing the data on a PC
This is likely to be the most important decision you will have to make. There are (at least)
three options:
a (structured) text file, to be processed by a user-written program
a spreadsheet
a database
Each of these options has its pros and cons, here are some details:
Text file
I use the text file option with a few programs I've written myself. The advantage of using
this format is the fact that it allows me total flexibility, but it has a pretty big
disadvantage in that you have to think very carefully about the format you plan to use: it
should be able to cater for future changes without you having to completely rewrite your
programs. My format, described later, was developed over about 25 years
and despite that fact that I've moved to a new format after a few years, the result of not
giving the format enough thought initially, is rather cryptic due to more additions since
adopting it!
A spreadsheet
If you're well versed in spreadsheets (or if not, try LibreOffice,
it's free) you might want to consider using one to process your data. It will have the big
advantage that you can insert or delete columns in your source data and the program will
automagically update the references in all other cells and/or sheets. Combined with the
many conditional functions, you might(probably) be able to produce any statistics
you like, although some of the more esoteric ones my program creates will
be pretty hard (or even impossible) to replicate.
A database
What was written about spreadsheets also holds true for databases. Not having used any PC
database programs, I cannot recommend any, but there are plenty of free ones, LibreOffice
and MariaDB, to name just two of the more
well known ones. Creating your statistics will mean writing queries (most likely in the
fairly easy to learn language SQL), but given the non-procedural nature of this language,
some results that can be created with a self-written program or a spreadsheet may be hard
or even impossible to recreate.
Prino's original program
As mentioned above, and being a programmer by profession, I selected the first method of
storing the data, a text file. The first 60(!) versions of the program were written in
Turbo Pascal
(V3.01a) and until about version 20 they used 'Version I' of a simple
CSV file
with the data. It could handle rides passing through multiple countries and spanning more
than one day, but did not know anything about ferry crossings, stops or time-zones, to name
but a few of the things that arrived later...
Given that the old format became obsolete a long time ago, I've not included any details
about it, but its output mimic'ed my manually created five tables per trip, containing:
a table with the distribution of the distances per ride. Initially in intervals of
100 km, but due to the overwhelming number of rides shorter than 100 km, this
interval was soon split up into four additional intervals of 25 km, and the, as
expected rather small number of, rides over 1,000 km long are split into intervals
of 1,000 km.
a table of distances for each type of driver.
a table of distances per country.
a table with a count of the number of drivers per nationality.
a table with various maxima, minima and averages for the rides and days of the trip, i.e.
highest/lowest/average speed per ride/day, greatest/smallest/average distance per
ride/day, longest/shortest/average time per ride/day.
Simple, uncomplicated and one might assume that most hitch-hikers would leave it at this...
Prino's current program
The current program is written in Virtual Pascal V2.1.279 (or
PL/I, should you want to run it on IBM's z/OS). It is licensed under the provisions of the
GPL V3. The
authenticity verified WinRAR archive
containing the sources and executable files can be found on Prino's
Google Drive.
Data format used by Prino's current program
The 'simple' format was used until the end of 1994. Due to the fact that I wanted to add
some additional statistics to the output files, it was changed into something a bit more
logical, although some people might find otherwise. (And they are right, it's a right-royal
mess due to more additional requirements, and I would like to simplify some of the more
esoteric uses of punctuation, but that won't happen until I get back onto the 'big iron'
with its superb debugging facilities!)
The current format, split into two parts to avoid scrolling, looks like this:
....v....1....v....2....v....3....v....4....v....5....v....6....v....7....v....8....v....9..
999, 9999, AAA, 99999.9, HHH.MM, 999.9, NAT, TYPE, CTY, HH.MM, S, HH.MM, HH.MM, YYYY-MM-DD
| | | | | | | | | | | | | |
a b c d e f g h i j k l m n
..v....0....v....1....v....2....v....3....v....4....v....5....v....6....v....7....v....8....v....v...
, 999999.9, DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD , AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
| | |
o p q
and the fields used in it are:
Col 1 - 3 ("Trip")
number of the trip, up to three digits
Col 6 - 9 ("Ride")
number of the ride within the trip, up to four digits
Col 13 - 15 ("Day")
"day" of the trip, up to three digits
up to three digits for a real day
'␣␣#' to
indicate that this ride was split over two (or more) days
'nna' through 'nnz' for the parts traveled through more than one country. ('a' on
the first country must be lowercase, nn can be any number, I use
the first day of the current ride)
'!xy' to indicate the xy-th stop during this ride. 'x' must be
present, but may be a blank!
Col 18 - 24 ("Distance")
total distance for this ride, up to 99999.9 km
Col 27 - 32 ("Time")
driving time for this ride, up to 999:59
Col 34 - 40 ("Velocity")
the data in these columns can be either
the average velocity for this ride, i.e. "Distance" divided by "Time"
or
blank, if "Split" is non-blank
Col 43 - 45 ("Nationality")
the nationality of the driver, up to three characters^{[1]}
Col 48 - 51 ("Type")
the data in these columns can be either
the "type" of the driver, up to four characters
or
the type of "in-ride" wait, an exclamation ("!") mark followed by three
characters. Free format, but at the moment all descriptions must be hardcoded in
the program. There are two values that must be present for some
specific "in-ride" waits. They are:
B - border crossing
If the driver does not stop at the border, the time of crossing must be recorded in
"Departure time", and, for crossings that involve a stop,
or move into a different time-zone, "Arrival".
In case of a triple-crossing, i.e. day, border and forward-moving
time-zone (e.g. PL-LT at midnight), the 'B' must be followed
by a '#'^{[2]}.
^ - non-hitching time between rides
At the moment this time must be allocated as the last "in-ride" waiting time
of the previous ride, which needs to have its ending time adjusted, i.e. if
ride N ends at 12.34 and is followed by 56 minutes during which you
have a look at that interesting building (or whatever), you'd add a '!^'
stop to ride N, with a start-of-stop time of 12.34, an end-of-stop time
of 13.30 and adjust the end-of-ride time for ride N to 13.30.
Col 54 - 56 ("Country")
the country for this ride, in one of the following formats
the (up to) three letter country abbreviation^{[1]}, when "Split" is blank,
or
an "*" if this particular ride is split over multiple countries, where the "*" must be on the first line for this ride,
or
blank, for those lines of the ride that deal with the split day data, i.e. where "Split" is "#",
or
the (up to) three letter country abbreviation^{[1]}, i.e. "Type" is "!B" and "Split" is "!"
Col 59 - 63 ("Wait")
the wait for this ride^{[3][4]}
Col 66 ("Split")
a "split-type" indicator^{[5]}, with the following possible values:
blank
ride is not split over multiple countries, nor over multiple days
"#"
the current line contains ride-data relating to a day split
"*"
the current line contains ride-data relating to a country split
"!"
the current line contains ride-data relating to an "in-ride" wait (See "Type")
Col 69 - 73 ("Departure time")
a time field, that can contain one of the following times
the departure time^{[6]} of this particular ride,
or
the starting time of an "in-ride" wait, i.e. where "Split" is "!",
or
the time for a non-stopping border crossing^{[7]}, i.e. "Type" is "!B" and "Split" is "!"
Col 76 - 80 ("Arrival time")
a time field, that can contain one of the following times
the arrival time^{[8]} of this particular ride,
or
the ending time of an "in-ride" wait, i.e. where "Split" is "!",
or
the time for a stopping border crossing^{[7]}, i.e. "Type" is "!B" and "Split" is "!"
Col 83 - 92 ("Date")
the date, in ISO 8601 format. The date is required for the first ride of the day, and for
every multiple day line where the "Split" is "#", except the first, unless the multiple day ride is the first
ride of the day.
Col 95 - 102 ("Odometer")
the odometer at the start if this ride, up to 999999.9 km
[2] Given that Prino has never encountered a triple crossing
with a backward moving time-zone (e.g. F-GB at midnight), he doesn't expect the program to
handle them.
[3] Prino never records the waiting time before the first
ride of the day, unless the first ride of the day happens to be one that continues directly
after the last ride of the previous day, without any intervening (sleep?) stop.
[4] The normal "." separator in the waiting time must be
replaced by a ":" (colon) for those waits caused by the departure from a
ferry terminal where you haven't been able to get a ride on the ferry (and may have had to
wait until the next ferry...)
[5] Multiple split lines may be present, but they must
be grouped, and the groups must be in "#", "*", "!" order!
[6] The normal "." separator in the departure time must be
replaced by
a ":" (colon) for departures after a ferry crossing in order to record the ferry crossing, or
a lowercase"x" for departures from a same-named location at the other side of the road, or
an uppercase"X" for departures from a different-named location at the other side of the road.
[7] Non-stopping (i.e. drive-through) border crossings must not record
an arrival time, unless the time-zone changes.
[8] The normal "." separator in the arrival time of a time-zone changing border crossing must be
replaced by
a "+" for a crossing in easterly direction, and
a "-" for a crossing in westerly direction.
[9] The places of departure and arrival must be UFT-8 encoded and their length is (currently) limited to 47 bytes!
[10] Lines should not contain trailing blanks, but completely blanks lines, consisting of just a CR/LF, are allowed.
[11] Although the data is in CSV format, all positions are fixed!
[12] Lines starting with "{" in column 1 contain are treated as comment or meta-data. A description of the meta-date format
can be found below.
[13] Columns containing numerical data or times must be right aligned, columns containing text must be left aligned, and if the
textual data is shorter than the column width, the separating comma may follow it without intervening blanks.
[14] The "Odometer", "Place of departure",
and "Place of arrival", are (currently) completely ignored by the main "lift" program.
Meta data
As I didn't want to make the format of the data even more complex than it already is, I decided to allow for comments and
meta-data to be embedded into the input file. Both comments and meta-data start with a "{" and can be up to 255 characters long.
The (currently) defined, but only partly used types of meta-date are:
Time-zone information
The meta-data contained in lines starting with "{Z-" or "{Z+"
provide information about the time-zones for the countries passed in a trip. The difference between
the two variants is that
All lines containing comments and meta-data share their first character, a "{", which also happens to be the
character that starts a comment in Pascal, which happens to be the language my programs are written in...
Col 1
"{", the start-of-comment/meta-data indicator
Col 2
the following characters, in the order they came into existance, indicate whether the line contains meta-data, or is simply a comment:
"Z" - the line contains time-zone information
"<" - the line indicates the start of partner data
">" - the line indicates the end of partner data
"I" - the line contains interruption information
"*" - the line contains an in-trip split-year indicator
"H" - the line contains a Google short URL to Google Maps tracing a ride or series of rides
These lines contain time-zone information. The format of these comments is:
Pos
Description
1..2
Time-zone identifier, '{Z'
3
- : the remaining data on the line will completely replace the current
time-zone info.
+ : the remaining data on the line will be added to the current time-zone
information, possibly overwriting existing information. This option can be used in multi-zonal countries to update the
time-zone for the country.
4
three letter abbreviation for the country
7
blank
8..11
zone difference from a default (your?) country
11
blank
Pos 4..11 can be repeated up to 31 times. Should a trip pass through more than 31 countries or should you wish to include
all countries in one single place, additional '{Z+...' lines must be used.
The program can handle up to 256 countries, which is more than the current number of countries on Earth, but it still
requires a change to handle fractional time-zones, for countries like Iran (UTC +3.30), India (UTC +4.30)
{< aaaa bbbb cccc - lorum ipsum ....v....1....v....2....v....3... {> - lorum ipsum
These lines allow the rides of a second person, provided they are an exact subset of
the rides of the first person, to be extracted into their own file. The essential parts of the lines are:
Pos
Description
1..2
Identifier, "{<" - Start of second-person data / "{>" - End of second-person data
04..07
value to subtract from main file trip number to create the trip number for the second person
09..12
value to subtract from main file ride number to create the ride number for the second person
14..17
value to subtract from main file day number to create the day number for the second person
20..EOL
Name of second person, used to extract specific records (on "{<" record)
06..EOL
Name of second person, used to extract specific records (on "{>" record)
{W
These lines allow for alternate descriptions of the departure and arrival locations. I (Prino) use them
to add consistent descriptions and English translations to my original history data. The data is used by the DAT2CSV and
H-H2WIKI programs, and should be in the format of the normal data, but only the departure and arrival locations should
be used, the rest of the line should be blank.
Note: Any line starting with '{' that does not fit into any of the above categories is ignored completely,
i.e. treated as a comment!
The results of Prino's current program
The current program produces rather a lot more output than the five tables per trip! In fact it now produces four files and
an optional additional post-processor program that translates the output into .RTF format creates two additional files with
two tables sorted in various other orders.
The summary output file: 'summ.h-h'
This file contains no less than 86 tables (some of them broken into several parts because they would otherwise require A3 or
A2 size paper). Here's the full list, the examples given are based on the first two trips of my hitch-hiking career:
two tables of general totals for every trip
per individual trip
as a running cumulative total
a table of totals for all distances
a table of totals for all types
a table of totals for all countries
a table of totals for all nationalities
a table of totals for all speeds
three tables of totals for all waits
waits split up in waiting time intervals
a statistical analysis of waiting times
waits split up in reason per wait
two tables of ferry related waits
waits after ferry crossings
time spent on ferries
three tables of pick-ups
per nationality per country
per country per type
per nationality per type
a table with the distribution of departure times per weekday
a table with the first and last ride for all distances
a table with the first and last ride for all types
a table with the first and last ride for all countries
a table with the first and last ride for all nationalities
a table with the first and last ride for all speeds
two tables of waits per trip, split in short and long waits
per individual trip
as a running cumulative total
a table of waits per country, split in short and long waits
a table of waits per weekday, split in short and long waits
a table of waits per month, split in short and long waits
a table of waits per year, split in short and long waits
a max/min/average summary for all rides
a max/min/average summary for all days
a max/min/average summary for all types
a max/min/average summary for all nationalities
a max/min/average summary for all countries
a table of rides per country, split in internal and border crossing rides
four tables for the max/min speed & max/min rides for a given number of distances
four tables for the max/min speed & max/min distance for a given number of rides
four tables for the maximum number of rides exceeding a number of selected velocities, maximized for the number of rides and the distance,
one set of two tables for absolute speed
one set of two tables for average speed
four tables for the maximum number of rides exceeding a number of selected lengths, maximized for the number of rides and the distance,
one set of two tables for absolute distance
one set of two tables for average distance
a max/min/average summary for all rides per year
a max/min/average summary for all days per year
a table of totals for all distances per trip
a table of totals for all speeds per trip
a table of totals for all distances per day
a table of totals for all speeds per day
a table with the first and last day for all distances
a table with the first and last day for all speeds
four tables for the max/min speed & max/min days for a given number of distances
four tables for the max/min speed & max/min distance for a given number of days
four tables for the maximum number of days exceeding a number of selected velocities, maximized for the number of days and the distance,
one set of two tables for absolute speed
one set of two tables for average speed
four tables for the maximum number of days exceeding a number of selected lengths, maximized for the number of days and the distance,
one set of two tables for absolute distance
one set of two tables for average distance
a table with totals per weekday
a table with totals per month
a table of general totals per year
a table with first/last ride/trip per year
a table with usage of days per year
a table of totals for consecutive days
a table of totals for 24 hour periods
a table of totals for 365 day periods
a table of minimum number of rides needed for selected numbers of nationalities
two tables (one per trip, one per year) with the number of types, countries and nationalities encountered during the
trip/year, split in a total and a "new" column
four tables (two per type, two per nationality) with
the longest run of consecutive rides for a single type or nationality
the longest run of consecutive rides without a type or nationality
a table of pickup times per 4-hour interval per country
The trip/type/country/nationality/year output file: 'lift.h-h'
This file contains
four pages for every trip, containing the following tables:
on page 1:
a table with totals per day
a table of totals for all distances
a table of totals for all types
a table of totals for all countries
a table of totals for all nationalities
a table of totals for all speeds
a max/min/average summary for all rides and days
on page 2:
a table of totals for all waits
a table of the statistical waiting time distribution
a table of all in-ride waits per category
a table of waits per country, split in short and long waits
on page 3:
three tables of pick-ups
per nationality per country
per country per type
per nationality per type
on page 4:
a max/min/average summary for all types
a max/min/average summary for all nationalities
a max/min/average summary for all countries
two tables detailing distances per country
a table listing the (partial) country distances in the order they were passed
a table that just summarizes the distance per country
a 'Totals per type' separator page, followed by one page for every type, containing the following five tables:
a table of totals for all distances
a table of totals for all countries
a table of totals for all nationalities
a table of totals for all speeds
a max/min/average summary for the type
The table with totals per type is not included on the per-type pages, as it would contain just a single line
with the totals for that particular type. Instead the type is added into the heading of the totals-for-all-distances table.
a 'Totals per country' separator page, followed by one page for every country, containing the following four tables:
a table of totals for all waits, the country is added to the heading of this table
a table of the statistical waiting time distribution
a table with the distribution of departure times
a max/min/average summary for the country, containing two rows, one for the non border-crossing rides, and one for the border-crossing rides
a 'Totals per nationality' separator page, followed by one page for every nationality, containing the following tables:
a table of totals for all distances
a table of totals for all types
a table of totals for all countries
a table of totals for all speeds
a max/min/average summary for the nationality
The table with totals per nationality is not included on the per-nationality pages, as it would contain
just a single line with the totals for that particular nationality. Instead the nationality is added into the heading of
the totals-for-all-distances table.
a 'Totals per year' separator page, followed by five pages for every year, containing the following tables:
on page 1:
a table of totals for all distances
a table of totals for all types
a table of totals for all countries
a table of totals for all nationalities
a table of totals for all speeds
a max/min/average summary for all rides and days
The year is added into the heading of the totals-for-all-distances table.
on page 2:
a table of totals for all waits
a table of the statistical waiting time distribution
a table of all in-ride waits per category
a table of waits per country, split in short and long waits
on page 3:
three tables of pick-ups
per nationality per country
per country per type
per nationality per type
on page 4:
a max/min/average summary for all types
a max/min/average summary for all nationalities
a max/min/average summary for all countries
a table that summarizes the distance per country, split in non border-crossing and border-crossing rides
on page 5:
a table of totals for all distances per day
a table of totals for all speeds per day
a table with totals per weekday
a table with totals per month
a table with progressive totals for 24 hour periods
a table with the total period in days hitched during the year
However, some logical pages may overflow physical pages, which is most likely to happen with the page that contains your
most seen type, especially if you've visited a fair amount of countries.
The set of programs contains an optional program to remove all data that does not relate to the current trip from this file, leaving only
five pages for the current trip
a 'Totals per type' separator page, followed by one page for every type that appeared in the current trip,
a 'Totals per country' separator page, followed by one page for every country that appeared in the current trip,
a 'Totals per nationality' separator page, followed by one page for every nationality that appeared in the current trip, and
a 'Totals per year' separator page, followed by five pages per year for the year(s) of the current trip,
which is kinder to trees, if you insist on also keeping the results on paper.
The daily summary output file: 'days.h-h'
This file contains one table with a line for every calendar day of every trip, detailing
the number of the trip
the day in the trip
the distance hitched during the day
the (actual) driving time during the day
the average velocity for the day
the date
A follow-up program will process this file, putting the original single column data in four columns of 70 rows. It also
sorts the file into three additional orders, Distance, Time and Velocity. If the data is required to be in .RTF format,
this program is required.
The formatted input data output file: 'trip.h-h'
This file merely puts the input data into a neat table (zapping the odometer and place of departure & arrival columns).
The program will paginate trips that do not fit on A4 paper.