Time
No, not that Pink Floyd song off Dark Side (although that's a great song). I'm planning to smack down some Unix and computer time knowledge in today's post. Let's keep it on the practical side of things and we'll try not to stray too far into the metaphysical.
It's more complex than you might think
It's pretty common as a programmer to have to deal with time, and if you have then you know that it's a much more complex concept than it at first appears. If you haven't strayed into an area where you're dealing with the representation of time, then you've probably never given it much thought.
When you look closely at our current system of timekeeping and examine it as a system, without thinking in terms of the natural world, you start to notice that nothing lines up into nice little buckets for the poor programmer. Sure, it all makes perfect sense when displayed against the framework of the natural world, but take that away and it just looks batshit crazy. Let's break things down into units of time and how they're measured.
- seconds are divided into decimal bits
- minutes are 60 seconds
- hourse are 60 minutes
- days are 24 hours (most of the time, ignoring leap seconds for now)
- weeks are 7 days
- months are anywhere from 28-31 days, depending on which month of the year it is and whether it's a leap year or not
- Years have 12 months and either 365 or 366 days
- A leap year is a year that is divisible by 4, but not by 100, unless it is also divisible by 400
- Every 15 degrees around the globe we have a timmezone, which is generally one hour off from it's neighbor in local time, except for those places which use dailight savings time during the summer, and the middle of the pacific where you might travel accross a magical boundary line one day into the past.
There aren't a lot of decimals or binaries in all of that, and there are some interesting rules to follow. It kind of leaves one in awe of the human brain to think that our mushy grey matter can intuitively grasp such an inconsistent standard, which to a computer must look like someone has been spiking the punch at a junior high prom.
One might, at first glance, think that dealing with timezones (as an example) is not really all that hard. Well, what if it's March 1st at 2AM and you're living in UTC+5? What does that translate to in UTC time? Depends on what year it is.
Enter the Unix epoch
The Unix epoch is nothing more than an agreed upon moment, coinciding with the date and time of January 1st, 1970 at 00::00:00 AM UTC. In Unix we use timestamps to track file creation and modification times, which are a large signed integer which counts seconds from the Unix epoch. This greatly simplifies timekeeping on Unix systems, as we just have to count seconds. Need to order a bunch of blog posts in an aggregator by time of publication? Easiest way to do it is to convert to an i64 referenced to the epoch. So rather than a complex mathematical system to compare date-times all we really need is a way to convert from a human readable time representation to an i64 referenced to the epoch, and back again.
What about Leap seconds?
That's a good question. For the unaware, our current leap year system isn't wholly adequate to keep our clocks in sync with the cosmos and planetary movement, so we occasionally add in a leap second to fix the discrepancy. But on Unix systems we fudge it a bit. Rather than having a day with 60 * 60 * 24 seconds plus 1 leap second we generally just count the last second in the day twice. This does mean, however, that on those days there is a very ambiguous window between 11:59:59 and 00:00:00, where the timestamps cannot be considered an accurate representation for ordering purposes.
How does this conversion work, anyway?
I'm glad that you asked. I've implemented datetime containers in Rust a few times now, but I'm a little tired of Rust and want to work in Zig for a bit. While Zig doesn't have all of the gaurantees regarding memory safety that Rust has, both languages benefit from having very strong typing including algabreic data types, which goes a surprisingly long way towards turning runtime errors into compile time errors. In fact, if you make a habit of touching the heap as little as possible I think Zig comes damn near Rust in terms of enforcing safety and correctness. So let's indulge for a moment and take a peak at a datetime container in Zig, with methods to convert to and from i64 timestamps. We're going to start with this year nonsense and work down from there to months and days.
const std = @import("std");
const debug = std.debug;
const testing = std.testing;
const SECONDS_PER_DAY = @import("main.zig").SECONDS_PER_DAY;
pub const YearTag = enum(u1) {
normal,
leap,
fn new(year: i32) YearTag {
return if (@rem(year, 4) == 0 and (@rem(year, 100) != 0 or @rem(year, 400) == 0)) .leap else .normal;
}
};
This bit of code is just an enum which tells us whether we're in a leap year or
not. The new
function takes in i32, the year, and returns either .normal
of
.leap
depending on the math. We're going to use this enum as a tag to create
our Year
type, which is a tagged union, so that whether or not a given year is
a leap year or not is encoded in the type. As a quick aside before looking at that,
I want to point out that YearTag
is an enum(u1)
, meaning that it is represented
by a one bit integer. Combined with other language constructs such as packed
structs, Zig allows you to pack the maximum amount of data into the minimum memory
footprint. Aside over, let's look at our Year
union.
pub const Year = union(YearTag) {
normal: i32,
leap: i32,
const Self = @This();
pub fn new(year: i32) Self {
return switch (YearTag.new(year)) {
.normal => Self{ .normal = year },
.leap => Self{ .leap = year },
};
}
}
Since we want to break some of this math down into bite sized chunks, we're going
to use our Year
tag to give us a function which returns the number of days in
any given year. Since we want to be able to convert to seconds, it's also useful
to provide a function which gives to total number of seconds in a given year.
pub fn days(self: Self) u16 {
return switch (self) {
.normal => 365,
.leap => 366,
};
}
pub fn seconds(self: Self) i64 {
return @as(i64, self.days()) * SECONDS_PER_DAY;
}
We're going to round out our little year module with a function which get's the number portion of our data type, functions which give us the next or previous year, and a function which pretty-prints the year for us which we can use in Zig's format strings.
pub fn get(self: Self) i32 {
return switch (self) {
.normal => |year| year,
.leap => |year| year,
};
}
pub fn next(self: Self) Self {
return Self.new(self.get() + 1);
}
pub fn previous(self: Self) Self {
return Self.new(self.get() - 1);
}
pub fn format(
self: Self,
comptime fmt: []const u8,
options: std.fmt.FormatOptions,
writer: anytype,
) !void {
_ = fmt;
_ = options;
const year = self.get();
if (year > 0) {
try writer.print("{d:0>4}", .{@intCast(u32, year)});
} else {
try writer.print("{d:0>4}", .{year});
}
}
};
I want to point out that you could very well use an i32 by itself to represent the year in your datetime container. Doing it this way, by leveraging a strong type system, makes it harder to represent gibberish data. It also takes care of some of the math that we'll be dealing with later, which is going to shorten our conversion functions considerably. Let's move on then, to months.
Months
We have twelve months on our calendar, and we want to represent them in a way that
is human readable while also making sense to the processor. We also want to move
some of our logic into our month module so that we won't have to deal with it
later, just like we did with our year module. The type I'm going to reach for this
time is just an enum, not a tagged union, because we don't have an extra numerical
component to represent. So, like our Year
union, I'm providing methods to return
the number of days, number of seconds, next and previous months.
pub const Month = enum(u4) {
january = 1,
february = 2,
march = 3,
april = 4,
may = 5,
june = 6,
july = 7,
august = 8,
september = 9,
october = 10,
november = 11,
december = 12,
const Self = @This();
pub fn days(self: Self, year: Year) u5 {
return switch (@enumToInt(self)) {
1, 3, 5, 7, 8, 10, 12 => 31,
2 => switch (year) {
.normal => 28,
.leap => 29,
},
else => 30,
};
}
pub fn seconds(self: Self, year: Year) u32 {
return @as(u32, self.days(year)) * SECONDS_PER_DAY;
}
pub fn next(self: Self) ?Self {
const num = @enumToInt(self);
return if (num < 12) @intToEnum(Self, num + 1) else null;
}
pub fn previous(self: Self) ?Self {
const num = @enumToInt(self);
return if (num > 1) @intToEnum(Self, num - 1) else null;
}
};
Now, we're almost ready for our actual DateTime
container, but it's probably a
good idea to give some thought to how we're going to handle time zones first.
TimeZones in Zig
Let's lay some groundwork. Our TimeZone is going to be expressed as either UTC,
or as a positive or a negative offset from UTC. Whenever something can be one of
several different things that says "I'm an enum!", and since we have tagged unions
in Zig we can then encode the data and it's meaning (the tags) right into the types.
Notice that the HoursMinutes
struct is not marked pub? We don't really want or
need it cluttering up the public API.
pub const TimeZoneTag = enum(u1) {
utc,
offset,
};
pub const Sign = enum(u1) {
positive,
negative,
};
const HoursMinutes = struct {
hours: u4,
minutes: ?u6,
};
pub const Offset = union(Sign) {
positive: HoursMinutes,
negative: HoursMinutes,
};
That's pretty nice already, but we can add some logic into Offset
which will
again shorten our conversion functions later.
pub const Offset = union(Sign) {
positive: HoursMinutes,
negative: HoursMinutes,
const Self = @This();
pub fn new(hours: i8, minutes: ?u6) ?Self {
if (hours > 12 or hours < -12) {
return null;
}
if (minutes) |m| {
if (m > 59) return null;
if (hours == 0 and m == 0) return null;
} else if (hours == 0) return null;
if (hours < 0) {
const h = @intCast(u4, @as(i8, hours) * -1);
return Self{ .negative = .{ .hours = h, .minutes = minutes } };
} else {
return Self{ .positive = .{ .hours = @intCast(u4, hours), .minutes = minutes } };
}
}
pub fn asSeconds(self: Self) i64 {
return switch (self) {
.positive => |ofs| blk: {
var seconds = @as(i64, ofs.hours) * 3600;
if (ofs.minutes) |m| seconds += (@as(i64, m) * 60);
break :blk seconds;
},
.negative => |ofs| blk: {
var seconds = @as(i64, ofs.hours) * 3600;
if (ofs.minutes) |m| seconds += (@as(i64, m) * 60);
break :blk seconds * -1;
},
};
}
};
So now with the groundwork layed, we can go from those smaller types to the actual
TimeZone
container, a tagged union which is either .utc
or .offset
, where
the .offset
variant contains it's data.
pub const TimeZone = union(TimeZoneTag) {
utc: void,
offset: Offset,
const Self = @This();
pub fn new(hours: ?i8, minutes: ?u6) ?Self {
return if (hours) |h| blk: {
if (h == 0) {
break :blk .utc;
} else if (Offset.new(h, minutes)) |ofs| {
break :blk Self{ .offset = ofs };
} else {
break :blk null;
}
} else if (minutes) |m| Self{ .offset = Offset.new(0, m).? } else .utc;
}
pub fn format(
self: Self,
comptime fmt: []const u8,
options: std.fmt.FormatOptions,
writer: anytype,
) !void {
_ = fmt;
_ = options;
switch (self) {
.utc => try writer.writeAll("Z"),
.offset => |ofs| switch (ofs) {
.positive => |p| {
try writer.print("+{d:0>2}", .{p.hours});
if (p.minutes) |m| {
try writer.print(":{d:0>2}", .{m});
}
},
.negative => |n| {
try writer.print("-{d:0>2}", .{n.hours});
if (n.minutes) |m| {
try writer.print(":{d:0>2}", .{m});
}
},
},
}
}
};
Now with all of these smaller container types, which encode a lot of our program
logic including bounds (aren't strongly typed languages great?) it's time to go
for it and put it all together with days, hours, minutes and seconds and create
our DateTime
container.
DateTime in Zig
pub const DateTime = struct {
year: Year,
month: Month,
day: u8,
hour: ?u5,
minute: ?u6,
second: ?u6,
tz: TimeZone,
};
So yeah, we're actually using all of those types here. The previous work wasn't
entirely in vain, and we now have a DateTime
representation. Note that we're
allowing for some variable precision here by making hours, minutes and seconds
optionals. But wait, didn't I say that there were going to be conversions to an
from i64? Oh yes, I did all right. Let's tackle converting to a timestamp first.
We're going to keep a running total of seconds and work through years, months,
days, hours etc. The comments in the code should guide you through if it's not
already clear.
fn toTimestampNaive(self: Self) i64 {
var seconds: i64 = 0;
// If the year is before 1970, we have to work
// backwards. For simplicity, we'll count *past* the date
// to the beginning of the year, then work forwards again.
if (self.year.get() < 1970) {
var year = Year.new(1970);
while (year.get() != self.year.get()) {
// get the previous year and subtract the total seconds in it
year = year.previous();
seconds -= year.seconds();
}
} else if (self.year.get() > 1970) {
var year = Year.new(1970);
// get the next year and add all of it's seconds
while (year.get() != self.year.get()) {
seconds += year.seconds();
year = year.next();
}
}
// Until we get to the current month, we'll advance through each
// month starting with January and add all of it's seconds
var month = Month.january;
while (month != self.month) {
seconds += month.seconds(self.year);
month = month.next().?;
}
// The days begin numbering with 1, so on the 5th we have had four full
// days plus some remainder. So we take self.days - 1 for our calculation
seconds += @as(i64, self.day - 1) * SECONDS_PER_DAY;
if (self.hour) |h| {
seconds += @as(i64, h) * 3600; // 60 * 60 = 3600
}
if (self.minute) |m| {
seconds += @as(i64, m) * 60;
}
if (self.second) |s| {
seconds += s;
}
return seconds
}
pub fn toTimestamp(self: Self) i64 {
var seconds = self.toTimestampNaive();
// if we have an offset apply it here
if (self.getOffset()) |ofs| seconds -= ofs.asSeconds();
return seconds;
}
NOTE: The
toTimestampNaive
function is going to be used later on when we need to calculate the day of the week. If we were to do this calculation after taking the TimeZone into account, then the result would be accurate for UTC but possibly inaccurate for Local time.
For our conversion going the other direction, I'm going to skip the part which deals with negative integers as those timestamps are all well in the past. Just know that for a real library it is something that would have to be dealt with, because we can't assume we're never going to see a date before 1970. We might.
This op is the reverse, so instead of starting with a var seconds = 0
we're
going to start with the timestamp itself, then whittle it down until we get to
our final DateTime
representation.
pub fn fromTimestamp(ts: i64) Self {
if (ts < 0) {
// skipped for brevity
} else if (ts > 0) {
var seconds = ts;
// We start at 1970-01-01 00:00:00 and count forward
var year = Year.new(1970);
// Until we reach a point where the number of seconds left is less
// than what is in the year, we take a year's worth of seconds off
// and advance through the years
while (year.seconds() < seconds) {
seconds -= year.seconds();
year = year.next();
}
// Do the same with months as we did with years, but we can start
// with January
var month = Month.january;
while (month.seconds(year) < seconds) {
seconds -= month.seconds(year);
month = month.next().?;
}
// In Zig, we actually care about the difference between remainder
// and modulo, so we're using `@divTrunc` and `@rem` compiler builtins
// to deal with the fact that we are technically doing math with signed
// integers, even though this code branch is all positive int's.
const day = @divTrunc(seconds, SECONDS_PER_DAY) + 1;
seconds = @rem(seconds, SECONDS_PER_DAY);
const hour = @divTrunc(seconds, SECONDS_PER_HOUR);
seconds = @rem(seconds, SECONDS_PER_HOUR);
const minute = @divTrunc(seconds, 60);
seconds = @rem(seconds, 60);
return Self{
.year = year,
.month = month,
.day = @intCast(u8, day),
.hour = @intCast(u5, hour),
.minute = @intCast(u6, minute),
.second = @intCast(u6, seconds),
.tz = .utc,
};
} else {
// If the timestamp is not less than zero or greater than zero, then
// by definition that means it's zero. So we return the Unix epoch.
return Self{
.year = Year.new(0),
.month = .january,
.day = 1,
.hour = 0,
.minute = 0,
.second = 0,
.tz = .utc,
};
}
}
Wrapping up
There are some other things that would be nice to have in a little code library
such as this. We want to be able to compare DateTime
instances, and we want to
be able to sent the entire struct into a format string and get something nice and
human readable back. We might want to be able to create an instant using now()
.
And of course, we haven't dealt with days of the week yet.
For weekdays we're just going to create an enum. Actually getting the weekday is
pretty easy since we know what day of the week the epoch fell on (Thursday). We
just take our constant, SECONDS_PER_DAY
, which is 60 * 60 * 24, and we divide
the full timestamp by that number, which gives us the total days since the epoch.
We then get the remainder of that number divided by 7.
pub const WeekDay = enum(u3) {
thursday = 0,
friday,
saturday,
sunday,
monday,
tuesday,
wednesday,
};
// further down, in the `DateTime` struct definition
pub fn weekday(self: Self) WeekDay {
// First convert to timestamp
const ts = self.toTimestampNaive();
// Now get the number of days, ignoring any remainder
const days = @divTrunc(ts, SECONDS_PER_DAY);
// By taking the remainder here, we'll get a number from 0-6, which we
// can then convert to a `Weekday` using the `@intToEnum` Zig compiler
// builtin.
return @intToEnum(WeekDay, @rem(days, 7));
}
As for comparison, Zig does not have operator overloading, so we're never going
to be able to take two DateTime
structs and compare them using <
, >
or =
operators. But we can do ordering and simulate the comparison operators a bit.
pub const Comparison = enum {
gt,
lt,
eq,
};
// further down, we want to compare two `DateTime` structs
pub fn compare(self: Self, other: Self) Comparison {
const a = self.toTimestamp();
const b = other.toTimestamp();
return if (a > b) .gt else if (a < b) .lt else .eq;
}
Note that since we apply any timezone offsets as part of the conversion to i64,
we can just do a straight comparison here. A sorting algorithm for a collection
of DateTime
structs could be built up from this function. I'm leaving that for
now as outside of the scope of this article.
For a now
function, we can just call the Zig std.time.timestamp()
function
and parse that into a DateTime
struct using our conversion method from earlier.
pub fn now() Self {
return Self.fromTimestamp(std.time.timestamp());
}
And finally, pretty printing in Zig format strings. Now, since we've already
implemented the format
function on some of our subtypes, we get to re-use that
here.
pub fn format(
self: Self,
comptime fmt: []const u8,
options: std.fmt.FormatOptions,
writer: anytype,
) !void {
_ = fmt;
_ = options;
try writer.print("{s}-{d:0>2}-{d:0>2}", .{
self.year, @enumToInt(self.month), self.day,
});
if (self.hour) |h| {
try writer.print("T{d:0>2}", .{h});
if (self.minute) |m| {
try writer.print(":{d:0>2}", .{m});
if (self.second) |s| {
try writer.print(":{d:0>2}", .{s});
}
}
}
try writer.print("{s}", .{self.tz});
}
What else?
If you look closely at the format
function you'll notice that it's output is
in ISO8601 format (YYYY-MM-DDTHH:MM:SSZ). One could expand on this to provide
formatting functions for all of the other various standards for time representation.
Something else that I left completely out of scope for this article was parsing
a string into a DateTime
struct. This is something that becomes a bit of a rats
nest when you consider all of the different date and time formats that are in
common use. There's disagreement about whether the day or the month should be
placed first, depending on what country you were born in. Some businesses use
weeks rather than months, which gets complicated by the fact that the first of
the year falls on a different weekdays each year. There's ordinal dates, where
we leave out months and weeks entirely and just count days. And there are a lot
of different ways to represent time zones. Probably the best advice would be to
limit the scope of any given library to a small subset of the different schemes
in common use, lest you blow someone's binary size up.
Code for this article
The full code for this article sits on my Gitea instance here.
2024-01-29