Working with Time Zones & Daylight Saving Time in pandas 🕑
In most of the USA (plus a few other places in North America), Daylight Saving Time began on Sunday, March 12 at 2:00am.
So what is Daylight Saving Time, why should you care about it, and how is it handled by pandas? Let’s find out!
To start, we need to create some example data. We’ll use the date_range function to create 6 times starting on March 12 at 4:00am with an hourly frequency (abbreviated as "H"), and then convert it to a pandas Series:
You might notice that nowhere in the data is the time zone specified! This is known as "timezone-naive" data.
If you were collecting sales data for a local coffee shop, using timezone-naive data would likely be fine since it’s all from the same location and it’s never being collected overnight.
But if you were collecting rainfall data across a continent, it would be critical to specify the time zone of your data!
Localizing to UTC
To specify the time zone for our existing Series, we’ll use the tz_localize method and set it to "UTC":
UTC isn’t actually a time zone, rather it’s the standard around which all time zones worldwide are based. UTC doesn’t change based on Daylight Saving Time, which is why it’s often used internally for data storage.
Our new Series is considered "timezone-aware" data, which is why "+00:00" has been appended to all of the times. That’s called the "UTC offset", which is the difference between a given time and UTC. But since we’ve set the time zone to UTC, the offset is always zero.
Converting to US Eastern Time
To convert our Series to US Eastern Time (which is officially known as "America/New_York"), we’ll use the tz_convert method:
Notice that the first three times have an offset of -05:00, and the last three times have an offset of -4:00.
That’s because on March 12 at 2:00am (when Daylight Saving Time started), the US Eastern Time Zone shifted from Eastern Standard Time (known as "EST" or "UTC-5") to Eastern Daylight Time (known as "EDT" or "UTC-4").
Thus, there’s no 2:00am local time in US Eastern Time on March 12, 2023.
When Daylight Saving Time ends (in the US) on November 5, 2023, there will actually be two instances of 1:00am local time:
Thus from mid-March to early November every year, US Eastern Time is 4 hours behind UTC, and from early November to mid-March, US Eastern Time is 5 hours behind UTC.
Keep in mind that only some countries observe Daylight Saving Time, and they also start and end DST on different dates. 🤦♂️
As such, we can be grateful that DST is handled by pandas automatically... all thanks to the one guy in California who maintains the time zone database used by basically every computer system in the world!
If you work with datetime data in pandas, hopefully this has given you some insights about how to work with time zones. (Here’s the code from this post, which you can play around with!)
Otherwise, I hope this has at least given you a useful introduction to UTC, time zones, and Daylight Saving Time!