Analysis of email metadata #
Emails in my undergraduate university mail box are analyzed and a calendar plot is generated for time series analysis of the frequency of emails.
Mail drafts are excluded by ensuring the absence of "Drafts.mbox" in the scanned directories to filter out hundreds of drafts emails that are discarded but not deleted by the application.
The timezone of date and time parsed from the "Date" field is localized to
Asia/Hong_Kongto produce meaningful plots.
Note that mass emails from student union associations that student emails are subscribed to by default, of which there had been more than 10,000 over the years, were deleted and unsubscribed from in my previous attempt to solve a mysterious episode of lagging and crashing of the Mail application.
Calendar plot of frequency of emails #
While the frequency of emails on any given day is dominated by the number of received emails, which include many mass emails, it appears to be an indirect metric of my activities at the university.
There is not much activity in the summers compared to the fall semester and spring semester periods. For my four summers during the undergraduate program, I did a short-term exchange summer course, a career exploration program including company visits, a local research internship for a required degree component, and a final summer course for making up of credits. All of these summer activities did not involve many university emails, and did not last the whole summer period from June to August.
The semester in 2018 fall was a hectic one, in which I took a normal full loading of five Computer Science major requirement or free elective course. Other emails in that semester were most likely related to notices of residential hall activities, arrangement of various sports classes, and announcements of a course of which I was a student teaching assistant, which were all new to me.
The semester in 2019 spring was a quiet one, in which I went to exchange studies or rather a months long trip away from home. The activity level can be viewed as a baseline or background noise level due to mass emails. It is similar to that of 2018 spring, correctly suggesting that it was a semester with a relatively low workload.
The semester in 2020 spring was my last and filled with announcements of special arrangements and online lecture sessions or recordings due to novel coronavirus disease, which was later known as COVID-19. It can be observed from the calendar plot that the semester was shifted by close to a month because the university administration earlier in the year had false hopes in being able to delay face-to-face classes and resume later.
The red bar plot shows the frequency of emails by day of week. The count on a weekend day is less than half of a weekday day. This observation is unsurprising since (i) university mass email service runs only on weekdays; and (ii) university officers and some professor do not post announcements or reply to emails on Saturdays, and more so on Sundays.
The green bar plot shows the frequency of emails by hour of day.
The peak at 12am (hour 0) can be attributed to the numerous confirmation receipts of submission of my assignments near the time of deadline.
Afterwards, there is generally low activity in sleepy times. The exception is an outbreak of emails from 3am to 6am (hour 3, 4 and 5), during which the university mass email system service emails.
Most other emails, usually sent by humans, are delivered during daytime starting from 9am (hour 9) at which a work day starts, with a dip during lunch break for university office staff at 1pm (hour 13), and ending at 6pm (hour 18).
The frequency of emails approximately halves after office hours, with a slight gradual increase towards midnight.
The pie plot shows a breakdown of the domain names of all sender email addresses by frequency.
Other notes #
Plots for the most frequent "From" and "Subject" fields and plots with data for my other personal email accounts are not posted.
It is now a good time for analysis since my undergraduate studies have just been completed for fulfilment of graduation requirements.
As a side note, my strategy in handling emails had been notification on all emails except those automatically marked as read by filter rules, supplemented with manual unsubscribing of marketing materials. Recently I made AppleScript's for use along with the native Mail application to create Gmail filter rules with Gmail API for actions on matching emails including mark as read, archive and delete.
My room during a summer stay at university resident hall (2018) (f/3.625, 1/8, ISO 400)
The view outside the windows in a university resident hall room (2018) (f/4, 1/4, ISO 200)
A one-second long-exposure photo of a pedestrian path on the main road where the university hall is located (2018) (f/4.375, 1/1, ISO 200)
- Jan 2021 Added photographs taken at around university hall in Hong Kong.
- Jan 2021 Updated calendar plot with a new version that shows blinking numbers in grid cells in animation.
- Dec 2020 Updated calendar plot with updated calplot with month separating lines.
- Oct 2020 Added note on mail filtering.
- Oct 2020 Re-organized content in bullet points.