During a months long exercise to reorganize my computer files earlier this year, I cleaned up my internet bookmarks, which I have been collecting in the Chrome browser since around 2009. The bookmarks were imported in 2016 to Safari, which became my main browser.
The branch of bookmarks in Chrome were merged into that in Safari after exporting both collections in the Netscape bookmark file format, which can be easily worked with. An HTML parser was used to find all
<a> tags for links and
<h3> tags for folder titles.
During the clean up, thousands of unsorted bookmarks in my legacy messy folder structures are manually categorized into a newly devised flat structure of folders with no sub-folders. The process was sped up with the use of searching by domain name. (Later, when adding a new bookmark in browser, the designated folder can be easily selected by opening the drop-down menu and typing the folder name.)
After some quick manual categorization, a pie plot is generated to show the number of bookmarks in each folder.
Pie plot of frequency of bookmarks in each folder
Limitation of analysis #
Note that the figures are not necessarily an accurate representation of the relative importance of each of the folders for the following reasons.
- The categories that the folders represent are arbitrary in the size of their scope. Folders with lower counts of bookmarks may need merging.
- There is no unified criteria for whether a link should be included in the bookmarks.
Many examples of links #
Nonetheless, the results serve as an inspiration for a write up of some of my major Internet influences, especially those that were a few years ago. While I try to include links that are representative of each folder, the following selection of over a hundred links are still quite random.
Articles includes blog articles, essays, technical references and reviews from diverse sources. Selected informative posts include Bloomberg on Hong Kong dollar peg (2020), Ars Technica on submarine cables infrastructure (2016), Priceonomics on the invention of Auto-Tune (2016), Economist on city-state Singapore (2015) and Paul Graham on hierarchy of disagreement (2008).
Wikipedia includes Wikipedia entries of various topics, out of the current size of over 17GB bzip2-compressed download-able database including articles and metadata of media files. There are a few Wikimedia links, such as a blood values reference chart for blood tests.
HKU includes materials on the university website collected during my undergraduate studies, including information on admission, accommodation, scholarship, exchange, courses, careers, sports, and a course mark to grade mapping table found by searching for the meaning of a course grade descriptor.
GitHub includes a lot of old GitHub repositories that I bookmarked, some GitHub gists and a few links to GitHub Docs including one on removing data permanently. There is a list of starred repositories in my user profile.
HKDSE includes resources for preparation of the public examination I took in 2016, including past and sample exam papers, blog articles on English language and revision of science concepts, and CASIO calculator programs which brings nostalgia for my high school mathematics. Some of the resources are for other curricula, including IB sample exam papers and other's AP Chemistry notes which were prepared in a similar vein to the LaTeX notes I have for my studies.
.edu links from mostly university and some academic journals as well, including physics concepts, life science topics, calculus problems, computer science illustrations, pop song analysis, English text corpus, speech accent archive and Latin words in English.
Reddit include posts on various sub-reddits on topics that include technology, questions and answers, and funny or interesting stuff in general. I do not visit the site as much as the rank of the folder in this list indicates.
Hacker News includes posts on a site operated and moderated by Y Combinator. Links are usually saved by up-voting the post on the site after logging in. It is a site that I visit almost daily. I tried posting and have a rough idea of what the crowd likes a few years ago. Technology background, liberal mindset, criticality and meta-analysis are often expected in the comments, which vary in quality.
Server includes resources for hosting and networking, including online probing tool by Hurricane Electric, cheap hosting forum, nginx documentation, information security articles and various tutorials and blog articles.
Zhihu includes posts on the mainland Chinese social media site on miscellaneous topics, including technology, music, life and news. With posts of funny anecdotes that may or may not be made up, the site is more like Reddit than Quora. Some of the highly up-voted posts on politics and news events are absolutely politically and ideologically correct with narratives put forward by the ruling government. Nonetheless, some posts are well-researched and informative.
HK includes websites of Hong Kong banks and public services. Other links include Hong Kong building projects, Hong Kong virtual communities niche encyclopedia, Hong Kong meteorological interest site and CUHK Chinese lexicography tool.
YouTube includes YouTube videos of various types, including most disliked videos, unofficial music video, swimming video, talk video, documentary video, historical video and DIY video. Videos are usually saved by liking the video on the site after logging in instead of bookmarking, so the importance of the site is likely underrepresented in this list.
Stackexchange includes threads on various topics, including Unix, TeX, Apple, Security, Physics and Math.
Blogspot includes various blog articles on the Blogspot platform acquired by Google, including on Google's blogs and Chinese blogs on political news, philosophy and technology. Examples include a blog article list for programmers and a blog of a random of collection of poems.
Media includes news and journalistic articles on various outlets, including BBC, The Guardian, Financial Times, New York Times, The Atlantic and Vox, which generally have a center towards left bias on the political spectrum of news sources. This is consistent with the left-right position in my test results on The Political Compass.
ArchLinux includes rich resources on the Arch Linux website for its users, which are useful for general Linux users, including articles on systemd, chroot, grub, WireGuard, solid state drive, power management, NVIDIA card, Pacman tips, Arch Linux Archive, unofficial user binary repositories, booting Arch Linux on Mac, script for wakeonlan after suspend and a comparison of Linux distributions.
Hardware includes official product websites for various gadgets including printer, DSLR camera, solid state drive, memory sticks, SD card, portable hard drive, digital voice recorder, watch and other devices of various brands including Canon, Transcend, Toshiba, Western Digital, SanDisk and other less well-known brands. Other useful links include a LCD panel data sheet website and an article on choosing a GPU for deep learning.
StackOverflow includes questions and answers and code snippets for programming languages that I used, including C++, Python and Bash. Links to other sites in the Stack Exchange network, such as Server Fault and Super User, are included as well.
Apple includes links to Apple official website for developer resource, iPhone models compare, check coverage tool, software beta program sign up, iMessage deregister and support articles on firmware password, screenshot shortcut, startup shortcut, changing display color profile and other resources. Other links include MacRumors buyer's guide and summaries of rumored, beta or final releases of Apple hardware or software products such as iPhone and macOS.
Govhk includes websites on the
.gov.hk top-level domain of various governmental services and resources including live weather forecast, e-legislation, public library catalog, public pools, public ferries, contagious diseases, student finance, general holidays and railway survey map.
Math includes articles on Wolfram MathWorld, MathIsFun, Purplemath, math blog, math computing and macOS Grapher application. Other links include integral table.
TW includes links on the
.tw top-level domain, including various Taiwanese technology blogs and news articles.
OpenWrt includes links to OpenWrt project website and other related resources, including building OpenWrt from source, WireGuard setup and power consumption database of routers and other appliances.
Australia includes various resources on visiting and living in Australia, including immigration document checklist tool, tourist tax refund scheme, city climate records, telephone area codes and mobile network information.
Other notes #
My bookmarks before 2009 were most likely lost as I replaced the old Windows desktop computer that I used when I was a child. An interesting example from my recollection is McDonald's Video Game, which is a satirical game that I used to play when Adobe Flash games were popular.
There is the metaphor in the term bookmark for an internet URL that the web is a book and a webpage is a page of a book. I read a lot of articles on different topics in the form of web pages. Does that mean I read a lot of books? Does the term book mean only formally published books?
For semi-related illustration, the following photograph shows the cover of a published book I read years ago.
Me holding a copy of the book Animal Farm by George Orwell off the bookshelf in a co-working office in Shanghai, China (2018) (f/3.625, 1/83, ISO 400)
- Jan 2021 Added original photograph of books.
- Nov 2020 Added paragraph on older bookmarks.
- Nov 2020 Improved some phrasings.
- Oct 2020 Added a few more links from bookmarks.