<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>In Pursuit of Simplicity</title>
    <link>https://ramsayleung.github.io/en/</link>
    <description>Recent content on In Pursuit of Simplicity</description>
    <image>
      <title>In Pursuit of Simplicity</title>
      <url>https://ramsayleung.github.io/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</url>
      <link>https://ramsayleung.github.io/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</link>
    </image>
    <generator>Hugo -- 0.146.7</generator>
    <language>en</language>
    <copyright>See this site&amp;rsquo;s source code here, licensed under GPLv3 ·</copyright>
    <lastBuildDate>Tue, 06 Jan 2026 20:26:33 -0800</lastBuildDate>
    <atom:link href="https://ramsayleung.github.io/en/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>A Genius AI Product design: Reddit translation</title>
      <link>https://ramsayleung.github.io/en/post/2026/a_genius_ai_product_design_reddit_translation/</link>
      <pubDate>Tue, 06 Jan 2026 19:46:00 -0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2026/a_genius_ai_product_design_reddit_translation/</guid>
      <description>&lt;h2 id=&#34;the-universal-ai-craze&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; The Universal AI Craze&lt;/h2&gt;
&lt;p&gt;Since OpenAI released ChatGPT, there has been a massive rush to jump on the AI bandwagon. Over the past two years, it seems every website and application has been desperate to embrace AI.&lt;/p&gt;
&lt;p&gt;Music apps have AI, for example Spotify &lt;a href=&#34;https://newsroom.spotify.com/2024-04-07/spotify-premium-users-can-now-turn-any-idea-into-a-personalized-playlist-with-ai-playlist-in-beta/&#34;&gt;launched AI playlists&lt;/a&gt; &lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;; the programming Q&amp;amp;A site Stack Overflow introduced &lt;a href=&#34;https://stackoverflow.com/ai-assist&#34;&gt;AI Assist&lt;/a&gt; &lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;; browsers are integrating AI (Firefox, &lt;a href=&#34;https://www.google.com/chrome/ai-innovations/&#34;&gt;Chrome&lt;/a&gt; &lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;); note-taking apps like Notion have built &lt;a href=&#34;https://www.notion.com/product/ai&#34;&gt;AI workflows&lt;/a&gt; &lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;; code hosting platform like GitHub is focusing on AI agent like copilot; The ebook management software Calibre adding an &amp;ldquo;&lt;a href=&#34;https://calibre-ebook.com/whats-new&#34;&gt;Asking AI&lt;/a&gt;&amp;quot; &lt;sup id=&#34;fnref:5&#34;&gt;&lt;a href=&#34;#fn:5&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;5&lt;/a&gt;&lt;/sup&gt; feature; Perhaps the most absurd example is Razer, the company that makes computer electronics, also releases &lt;a href=&#34;https://www.razer.com/ca-en/concepts/project-ava&#34;&gt;Project AVA&lt;/a&gt; &lt;sup id=&#34;fnref:6&#34;&gt;&lt;a href=&#34;#fn:6&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;6&lt;/a&gt;&lt;/sup&gt;, a 24/7 AI companion, designed to live right alongside you.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="the-universal-ai-craze"><span class="section-num">1</span> The Universal AI Craze</h2>
<p>Since OpenAI released ChatGPT, there has been a massive rush to jump on the AI bandwagon. Over the past two years, it seems every website and application has been desperate to embrace AI.</p>
<p>Music apps have AI, for example Spotify <a href="https://newsroom.spotify.com/2024-04-07/spotify-premium-users-can-now-turn-any-idea-into-a-personalized-playlist-with-ai-playlist-in-beta/">launched AI playlists</a> <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>; the programming Q&amp;A site Stack Overflow introduced <a href="https://stackoverflow.com/ai-assist">AI Assist</a> <sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>; browsers are integrating AI (Firefox, <a href="https://www.google.com/chrome/ai-innovations/">Chrome</a> <sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>); note-taking apps like Notion have built <a href="https://www.notion.com/product/ai">AI workflows</a> <sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>; code hosting platform like GitHub is focusing on AI agent like copilot; The ebook management software Calibre adding an &ldquo;<a href="https://calibre-ebook.com/whats-new">Asking AI</a>&quot; <sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> feature; Perhaps the most absurd example is Razer, the company that makes computer electronics, also releases <a href="https://www.razer.com/ca-en/concepts/project-ava">Project AVA</a> <sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>, a 24/7 AI companion, designed to live right alongside you.</p>
<p>Everyone truly want to be &ldquo;AI-native&rdquo;.</p>
<p>It would be good if these implementations were useful, but a pile of websites and apps are just shoehorning AI into their products. These so-called AI features are often nothing more than a generic chatbox that lets you talk to an AI, often without even feeding the current page content as context. The user experience is terrible.</p>
<h2 id="reddit-s-ai-translation"><span class="section-num">2</span> Reddit&rsquo;s AI Translation</h2>
<p>Today I saw an <a href="https://www.theguardian.com/technology/2026/jan/03/reddit-overtakes-tiktok-uk-search-algorithms-gen-z">article</a> <sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> stating that Reddit overtaks TikTok to become the fourth most visited social media platform in UK. The platform has undergone huge growth over the last two years, with an 88% increase in the proportion of UK internet users it reaches.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-dd7d6" hidden>
    <label for="zoomCheck-dd7d6">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/reddit_overtake_tiktok.jpg"/> 
    
    
    </label>
</figure>

<p>A series of factors are behind its rise. However, a change in Google&rsquo;s search algorithms last year to prioritise helpful content from discussion forums appears to have been a significant driver.
In the AI era, users are increasingly turning towards human-written content, and Reddit is benefiting from this trend.</p>
<p>This article reminded me of a new feature Reddit recently launched. Reddit is utilizing Large Language Models (LLMs) to generate translated versions of its massive archive of posts into multiple languages, such as translating English posts into Chinese, Korean, Japanese, etc.</p>
<p>Now, when I search in Chinese on Google(I was searching &ldquo;Berserk TV&rdquo;), the search results display the translated versions of original Reddit posts. When I click on a result, it jumps directly to the translated Chinese version on Reddit:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-cba3f" hidden>
    <label for="zoomCheck-cba3f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/berserk_tv_in_chinese.jpg"/> 
    
    
    </label>
</figure>

<p>I can read the translated version directly:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-dcf9d" hidden>
    <label for="zoomCheck-dcf9d">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/berserk_post_tl_zh_hans.jpg"/> 
    
    
    </label>
</figure>

<p>Or click &ldquo;Show Original&rdquo; to read the source text.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-8aca7" hidden>
    <label for="zoomCheck-8aca7">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/berserk_post_show_original.jpg"/> 
    
    
    </label>
</figure>

<h2 id="genius-product-design"><span class="section-num">3</span> Genius Product Design</h2>
<p>At first glance, it seems unremarkable, but upon closer inspection, it is absolutely brilliant. It might be the most solid AI implementation I&rsquo;ve seen to date</p>
<h3 id="the-reddit-perspective"><span class="section-num">3.1</span> The Reddit Perspective</h3>
<p>First, this feature allows more users to read posts in different languages. For example, Chinese or Japanese users can read posts originally written in English, French, or Spanish. This significantly increases the universality of Reddit posts. Language is no longer a barrier, which helps Reddit capture user traffic in the AI era and attract new users.</p>
<p>More users mean more content, and richer content attracts even more users. In the AI era, content is the raw material for &ldquo;training&rdquo; AI models. You can&rsquo;t train an AI with just NVIDIA GPUs; you need data.</p>
<p>Compared to external translation tools like Google Translate, this is a seamless, page-level translation indexed by search engine. The quality is incredibly high. It is so natural and well-integrated that, at first use, it&rsquo;s hard to even detect that it&rsquo;s an output of machine translation.</p>
<p>The user experience is excellent. Its natural and practical nature perfectly solves the core pain point for non-native speakers accessing high-quality posts.</p>
<p>Crucially, as a user, I was completely unaware of the existence of &ldquo;AI.&rdquo;</p>
<p>It wasn&rsquo;t until I switched to an engineer&rsquo;s perspective that I realized Reddit used an LLM to translate the posts, and then had Google&rsquo;s search engine index the different language versions, displaying the translated results directly in Google Search.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-cba3f" hidden>
    <label for="zoomCheck-cba3f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/berserk_tv_in_chinese.jpg"/> 
    
    
    </label>
</figure>

<p>The best AI features are the ones where you don&rsquo;t even realize AI is there.</p>
<h3 id="the-google-perspective"><span class="section-num">3.2</span> The Google Perspective</h3>
<p>For Google, since the vast majority of Reddit posts are written by humans (Reddit mods generally hate bots or AI posting, and many subreddit rules explicitly ban AI-generated content under threat of mutes or bans), increasing the ranking weight of forum content ensures search quality.</p>
<p>Indexing posts in different languages provides Google with more high-quality search data.</p>
<p>So, Google gets to index a massive amount of fresh, high-quality, multilingual human content while improving search result quality, which aligns perfectly with their algorithm adjustment direction.</p>
<h3 id="the-flywheel-of-data-models-and-ecosystem"><span class="section-num">3.3</span> The Flywheel of Data, Models, and Ecosystem</h3>
<p>By breaking down language barriers for users and enriching search results for Google, Reddit has kickstarted a grander positive feedback loop.</p>
<p>In February 2024,
Reddit and Google announced an <a href="https://blog.google/inside-google/company-announcements/expanded-reddit-partnership/">expanded partnership</a> <sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>that includes a multi-million dollar data licensing agreement allowing Google to use Reddit&rsquo;s content for training its artificial intelligence (AI) models.
This deal is reportedly worth about $60 million per year.</p>
<p>In September 2025, reports indicated that Reddit is in talks with Google and OpenAI. They may negotiate new deals.</p>
<p>Therefore, this AI translation design is a powerful tool, achieving a &ldquo;1+1&gt;3&rdquo; effect. It even serves as leverage to negotiate with other large model vendors like OpenAI and Anthropic, because you can&rsquo;t train AI without data.</p>
<p>It forms a flywheel: More Users -&gt; More Content -&gt; Better AI Data -&gt; Better Translation/Search/AI Models -&gt; More Users.</p>
<p>This is a design that benefits the users, Reddit, Google, and even the entire AI field. It can truly be called a &ldquo;quadruple win.&rdquo;</p>
<h2 id="conclusion"><span class="section-num">4</span> Conclusion</h2>
<p>Successful AI implementation makes the technology invisible while making the user value prominent.</p>
<p>Compared to other websites that shoehorns AI into their products, Reddit&rsquo;s translation integrates AI seamlessly into the core product experience. It doesn&rsquo;t even tell the user that this is achieved via AI; it doesn&rsquo;t push the AI narrative.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://newsroom.spotify.com/2024-04-07/spotify-premium-users-can-now-turn-any-idea-into-a-personalized-playlist-with-ai-playlist-in-beta/">https://newsroom.spotify.com/2024-04-07/spotify-premium-users-can-now-turn-any-idea-into-a-personalized-playlist-with-ai-playlist-in-beta/</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://stackoverflow.com/ai-assist">https://stackoverflow.com/ai-assist</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://www.google.com/chrome/ai-innovations/">https://www.google.com/chrome/ai-innovations/</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://www.notion.com/product/ai">https://www.notion.com/product/ai</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://calibre-ebook.com/whats-new">https://calibre-ebook.com/whats-new</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p><a href="https://www.razer.com/ca-en/concepts/project-ava">https://www.razer.com/ca-en/concepts/project-ava</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p><a href="https://www.theguardian.com/technology/2026/jan/03/reddit-overtakes-tiktok-uk-search-algorithms-gen-z">https://www.theguardian.com/technology/2026/jan/03/reddit-overtakes-tiktok-uk-search-algorithms-gen-z</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p><a href="https://blog.google/inside-google/company-announcements/expanded-reddit-partnership/">https://blog.google/inside-google/company-announcements/expanded-reddit-partnership/</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>The Essence of Prompt Engineering is the Art of Asking Questions</title>
      <link>https://ramsayleung.github.io/en/post/2025/the-essence-of-prompt-engineering-is-the-art-of-asking-questions/</link>
      <pubDate>Sat, 25 Oct 2025 20:18:00 +0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2025/the-essence-of-prompt-engineering-is-the-art-of-asking-questions/</guid>
      <description>&lt;h2 id=&#34;introduction&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Introduction&lt;/h2&gt;
&lt;p&gt;In today&amp;rsquo;s AI-driven world, &amp;ldquo;Prompt Engineer&amp;rdquo; has become a buzzword.&lt;/p&gt;
&lt;p&gt;AI enthusiasts are eager to share prompts, study token control, and tweak temperature parameters.&lt;/p&gt;
&lt;p&gt;The classic meme from Linux founder Linus Torvalds&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Talk is cheap. Show me the code.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;has evolved into:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Code is cheap. Show me the prompts.&lt;/p&gt;&lt;/blockquote&gt;

&lt;figure&gt;
    
    
    &lt;input type=&#34;checkbox&#34; id=&#34;zoomCheck-f61ab&#34; hidden&gt;
    &lt;label for=&#34;zoomCheck-f61ab&#34;&gt;
    
    
    &lt;img class=&#34;zoomCheck&#34; loading=&#34;lazy&#34; src=&#34;https://ramsayleung.github.io/ox-hugo/talk_is_cheap_show_me_the_prompt.jpg&#34;/&gt; 
    
    
    &lt;/label&gt;
&lt;/figure&gt;

&lt;p&gt;Original tweet: &lt;a href=&#34;https://x.com/tunguz/status/1856045530951917763?lang=en&#34;&gt;https://x.com/tunguz/status/1856045530951917763?lang=en&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;But the core of effective AI interaction isn&amp;rsquo;t about technical jargon—it&amp;rsquo;s about &lt;strong&gt;&lt;strong&gt;how to ask questions effectively&lt;/strong&gt;&lt;/strong&gt;.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="introduction"><span class="section-num">1</span> Introduction</h2>
<p>In today&rsquo;s AI-driven world, &ldquo;Prompt Engineer&rdquo; has become a buzzword.</p>
<p>AI enthusiasts are eager to share prompts, study token control, and tweak temperature parameters.</p>
<p>The classic meme from Linux founder Linus Torvalds</p>
<blockquote>
<p>Talk is cheap. Show me the code.</p></blockquote>
<p>has evolved into:</p>
<blockquote>
<p>Code is cheap. Show me the prompts.</p></blockquote>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-f61ab" hidden>
    <label for="zoomCheck-f61ab">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/talk_is_cheap_show_me_the_prompt.jpg"/> 
    
    
    </label>
</figure>

<p>Original tweet: <a href="https://x.com/tunguz/status/1856045530951917763?lang=en">https://x.com/tunguz/status/1856045530951917763?lang=en</a></p>
<p>But the core of effective AI interaction isn&rsquo;t about technical jargon—it&rsquo;s about <strong><strong>how to ask questions effectively</strong></strong>.</p>
<p>Back in 2001, Eric S. Raymond and Rick Moen wrote the classic guide <a href="http://www.catb.org/~esr/faqs/smart-questions.html">How To Ask Questions The Smart Way</a>.</p>
<p>Originally written for technical seekers in the open-source community during the rise of hacker culture, its principles prove surprisingly applicable to today&rsquo;s AI interactions — <strong><strong>because whether seeking help from experts or large models, it&rsquo;s essentially about providing as much effective information as possible to get problem-solving answers quickly</strong></strong>.</p>
<p>As the guide states:</p>
<blockquote>
<p>The kind of answers you get to your technical questions depends as much on the way you ask the questions as on the difficulty of developing the answer.</p></blockquote>
<p>In my daily work, whether asking colleagues or domain experts, I continue to quickly obtain the answers I need by applying the philosophy from &ldquo;little book&rdquo;.</p>
<p>In this new AI era, I&rsquo;m applying the timeless wisdom of &ldquo;asking questions the smart way&rdquo; to the modern craft of prompt engineering.</p>
<h2 id="do-your-own-homework-first"><span class="section-num">2</span> Do Your Own Homework First</h2>
<blockquote>
<p>Try to find an answer by searching the archives of the forum or mailing list you plan to post to. Try to find an answer by searching the Web. Try to find an answer by reading the manual.</p></blockquote>
<p>AI is not your personal assistant, let alone a substitute for your own thinking.</p>
<p>If you toss out a vague &ldquo;write me a program&rdquo; without understanding the problem, the model can only guess.</p>
<p><strong><strong>Good prompts should demonstrate the effort you&rsquo;ve already made</strong></strong>:</p>
<blockquote>
<p>&ldquo;I tried to scrape a website using Python&rsquo;s requests library but got a 403 error. I&rsquo;ve checked my network connection and added a User-Agent header, but the problem persists. Do I need to handle cookies or consider JavaScript rendering?</p>
<p>Here&rsquo;s my error log:
&hellip;&rdquo;</p></blockquote>
<p>This not only helps AI pinpoint the issue more accurately but also <strong><strong>reduces its futile attempts</strong></strong>.</p>
<p>As Raymond advises:</p>
<blockquote>
<p>When you ask your question, display the fact that you have done these things first; this will help establish that you&rsquo;re not being a lazy sponge and wasting people&rsquo;s time.</p></blockquote>
<h2 id="describe-symptoms-not-guesses"><span class="section-num">3</span> Describe Symptoms, Not Guesses</h2>
<blockquote>
<p>Describe the problem&rsquo;s symptoms, not your guesses.</p></blockquote>
<p>Many people tend to preset conclusions in their prompts: &ldquo;My code has a memory leak,&rdquo; &ldquo;This API must have a bug.&rdquo;</p>
<p>But AI lacks context and cannot verify your assumptions.</p>
<p><strong><strong>Provide observable facts</strong></strong>:</p>
<blockquote>
<p>&ldquo;After running for 10 minutes, the program&rsquo;s memory usage increases from 100MB to 2GB without release. Here are my core code snippets and resource monitoring data.&rdquo;</p></blockquote>
<p>The most common observable facts are logs and runtime data; for UI issues (like unexpected CSS rendering), providing screenshots is extremely helpful.</p>
<p>The key is: <strong><strong>Your role is to present the evidence; let AI draw its own conclusions.</strong></strong></p>
<p>As the guide emphasizes:</p>
<blockquote>
<p>All diagnosticians are from Missouri. Show us.</p></blockquote>
<h2 id="goal-oriented-not-step-oriented"><span class="section-num">4</span> Goal-Oriented, Not Step-Oriented</h2>
<blockquote>
<p>Describe the goal, not the step.</p></blockquote>
<p>Users often fall into &ldquo;path dependency&rdquo;: clinging to a specific tool or method while forgetting the ultimate goal.</p>
<p><strong><strong>State the goal first, then the sticking point</strong></strong>.</p>
<p>Bad question:</p>
<ul>
<li><strong><strong>Step-Oriented</strong></strong>: &ldquo;How do I use <code>VLOOKUP</code> in Excel to match two columns of data?&rdquo;</li>
</ul>
<p>Good question:</p>
<ul>
<li><strong><strong>Goal-Oriented</strong></strong>: &ldquo;I want to merge two tables based on matching ID columns. I tried <code>VLOOKUP</code> but it returns <code>#N/A</code>. I&rsquo;ve confirmed both data columns are text format.&rdquo;</li>
</ul>
<p>This way, AI might suggest switching to <code>XLOOKUP</code>, <code>Power Query</code>, or even recommend using Python directly—**offering you a better solution, not just focusing on your specified step.**</p>
<h2 id="be-concise-specific-and-structured"><span class="section-num">5</span> Be Concise, Specific, and Structured</h2>
<blockquote>
<p>Volume is not precision.</p></blockquote>
<p><strong><strong>Text length ≠ Information density.</strong></strong></p>
<p>A 500-line code dump often carries less information density than a carefully refined, 10-line minimal reproducible example.</p>
<p>While large models can handle long contexts, <strong><strong>more noise means weaker effective signals</strong></strong>.</p>
<ul>
<li>Provide a minimal reproducible example</li>
<li>Specify the input, expected output, and actual output</li>
<li>Organize information using clear formatting (like code blocks, lists)</li>
</ul>
<p>For example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">Input: [1, 2, &#39;3&#39;, 4]
</span></span><span class="line"><span class="cl">Expected: Convert all to integers → [1, 2, 3, 4]
</span></span><span class="line"><span class="cl">Actual: int() conversion throws an error
</span></span><span class="line"><span class="cl">Question: How to safely convert this mixed-type list?
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="seek-key-guidance-not-complete-answers"><span class="section-num">6</span> Seek &ldquo;Key Guidance,&rdquo; Not &ldquo;Complete Answers&rdquo;</h2>
<p>While AI&rsquo;s capabilities far exceed traditional Q&amp;A scenarios, the key to efficient collaboration isn&rsquo;t &ldquo;full disclosure&rdquo; but &ldquo;step-by-step guidance.&rdquo;</p>
<blockquote>
<p>If you can&rsquo;t be bothered to do that, we can&rsquo;t be bothered to pay attention.</p></blockquote>
<p>In the AI era, although AI can directly output complete answers, a phased interaction pattern — first outline the approach, then select solutions, finally generate code  —is often more efficient, controllable, and reduces &ldquo;hallucinations.&rdquo;</p>
<p>This essentially treats AI as a collaborator rather than an oracle.</p>
<p>More efficient collaboration model:</p>
<ol>
<li>First, ask AI to explain the solution approach</li>
<li>Provide 2-3 feasible solutions</li>
<li>Compare the pros and cons of each (performance, maintainability, complexity, etc.)</li>
<li>After human confirmation of the direction, proceed with detailed implementation</li>
</ol>
<p>This &ldquo;phased collaboration&rdquo; significantly reduces hallucinations, improves controllability, and keeps humans in charge of key decisions.</p>
<h2 id="politeness-plus-closure-long-term-win-win"><span class="section-num">7</span> Politeness + Closure = Long-term Win-Win</h2>
<blockquote>
<p>Courtesy never hurts, and sometimes helps.</p></blockquote>
<p>Although AI has no emotions, structured politeness and feedback can shape higher-quality interactions:</p>
<ul>
<li>Clearly state the context at the beginning; express thanks at the end</li>
<li>If helped, provide follow-up feedback: &ldquo;Following your suggestion, the problem is solved. Thank you!&rdquo;</li>
</ul>
<p>While this feedback loop doesn&rsquo;t directly trigger online learning in current static AI models, informing the model that its solution worked simulates the process of <a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback">RLHF (Reinforcement Learning from Human Feedback)</a>.</p>
<p>The more you ask effectively, verify, and correct, the more AI &ldquo;becomes&rdquo; a collaborator that understands you.</p>
<p>This is the value of feedback.</p>
<hr>
<p>Additionally, saying &ldquo;please&rdquo; and &ldquo;thank you&rdquo; to AI might come in handy—just in case AI ever dominates human, it might spare you for having been polite:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-7f85c" hidden>
    <label for="zoomCheck-7f85c">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/be_polite_to_robot.jpg"/> 
    
    
    </label>
</figure>

<p>source: <a href="https://old.reddit.com/r/comics/comments/x8bcu9/be_polite_to_robots/">https://old.reddit.com/r/comics/comments/x8bcu9/be_polite_to_robots/</a></p>
<h2 id="conclusion-asking-questions-is-a-skill-and-an-attitude"><span class="section-num">8</span> Conclusion: Asking Questions is a Skill and an Attitude</h2>
<p>The core of &ldquo;How To Ask Questions The Smart Way&rdquo; isn&rsquo;t about &ldquo;techniques&rdquo; but attitude: respecting others&rsquo; time, respecting the boundaries of knowledge, and respecting the complexity of the problem itself.</p>
<p>In the AI era, we have unprecedented access to &ldquo;all-powerful instant experts,&rdquo; but lazy questions only yield mediocre answers.</p>
<p>Truly effective Prompt Engineers aren&rsquo;t those who memorize prompts, but those who understand how to build understanding with AI.
(Even though current AI remains essentially a probabilistic model.)</p>
<p>As the book wisely states:</p>
<blockquote>
<p>Good questions are a stimulus and a gift.</p></blockquote>
<h2 id="further-reading"><span class="section-num">9</span> Further Reading</h2>
<ul>
<li><a href="http://www.catb.org/~esr/faqs/smart-questions.html">How To Ask Questions The Smart Way</a></li>
<li><a href="https://www.chiark.greenend.org.uk/~sgtatham/bugs.html">How to Report Bugs Effectively – Simon Tatham</a></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>A Story About Bypassing Air Canada&#39;s In-flight Network Restrictions</title>
      <link>https://ramsayleung.github.io/en/post/2025/a_story_about_bypassing_air_canadas_in-flight_network_restrictions/</link>
      <pubDate>Fri, 10 Oct 2025 15:29:00 +0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2025/a_story_about_bypassing_air_canadas_in-flight_network_restrictions/</guid>
      <description>&lt;h2 id=&#34;prologue&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Prologue&lt;/h2&gt;
&lt;p&gt;A while ago, I took a flight from Canada back to Hong Kong - about 12 hours in total with Air Canada.&lt;/p&gt;
&lt;p&gt;Interestingly, the plane actually had WiFi:&lt;/p&gt;

&lt;figure&gt;
    
    
    &lt;input type=&#34;checkbox&#34; id=&#34;zoomCheck-a931d&#34; hidden&gt;
    &lt;label for=&#34;zoomCheck-a931d&#34;&gt;
    
    
    &lt;img class=&#34;zoomCheck&#34; loading=&#34;lazy&#34; src=&#34;https://ramsayleung.github.io/ox-hugo/acwifi-connect-2.png&#34;/&gt; 
    
    
    &lt;/label&gt;
&lt;/figure&gt;

&lt;p&gt;However, the WiFi had restrictions. For Aeroplan members who hadn&amp;rsquo;t paid, it only offered &lt;a href=&#34;https://www.aircanada.com/ca/en/aco/home/fly/onboard/in-flight-entertainment-and-connectivity.html#/&#34;&gt;Free Texting&lt;/a&gt;, meaning you could only use messaging apps like WhatsApp, Snapchat, and WeChat to send text messages, but couldn&amp;rsquo;t access other websites.&lt;/p&gt;
&lt;p&gt;If you wanted unlimited access to other websites, it would cost CAD $30.75:&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="prologue"><span class="section-num">1</span> Prologue</h2>
<p>A while ago, I took a flight from Canada back to Hong Kong - about 12 hours in total with Air Canada.</p>
<p>Interestingly, the plane actually had WiFi:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-a931d" hidden>
    <label for="zoomCheck-a931d">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/acwifi-connect-2.png"/> 
    
    
    </label>
</figure>

<p>However, the WiFi had restrictions. For Aeroplan members who hadn&rsquo;t paid, it only offered <a href="https://www.aircanada.com/ca/en/aco/home/fly/onboard/in-flight-entertainment-and-connectivity.html#/">Free Texting</a>, meaning you could only use messaging apps like WhatsApp, Snapchat, and WeChat to send text messages, but couldn&rsquo;t access other websites.</p>
<p>If you wanted unlimited access to other websites, it would cost CAD $30.75:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-1118f" hidden>
    <label for="zoomCheck-1118f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/acwifi.jpg"/> 
    
    
    </label>
</figure>

<p>And if you wanted to watch videos on the plane, that would be CAD $39:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-47db4" hidden>
    <label for="zoomCheck-47db4">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/acwifi_plan.jpg"/> 
    
    
    </label>
</figure>

<p>I started wondering: for the Free Texting service, could I bypass the messaging app restriction and access other websites freely?</p>
<p>Essentially, could I enjoy the benefits of the $30.75 paid service without actually paying the fee? After all, with such a long journey ahead, I needed something interesting to pass the 12 hours.</p>
<p>Since I could use WeChat in flight, I could also call for help from the sky.</p>
<p>Coincidentally, my roommate happens to be a security and networking expert who was on vacation at home. When I mentioned this idea, he thought it sounded fun and immediately agreed to collaborate. So we started working on it together across the Pacific.</p>
<h2 id="the-process"><span class="section-num">2</span> The Process</h2>
<p>After selecting the only available WiFi network <code>acwifi.com</code> on the plane, just like other login-required WiFi networks, it popped up a webpage from <code>acwifi.com</code> asking me to verify my Aeroplan membership. Once verified, I could access the internet.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-4b541" hidden>
    <label for="zoomCheck-4b541">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/onboard_success.jpg"/> 
    
    
    </label>
</figure>

<p>There&rsquo;s a classic software development interview question: what happens after you type a URL into the browser and press enter?</p>
<p>For example, if you type <code>https://acwifi.com</code> and only focus on the network request part, the general process is: DNS query -&gt; TCP connection -&gt; TLS handshake -&gt; HTTP request and response.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-9a5c6" hidden>
    <label for="zoomCheck-9a5c6">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/network_request_sequence_en.png"/> 
    
    
    </label>
</figure>

<p>Let&rsquo;s consider <code>github.com</code> as our target website we want to access. Now let&rsquo;s see how we can break through the network restrictions and successfully access <code>github.com</code>.</p>
<h2 id="approach-1-disguise-domain"><span class="section-num">3</span> Approach 1: Disguise Domain</h2>
<p>Since <code>acwifi.com</code> is accessible but <code>github.com</code> is not, is it possible that the network has imposed restrictions on the DNS server, only resolving domain names within a whitelist (such as instant messaging domains)?</p>
<p>If this is the case, can I modify <code>/etc/hosts</code> to disguise my server as <code>acwifi.com</code>, so that all request traffic passes through my server before reaching the target website (github.com)? For example:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-646db" hidden>
    <label for="zoomCheck-646db">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/self-sign-certificate-en.png"/> 
    
    
    </label>
</figure>

<p>The general idea is that I modify the DNS record to bind our proxy server&rsquo;s IP <code>137.184.231.87</code> to <code>acwifi.com</code>. Since the local <code>/etc/hosts</code> file takes precedence over the DNS server, I can then use a self-signed certificate to tell the browser that this IP is bound to this domain and that it should trust it.</p>
<p>Let me first test this idea:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">&gt; ping 137.184.231.87
</span></span><span class="line"><span class="cl">PING 137.184.231.87 <span class="o">(</span>137.184.231.87<span class="o">)</span>: <span class="m">56</span> data bytes
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">0</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">1</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">2</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">3</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">4</span>
</span></span><span class="line"><span class="cl">^C
</span></span><span class="line"><span class="cl">--- 137.184.231.87 ping statistics ---
</span></span><span class="line"><span class="cl"><span class="m">6</span> packets transmitted, <span class="m">0</span> packets received, 100.0% packet loss
</span></span></code></pre></td></tr></table>
</div>
</div><p>Unexpectedly, the IP was completely unreachable via <code>ping</code>, meaning the IP was likely blocked entirely.</p>
<p>I tried other well-known IPs, like Cloudflare&rsquo;s CDN IP, and they were also unreachable:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">&gt; ping 172.67.133.121
</span></span><span class="line"><span class="cl">PING 172.67.133.121 <span class="o">(</span>172.67.133.121<span class="o">)</span>: <span class="m">56</span> data bytes
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">0</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">1</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">2</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">3</span>
</span></span><span class="line"><span class="cl">Request timeout <span class="k">for</span> icmp_seq <span class="m">4</span>
</span></span><span class="line"><span class="cl">^C
</span></span><span class="line"><span class="cl">--- 172.67.133.121 ping statistics ---
</span></span><span class="line"><span class="cl"><span class="m">6</span> packets transmitted, <span class="m">0</span> packets received, 100.0% packet loss
</span></span></code></pre></td></tr></table>
</div>
</div><p>It seems this approach won&rsquo;t work. This approach might only work if:</p>
<ul>
<li>The DNS server only answers queries for a specific list of domain names (e.g., WhatsApp, Snapchat, WeChat), which means the firewall&rsquo;s filtering mechanism was solely based on DNS resolution.</li>
<li>The network allows connections to arbitrary IP addresses</li>
</ul>
<p>After all, if the IPs are directly blocked, no amount of disguise will help. This network likely maintains some IP whitelist (such as WhatsApp and WeChat&rsquo;s egress IPs), and only IPs on the whitelist can be accessed.</p>
<hr>
<blockquote>
<p>If a ping to a specific IP times out, I wouldn&rsquo;t say the IP is blocked. It could be that ICMP specifically is blocked, following some network rules on the firewall. This is pretty common in entreprise networks to not allow endpoint discovery. I could be missing something and happy to be corrected here, but I was surprised to read that. &ndash; HackerNews top comment</p></blockquote>
<p><a href="https://news.ycombinator.com/item?id=45536325">https://news.ycombinator.com/item?id=45536325</a></p>
<p>Actually, I did verified whether only ICMP was blocked, was it possible to create a connection through TLS:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&gt; curl -Lkv https://172.67.133.121
</span></span><span class="line"><span class="cl">*   Trying 172.67.133.121:443...
</span></span><span class="line"><span class="cl">* Connected to 172.67.133.121 <span class="o">(</span>172.67.133.121<span class="o">)</span> port <span class="m">443</span>
</span></span><span class="line"><span class="cl">* ALPN: curl offers h2,http/1.1
</span></span><span class="line"><span class="cl">* <span class="o">(</span>304<span class="o">)</span> <span class="o">(</span>OUT<span class="o">)</span>, TLS handshake, Client hello <span class="o">(</span>1<span class="o">)</span>:
</span></span><span class="line"><span class="cl">* LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 172.67.133.121:443
</span></span><span class="line"><span class="cl">* Closing connection
</span></span><span class="line"><span class="cl">curl: <span class="o">(</span>35<span class="o">)</span> LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 172.67.133.121:443
</span></span></code></pre></td></tr></table>
</div>
</div><p>However, it turned out that both ICMP and TLS were blocked</p>
<h2 id="approach-2-dns-port-masquerading"><span class="section-num">4</span> Approach 2: DNS Port Masquerading</h2>
<p>When the first approach failed, my roommate suggested a second approach: try using DNS service as a breakthrough:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">&gt; dig http418.org
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;</span> &lt;&lt;&gt;&gt; DiG 9.10.6 &lt;&lt;&gt;&gt; http418.org
</span></span><span class="line"><span class="cl"><span class="p">;;</span> global options: +cmd
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Got answer:
</span></span><span class="line"><span class="cl"><span class="p">;;</span> -&gt;&gt;HEADER<span class="s">&lt;&lt;- opco</span>de: QUERY, status: NOERROR, id: <span class="m">64160</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> flags: qr rd ra<span class="p">;</span> QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: <span class="m">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> OPT PSEUDOSECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span> EDNS: version: 0, flags:<span class="p">;</span> udp: <span class="m">4096</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> QUESTION SECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span>http418.org.			IN	A
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> ANSWER SECTION:
</span></span><span class="line"><span class="cl">http418.org.		300	IN	A	172.67.133.121
</span></span><span class="line"><span class="cl">http418.org.		300	IN	A	104.21.5.131
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Query time: <span class="m">3288</span> msec
</span></span><span class="line"><span class="cl"><span class="p">;;</span> SERVER: 172.19.207.1#53<span class="o">(</span>172.19.207.1<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> WHEN: Sat Oct <span class="m">04</span> 14:18:24 PDT <span class="m">2025</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> MSG SIZE  rcvd: <span class="m">94</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is good news! It means there are still ways to reach external networks, and DNS is one of them.</p>
<p>Looking at the record above, it shows our DNS query for <code>http418.org</code> was successful, meaning DNS requests work.</p>
<h3 id="arbitrary-dns-servers"><span class="section-num">4.1</span> Arbitrary DNS Servers</h3>
<p>My roommate then randomly picked another DNS server to see if the network had a whitelist for DNS servers:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">&gt; dig @40.115.144.198 http418.org
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;</span> &lt;&lt;&gt;&gt; DiG 9.10.6 &lt;&lt;&gt;&gt; @40.115.144.198 http418.org
</span></span><span class="line"><span class="cl"><span class="p">;</span> <span class="o">(</span><span class="m">1</span> server found<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> global options: +cmd
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Got answer:
</span></span><span class="line"><span class="cl"><span class="p">;;</span> -&gt;&gt;HEADER<span class="s">&lt;&lt;- opco</span>de: QUERY, status: NOERROR, id: <span class="m">58958</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> flags: qr rd ra<span class="p">;</span> QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: <span class="m">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> OPT PSEUDOSECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span> EDNS: version: 0, flags:<span class="p">;</span> udp: <span class="m">1224</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> QUESTION SECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span>http418.org.			IN	A
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> ANSWER SECTION:
</span></span><span class="line"><span class="cl">http418.org.		275	IN	A	104.21.5.131
</span></span><span class="line"><span class="cl">http418.org.		275	IN	A	172.67.133.121
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Query time: <span class="m">1169</span> msec
</span></span><span class="line"><span class="cl"><span class="p">;;</span> SERVER: 40.115.144.198#53<span class="o">(</span>40.115.144.198<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> WHEN: Sat Oct <span class="m">04</span> 14:24:25 PDT <span class="m">2025</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> MSG SIZE  rcvd: <span class="m">72</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>We can actually use arbitrary DNS servers - even better!</p>
<h3 id="tcp-queries"><span class="section-num">4.2</span> TCP Queries</h3>
<p>The fact that arbitrary DNS servers can be queried successfully is excellent news. DNS typically uses UDP protocol, but would TCP-based DNS requests be blocked?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">&gt; dig @40.115.144.198 http418.org +tcp
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;</span> &lt;&lt;&gt;&gt; DiG 9.10.6 &lt;&lt;&gt;&gt; @40.115.144.198 http418.org +tcp
</span></span><span class="line"><span class="cl"><span class="p">;</span> <span class="o">(</span><span class="m">1</span> server found<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> global options: +cmd
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Got answer:
</span></span><span class="line"><span class="cl"><span class="p">;;</span> -&gt;&gt;HEADER<span class="s">&lt;&lt;- opco</span>de: QUERY, status: NOERROR, id: <span class="m">30355</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> flags: qr rd ra<span class="p">;</span> QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: <span class="m">1</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> OPT PSEUDOSECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span> EDNS: version: 0, flags:<span class="p">;</span> udp: <span class="m">1224</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> QUESTION SECTION:
</span></span><span class="line"><span class="cl"><span class="p">;</span>http418.org.			IN	A
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> ANSWER SECTION:
</span></span><span class="line"><span class="cl">http418.org.		36	IN	A	172.67.133.121
</span></span><span class="line"><span class="cl">http418.org.		36	IN	A	104.21.5.131
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">;;</span> Query time: <span class="m">4679</span> msec
</span></span><span class="line"><span class="cl"><span class="p">;;</span> SERVER: 40.115.144.198#53<span class="o">(</span>40.115.144.198<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> WHEN: Sat Oct <span class="m">04</span> 14:28:24 PDT <span class="m">2025</span>
</span></span><span class="line"><span class="cl"><span class="p">;;</span> MSG SIZE  rcvd: <span class="m">72</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>DNS TCP queries also work! This indicates the plane network&rsquo;s filtering policy is relatively lenient, standing a chance of our subsequent DNS tunneling approach.</p>
<h3 id="proxy-service-on-port-53"><span class="section-num">4.3</span> Proxy Service on Port 53</h3>
<p>It seems the plane network restrictions aren&rsquo;t completely airtight - we&rsquo;ve found a &ldquo;backdoor&rdquo; in this wall.</p>
<p>So we had a clever idea: since the plane gateway doesn&rsquo;t block DNS requests, theoretically we could disguise our proxy server as a DNS server, expose port 53 for DNS service, route all requests through the proxy server disguised as DNS requests, and thus bypass the restrictions.</p>
<p>My roommate spent about an hour setting up a proxy server exposing port 53 using <a href="https://github.com/XTLS/Xray-core">xray</a> <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, and sent me the configuration via WeChat:</p>
<p>The proxy server configuration my roommate set up with Xray included the following sample configuration:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;outbounds&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;tag&#34;</span><span class="p">:</span> <span class="s2">&#34;proxy&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;protocol&#34;</span><span class="p">:</span> <span class="s2">&#34;vless&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;settings&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;vnext&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">          <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;address&#34;</span><span class="p">:</span> <span class="s2">&#34;our-proxy-server-domain&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;port&#34;</span><span class="p">:</span> <span class="mi">53</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;users&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">              <span class="p">{</span>
</span></span><span class="line"><span class="cl">                <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;some-uuid&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="nt">&#34;flow&#34;</span><span class="p">:</span> <span class="s2">&#34;xtls-rprx-vision&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="nt">&#34;encryption&#34;</span><span class="p">:</span> <span class="s2">&#34;none&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                <span class="nt">&#34;level&#34;</span><span class="p">:</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">              <span class="p">}</span>
</span></span><span class="line"><span class="cl">            <span class="p">]</span>
</span></span><span class="line"><span class="cl">          <span class="p">}</span>
</span></span><span class="line"><span class="cl">        <span class="p">]</span>
</span></span><span class="line"><span class="cl">      <span class="p">},</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;streamSettings&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;network&#34;</span><span class="p">:</span> <span class="s2">&#34;tcp&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;security&#34;</span><span class="p">:</span> <span class="s2">&#34;tls&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;tlsSettings&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;allowInsecure&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;allowInsecureCiphers&#34;</span><span class="p">:</span> <span class="kc">false</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">          <span class="nt">&#34;alpn&#34;</span><span class="p">:</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;h2&#34;</span>
</span></span><span class="line"><span class="cl">          <span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;tag&#34;</span><span class="p">:</span> <span class="s2">&#34;direct&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;protocol&#34;</span><span class="p">:</span> <span class="s2">&#34;freedom&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;tag&#34;</span><span class="p">:</span> <span class="s2">&#34;block&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">      <span class="nt">&#34;protocol&#34;</span><span class="p">:</span> <span class="s2">&#34;blackhole&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>And I already had an xray client on my computer, so no additional software was needed to establish the connection.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-60758" hidden>
    <label for="zoomCheck-60758">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/dns-server-proxy-en.png"/> 
    
    
    </label>
</figure>

<p>Everything was ready. The exciting moment arrived - pressing enter to access <code>github.com</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">/Users/ramsayleung <span class="o">[</span>ramsayleung@ramsayleungs-Laptop<span class="o">]</span> <span class="o">[</span>18:28<span class="o">]</span>
</span></span><span class="line"><span class="cl">&gt; curl -v github.com -x socks5://127.0.0.1:10810
</span></span><span class="line"><span class="cl">*   Trying 127.0.0.1:10810...
</span></span><span class="line"><span class="cl">* Connected to 127.0.0.1 <span class="o">(</span>127.0.0.1<span class="o">)</span> port <span class="m">10810</span>
</span></span><span class="line"><span class="cl">* SOCKS5 connect to 172.19.1.1:80 <span class="o">(</span>locally resolved<span class="o">)</span>
</span></span><span class="line"><span class="cl">* SOCKS5 request granted.
</span></span><span class="line"><span class="cl">* Connected to 127.0.0.1 <span class="o">(</span>127.0.0.1<span class="o">)</span> port <span class="m">10810</span>
</span></span><span class="line"><span class="cl">&gt; GET / HTTP/1.1
</span></span><span class="line"><span class="cl">&gt; Host: github.com
</span></span><span class="line"><span class="cl">&gt; User-Agent: curl/8.4.0
</span></span><span class="line"><span class="cl">&gt; Accept: */*
</span></span><span class="line"><span class="cl">&gt;
</span></span><span class="line"><span class="cl">&lt; HTTP/1.1 <span class="m">301</span> Moved Permanently
</span></span><span class="line"><span class="cl">&lt; Content-Length: <span class="m">0</span>
</span></span><span class="line"><span class="cl">&lt; Location: https://github.com/
</span></span><span class="line"><span class="cl">&lt;
</span></span><span class="line"><span class="cl">* Connection <span class="c1">#0 to host 127.0.0.1 left intact</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">/Users/ramsayleung <span class="o">[</span>ramsayleung@ramsayleungs-Laptop<span class="o">]</span> <span class="o">[</span>18:28<span class="o">]</span>
</span></span><span class="line"><span class="cl">&gt; curl -v github.com -x socks5://127.0.0.1:10810
</span></span><span class="line"><span class="cl">*   Trying 127.0.0.1:10810...
</span></span><span class="line"><span class="cl">* Connected to 127.0.0.1 <span class="o">(</span>127.0.0.1<span class="o">)</span> port <span class="m">10810</span>
</span></span><span class="line"><span class="cl">* SOCKS5 connect to 172.19.1.1:80 <span class="o">(</span>locally resolved<span class="o">)</span>
</span></span><span class="line"><span class="cl">* SOCKS5 request granted.
</span></span><span class="line"><span class="cl">* Connected to 127.0.0.1 <span class="o">(</span>127.0.0.1<span class="o">)</span> port <span class="m">10810</span>
</span></span><span class="line"><span class="cl">&gt; GET / HTTP/1.1
</span></span><span class="line"><span class="cl">&gt; Host: github.com
</span></span><span class="line"><span class="cl">&gt; User-Agent: curl/8.4.0
</span></span><span class="line"><span class="cl">&gt; Accept: */*
</span></span><span class="line"><span class="cl">&gt;
</span></span><span class="line"><span class="cl">&lt; HTTP/1.1 <span class="m">301</span> Moved Permanently
</span></span><span class="line"><span class="cl">&lt; Content-Length: <span class="m">0</span>
</span></span><span class="line"><span class="cl">&lt; Location: https://github.com/
</span></span><span class="line"><span class="cl">&lt;
</span></span><span class="line"><span class="cl">* Connection <span class="c1">#0 to host 127.0.0.1 left intact</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The request actually succeeded! github.com returned a successful result!</p>
<p>This means we&rsquo;ve truly broken through the network restrictions and can access any website!</p>
<p>We hadn&rsquo;t realized before that xray could be used in this clever way :)</p>
<p>Here we exploited a simple cognitive bias: not all services using port 53 are DNS query requests.</p>
<h2 id="ultimate-approach-dns-tunnel"><span class="section-num">5</span> Ultimate Approach: DNS Tunnel</h2>
<p>If Approach 2 still didn&rsquo;t work, we had one final trick up our sleeves.</p>
<p>Currently, the gateway only checks whether the port is 53 to determine if it&rsquo;s a DNS request.
But if the gateway were stricter and inspected the content of DNS request packets, it would discover that our requests are &ldquo;disguised&rdquo; as DNS queries rather than genuine DNS queries:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-0b089" hidden>
    <label for="zoomCheck-0b089">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/intercept-dns-request-en.png"/> 
    
    
    </label>
</figure>

<p>Since disguised DNS requests would be blocked, we could embed all requests inside genuine DNS request packets, making them DNS TXT queries. We&rsquo;d genuinely be querying DNS, just with some extra content inside:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-78ed8" hidden>
    <label for="zoomCheck-78ed8">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/dns-tunnel-en.png"/> 
    
    
    </label>
</figure>

<p>However, this ultimate approach requires a DNS Tunnel client to encapsulate all requests(for example: <a href="https://github.com/yarrick/iodine">https://github.com/yarrick/iodine</a>). Unfortunately, I didn&rsquo;t have such software on my computer while I was on board, so this remained a theoretical ultimate solution that couldn&rsquo;t be practically verified.</p>
<h2 id="conclusion"><span class="section-num">6</span> Conclusion</h2>
<p>With the long journey ahead, my roommate and I spent about 4 hours remotely breaking through the network restrictions, having great fun in the process, proving that our problem-solving approach was indeed feasible.</p>
<p>The successful implementation of the solution was mainly thanks to my roommate, the networking expert, who provided remote technical and conceptual support.</p>
<p>The only downside was that although we broke through the network restrictions and could access any website, the plane&rsquo;s bandwidth was extremely limited, making web browsing quite painful. So I didn&rsquo;t spend much time browsing the web.</p>
<p>For the remaining hours, I rewatched the classic 80s time-travel movie: <code>&quot;Back to the Future&quot;</code> , which was absolutely fantastic.</p>
<p>Last and not least, it&rsquo;s the disclaimer:</p>
<p>This technical exploration is intended solely for educational and research purposes. We affirm our strict adherence to all relevant regulations and service terms throughout this project.</p>
<p>Discuss this post on <a href="https://news.ycombinator.com/item?id=45536325">HackerNews</a>, or <a href="https://old.reddit.com/r/netsec/comments/1o3l1fy/a_story_about_bypassing_air_canadas_inflight/">Reddit</a></p>
<h2 id="follow-up-can-we-bypass-the-speed-limit"><span class="section-num">7</span> Follow up: Can we bypass the speed limit?</h2>
<blockquote>
<p>There&rsquo;s a speed limit. No matter what you try, you can&rsquo;t break through it.</p>
<p>The bandwidth for free texting is the ultimate bottleneck. Even if you bypass the restrictions, without solving the bandwidth issue, you can&rsquo;t use it like paid Wi-Fi 😭</p></blockquote>
<p>This is a comment from a reader. Is this true?</p>
<p>Not exactly.</p>
<p>There are actually ways to bypass it. During communication on the network, there is no business marker indicating &ldquo;paid user&rdquo; attached.
So, if I were designing this paid system, after a user pays, I would add the unique identifier of the paid user&rsquo;s device, typically the MAC address, to the gateway&rsquo;s whitelist. Then, all traffic from that MAC address could use the higher-bandwidth line.</p>
<p>Furthermore, because of this whitelist, even if free users bypass the free texting restrictions, they still cannot enjoy the higher bandwidth – killing two birds with one stone.</p>
<p>Once we guess this principle, I can &ldquo;pretend to be a paid user.&rdquo;</p>
<p>Since the MAC address is essentially assigned by the computer itself and then communicated to the gateway using the ARP protocol.</p>
<p>The ARP (Address Resolution Protocol) is used to resolve an IP address into its corresponding MAC address. The basic idea of the ARP protocol is to broadcast an ARP request within the network:</p>
<blockquote>
<p>Gateway: Whose IP address is 192.168.1.100? Please tell me your MAC address.
Device with that IP: I am 192.168.1.100, my MAC address is XX:XX:XX:XX:XX:XX.</p></blockquote>
<p>The ARP protocol itself has no security verification mechanism – it unconditionally trusts the received ARP replies. I could also tell the gateway that my IP address corresponds to the MAC address of a user who has already paid. This way, all traffic coming from me would enjoy the benefits of the paid line. This is the so-called ARP Spoofing.</p>
<p>The final question is, how do we know which MAC address belongs to a paid user?</p>
<p>It&rsquo;s quite easy, just brute-force: try every known MAC address in the network. If there is a paid user&rsquo;s MAC address, requests from that MAC address should definitely be able to access sites like YouTube/Netflix. This can easily be automated and detected with a script.</p>
<p>My previous solution of disguising the DNS server would not affect the aircraft&rsquo;s network in any way, nor would it intrude into any aircraft systems. It&rsquo;s essentially the same as running a VPN on your computer that uses port 53.</p>
<p>However, this ARP Spoofing method is very &ldquo;criminal&rdquo; – as it constitutes unauthorized access to the plane&rsquo;s network system. I&rsquo;ll just share the idea here; I don&rsquo;t want FBI or RCMP waiting for me at the gate when the airplane lands</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://github.com/XTLS/Xray-core">https://github.com/XTLS/Xray-core</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>A Telegram Spam Blocker Bot Based On Bayesian Algorithm</title>
      <link>https://ramsayleung.github.io/en/post/2025/a_telegram_spam_blocker_bot_based_on_bayesian/</link>
      <pubDate>Sat, 30 Aug 2025 10:34:00 -0700</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2025/a_telegram_spam_blocker_bot_based_on_bayesian/</guid>
      <description>&lt;h2 id=&#34;preface&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Preface&lt;/h2&gt;
&lt;p&gt;I spent a weekend building a Telegram spam blocker bot based on Bayesian Algorithm &lt;code&gt;@BayesSpamSniperBot&lt;/code&gt; (&lt;a href=&#34;https://t.me/BayesSpamSniperBot&#34;&gt;https://t.me/BayesSpamSniperBot&lt;/a&gt;). The project is open-sourced at: &lt;a href=&#34;https://github.com/ramsayleung/bayes_spam_sniper&#34;&gt;https://github.com/ramsayleung/bayes_spam_sniper&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;telegram&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1.1&lt;/span&gt; Telegram&lt;/h3&gt;
&lt;p&gt;Telegram is a popular instant messaging application, similar to Snapchat and WhatsApp, with over 1 billion users.&lt;/p&gt;
&lt;p&gt;It supports many powerful features like cloud chat history storage, clients for Linux, Mac, Windows, Android, IOS, and Web (all open-source), Channel, and arguably the most powerful bot system I&amp;rsquo;ve ever seen.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="preface"><span class="section-num">1</span> Preface</h2>
<p>I spent a weekend building a Telegram spam blocker bot based on Bayesian Algorithm <code>@BayesSpamSniperBot</code> (<a href="https://t.me/BayesSpamSniperBot">https://t.me/BayesSpamSniperBot</a>). The project is open-sourced at: <a href="https://github.com/ramsayleung/bayes_spam_sniper">https://github.com/ramsayleung/bayes_spam_sniper</a></p>
<h3 id="telegram"><span class="section-num">1.1</span> Telegram</h3>
<p>Telegram is a popular instant messaging application, similar to Snapchat and WhatsApp, with over 1 billion users.</p>
<p>It supports many powerful features like cloud chat history storage, clients for Linux, Mac, Windows, Android, IOS, and Web (all open-source), Channel, and arguably the most powerful bot system I&rsquo;ve ever seen.</p>
<h2 id="origin"><span class="section-num">2</span> Origin</h2>
<p>I usually listen to podcasts while running and cooking. 《<a href="https://podcasts.apple.com/us/podcast/%E8%BD%AF%E4%BB%B6%E9%82%A3%E4%BA%9B%E4%BA%8B%E5%84%BF/id1147186605">软件那些事儿(A podcast in Chinese about history and story behind software)</a>》<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> is one of my favorites, hosted by <a href="https://liuyandong.com/sample-page">栋哥</a> <sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Because I enjoyed 栋哥&rsquo;s show, I took the chance to join his Telegram channel.</p>
<p>栋哥&rsquo;s Telegram channel <a href="https://t.me/huruanhuying">汗牛充栋</a> <sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> is primarily used for releasing podcast information.
He once enabled the comment for channel, but it unexpectedly attracted a flood of crypto-related users posting spam, leading him to disable comments:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-be4f5" hidden>
    <label for="zoomCheck-be4f5">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/spam_concert.jpg"/> 
    
    
    </label>
</figure>

<p>Another channel I subscribe, <a href="https://t.me/kaedeharakazuha17">Ray Tracing</a> <sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, also complained about the crypto spam:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-d541b" hidden>
    <label for="zoomCheck-d541b">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/ray_tracing_spam.jpg"/> 
    
    
    </label>
</figure>

<h2 id="hackers-and-painters"><span class="section-num">3</span> Hackers &amp; Painters</h2>
<p>Most common Telegram spam blocker bots are keyword-based, blocking messages by matching keywords, which can be easily bypassed by spammers.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-1401f" hidden>
    <label for="zoomCheck-1401f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/keyword_based_blocker.jpg"/> 
    
    
    </label>
</figure>

<p>If the messages get bypassed, it could only be deleted manually by administrator.</p>
<p>This reminded me of the situation Paul Graham described in his 2002 essay within &ldquo;Hackers &amp; Painters&rdquo;:</p>
<p>When email became popular, there was also a lot of spam. Common spam blockers were keyword matching + email address blacklists, but these were inefficient and easily circumvented.</p>
<p>Paul Graham creatively used Bayesian Theorem to implement a <a href="https://paulgraham.com/spam.html">spam blocker</a> <sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>, and the results were surprisingly effective.</p>
<p>Isn&rsquo;t this a similar problem for Telegram spam?</p>
<p>Couldn&rsquo;t I use a similar solution to tackle Telegram spam?</p>
<h3 id="bayes-theorem"><span class="section-num">3.1</span> Bayes&rsquo; Theorem</h3>
<p>When it comes to probabilistic algorithms, the most classic example is the &ldquo;coin toss&rdquo; – a case of classical probability where each toss is an independent event, and the previous outcome doesn&rsquo;t affect the next probability.</p>
<p>However, many real-world scenarios aren&rsquo;t infinitely repeatable like coin tosses, and events are often not independent.</p>
<p>This is where Bayes Theorem shows its unique value.</p>
<p>It used to update our degree of belief in a hypothesis given certain evidence.</p>
<p>In other words, the Bayes algorithm can dynamically adjust the estimated probability of an event occurring based on continuously emerging new evidence.</p>
<p>Simply put, it&rsquo;s like the human brain&rsquo;s learning process: we start with a preliminary understanding, then revise our original view based on new information, thereby adjusting our next actions.</p>
<p>Paul Graham used Bayes theorem to continuously classify new emails as spam or not based on emails already identified as spam or non-spam (ham).</p>
<p>To understand Bayes Theorem more intuitively, I recommend this clear and easy-to-understand videos:</p>
<ul>
<li>《<a href="https://www.youtube.com/watch?v=HZGCoVF3YvM">Bayes theorem, the geometry of changing beliefs</a>》<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></li>
</ul>
<h2 id="architecture-design"><span class="section-num">4</span> Architecture Design</h2>
<p>Telegram Bot supports two modes of interacting with Telegram servers:</p>
<ol>
<li>
<p>Webhook: Telegram servers actively callback a URL previously registered by the Bot whenever the Bot receives a new message. The Bot Server only needs to handle the callback messages.</p>
</li>
<li>
<p>Long Polling: The Bot Server continuously polls the Telegram servers to check for new messages and processes them if any. This bot uses this mode.</p>

    <figure>
        
        
        <input type="checkbox" id="zoomCheck-ca97a" hidden>
        <label for="zoomCheck-ca97a">
        
        
        <img class="zoomCheck" loading="lazy" src="/ox-hugo/webhook_vs_long_polling.jpg"/> 
        
        
        </label>
    </figure>

</li>
</ol>
<h4 id="message-analysis"><span class="section-num">4.0.1</span> Message Analysis</h4>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-e2518" hidden>
    <label for="zoomCheck-e2518">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/spam_analyze.jpg"/> 
    
    
    </label>
</figure>

<p>After the Bot Server receives a message, it dispatches it to a separate <code>telegram_bot_worker</code> for processing. Based on the pre-trained model, it judges whether it&rsquo;s a spam. If it is, it calls the Bot API to delete the message.</p>
<h4 id="ban-and-train"><span class="section-num">4.0.2</span> Ban and Train</h4>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-04d7c" hidden>
    <label for="zoomCheck-04d7c">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/mark_spam_and_ban_user.jpg"/> 
    
    
    </label>
</figure>

<p>After the Bot Server receives a message, it dispatches it to a separate <code>telegram_bot_worker</code> for processing. The <code>telegram_bot_worker</code> calls the bot API to delete the message and ban the user, and inserts a training data record marked as spam.</p>
<p>Saving the training data triggers a hook, creating a training message delivered to the <code>training</code> message queue. Another worker, <code>classifier_trainer</code>, subscribes to <code>training</code> messages and uses the new messages to retrain and update the model.</p>
<p>Using a queue and a background process (<code>classifier_trainer</code>) for training tasks, instead of directly using the <code>telegram_bot_worker</code>, primarily decouples the Bot request handling from model training. Otherwise, as the model size increases, training time would get longer and longer, leading to increasingly long response times.</p>
<p>Decoupling makes it easy to scale.</p>
<h2 id="why-rails"><span class="section-num">5</span> Why Rails</h2>
<p>whoever have seen my project source code might wonder, why was it implemented using Ruby on Rails?</p>
<p>Because I work with JVM languages (Java/Kotlin/Scala) and Rust, I&rsquo;m quite familiar with Java/Rust. Initially, thinking model training might require high performance, my first <a href="https://gist.github.com/ramsayleung/5848af0177a70a01d41f624e361b1b5d">prototype</a> <sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> was implemented in Rust, taking about half an hour.</p>
<p>But when I wanted to expand the prototype into a Telegram bot, I found I needed to handle a lot of logic related to bot interaction, mainly involving API and database operations, most of which were unrelated to the model. So, I thought of Ruby on Rails again.</p>
<p>For a single engineer building a product prototype, in my personal opinion, there&rsquo;s really no framework more efficient than <code>Ruby on Rails</code>, so I switched to Ruby on Rails.</p>
<p>New features in Rails 8 move it further towards being a so-called &ldquo;one-person full-stack framework,&rdquo; with built-in support for <code>Solid Queue</code> via the relational database.</p>
<p>The queue and background process from the architecture design were implemented with just a few lines of code, without even needing extra configuration. If the queue doesn&rsquo;t exist, the framework creates it automatically:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-ruby" data-lang="ruby"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">ClassifierTrainerJob</span> <span class="o">&lt;</span> <span class="no">ApplicationJob</span>
</span></span><span class="line"><span class="cl">  <span class="c1"># Job to train classifier asynchronously</span>
</span></span><span class="line"><span class="cl">  <span class="n">queue_as</span> <span class="ss">:training</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">def</span> <span class="nf">perform</span><span class="p">(</span><span class="n">group_id</span><span class="p">,</span> <span class="n">group_name</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="no">SpamClassifierService</span><span class="o">.</span><span class="n">rebuild_for_group</span><span class="p">(</span><span class="n">group_id</span><span class="p">,</span> <span class="n">group_name</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="k">end</span>
</span></span><span class="line"><span class="cl"><span class="k">end</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Thanks to Rails&rsquo; powerful ORM framework and its built-in lifecycle hooks, the code to trigger the background process for retraining the model after inserting new training data is also just a few lines:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-ruby" data-lang="ruby"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TrainedMessage</span> <span class="o">&lt;</span> <span class="no">ApplicationRecord</span>
</span></span><span class="line"><span class="cl">  <span class="c1"># Automatically train classifier after creating/updating a message</span>
</span></span><span class="line"><span class="cl">  <span class="n">after_create</span> <span class="ss">:retrain_classifier</span>
</span></span><span class="line"><span class="cl">  <span class="n">after_destroy</span> <span class="ss">:retrain_classifier</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">def</span> <span class="nf">retrain_classifier</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># For efficiency, we could queue this as a background job</span>
</span></span><span class="line"><span class="cl">    <span class="no">ClassifierTrainerJob</span><span class="o">.</span><span class="n">perform_later</span><span class="p">(</span><span class="n">group_id</span><span class="p">,</span> <span class="n">group_name</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="k">end</span>
</span></span><span class="line"><span class="cl"><span class="k">end</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Empowered by Rails&rsquo; various built-in powerful tools, I implemented the entire bot&rsquo;s functionality in just one day.</p>
<p>Seeing this, some friends might worry about performance, thinking Ruby&rsquo;s performance isn&rsquo;t great, and it&rsquo;s a dynamic language, making it hard to maintain.</p>
<p>My view remains the same as in my previous blog post 《<a href="https://ramsayleung.github.io/zh/post/2024/%E7%BC%96%E7%A8%8B%E5%8D%81%E5%B9%B4%E7%9A%84%E6%84%9F%E6%82%9F/">Thoughts on a Decade of Programming(In Chinese)</a>》<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>:</p>
<p><strong><strong>Get it running first</strong></strong>.</p>
<p>Build a prototype and get it running. See if users are willing to use your product first.</p>
<p>When running speed becomes a bottleneck, your business must be very successful, and you&rsquo;ll surely have enough resources to hire a team of programmers to optimize the project into Rust/C++, or even assembly.</p>
<p>Without users, discussing performance is a pseudo-proposition.</p>
<p>As for the saying &ldquo;dynamic languages are fast in the moment, but maintaining the code is a nightmare&rdquo; I quite agree with that too.</p>
<p>Therefore, I won&rsquo;t consider dynamic languages when choosing a tech stack for a team; I would only use compiled languages, even strong-typed ones like Rust.</p>
<p>But right now, it&rsquo;s just me building a prototype, so I use whatever I&rsquo;m most proficient with.</p>
<h3 id="vibe-coding"><span class="section-num">5.1</span> Vibe Coding?</h3>
<p>Concepts like Vibe Coding and AI programming are everywhere, overwhelming the discourse. You might naturally wonder if this project was generated by Vibe Coding.</p>
<p>The answer is, I tried for a few hours and then gave up entirely. I tried both Claude 4 and Gemini 2.5 Pro.</p>
<p>I started with a Rust + Cloudflare Worker tech stack. Rust + Cloudflare Worker is a relatively niche field with limited training data. The code generated by Vibe Coding failed to compile.</p>
<p>Later, I switched to Ruby on Rails, and the problems became even worse. Ruby is a dynamic language; its syntax is almost like English, and Rails has many &ldquo;black magic&rdquo; metaprogramming features.</p>
<p>So errors only appeared at runtime. The development time saved by code generation was entirely consumed by the debugging process.</p>
<p>Another issue is that code generated by Vibe Coding often lacks design. For example, it tightly coupled the <code>Classifier</code> and <code>TrainedMessage</code> classes, having the <code>Classifier</code> persist <code>TrainedMessage</code> instances.</p>
<p>It also directly performed synchronous model training within the <code>telegram_bot_worker</code> process upon receiving training data, waiting for training to finish before returning the command result, completely neglecting to decouple receiving training data from model training.</p>
<p>One can only say that Vibe Coding is quite suitable for strongly-typed, compiled languages like Rust – at least the generated code has to compile.</p>
<p>As for those claims of &ldquo;making an APP without writing/changing a single line of code,&rdquo; I can&rsquo;t help but wonder:</p>
<p>Is the code so good that it doesn&rsquo;t need a single change? Or can the developer not identify the crux of the problem, and thus doesn&rsquo;t change anything?</p>
<h2 id="design-philosophy"><span class="section-num">6</span> Design Philosophy</h2>
<p>After developing the prototype and making the bot&rsquo;s core functionality usable, many ideas popped into my head.</p>
<p>I immediately rushed to add them to the bot,
resulting in support for nearly ten commands, plus different modes for private chats and group chats.</p>
<p>But I felt something was off. Adding so many features felt like those all-in-one super apps common in China. I began to question:</p>
<p>Would users really use all these features? Would <strong>any</strong> users use these features? Don&rsquo;t too many features also create extra cognitive burden?</p>
<p>My favorite ad blocker, <a href="https://ublockorigin.com/">Ublock Origin</a> <sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>, is powerful and extremely effective at blocking, yet very simple and easy to use.</p>
<p>Recalling the design philosophy mentioned in 《<a href="https://ramsayleung.github.io/zh/post/2025/a_philosophy_of_software_design/">A Philosophy of Software Design(My thought about the book in Chinese)</a>》<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup>, the interface should be simple and easy to use, even if the functionality underneath is complex and rich.</p>
<p>So, I first removed all commands I considered unrelated to the core functionality.</p>
<p>Furthermore, considering that most users might not have a technical background and might not know how to use commands, I optimized the interface to use buttons as much as possible, allowing users to click directly, improving usability:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-5f8d5" hidden>
    <label for="zoomCheck-5f8d5">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/start_en.jpg"/> 
    
    
    </label>
</figure>


<figure>
    
    
    <input type="checkbox" id="zoomCheck-bc48d" hidden>
    <label for="zoomCheck-bc48d">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/help_en.jpg"/> 
    
    
    </label>
</figure>

<p>I also wanted to support multiple languages (e.g., automatically switching to Chinese or English based on the user&rsquo;s system language). This required decent Internationalization support.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-e207f" hidden>
    <label for="zoomCheck-e207f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/start_zh.jpg"/> 
    
    
    </label>
</figure>


<figure>
    
    
    <input type="checkbox" id="zoomCheck-ca4b6" hidden>
    <label for="zoomCheck-ca4b6">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/help_page_zh.jpg"/> 
    
    
    </label>
</figure>

<p>Over 60% of the code in the core service class <a href="https://github.com/ramsayleung/bayes_spam_sniper/blob/master/app/services/telegram_botter.rb">telegram_botter.rb</a> was introduced for such usability improvements.</p>
<p>Simplicity for the user, complexity for the developer.</p>
<h3 id="how-to-use"><span class="section-num">6.1</span> How to Use</h3>
<p>Just two steps, and the bot works automatically.</p>
<ul>
<li><a href="https://t.me/BayesSpamSniperBot?startgroup=true">Add the bot (@BayesSpamSniperBot) to your group</a></li>
<li>Grant the bot admin permissions (Delete messages, Ban users)</li>
</ul>
<p>After these two steps, the bot will not only start working automatically, identifying spam within the group, deleting text messages, and banning users who send spam more than 3 times;</p>
<p>It will also become smarter with community usage (via <code>/markspam</code> and <code>/feedspam</code>).</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-28a2c" hidden>
    <label for="zoomCheck-28a2c">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/detect_spam_and_ban_user.jpg"/> 
    
    
    </label>
</figure>

<p>The design philosophy of this bot is to minimize disruption to admins and users, provide simple operation commands, and maximize automation.
Therefore, this bot only offers the following three commands (supporting auto-completion with &ldquo;/&rdquo;):</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-ecf1f" hidden>
    <label for="zoomCheck-ecf1f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/command_auto_completion.jpg"/> 
    
    
    </label>
</figure>

<h4 id="markspam"><span class="section-num">6.1.1</span> <code>/markspam</code></h4>
<p>Delete spam messages and ban the user. Requires admin permissions.</p>
<p>Reply <code>/markspam</code> to the message you want to ban, and the bot will automatically delete that message and ban the user.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-14d59" hidden>
    <label for="zoomCheck-14d59">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/markspam_2.jpg"/> 
    
    
    </label>
</figure>

<p>(Message has also been deleted)
<img loading="lazy" src="/ox-hugo/markspam.jpg"></p>
<p>Unlike common group management bots, this command not only deletes the spam and bans the user but also, because this message is marked as spam by an admin with very high confidence, uses this spam ad as training data to update the model in real-time.</p>
<p>Similar messages will not only be identified next time, but all groups using this bot will benefit, as similar texts will also be marked as spam.</p>
<h4 id="listspam"><span class="section-num">6.1.2</span> <code>/listspam</code></h4>
<p>View the list of spam messages. Requires admin permissions.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-d0038" hidden>
    <label for="zoomCheck-d0038">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/listspam.jpg"/> 
    
    
    </label>
</figure>

<p>View the list of spam messages and proactively mark false positive spam as normal.</p>
<h4 id="listbanuser"><span class="section-num">6.1.3</span> <code>/listbanuser</code></h4>
<p>View the list of banned accounts. Requires admin permissions.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-7751e" hidden>
    <label for="zoomCheck-7751e">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/listbanuser.jpg"/> 
    
    
    </label>
</figure>

<p>View the list of banned users and proactively unban them.</p>
<h4 id="feedspam"><span class="section-num">6.1.4</span> <code>/feedspam</code></h4>
<p>Feed spam messages for training. No permissions required. Can be used in private chat or in-group.</p>
<p>Feeding in private chat:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-2a139" hidden>
    <label for="zoomCheck-2a139">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/feedspam.jpg"/> 
    
    
    </label>
</figure>

<p>Feeding in-group:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-1fc12" hidden>
    <label for="zoomCheck-1fc12">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/feedspam2.jpg"/> 
    
    
    </label>
</figure>

<h2 id="eating-your-own-dog-food"><span class="section-num">7</span> Eating your own dog food</h2>
<p>In the software development field, there&rsquo;s a saying: &ldquo;Eating your own dog food,&rdquo; which means you should use the things you develop yourself.</p>
<p>So I created my own channel for testing: <a href="https://t.me/pipeapplebun">菠萝油与天光墟</a> <sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup>. Unfortunately, it has very few subscribers,
which fails to attract many spammers. So everyone is welcome to subscribe or come in to post spam, to attract more spammers.</p>
<p>In my channel, everyone has the right to speak freely :-) (the only slight drawback is a limit on frequency ).</p>
<p>Since no one was posting spam in my channel, and suffering from a lack of training data, I had to do it in the hard way: to join various crypto groups, NSFW groups, and actively seek out spam:</p>
<p>The screenshots are Chinese Telegram group contains a lot of spammers and spam:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-3779e" hidden>
    <label for="zoomCheck-3779e">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/telegram_group1.jpg"/> 
    
    
    </label>
</figure>


<figure>
    
    
    <input type="checkbox" id="zoomCheck-35262" hidden>
    <label for="zoomCheck-35262">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/telegram_group2.jpg"/> 
    
    
    </label>
</figure>


<figure>
    
    
    <input type="checkbox" id="zoomCheck-b0b9b" hidden>
    <label for="zoomCheck-b0b9b">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/spam_sample.jpg"/> 
    
    
    </label>
</figure>

<p>Since developing this bot, my perspective on spam has changed. I used to find spam annoying in other groups, but now I&rsquo;m happy to see them in other groups,
as they are valuable training data that I need to record quickly before they get deleted.</p>
<h3 id="the-ingenuity-of-spam"><span class="section-num">7.1</span> The Ingenuity of Spam</h3>
<p>The algorithm&rsquo;s effectiveness in other people&rsquo;s stories is always surprisingly good, but when I run it myself, I always find various uncovered cases and unexpected surprises, just like life.</p>
<p>Although keyword blocker is inefficient, the spam we actually see are those that have already bypassed keyword filters.</p>
<p>For example:</p>
<blockquote>
<p>在 币圈 想 赚 钱，那 你 不关 注 这 个 王 牌 社 区，真的太可惜了，真 心 推 荐，每 天 都 有 免 费 策 略
(Want to make money in the crypto circle? It&rsquo;s a real pity if you don&rsquo;t follow this ace community. Sincerely recommended, free strategies every day)</p></blockquote>
<p>Or</p>
<blockquote>
<p>这人简-介挂的 合-约-报单群组挺牛的ETH500点，大饼5200点！ + @BTCETHl6666
(The contract reporting group linked in this person&rsquo;s bio is pretty awesome. ETH to $500, BTC to $5200! + @BTCETHl6666)</p></blockquote>
<p>The former uses spaces to bypass keywords, the latter uses punctuation.</p>
<p>Unlike languages based on the Latin alphabet which naturally use spaces for word separation, Chinese requires word segmentation before statistical analysis with Bayes&rsquo; theorem.</p>
<blockquote>
<p>the fox jumped over the lazy dog</p>
<p>我们的中文就不一样了(Our Chinese is different)</p></blockquote>
<p>&ldquo;我们的中文就不一样了&rdquo; (Our Chinese is different) would be segmented into &ldquo;我们 | 的 | 中文 | 就 | 不 | 一样 | 了&rdquo; before word frequency can be counted.</p>
<p>But for the spam <code>在 币圈 想 赚 钱，那 你 不关 注 这 个 王 牌 社 区，真的太可惜了，真 心 推 荐，每 天 都 有 免 费 策 略</code>, the spaces not only affect keyword matching but also affect segmentation. The segmentation result for this sentence becomes(incorrect):</p>
<p><code>在 | | 币圈 | | 想 | | 赚 | | 钱 | ， | 那 | | 你 | | 不 | 关 | | 注 | | 这 | | 个 | | 王 | | 牌 | | 社 | | 区 | ， | 真的 | 太 | 可惜 | 了 | ， | 真 | | 心 | | 推 | | 荐 | ， | 每 | | 天 | | 都 | | 有 | | 免 | | 费 | | 策 | | 略</code></p>
<p><code>这人简-介挂的 合-约-报单群组挺牛的ETH500点，大饼5200点！ + @BTCETHl6666</code> would be segmented into:</p>
<p><code>这人简 | - | 介挂 | 的 | | 合 | - | 约 | - | 报单 | 群组 | 挺 | 牛 | 的 | ETH500 | 点 | ， | 大饼 | 5200 | 点 | ！ | | + | | @ | BTCETHl6666</code></p>
<p>Unsanitized training data would affect the model&rsquo;s results, showing the importance of training data quality. Therefore, I performed corresponding preprocessing on the training data:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-ruby" data-lang="ruby"><span class="line"><span class="cl"><span class="c1"># Step 1: Handle anti-spam separators</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This still handles the cases like &#34;合-约&#34; -&gt; &#34;合约&#34;</span>
</span></span><span class="line"><span class="cl"><span class="n">previous</span> <span class="o">=</span> <span class="s2">&#34;&#34;</span>
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="n">previous</span> <span class="o">!=</span> <span class="n">cleaned</span>
</span></span><span class="line"><span class="cl">  <span class="n">previous</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">dup</span>
</span></span><span class="line"><span class="cl">  <span class="n">cleaned</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="sr">/([一-龯A-Za-z0-9])[^一-龯A-Za-z0-9]+([一-龯A-Za-z0-9])/</span><span class="p">,</span> <span class="s1">&#39;\1\2&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">end</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Step 2: Handle anti-spam SPACES between Chinese characters</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This specifically targets the &#34;想 赚 钱&#34; -&gt; &#34;想赚钱&#34; case.</span>
</span></span><span class="line"><span class="cl"><span class="c1"># We run it in a loop to handle multiple spaces, e.g., &#34;社 区&#34; -&gt; &#34;社区&#34;</span>
</span></span><span class="line"><span class="cl"><span class="n">previous</span> <span class="o">=</span> <span class="s2">&#34;&#34;</span>
</span></span><span class="line"><span class="cl"><span class="k">while</span> <span class="n">previous</span> <span class="o">!=</span> <span class="n">cleaned</span>
</span></span><span class="line"><span class="cl">  <span class="n">previous</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">dup</span>
</span></span><span class="line"><span class="cl">  <span class="c1"># Find a Chinese char, followed by one or more spaces, then another Chinese char</span>
</span></span><span class="line"><span class="cl">  <span class="n">cleaned</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="sr">/([一-龯])(\s+)([一-龯])/</span><span class="p">,</span> <span class="s1">&#39;\1\3&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="k">end</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Step 3: Add strategic spaces</span>
</span></span><span class="line"><span class="cl"><span class="c1"># This helps jieba segment properly, e.g., &#34;社区ETH&#34; -&gt; &#34;社区 ETH&#34;</span>
</span></span><span class="line"><span class="cl"><span class="n">cleaned</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="sr">/([一-龯])([A-Za-z0-9])/</span><span class="p">,</span> <span class="s1">&#39;\1 \2&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">cleaned</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="sr">/([A-Za-z0-9])([一-龯])/</span><span class="p">,</span> <span class="s1">&#39;\1 \2&#39;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Step 4: Remove excessive space</span>
</span></span><span class="line"><span class="cl"><span class="n">cleaned</span> <span class="o">=</span> <span class="n">cleaned</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="sr">/\s+/</span><span class="p">,</span> <span class="s1">&#39; &#39;</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>After preprocessing, <code>在 币圈 想 赚 钱，那 你 不关 注 这 个 王 牌 社 区，真的太可惜了，真 心 推 荐，每 天 都 有 免 费 策 略</code> becomes <code>在币圈想赚钱那你不关注这个王牌社区真的太可惜了真心推荐每天都有免费策略</code> (Note: legitimate commas are also removed here. I find the segmentation result acceptable compared to the impact of excessive punctuation). Its segmentation is:</p>
<p><code>在 | 币圈 | 想 | 赚钱 | 那 | 你 | 不 | 关注 | 这个 | 王牌 | 社区 | 真的 | 太 | 可惜 | 了 | 真心 | 推荐 | 每天 | 都 | 有 | 免费 | 策略</code></p>
<p><code>这人简-介挂的 合-约-报单群组挺牛的ETH500点，大饼5200点！ + @BTCETHl6666</code> becomes <code>这人简介挂的合约报单群组挺牛的 ETH500 点大饼 5200 点！ + @BTCETHl6666</code>. Its segmentation is:</p>
<p><code>这 | 人 | 简介 | 挂 | 的 | 合约 | 报单 | 群组 | 挺 | 牛 | 的 | | ETH500 | | 点 | 大饼 | | 5200 | | 点 | ！ | | + | | @ | BTCETHl6666</code></p>
<h4 id="genius-new-tricks"><span class="section-num">7.1.1</span> Genius New Tricks</h4>
<p>Seeing many spam, I has to admire the creativity of spammers.</p>
<p>Because sending spam in messages gets caught by blockers, they innovatively came up with new tricks:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-fe0e1" hidden>
    <label for="zoomCheck-fe0e1">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/spam_by_username.jpg"/> 
    
    
    </label>
</figure>

<p>The messages contain normal text, but the profile picture and username are spam. This way, the spam blocker can&rsquo;t work. Truly ingenious.</p>
<p>Faced with such creative opponents, I adapted by building a training model for usernames. During detection, both the message text model and the username model are checked.
If either one considers it spam, it gets blocked.</p>
<p>One could go further by doing OCR on profile pictures to extract text and add another model for profile pictures, but OCR is quite costly, so I&rsquo;ll hold off for now.</p>
<h3 id="optimization"><span class="section-num">7.2</span> Optimization</h3>
<p>Without users, any optimization is unnecessary, as premature optimization is the root of all evil.
Therefore, I focus on building the prototype first. But this doesn&rsquo;t mean the prototype has no room for optimization.</p>
<p>I have several optimization ideas in mind:</p>
<ol>
<li>jieba&rsquo;s segmentation might not be the best; other NLP algorithms could be used later for optimization.</li>
<li>Retraining on every single training message is inefficient. A batching mechanism could be added, waiting for 5 minutes or accumulating 100 messages before processing.</li>
<li>Currently, the entire model is computed in memory and persisted to the DB after calculation. A cache layer between memory and the database could optimize performance.</li>
<li>The Bayesian algorithm might not be effective enough; a more complex machine learning model could be used.</li>
</ol>
<p>But these optimization points are all Good to have, not Must have. I&rsquo;ll optimize them when actual problems arise later.</p>
<h2 id="does-it-capture-spam"><span class="section-num">8</span> Does it capture spam?</h2>
<p>Sending a message using transformed spam words:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-cdd9f" hidden>
    <label for="zoomCheck-cdd9f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/spam_messge_2.jpg"/> 
    
    
    </label>
</figure>

<p>Successfully detected and automatically deleted:</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-78c5f" hidden>
    <label for="zoomCheck-78c5f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/deleted_spam.jpg"/> 
    
    
    </label>
</figure>

<p>Some you might say this is just a demo showcase. Why are spam posted by others in my group still not being detected?</p>
<p>Because the Bayesian algorithm is fundamentally a probability-based algorithm. If it hasn&rsquo;t encountered similar spam before, it cannot determine whether they are spam :(</p>
<p>Be patient. All you need to do is use the <code>/markspam</code> command to delete the message and ban the user. This helps train the bot, and all users of this bot will benefit.</p>
<h2 id="conclusion"><span class="section-num">9</span> Conclusion</h2>
<p>I thoroughly enjoyed this creative process: discovering a problem, having a spark of inspiration, building a prototype, and finally polishing it into a complete project.</p>
<p>Although this is purely powered by passion – the code is open source, and I have to pay for the server out of my own pocket, with no material return.</p>
<p>But every time I see the bot successfully block an spam, that joy of creation and see it&rsquo;s working is the best reward.</p>
<ul>
<li>Project repository: <a href="https://github.com/ramsayleung/bayes_spam_sniper">https://github.com/ramsayleung/bayes_spam_sniper</a></li>
<li>Try it now: <a href="https://t.me/BayesSpamSniperBot">https://t.me/BayesSpamSniperBot</a></li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://podcasts.apple.com/us/podcast/%E8%BD%AF%E4%BB%B6%E9%82%A3%E4%BA%9B%E4%BA%8B%E5%84%BF/id1147186605">https://podcasts.apple.com/us/podcast/%E8%BD%AF%E4%BB%B6%E9%82%A3%E4%BA%9B%E4%BA%8B%E5%84%BF/id1147186605</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://liuyandong.com/sample-page">https://liuyandong.com/sample-page</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://t.me/huruanhuying">https://t.me/huruanhuying</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://t.me/kaedeharakazuha17">https://t.me/kaedeharakazuha17</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://paulgraham.com/spam.html">https://paulgraham.com/spam.html</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p><a href="https://www.youtube.com/watch?v=HZGCoVF3YvM">https://www.youtube.com/watch?v=HZGCoVF3YvM</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p><a href="https://gist.github.com/ramsayleung/5848af0177a70a01d41f624e361b1b5d">https://gist.github.com/ramsayleung/5848af0177a70a01d41f624e361b1b5d</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p><a href="https://ramsayleung.github.io/zh/post/2024/%E7%BC%96%E7%A8%8B%E5%8D%81%E5%B9%B4%E7%9A%84%E6%84%9F%E6%82%9F/">https://ramsayleung.github.io/zh/post/2024/%E7%BC%96%E7%A8%8B%E5%8D%81%E5%B9%B4%E7%9A%84%E6%84%9F%E6%82%9F/</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p><a href="https://ublockorigin.com/">https://ublockorigin.com/</a>&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p><a href="https://ramsayleung.github.io/zh/post/2025/a_philosophy_of_software_design/">https://ramsayleung.github.io/zh/post/2025/a_philosophy_of_software_design/</a>&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p><a href="https://t.me/pipeapplebun">https://t.me/pipeapplebun</a>&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Reflections on Ten Years of Programming</title>
      <link>https://ramsayleung.github.io/en/post/2024/reflection_on_ten_years_of_programming/</link>
      <pubDate>Sun, 15 Dec 2024 21:09:00 -0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2024/reflection_on_ten_years_of_programming/</guid>
      <description>&lt;h2 id=&#34;preface&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Preface&lt;/h2&gt;
&lt;p&gt;Malcolm Gladwell&amp;rsquo;s &amp;ldquo;10,000-hour rule&amp;rdquo; suggests that continuous investment of 10,000 hours of effort is sufficient to reach expert level in any field.
Based on 20 hours of practice per week, this requires about 3 hours of daily investment, taking roughly ten years to achieve this goal.&lt;/p&gt;
&lt;p&gt;Since I wrote my first line of C code, more than ten years have passed.
During this period, I have written over 300,000 lines of code, some of which, written at WeChat, have served more than 1 billion users.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="preface"><span class="section-num">1</span> Preface</h2>
<p>Malcolm Gladwell&rsquo;s &ldquo;10,000-hour rule&rdquo; suggests that continuous investment of 10,000 hours of effort is sufficient to reach expert level in any field.
Based on 20 hours of practice per week, this requires about 3 hours of daily investment, taking roughly ten years to achieve this goal.</p>
<p>Since I wrote my first line of C code, more than ten years have passed.
During this period, I have written over 300,000 lines of code, some of which, written at WeChat, have served more than 1 billion users.</p>
<p>Despite having written so much code, I still dare not call myself an expert.</p>
<p>However, years of being a &ldquo;builder&rdquo;, coding day after day, have allowed me to accumulate considerable insights.</p>
<p>&ldquo;Practice makes perfect&rdquo; - these insights are both reflections on programming technology and experiences of professional life.</p>
<h2 id="continuous-learning"><span class="section-num">2</span> Continuous Learning</h2>
<p>Although I started programming with C in university, my main language in college was Java,
as Java is a very mature industrial language with rich frameworks, highly popular in enterprises, with a lot job opportunities.</p>
<p>I started web development with Java Servlets, then learned the very popular JavaEE enterprise development framework SSH, namely <a href="https://struts.apache.org/">Struts2</a> <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>+ <a href="https://spring.io/projects/spring-framework">Spring</a> <sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>+ <a href="https://hibernate.org/">Hibernate</a> <sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. Struts2 handled control logic, Spring provided decoupling, and Hibernate as ORM.</p>
<p>By the time I started job hunting, the SSH concept had changed - Struts2 was replaced by <a href="https://docs.spring.io/spring-framework/reference/web/webmvc.html">SpringMVC</a> <sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, and SSH became SpringMVC + Spring + Hibernate.</p>
<p>When I interned at Ant Group, I discovered that the team&rsquo;s codebase didn&rsquo;t use Hibernate for database, but rather <a href="https://ibatis.apache.org/">Ibatis</a> <sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>, which later switched to the newer <a href="https://mybatis.org/mybatis-3/">MyBatis</a> <sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>.</p>
<p>Ant Group internally didn&rsquo;t use Spring/SpringMVC either, but rather their in-house <a href="https://github.com/sofastack/sofa-rpc">SOFA framework</a> <sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>. The Spring community later felt that the Spring framework was too heavyweight and not conducive to rapid development, so they developed the lighter <a href="https://spring.io/projects/spring-boot">SpringBoot</a> <sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>, while Ant internally launched the SOFA version called <a href="https://github.com/sofastack/sofa-boot">SOFABoot</a> <sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>.</p>
<p>After joining WeChat Pay, I initially wrote C++ using WeChat in-house RPC framework called Svrkit. Later, due to Big Data projects, I started using Spark + Python + Hive SQL.</p>
<p>Now at AWS S3, because the business has extremely high performance and resource usage requirements, I&rsquo;m using Rust again, while legacy business still uses Java. After going full circle, I&rsquo;m back on the Java path.</p>
<p>Counting them all, over these years, I&rsquo;ve written production code in Java, C++, Python, Rust, Scala, Kotlin, and JavaScript/TypeScript.</p>
<p>Beyond work, I&rsquo;ve also learned Scheme while studying <a href="https://github.com/ramsayleung/sicp_solution">SICP</a>, Emacs Lisp from using Emacs, Swift for indie development, Ruby for Ruby on Rails, Golang for load testing, and various frameworks and libraries for different languages.</p>
<p>Since I started learning programming, I&rsquo;ve learned at least half a dozen programming languages.</p>
<p>I never define myself as a programmer of any specific language, like Java programmer or C++ programmer - I just call myself a Software Development Engineer. Languages are merely tools; as long as I keep learning, I&rsquo;ll naturally learn new programming languages when encountering new scenarios.</p>
<p>The computer world changes rapidly - new frameworks might emerge every few months, and new languages become popular every few years. Continuous learning is essential for maintaining competitiveness..</p>
<h2 id="independent-thinking"><span class="section-num">3</span> Independent Thinking</h2>
<p>WeChat used to have a tradition of giving out annual gifts to employees.</p>
<p>In 2022, the annual gift we received was indeed an aluminum plate with &ldquo;Keep Independent Thinking 2022&rdquo; written on it.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-e312c" hidden>
    <label for="zoomCheck-e312c">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/think_independently.jpg"/> 
    
    
    </label>
</figure>

<p>The creator of WeChat has always emphasized the importance of &ldquo;independent thinking&rdquo; for WeChat, believing that if he had to choose the most important quality, he would choose &ldquo;independent thinking.&rdquo;</p>
<p>What superiors say isn&rsquo;t necessarily right, what teachers say isn&rsquo;t necessarily right, what academic institutions say isn&rsquo;t necessarily right, what media says isn&rsquo;t necessarily right, and loud voices certainly aren&rsquo;t necessarily right - after all, being reasonable doesn&rsquo;t require being loud.</p>
<p>For example, microservice architecture is very popular, and many companies are implementing microservices - does this mean monolithic architecture shouldn&rsquo;t be used?</p>
<p>Should startups or small teams adopt microservice architecture for new businesses? Or should they start with monolithic architecture and migrate to services as the business grows?</p>
<p>Development inevitably involves various decisions, such as technology selection. For your requirements, you might find a dozen components that &ldquo;seemingly&rdquo; meet the requirements,
and you might search online for evaluations of each component, finding diverse opinions. You need to independently analyze each component, identify their pros and cons, and make decisions based on your team&rsquo;s characteristics.</p>
<p>Regarding independent thinking, my favorite quote is from the HBO miniseries &ldquo;Chernobyl,&rdquo;
when scientist Valery Legasov asks the KGB to release his colleague Ulana Khomyuk, who was investigating the truth, saying he could guarantee she was no problem, the KGB chief responds:</p>
<blockquote>
<p>Trust, but verify.</p></blockquote>
<p>Because the first step of independent thinking is: start questioning.</p>
<h2 id="get-it-running-first"><span class="section-num">4</span> Get It Running First</h2>
<p>This phrase has a well-known variant: &ldquo;It&rsquo;s not like it doesn&rsquo;t work.&rdquo;</p>
<p>Many programmers are perfectionists, especially those who have read &ldquo;Refactoring&rdquo; and &ldquo;Design Patterns,&rdquo; who tend to spend a lot of time optimizing code and refactoring.</p>
<p>I used to have similar impulses, always wanting to spend time optimizing code, but after working on many projects, I have a strong feeling that it&rsquo;s better to get the MVP launch first and let users experience it early.</p>
<p>If no users are using it, even the best and most beautiful code is meaningless.</p>
<p>So when I often see community members asking what language and framework to use for side projects, whether PHP/Python/Ruby would be too slow, my view has always been: build a prototype first, find the first user first.</p>
<p>When performance becomes a bottleneck, your business is already very success, and you&rsquo;ll definitely have enough money to hire a dozen programmers to rewrite your project in Rust/C++, or assembly.</p>
<p>In this regard, I strongly agree with the my co-worker sitting next to me about code quality:</p>
<blockquote>
<p>make it run, make it fast, make it beautiful.</p></blockquote>
<p>Recently attempting side projects, I have a profound realization that technology might be the least important aspect of business.</p>
<p>Building a product from scratch and promoting it to users - users only care whether your product is useful and can solve their problems.</p>
<p>They don&rsquo;t care whether you wrote it in C++/Java or JavaScript, nor do they care whether your code is elegant. Rather than obsessing over technology selection, it&rsquo;s better to build the product first and let users try it.</p>
<h2 id="the-most-comfortable-tool-is-the-best"><span class="section-num">5</span> The Most Comfortable Tool is the Best</h2>
<p>I often see people in the community asking what&rsquo;s the best language, best framework, best editor, best operating system.</p>
<p>&ldquo;Best&rdquo; is quite a subjective conclusion, and there&rsquo;s no &ldquo;best&rdquo; solution for all scenarios, but I often see community members arguing over which language is better.</p>
<p>Or when someone shares about A, others reply that B/C/D is better, then arguments ensue.</p>
<p>This reminds me of the group identity phenomenon mentioned in &ldquo;The Social Animal,&rdquo; a famous social psychology work.</p>
<p>When fans develop strong identification with a team, they view the team as part of their self-identity, leading them to:</p>
<ol>
<li>Use &ldquo;we&rdquo; instead of &ldquo;they&rdquo; to refer to the team</li>
<li>View the team&rsquo;s success as personal success</li>
<li>React defensively to criticism of the team, viewing such criticism as attacks on themselves</li>
</ol>
<p>If someone asks me this question, I answer &ldquo;the tool you&rsquo;re most comfortable and familiar with is best.&rdquo;</p>
<p>Even if for fun, programming&rsquo;s purpose is still to use computers to solve problems, and the best tool for solving problems is the one you&rsquo;re most familiar with.</p>
<p>Unless the tool you know isn&rsquo;t suitable for your problem, then naturally you need a new tool - don&rsquo;t force a square peg into a round hole.</p>
<p>Of course, if it&rsquo;s to satisfy curiosity and learn a new language, then choose what interests you.</p>
<p>When I learned Rust in 2017, it was simply because I had no classes in senior year, plenty of time, and wanted to learn something interesting and new. Rust 1.0 had only been released for 2 years then - I never expected to find work with Rust.</p>
<p>I can&rsquo;t remember where I read this passage:</p>
<blockquote>
<p>I once asked myself similar questions:</p>
<ol>
<li>Do good things necessarily become popular? Not necessarily</li>
<li>Are things I like necessarily good things? Not necessarily</li>
<li>Will I spend time and energy on something that might not become popular but I like? Yes</li>
</ol></blockquote>
<h2 id="communicate-more-with-people"><span class="section-num">6</span> Communicate More with People</h2>
<p>While programmers certainly work with machines, fundamentally we&rsquo;re still solving human problems.</p>
<p>When I first learned programming, I had a misconception that as long as I mastered the technology, I wouldn&rsquo;t need to care about &ldquo;interpersonal relationships.&rdquo;</p>
<p>Therefore, after entering the workplace, I held such views and acted accordingly. While I wasn&rsquo;t cold to others, I inevitably was, as friends described: &ldquo;aloof.&rdquo;</p>
<p>But after working for a long time, I realized that &ldquo;interpersonal relationships&rdquo; are inevitable - which is called networking and connections.</p>
<p>Even if my technical abilities are solid, I need to be seen by others. Having good relationships with colleagues and leaders allows for &ldquo;win-win&rdquo; when achievements are made.</p>
<p>So now I usually chat with colleagues whether there&rsquo;s something specific or not, both to know each other better and learn more about the group, and to find potential optimization points from colleagues&rsquo; complaints, practicing my philosophy of &ldquo;Work hard and be nice to people.&rdquo;</p>
<p>After doing this job for a long time, I find out that software engineering is fundamentally human systems engineering.</p>
<h2 id="code-isn-t-everything"><span class="section-num">7</span> Code Isn&rsquo;t Everything</h2>
<p>After writing programs for a while, it&rsquo;s easy to have illusion that everything can be solved with code.</p>
<p>When you&rsquo;re holding a hammer, everything looks like a nail.</p>
<p>The reality I learned is that many things cannot be solved with code. Code is just a tool that can only be used in appropriate scenarios - avoid path dependency.</p>
<p>Knowing how to write code isn&rsquo;t useful, you also need to write articles, give presentations within the company, let others &ldquo;see you.&rdquo;</p>
<p>Professional ability in programming and project delivery is certainly important, but you also need soft skills in marketing yourself.</p>
<p>Chatting with the boss occasionally to increase communication and showing your visibility regularly might be more useful than launching ten projects.</p>
<h2 id="work-with-excellent-people"><span class="section-num">8</span> Work with Excellent People</h2>
<p>After years in the industry, having worked at Ant Group, WeChat, and AWS, and having worked with all kinds of colleagues, I have an increasingly strong insight:</p>
<p><strong><strong>Work with excellent people</strong></strong></p>
<p>Not only can you learn excellent qualities from them and improve technical abilities, you can learn best practices and engineering experience. During code reviews, you can learn better programming approaches, and when encountering problems, you have reliable teammates to help and guide you.</p>
<p>You&rsquo;ll understand the uniqueness of systems developed by excellent programmers, know what simple and ergonomic systems look like, and develop your own technical taste.</p>
<p>Taste and aesthetics are abstract concepts, but after using good systems, you naturally won&rsquo;t be interested in those crude, poorly made systems that rely on boss endorsement for forced promotion.</p>
<p>Improving technical taste, while enhancing our technical cognition, can in turn help us improve design capabilities.</p>
<p>Another benefit of working with excellent colleagues is building high-quality professional networks, which benefits career development and provides more options when changing jobs or switching tracks.</p>
<p>Although startups also have excellent developers, on average, Big Tech have higher proportions of excellent programmers, as they offer more competitive salaries and benefits, naturally having higher recruitment standards.</p>
<p>For example, WeChat has a so-called interview committee(like a group of bar raiser). Besides interviewers from the hiring department, candidates must also pass interviews by committee interviewers to avoid lowering standards for quick hiring.</p>
<p>So I personally suggest that new graduates should go to big tech when possible, to gain experience.</p>
<p>Although I left WeChat almost two years ago, I still miss the colleagues I worked with in the same group. They were truly technically excellent, super nice people, and willing to help.</p>
<h2 id="health-is-the-foundation-of-everything"><span class="section-num">9</span> Health Is the Foundation of Everything</h2>
<p>After programming for so many years, I&rsquo;ve developed a pile of occupational diseases.</p>
<p>I had tenosynovitis since university, developed a lower back pain after working for a few years and my once thick, dark hair is now increasingly sparse.</p>
<p>Because Tencent headquarters had a free gym, I basically went to the company gym on workdays to take advantage of company benefits - 2 days of cardio running, 2 days of anaerobic equipment training, persisting for almost 3 years.</p>
<p>I also started paying attention to my diet, trying to eat less oil and sugar and avoid alcohol.</p>
<p>While fitness isn&rsquo;t a cure-all, at least I feel charged to handle high-intensity work.</p>
<p>Only when I lose something do I learn to cherish it. Only when I start taking medication and going to hospital for follow-ups do I start paying attention to my body.</p>
<p>Although programming is fun and supporting family is important, I still need to pay attention to my body. After all, health is the foundation of everything - if it breaks down, there are no other exciting stories.</p>
<h2 id="summary"><span class="section-num">10</span> Summary</h2>
<p>Whether it&rsquo;s programming or other skills, I feel they all follow the &ldquo;Matthew Effect&rdquo; - the more I learn, the more I understand, and the faster you&rsquo;ll learn new things.</p>
<p>After writing a lot of code, I realized that a programmer&rsquo;s competitiveness isn&rsquo;t writing code, nor is it any particular language or framework.
The core competitiveness is the ability to solve problems through technology - why be constrained by any specific programming language or technology?</p>
<p>I hope ten years of programming is just a starting point, and in ten years I can write another piece: &ldquo;Reflections on Twenty Years of Programming.&rdquo;</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://struts.apache.org/">https://struts.apache.org/</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://spring.io/projects/spring-framework">https://spring.io/projects/spring-framework</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://hibernate.org/">https://hibernate.org/</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://docs.spring.io/spring-framework/reference/web/webmvc.html">https://docs.spring.io/spring-framework/reference/web/webmvc.html</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p><a href="https://ibatis.apache.org/">https://ibatis.apache.org/</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p><a href="https://mybatis.org/mybatis-3/">https://mybatis.org/mybatis-3/</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p><a href="https://github.com/sofastack/sofa-rpc">https://github.com/sofastack/sofa-rpc</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p><a href="https://spring.io/projects/spring-boot">https://spring.io/projects/spring-boot</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p><a href="https://github.com/sofastack/sofa-boot">https://github.com/sofastack/sofa-boot</a>&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>TIL: Git Blame with Following</title>
      <link>https://ramsayleung.github.io/en/post/2024/git_blame_with_following/</link>
      <pubDate>Sat, 13 Apr 2024 12:46:00 -0700</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2024/git_blame_with_following/</guid>
      <description>&lt;p&gt;Developers usually use &lt;code&gt;git blame&lt;/code&gt; in GUI tools like GitHub Blame &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;
&lt;figure&gt;
    
    
    &lt;input type=&#34;checkbox&#34; id=&#34;zoomCheck-68df0&#34; hidden&gt;
    &lt;label for=&#34;zoomCheck-68df0&#34;&gt;
    
    
    &lt;img class=&#34;zoomCheck&#34; loading=&#34;lazy&#34; src=&#34;https://ramsayleung.github.io/ox-hugo/github_blame.png&#34;/&gt; 
    
    
    &lt;/label&gt;
&lt;/figure&gt;
 &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;or using GitLens blame in VSCode: &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;
&lt;figure&gt;
    
    
    &lt;input type=&#34;checkbox&#34; id=&#34;zoomCheck-7f082&#34; hidden&gt;
    &lt;label for=&#34;zoomCheck-7f082&#34;&gt;
    
    
    &lt;img class=&#34;zoomCheck&#34; loading=&#34;lazy&#34; src=&#34;https://ramsayleung.github.io/ox-hugo/git_blame_git_lens_vscode.png&#34;/&gt; 
    
    
    &lt;/label&gt;
&lt;/figure&gt;
 &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Even though GUI tools is intuitive, but the Git CLI has much more powerful tooling for finding something closer to the real story behind your code. &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;There are many scenarios that CLI is valuable, the first is ignoring the whitespace changes. &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;For example, if you formatted your C++ codebase with &lt;code&gt;clang-format&lt;/code&gt; or Javascript codebase with &lt;code&gt;prettier&lt;/code&gt;, you haven&amp;rsquo;t actually changed the codebase, but you&amp;rsquo;re the owner of tons of lines of code. &lt;br/&gt;&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Developers usually use <code>git blame</code> in GUI tools like GitHub Blame <br/></p>
<p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-68df0" hidden>
    <label for="zoomCheck-68df0">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/github_blame.png"/> 
    
    
    </label>
</figure>
 <br/></p>
<p>or using GitLens blame in VSCode: <br/></p>
<p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-7f082" hidden>
    <label for="zoomCheck-7f082">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/git_blame_git_lens_vscode.png"/> 
    
    
    </label>
</figure>
 <br/></p>
<p>Even though GUI tools is intuitive, but the Git CLI has much more powerful tooling for finding something closer to the real story behind your code. <br/></p>
<p>There are many scenarios that CLI is valuable, the first is ignoring the whitespace changes. <br/></p>
<p>For example, if you formatted your C++ codebase with <code>clang-format</code> or Javascript codebase with <code>prettier</code>, you haven&rsquo;t actually changed the codebase, but you&rsquo;re the owner of tons of lines of code. <br/></p>
<p>The <code>git blame -w</code> option will ignore these type of whitespace changes. <br/></p>
<p>The other great option is <code>-C</code> which will look for code movement between files in a commit. <br/></p>
<p>For example, if you refactor a function from one file to another, the normal <code>git</code> blame will simply show you as the author in the new file, but the <code>-C</code> option will follow that movement and show the last person who actually change those lines of code. <br/></p>
<p><code>-C</code> is extremely helpful when I need to find out the original author of some lines of code after file renames or refactors, to know more about the background and context behind this code <br/></p>
<p>According to the <code>git blame</code> doc, you could pass <code>-C</code> up to three times to ask Git try even harder: <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">-C<span class="o">[</span>&lt;num&gt;<span class="o">]</span>
</span></span><span class="line"><span class="cl">           In addition to -M, detect lines moved or copied from other files that were modified in the same commit.
</span></span><span class="line"><span class="cl">           This is useful when you reorganize your program and move code around across files.
</span></span><span class="line"><span class="cl">           When this option is given twice, the <span class="nb">command</span> additionally looks <span class="k">for</span> copies from other files in the commit that creates the file.
</span></span><span class="line"><span class="cl">           When this option is given three times, the <span class="nb">command</span> additionally looks <span class="k">for</span> copies from other files in any commit.
</span></span></code></pre></td></tr></table>
</div>
</div><p>(it&rsquo;s a bit of odd design) <br/></p>
<p>Let&rsquo;s take <a href="https://github.com/rails/rails/blob/main/activemodel/lib/active_model/access.rb">the access.rb file of ActiveModel module in Rails framework</a> for example: <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">git blame activemodel/lib/active_model/access.rb
</span></span></code></pre></td></tr></table>
</div>
</div><p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-b230f" hidden>
    <label for="zoomCheck-b230f">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/normal_git_blame.png"
         alt="Figure 1: Vanilla git blame"/> 
    
    
    </label><figcaption>
            <p><span class="figure-number">Figure 1: </span>Vanilla git blame</p>
        </figcaption>
</figure>
 <br/></p>
<p>Ok, it looks like Jonathan Hefner wrote all of this code it appears, let&rsquo;s look at the same code with <code>git blame -w -C -C -C activemodel/lib/active_model/access.rb</code> <br/></p>
<p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-dab9e" hidden>
    <label for="zoomCheck-dab9e">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/git_blame_-w_-C_-C_-C.png"
         alt="Figure 2: git blame -w -C -C -C"/> 
    
    
    </label><figcaption>
            <p><span class="figure-number">Figure 2: </span>git blame -w -C -C -C</p>
        </figcaption>
</figure>
 <br/></p>
<p>Now we can see that Git has followed this code from file to file over the course of multiple renames, it turns out Jonathan Hefner is the most recent file renamer, Guillermo Iguaran is the original author. <br/></p>
<p>If we want to know the history about this file, it&rsquo;s much better to ask Guillermo rather than Jonathan, which is beyond what the GUI blame or normal Git blame tool reveals <br/></p>
]]></content:encoded>
    </item>
    <item>
      <title>TIL: Git Conditional Configs</title>
      <link>https://ramsayleung.github.io/en/post/2024/git_conditional_configs/</link>
      <pubDate>Sun, 07 Apr 2024 12:38:00 -0700</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2024/git_conditional_configs/</guid>
      <description>&lt;p&gt;Every Git user will have probably been asked to set up their Git at the first time: &lt;br/&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-sh&#34; data-lang=&#34;sh&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;git config --global user.name &lt;span class=&#34;s2&#34;&gt;&amp;#34;Ramsay Leung&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;git config --global user.email ramsayleung@gmail.com
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The above command will simply add the &lt;code&gt;user.name&lt;/code&gt; and &lt;code&gt;user.email&lt;/code&gt; value into your &lt;code&gt;~/.gitconfig&lt;/code&gt; file &lt;br/&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-sh&#34; data-lang=&#34;sh&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;gt; cat ~/.gitconfig
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;o&#34;&gt;[&lt;/span&gt;user&lt;span class=&#34;o&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nv&#34;&gt;name&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; Ramsay Leung
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nv&#34;&gt;email&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; ramsayleung@gmail.com
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;o&#34;&gt;[&lt;/span&gt;core&lt;span class=&#34;o&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nv&#34;&gt;quotepath&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;nb&#34;&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;o&#34;&gt;[&lt;/span&gt;init&lt;span class=&#34;o&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;nv&#34;&gt;defaultBranch&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; master
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;You could also specify &lt;code&gt;--local&lt;/code&gt; argument to writes the config values to &lt;code&gt;.git/config&lt;/code&gt; in whatever project you&amp;rsquo;re currently in. &lt;br/&gt;&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Every Git user will have probably been asked to set up their Git at the first time: <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">git config --global user.name <span class="s2">&#34;Ramsay Leung&#34;</span>
</span></span><span class="line"><span class="cl">git config --global user.email ramsayleung@gmail.com
</span></span></code></pre></td></tr></table>
</div>
</div><p>The above command will simply add the <code>user.name</code> and <code>user.email</code> value into your <code>~/.gitconfig</code> file <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">&gt; cat ~/.gitconfig
</span></span><span class="line"><span class="cl"><span class="o">[</span>user<span class="o">]</span>
</span></span><span class="line"><span class="cl">    <span class="nv">name</span> <span class="o">=</span> Ramsay Leung
</span></span><span class="line"><span class="cl">    <span class="nv">email</span> <span class="o">=</span> ramsayleung@gmail.com
</span></span><span class="line"><span class="cl"><span class="o">[</span>core<span class="o">]</span>
</span></span><span class="line"><span class="cl">    <span class="nv">quotepath</span> <span class="o">=</span> <span class="nb">false</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>init<span class="o">]</span>
</span></span><span class="line"><span class="cl">    <span class="nv">defaultBranch</span> <span class="o">=</span> master
</span></span></code></pre></td></tr></table>
</div>
</div><p>You could also specify <code>--local</code> argument to writes the config values to <code>.git/config</code> in whatever project you&rsquo;re currently in. <br/></p>
<p>If you need to simultaneously contribute to your work and open source project on the same laptop, with different Git config values, e.g.(company email address for work-specific projects, personal email address for open source project), what should you do? <br/></p>
<p>You could definitely set up work-specific config as global config, then set up personal config with <code>--local</code> for every personal project separately. It works, but tedious and easy to mess-up. <br/></p>
<p>Fortunately, starting from Git version 2.13, Git supports conditional configuration includes, you are capable of setting up different configs for different repositories. <br/></p>
<p>If you add the following config to your global config file: <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-toml" data-lang="toml"><span class="line"><span class="cl"><span class="p">[</span><span class="nx">includeIf</span> <span class="s2">&#34;gitdir:~/projects/oss/&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="nx">path</span> <span class="p">=</span> <span class="err">~/</span><span class="p">.</span><span class="nx">gitconfig-oss</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">[</span><span class="nx">includeIf</span> <span class="s2">&#34;gitdir:~/projects/work/&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="nx">path</span> <span class="p">=</span> <span class="err">~/</span><span class="p">.</span><span class="nx">gitconfig-work</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Then Git will look in the <code>~/.gitconfig-oss</code> files for values only if the project you are currently working on matches <code>~/projects/oss/</code>. <br/></p>
<p><strong><strong>Caution</strong></strong>: If you forget to specify the &ldquo;/&rdquo; at the end of the git dir, e.g. &ldquo;~/projects/oss&rdquo;, Conditional Config won&rsquo;t work! <br/></p>
<p>Therefore, you could have a &ldquo;work&rdquo; directory and work-specific config here and an &ldquo;oss&rdquo; directory with values for your open source projects, etc. <br/></p>
<p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-d0b28" hidden>
    <label for="zoomCheck-d0b28">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/conditional_config.png"/> 
    
    
    </label>
</figure>
 <br/></p>
<p>Git also supports other filters more than <code>gitdir</code>, you could specify a branch name as an include filter with <code>onbranch</code> <br/></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-toml" data-lang="toml"><span class="line"><span class="cl">  <span class="err">;</span> <span class="nx">include</span> <span class="nx">only</span> <span class="nx">if</span> <span class="nx">we</span> <span class="nx">are</span> <span class="nx">in</span> <span class="nx">a</span> <span class="nx">worktree</span> <span class="nx">where</span> <span class="nx">foo-branch</span> <span class="nx">is</span>
</span></span><span class="line"><span class="cl"><span class="err">;</span> <span class="nx">currently</span> <span class="nx">checked</span> <span class="nx">out</span>
</span></span><span class="line"><span class="cl"><span class="p">[</span><span class="nx">includeIf</span> <span class="s2">&#34;onbranch:foo-branch&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">        <span class="nx">path</span> <span class="p">=</span> <span class="nx">foo</span><span class="p">.</span><span class="nx">inc</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Check out <a href="https://git-scm.com/docs/git-config?ref=blog.gitbutler.com#_includes">the Git docs</a> for more details <br/></p>
]]></content:encoded>
    </item>
    <item>
      <title>Rewind your Github summary</title>
      <link>https://ramsayleung.github.io/en/post/2024/github_summary/</link>
      <pubDate>Mon, 01 Jan 2024 16:16:00 -0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2024/github_summary/</guid>
      <description>&lt;h2 id=&#34;goodbye-2023&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Goodbye 2023&lt;/h2&gt;
&lt;p&gt;As I farewelled to 2023, a year marked by numerous changes and personal evolution, I find myself recollecting the multitude of experiences that unfolded. &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;My 2023 journey was nothing short of fascinating and exciting, prompting me to revisit the year from various angles. &lt;br/&gt;&lt;/p&gt;
&lt;p&gt;After seeing hoards of posts in social media generated by &lt;a href=&#34;https://github.com/sallar/github-contributions-chart&#34;&gt;Github Contributions Chart&lt;/a&gt;, I thought I could also build an APP to summarize my Github contribution for every year for friends to have fun. &lt;br/&gt;&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="goodbye-2023"><span class="section-num">1</span> Goodbye 2023</h2>
<p>As I farewelled to 2023, a year marked by numerous changes and personal evolution, I find myself recollecting the multitude of experiences that unfolded. <br/></p>
<p>My 2023 journey was nothing short of fascinating and exciting, prompting me to revisit the year from various angles. <br/></p>
<p>After seeing hoards of posts in social media generated by <a href="https://github.com/sallar/github-contributions-chart">Github Contributions Chart</a>, I thought I could also build an APP to summarize my Github contribution for every year for friends to have fun. <br/></p>
<p>I spent my entire 4-days-new-year vocation to build this app named: <a href="https://github-summary.vercel.app/">Github Summary</a>. <br/></p>
<p>This project led me through a series of first-time experiences: first time to try Tailwind Css framework, first time to use and deploy project on Vercel, first time to build project on nextjs, first time to develop a public project on React(yes, I&rsquo;ve tried to learn React for hundreds of times, but never get a chance to use it in real project), etc. <br/></p>
<h2 id="happy-2024"><span class="section-num">2</span> Happy 2024</h2>
<p>While I hoped I could have completed this project by the close of 2023 to share summaries with friends, life&rsquo;s timeline had other plans. <br/></p>
<p>Now, as we step into 2024, I am thrilled to publish the GitHub Summary. <br/></p>
<p>It&rsquo;s never too late to showcase creative work, and this project is poised to generate insightful summaries not just for the past year but for the adventures that await in 2024. <br/></p>
<p>Wishing everyone a Happy New Year! Feel free to explore <a href="https://github-summary.vercel.app/">GitHub Summary</a>: <a href="https://github-summary.vercel.app/">https://github-summary.vercel.app/</a> <br/></p>
<p>
<figure>
    
    
    <input type="checkbox" id="zoomCheck-385a8" hidden>
    <label for="zoomCheck-385a8">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/github_summary.png"/> 
    
    
    </label>
</figure>
 <br/></p>
]]></content:encoded>
    </item>
    <item>
      <title>How to share resource between CDK stacks</title>
      <link>https://ramsayleung.github.io/en/post/2023/how_to_share_resource_between_cdk_stacks/</link>
      <pubDate>Wed, 28 Jun 2023 09:41:00 -0700</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2023/how_to_share_resource_between_cdk_stacks/</guid>
      <description>&lt;h2 id=&#34;introduction&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Introduction&lt;/h2&gt;
&lt;h3 id=&#34;iac&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1.1&lt;/span&gt; IaC&lt;/h3&gt;
&lt;p&gt;Infrastructure as code(IaC) is the managing and provisioning of infrastructure through code instead of manual processes, for example, clicking button, adding or editing roles in AWS console.&lt;/p&gt;
&lt;h3 id=&#34;aws-cloudformation&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1.2&lt;/span&gt; AWS CloudFormation&lt;/h3&gt;
&lt;p&gt;AWS CloudFormation is the original IaC tool for AWS, released in 2011, which uses template files to automate and mange the setup of AWS resources.&lt;/p&gt;
&lt;h3 id=&#34;aws-cdk&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1.3&lt;/span&gt; AWS CDK&lt;/h3&gt;
&lt;p&gt;AWS Cloud Development Kit(CDK) is a product provided by AWS that makes it easier for developers to manage their infrastructure with familiar programming languages like TypeScript, Python, Java, etc.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="introduction"><span class="section-num">1</span> Introduction</h2>
<h3 id="iac"><span class="section-num">1.1</span> IaC</h3>
<p>Infrastructure as code(IaC) is the managing and provisioning of infrastructure through code instead of manual processes, for example, clicking button, adding or editing roles in AWS console.</p>
<h3 id="aws-cloudformation"><span class="section-num">1.2</span> AWS CloudFormation</h3>
<p>AWS CloudFormation is the original IaC tool for AWS, released in 2011, which uses template files to automate and mange the setup of AWS resources.</p>
<h3 id="aws-cdk"><span class="section-num">1.3</span> AWS CDK</h3>
<p>AWS Cloud Development Kit(CDK) is a product provided by AWS that makes it easier for developers to manage their infrastructure with familiar programming languages like TypeScript, Python, Java, etc.</p>
<p>And, CDK is standing on the shoulder of Cloudformation, providing tools for developers by leveraging Cloudformation.</p>
<p>A stack is a collection of AWS resources that you can manage as a single unit, like a box.</p>
<p>For instance, this box could include all the resources required to run an application or Lambda service, such as S3 Buckets (storage), Roles (authorization), Lambda Function (computing), API Gateway (access point), Alarm, Monitoring, etc.</p>
<h2 id="problem"><span class="section-num">2</span> Problem</h2>
<p>I am currently working on a project which requires to set up two stacks, one stack( <code>GlueStack</code> ) for defining a list of AWS Glue tables and the other stack( <code>ServiceStack</code> ) for definition of Lambda service and associated resources.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-1a0ae" hidden>
    <label for="zoomCheck-1a0ae">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/two_stacks.jpg"/> 
    
    
    </label>
</figure>

<p>In fact, S3 bucket names have to be globally unique within a partition, which means crossing the whole AWS customer base.</p>
<p>You are unable to create a S3 bucket with bucket name which is in use by another AWS customer or your own account.</p>
<p>So it&rsquo;s safer to let CloudFormation generate a random bucket name for a developer when he need to initialize a S3 bucket.</p>
<p>However, there is new a problem I face: since the S3 bucket name is randomly generated characters, if <code>GlueStack</code> need to read the bucket created by <code>ServiceStack</code>, how could I share the bucket name between two stacks?</p>
<p>While these two stacks are isolated and separated, resources collection.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-cc2ab" hidden>
    <label for="zoomCheck-cc2ab">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/s3_bucket_problem.jpg"/> 
    
    
    </label>
</figure>

<h2 id="solution"><span class="section-num">3</span> Solution</h2>
<p>Fortunately, CDK offers a facility named <code>CfnOutput</code> to export a deployed resource, so that the consumer of the resource is able to <code>Import</code> required resource.</p>

<figure>
    
    
    <input type="checkbox" id="zoomCheck-f615d" hidden>
    <label for="zoomCheck-f615d">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/export.jpg"/> 
    
    
    </label>
</figure>

<ol>
<li>Define the required resource in <code>ServiceStack</code> (producer), for instance, a S3 bucket:
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">Bucket</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;aws-cdk-lib/aws-s3&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">s3Bucket</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Bucket</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="s1">&#39;MyBucketId&#39;</span><span class="p">,</span> <span class="p">{});</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
<li>Export the resource by specifying the <code>value</code> and <code>exportName</code>:
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">CfnOutput</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;aws-cdk-lib&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// export the generated bucket name to other stack
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="k">new</span> <span class="nx">CfnOutput</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="s1">&#39;exportRequiredS3Bucket&#39;</span><span class="p">,</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nx">value</span><span class="o">:</span> <span class="nx">s3Bucket</span><span class="p">.</span><span class="nx">bucketName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nx">exportName</span><span class="o">:</span> <span class="s1">&#39;exportRequiredS3Bucket&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
<li>Import the required resource in <code>GlueStack</code> (consumer):
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">Fn</span><span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;aws-cdk-lib&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requiredS3BucketName</span> <span class="o">=</span> <span class="nx">Fn</span><span class="p">.</span><span class="nx">importValue</span><span class="p">(</span><span class="s1">&#39;exportRequiredS3Bucket&#39;</span><span class="p">);</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
</ol>
<p>If we take a closer look at the synthesized CFN template for ServiceStack, we could find:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="s2">&#34;Outputs&#34;</span><span class="err">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;exportRequiredS3Bucket&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;Value&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;Ref&#34;</span><span class="p">:</span> <span class="s2">&#34;MyBucketId737FC949&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">},</span>
</span></span><span class="line"><span class="cl">        <span class="nt">&#34;Export&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nt">&#34;Name&#34;</span><span class="p">:</span> <span class="s2">&#34;exportRequiredS3Bucket&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The synthesized CFN template for <code>GlueStack</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;Fn::ImportValue&#34;</span><span class="p">:</span> <span class="s2">&#34;exportRequiredS3Bucket&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This is the way about how to share value between two stacks.</p>
<hr>
<h2 id="loose-couping-solution"><span class="section-num">4</span> Loose couping solution</h2>
<p>Updated on 2023-12-02</p>
<p>People learn from mistake.</p>
<p>After applying this practice in my project, I recently learn that it&rsquo;s not good practice to share resource across stack.</p>
<p>With using <code>export/import</code>, I tightly couple my stacks with a commitment that I can never update that unless I remove that couping later on.</p>
<p>It means it will become a disaster<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> whenever I need to update/delete the <code>S3Bucket</code>, <code>CloudFormation</code> will raise an error, complaining something like: &ldquo;ServiceStack cannot be deleted as it&rsquo;s in use by GlueStack&rdquo;.</p>
<p>A better practice I learnt is adding a loose couping between <code>ServiceStack</code> and <code>GlueStack</code> by sharing a constant variable:</p>
<ol>
<li>
<p>Define a constant variable somewhere:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">export</span> <span class="kr">const</span> <span class="nx">Constants</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nx">MyBucketName</span><span class="o">:</span> <span class="s1">&#39;TestBucket&#39;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
<li>
<p>Refine the definition of <code>s3Bucket</code></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">import</span> <span class="p">{</span> <span class="nx">Bucket</span> <span class="p">}</span> <span class="nx">from</span> <span class="s1">&#39;aws-cdk-lib/aws-s3&#39;</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">s3Bucket</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Bucket</span><span class="p">(</span><span class="k">this</span><span class="p">,</span> <span class="s1">&#39;MyBucketId&#39;</span><span class="p">,</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nx">bucketName</span><span class="o">:</span> <span class="nx">Constants</span><span class="p">.</span><span class="nx">MyBucketName</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
<li>
<p>Refer the <code>s3Bucket</code> in <code>GlueStack</code> by <code>MyBucketName</code> instead of CDK exported reference</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-javascript" data-lang="javascript"><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requiredS3BucketName</span> <span class="o">=</span> <span class="nx">Constants</span><span class="p">.</span><span class="nx">MyBucketName</span><span class="p">;</span>
</span></span></code></pre></td></tr></table>
</div>
</div></li>
</ol>
<p>Therefore, these two stacks are not directly coupled, but they are referencing the same constant variable.</p>
<p>Then, CloudFormation won&rsquo;t prevent you from updating the <code>S3Bucket</code> as there is not direct relation between these two stacks anymore.</p>
<p>This is the benefit of loose couping.</p>
<h2 id="reference"><span class="section-num">5</span> Reference</h2>
<ul>
<li><a href="https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.CfnOutput.html">API Document: class CfnOutput (construct)</a></li>
<li><a href="https://docs.aws.amazon.com/cdk/v2/guide/stacks.html">AWS Cloud Development Kit (AWS CDK) v2</a></li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://stackoverflow.com/questions/63350346/delete-resource-with-references">https://stackoverflow.com/questions/63350346/delete-resource-with-references</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Topological Sort</title>
      <link>https://ramsayleung.github.io/en/post/2022/topological_sorting/</link>
      <pubDate>Sun, 22 May 2022 10:34:00 +0800</pubDate>
      <guid>https://ramsayleung.github.io/en/post/2022/topological_sorting/</guid>
      <description>&lt;h2 id=&#34;definition&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;1&lt;/span&gt; Definition&lt;/h2&gt;
&lt;p&gt;In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge &lt;code&gt;uv&lt;/code&gt; from vertex &lt;code&gt;u&lt;/code&gt; to vertx &lt;code&gt;v&lt;/code&gt;, &lt;code&gt;u&lt;/code&gt; comes before &lt;code&gt;v&lt;/code&gt; in the ordering.&lt;/p&gt;
&lt;p&gt;It sounds pretty academic, but I am sure you are using topological sort unconsciously every single day.&lt;/p&gt;
&lt;h2 id=&#34;application&#34;&gt;&lt;span class=&#34;section-num&#34;&gt;2&lt;/span&gt; Application&lt;/h2&gt;
&lt;p&gt;Many real world situations can be modeled as a graph with directed edges where some events must occur before others. Then a topological sort gives an order in which to perform these events, for instance:&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="definition"><span class="section-num">1</span> Definition</h2>
<p>In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge <code>uv</code> from vertex <code>u</code> to vertx <code>v</code>, <code>u</code> comes before <code>v</code> in the ordering.</p>
<p>It sounds pretty academic, but I am sure you are using topological sort unconsciously every single day.</p>
<h2 id="application"><span class="section-num">2</span> Application</h2>
<p>Many real world situations can be modeled as a graph with directed edges where some events must occur before others. Then a topological sort gives an order in which to perform these events, for instance:</p>
<h3 id="college-class-prerequisites"><span class="section-num">2.1</span> College class prerequisites</h3>
<p>You must take course <code>b</code> first if you want to take course <code>a</code>. For example, in your alma mater, the student must complete <code>PHYS:1511(College Physics)</code> or <code>PHYS:1611(Introductory Physics I)</code> before taking <code>College Physics II</code>.</p>
<p>The courses can be represented by vertices, and there is an edge from <code>College Physics</code> to <code>College Physics II</code> since <code>PHYS:1511</code> must be finished before <code>College Physics II</code> can be enrolled.</p>

<figure><a href="/ox-hugo/course_prerequsites.png">
    
    
    <input type="checkbox" id="zoomCheck-1c4f5" hidden>
    <label for="zoomCheck-1c4f5">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/course_prerequsites.png"/> 
    
    
    </label></a>
</figure>

<h3 id="job-scheduling"><span class="section-num">2.2</span> Job scheduling</h3>
<p>scheduling a sequence of jobs or tasks based on their dependencies. The jobs are represented by vertices, and there is an edge from <code>x</code> to <code>y</code> if job <code>x</code> must be completed before job <code>y</code> can be started.</p>

<figure><a href="/ox-hugo/job_scheduling.png">
    
    
    <input type="checkbox" id="zoomCheck-0797d" hidden>
    <label for="zoomCheck-0797d">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/job_scheduling.png"/> 
    
    
    </label></a>
</figure>

<p>In the context of a CI/CD pipeline, the relationships between jobs can be represented by directed graph(specifically speaking, by directed acyclic graph). For example, in a CI pipeline, <code>build</code> job should be finished before start <code>test</code> job and <code>lint</code> job.</p>
<h3 id="program-build-dependencies"><span class="section-num">2.3</span> Program build dependencies</h3>
<p>You want to figure out in which order you should compile all the program&rsquo;s dependencies so that you will never try and compile a dependency for which you haven&rsquo;t first built all of its dependencies.</p>
<p>A typical example is <code>GNU Make</code>: you specific your targets in a <code>makefile</code>, <code>Make</code> will parse <code>makefile</code>, and figure out which target should be built firstly. Supposing you have a <code>makefile</code> like this:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-makefile" data-lang="makefile"><span class="line"><span class="cl"><span class="c"># Makefile for analysis report
</span></span></span><span class="line"><span class="cl"><span class="c"></span>
</span></span><span class="line"><span class="cl"><span class="nf">output/figure_1.png</span><span class="o">:</span> <span class="n">data</span>/<span class="n">input_file_</span>1.<span class="n">csv</span> <span class="n">scripts</span>/<span class="n">generate_histogram</span>.<span class="n">py</span>
</span></span><span class="line"><span class="cl"><span class="err">python</span> <span class="err">scripts/generate_histogram.py</span> <span class="err">-i</span> <span class="err">data/input_file_1.csv</span> <span class="err">-o</span> <span class="err">output/figure_1.png</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nf">output/figure_2.png</span><span class="o">:</span> <span class="n">data</span>/<span class="n">input_file_</span>2.<span class="n">csv</span> <span class="n">scripts</span>/<span class="n">generate_histogram</span>.<span class="n">py</span>
</span></span><span class="line"><span class="cl"><span class="err">python</span> <span class="err">scripts/generate_histogram.py</span> <span class="err">-i</span> <span class="err">data/input_file_2.csv</span> <span class="err">-o</span> <span class="err">output/figure_2.png</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nf">output/report.pdf</span><span class="o">:</span> <span class="n">report</span>/<span class="n">report</span>.<span class="n">tex</span> <span class="n">output</span>/<span class="n">figure_</span>1.<span class="n">png</span> <span class="n">output</span>/<span class="n">figure_</span>2.<span class="n">png</span>
</span></span><span class="line"><span class="cl"><span class="err">cd</span> <span class="err">report/</span> <span class="err">&amp;&amp;</span> <span class="err">pdflatex</span> <span class="err">report.tex</span> <span class="err">&amp;&amp;</span> <span class="err">mv</span> <span class="err">report.pdf</span> <span class="err">../output/report.pdf</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><code>Make</code> will generate a DAG internally to figure out which target should be executed firstly with typological sort:</p>

<figure><a href="/ox-hugo/build_dependencies.png">
    
    
    <input type="checkbox" id="zoomCheck-9b93b" hidden>
    <label for="zoomCheck-9b93b">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/build_dependencies.png"/> 
    
    
    </label></a>
</figure>

<h2 id="directed-acyclic-graph"><span class="section-num">3</span> Directed Acyclic Graph</h2>
<p>Back to the definition, we say that a topological ordering of a directed graph is a linear ordering of its vertices, but not all directed graphs have a topological ordering.</p>
<p>A topological ordering is possible if and only if the graph has no directed cycles, that is, if it&rsquo;s a directed acyclic graph(DAG).</p>
<p>Let us see some examples:</p>

<figure><a href="/ox-hugo/directed_acyclic_graph.png">
    
    
    <input type="checkbox" id="zoomCheck-3b058" hidden>
    <label for="zoomCheck-3b058">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/directed_acyclic_graph.png"/> 
    
    
    </label></a>
</figure>

<p>The definition requires that only the directed acyclic graph has a topological ordering, but why? What happens if we are trying to find a topological ordering of a directed graph? Let&rsquo;s take the <code>figure 3</code> for an example.</p>

<figure><a href="/ox-hugo/dag_issue.png">
    
    
    <input type="checkbox" id="zoomCheck-a4346" hidden>
    <label for="zoomCheck-a4346">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/dag_issue.png"/> 
    
    
    </label></a>
</figure>

<p>The directed graph problem has no solution, this is the reason why directed cycle is forbidden</p>
<h2 id="kahn-s-algorithm"><span class="section-num">4</span> Kahn&rsquo;s Algorithm</h2>
<p>There are several <a href="https://en.wikipedia.org/wiki/Topological_sorting#Algorithms">algorithms</a> for topological sorting, Kahn&rsquo;s algorithm is one of them, based on breadth first search.</p>
<p>The intuition behind Kahn&rsquo;s algorithm is pretty straightforward:</p>
<p><strong><strong>To repeatedly remove nodes without any dependencies from the graph and add them to the topological ordering</strong></strong></p>
<p>As nodes without dependencies are removed from the graph, the original nodes depend on the removed node should be free now.</p>
<p>We keep removing nodes without dependencies from the graph until all nodes are processed, or a cycle is detected.</p>
<p>The dependencies of one node are represented as in-degree of this node.</p>
<p>Let&rsquo;s take a quick example of how to find out a topological ordering of a given graph with Kahn&rsquo;s algorithm.</p>

<figure><a href="/ox-hugo/kahn%27s_algorithm_1.png">
    
    
    <input type="checkbox" id="zoomCheck-c84fa" hidden>
    <label for="zoomCheck-c84fa">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/kahn%27s_algorithm_1.png"/> 
    
    
    </label></a>
</figure>


<figure><a href="/ox-hugo/kahn%27s_algorithm_2.png">
    
    
    <input type="checkbox" id="zoomCheck-6a74a" hidden>
    <label for="zoomCheck-6a74a">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/kahn%27s_algorithm_2.png"/> 
    
    
    </label></a>
</figure>


<figure><a href="/ox-hugo/kahn%27s_algorithm_3.png">
    
    
    <input type="checkbox" id="zoomCheck-a6975" hidden>
    <label for="zoomCheck-a6975">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/kahn%27s_algorithm_3.png"/> 
    
    
    </label></a>
</figure>


<figure><a href="/ox-hugo/kahn%27s_algorithm_4.png">
    
    
    <input type="checkbox" id="zoomCheck-020a2" hidden>
    <label for="zoomCheck-020a2">
    
    
    <img class="zoomCheck" loading="lazy" src="/ox-hugo/kahn%27s_algorithm_4.png"/> 
    
    
    </label></a>
</figure>

<p>Now we should understand how Kahn&rsquo;s algorithm works. Let&rsquo;s have a look at a C++ implementation of Kahn&rsquo;s algorithm:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span><span class="lnt">44
</span><span class="lnt">45
</span><span class="lnt">46
</span><span class="lnt">47
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-C++" data-lang="C++"><span class="line"><span class="cl"><span class="cp">#include</span> <span class="cpf">&lt;deque&gt;</span><span class="cp">
</span></span></span><span class="line"><span class="cl"><span class="cp">#include</span> <span class="cpf">&lt;vector&gt;</span><span class="cp">
</span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1">// Kahn&#39;s algorithm
</span></span></span><span class="line"><span class="cl"><span class="c1">// `adj` is a directed acyclic graph represented as an adjacency list.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span>
</span></span><span class="line"><span class="cl"><span class="n">findTopologicalOrder</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;&gt;</span> <span class="o">&amp;</span><span class="n">adj</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="kt">int</span> <span class="n">n</span> <span class="o">=</span> <span class="n">adj</span><span class="p">.</span><span class="n">size</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">in_degree</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span> <span class="o">&amp;</span><span class="nl">to_vertex</span> <span class="p">:</span> <span class="n">adj</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">in_degree</span><span class="p">[</span><span class="n">to_vertex</span><span class="p">]</span><span class="o">++</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// queue contains nodes with no incoming edges
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="n">std</span><span class="o">::</span><span class="n">deque</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">queue</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="p">(</span><span class="n">in_degree</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="n">queue</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">order</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="kt">int</span> <span class="n">index</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="k">while</span> <span class="p">(</span><span class="n">queue</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">int</span> <span class="n">cur</span> <span class="o">=</span> <span class="n">queue</span><span class="p">.</span><span class="n">front</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">queue</span><span class="p">.</span><span class="n">pop_front</span><span class="p">();</span>
</span></span><span class="line"><span class="cl">    <span class="n">order</span><span class="p">[</span><span class="n">index</span><span class="o">++</span><span class="p">]</span> <span class="o">=</span> <span class="n">cur</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span> <span class="o">&amp;</span><span class="nl">next</span> <span class="p">:</span> <span class="n">adj</span><span class="p">[</span><span class="n">cur</span><span class="p">])</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="k">if</span> <span class="p">(</span><span class="o">--</span><span class="n">in_degree</span><span class="p">[</span><span class="n">next</span><span class="p">]</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">	<span class="n">queue</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">next</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// there is no cycle
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>  <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="n">index</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">order</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// return an empty list if there is a cycle
</span></span></span><span class="line"><span class="cl"><span class="c1"></span>    <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span><span class="p">{};</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="bonus"><span class="section-num">5</span> Bonus</h2>
<p>When a pregnant woman takes calcium pills, she must make sure also that her diet is rich in vitamin D, since this vitamin makes the absorption of calcium possible.</p>
<p>After reading the demonstration of topological ordering, you (and I) too should take a certain vitamin, metaphorically speaking, to help you absorb. The vitamin D I pick for you (and myself) is two leetcode problems, which involve with the most typical use case of topological ordering &ndash; college class prerequisites:</p>
<ul>
<li><a href="https://leetcode.com/problems/course-schedule/">Course Schedule</a></li>
<li><a href="https://leetcode.com/problems/course-schedule-ii/">Course Schedule II</a></li>
</ul>
<h2 id="reference"><span class="section-num">6</span> Reference</h2>
<ul>
<li><a href="https://www.youtube.com/watch?v=cIBFEhD77b4">Topological Sort | Kahn&rsquo;s Algorithm | Graph Theory</a></li>
<li><a href="https://docs.gitlab.com/ee/ci/directed_acyclic_graph/">Directed Acyclic Graph</a></li>
<li><a href="https://gertjanvandenburg.com/files/talk/make.html">Hands-on Tutorial on Make</a></li>
<li><a href="https://en.wikipedia.org/wiki/Topological_sorting">Topological sorting</a></li>
</ul>
]]></content:encoded>
    </item>
  </channel>
</rss>
