<?xml version="1.0" encoding="utf-8"?><rss version="2.0" xml:lang="en" xmlns:atom="http://www.w3.org/2005/Atom"><channel><link href="https://flowerinthenight.com/atom.xml" rel="self" type="application/atom+xml"/><link href="https://flowerinthenight.com/" rel="alternate" type="text/html"/><link href="https://flowerinthenight.com/paige-search.json" rel="alternate" type="application/json"/><link href="https://flowerinthenight.com/rss.xml" rel="alternate" type="application/rss+xml"/><copyright>© Flowerinthenight, 2016-2025. All rights reserved.</copyright><description>Recent content</description><language>en</language><lastBuildDate>2025-12-02 00:00:00 +0900 JST</lastBuildDate><link>https://flowerinthenight.com/</link><managingEditor>root@flowerinthenight.com (Flowerinthenight)</managingEditor><title/><webMaster>root@flowerinthenight.com (Flowerinthenight)</webMaster><item><description><![CDATA[<br>
<p>This is a guide for my (future) self.</p>
<p>To setup <a href="https://filen.io/">Filen</a> as a systemd service, see guide below. As an example, the username we will be using is <code>user1</code>.</p>
<p><strong>[1]</strong> Create a unit file under <code>/usr/lib/systemd/system/</code> for our service. Take note of the binary location and the mount point.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">cat &gt;/usr/lib/systemd/system/filen-mount.service <span class="s">&lt;&lt;EOL
</span></span></span><span class="line"><span class="cl"><span class="s">[Unit]
</span></span></span><span class="line"><span class="cl"><span class="s">Description=Filen CLI Mount Service
</span></span></span><span class="line"><span class="cl"><span class="s">After=network-online.target
</span></span></span><span class="line"><span class="cl"><span class="s">Wants=network-online.target
</span></span></span><span class="line"><span class="cl"><span class="s">
</span></span></span><span class="line"><span class="cl"><span class="s">[Service]
</span></span></span><span class="line"><span class="cl"><span class="s">Type=simple
</span></span></span><span class="line"><span class="cl"><span class="s">ExecStart=/home/user1/.local/bin/filen mount /home/user1/filen/ --quiet
</span></span></span><span class="line"><span class="cl"><span class="s">Restart=on-failure
</span></span></span><span class="line"><span class="cl"><span class="s">RestartSec=5
</span></span></span><span class="line"><span class="cl"><span class="s">User=user1
</span></span></span><span class="line"><span class="cl"><span class="s">WorkingDirectory=/home/user1/
</span></span></span><span class="line"><span class="cl"><span class="s">
</span></span></span><span class="line"><span class="cl"><span class="s">[Install]
</span></span></span><span class="line"><span class="cl"><span class="s">WantedBy=multi-user.target
</span></span></span><span class="line"><span class="cl"><span class="s">EOL</span>
</span></span></code></pre></div></blockquote>
<p><strong>[2]</strong> Set the service to start on boot.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">systemctl daemon-reload
</span></span><span class="line"><span class="cl">systemctl <span class="nb">enable</span> filen-mount
</span></span><span class="line"><span class="cl">systemctl start filen-mount
</span></span></code></pre></div></blockquote>
<p>(Might need <code>sudo</code> for admin permissions.)</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-12-02:/blog/2025-12-02-filen-systemd/</guid><link>https://flowerinthenight.com/blog/2025-12-02-filen-systemd/</link><pubDate>Tue, 02 Dec 2025 00:00:00 JST</pubDate><title>Mount Filen as a systemd service</title></item><item><description><![CDATA[<br>
<p>Introducing another side-project of mine: <a href="https://github.com/flowerinthenight/luna">Luna</a>.</p>
<p><strong>Luna</strong> is part of my ongoing familiarization attempts with Rust. In my previous blog entries, I&rsquo;ve mentioned that I&rsquo;ve <a href="/blog/2025-03-18-on-rust/">decided</a> to choose Rust as the additional non-GC systems programming language to compliment our use of Go at <a href="https://alphaus.cloud/">work</a>. Ever since, there were several attempts to write some of our critical systems in Rust but due to insufficient know-how, it didn&rsquo;t materialize yet.</p>
<p>Recently however, one of our core data processing engine, <a href="/blog/2024-07-24-spillover-store/">Sapphire</a>, has been nearing its limits in terms of scale. Being mainly a data ingestion and processing engine, one of its bottlenecks is how fast it can query data from BigQuery. Now, BigQuery is actually quite performant as long as you&rsquo;re on top of its scaling pecularities. It is a bit pricey though. So we started to explore the idea of caching some of the heavy ones instead of upgrading BigQuery&rsquo;s compute capabilities across the board; sort of Redis for columnar SQL data.</p>
<p>That&rsquo;s where Luna comes in. Although, specifically, it is really <a href="https://duckdb.org/">DuckDB</a>&rsquo;s in-memory SQL capabilities that is being leveraged here. Luna is just the host process for an embedded, in-memory DuckDB instance. But that makes Luna a fully-pledge SQL database server. We just have to add the must-haves and nice-to-haves expected of cache servers such as an auth mechanism, distributed capabilities, scale, and some degree of high availability. And I think Rust is a good choice here: precise control of memory, with strong security defaults.</p>
<p>I&rsquo;ve decided to make it open source as well. I think this is a good software to be open-sourced; certainly not the first (there are already several out there), but I think the difference will be in terms of governance and roadmap.</p>
<p>So, if this is right up your alley, check it out on <a href="https://github.com/flowerinthenight/luna">GitHub</a>. And contribute, if you can!</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-10-01:/blog/2025-10-01-luna-sql/</guid><link>https://flowerinthenight.com/blog/2025-10-01-luna-sql/</link><pubDate>Wed, 01 Oct 2025 00:00:00 JST</pubDate><title>Introducing Luna, an in-memory SQL cache</title></item><item><description><![CDATA[<br>
<p>When building, and especially testing, <a href="https://vortex.nightblue.io/">Vortex</a>, we use several cloud VMs with different kernel versions (usually LTS ones), and a dedicated GKE test cluster. The trouble with eBPF though, is that a lot of the features depend on what kernel version is running on the target system. And maintaining multiple cloud VMs with different kernel versions is a tad expensive, and really not that straightforward.</p>
<p>And so, <a href="https://www.qemu.org/">QEMU</a> to the rescue. When it comes to emulators, QEMU doesn&rsquo;t really need introductions. This blog is a guide (mostly to myself) on how to build a Debian-based image with a specific kernel version to be used as an eBPF testbed.</p>
<p>This guide uses an Ubuntu-based system. This also works with a cloud VM with KVM enabled. First, let&rsquo;s install our dependencies.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ sudo apt update
</span></span><span class="line"><span class="cl">$ sudo apt git install make gcc flex bison libncurses-dev libelf-dev libssl-dev <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  debootstrap dwarves qemu-system -y
</span></span></code></pre></div></blockquote>
<p>Next, we clone the stable version of the Linux kernel. I&rsquo;ll be using <code>workdir</code> as the working directory.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ mkdir -p workdir/
</span></span><span class="line"><span class="cl">$ <span class="nb">cd</span> workdir/
</span></span><span class="line"><span class="cl">$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Checkout desired version (tag):</span>
</span></span><span class="line"><span class="cl">$ <span class="nb">cd</span> linux-stable/
</span></span><span class="line"><span class="cl">$ git checkout -b v6.6.102 v6.6.102
</span></span></code></pre></div></blockquote>
<p>After several trial and error, the following config seems to work for eBPF testing.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ make defconfig
</span></span><span class="line"><span class="cl">$ make kvm_guest.config
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Required for Debian Stretch and later:</span>
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_CONFIGFS_FS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_SECURITYFS y
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># BPF-related configs:</span>
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_BPF y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_BPF_SYSCALL y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_MODULES y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_BPF_EVENTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_PERF_EVENTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_HAVE_PERF_EVENTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_FUNCTION_TRACER y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_DEBUG_INFO y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_DEBUG_INFO_BTF y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_NET_CLS_BPF y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_NET_ACT_BPF y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_NET_SCH_INGRESS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_BPF_JIT y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_HAVE_BPF_JIT y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_CGROUP_BPF y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_KPROBES y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_HAVE_KPROBES y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_KPROBE_EVENTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_KPROBES_ON_FTRACE y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_UPROBES y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_UPROBE_EVENTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_ARCH_SUPPORTS_UPROBES y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_MMU y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_TRACEPOINTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_HAVE_SYSCALL_TRACEPOINTS y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_FTRACE y
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_FTRACE_SYSCALLS y
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">$ ./scripts/config --set-val CONFIG_CMDLINE_BOOL y
</span></span><span class="line"><span class="cl">$ <span class="nb">echo</span> <span class="s1">&#39;CONFIG_CMDLINE=&#34;net.ifnames=0&#34;&#39;</span> &gt;&gt; .config
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">$ make olddefconfig
</span></span></code></pre></div></blockquote>
<p>After setting up the build config, we can now build the kernel.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ make -j<span class="k">$(</span>nproc<span class="k">)</span>
</span></span></code></pre></div></blockquote>
<p><code>arch/x86/boot/bzImage</code> is now our newly-built kernel. Next, let&rsquo;s build a <a href="https://www.debian.org/releases/bullseye/">Debian-based (bullseye)</a> image for QEMU to boot with our custom kernel.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">cd</span> ../
</span></span><span class="line"><span class="cl">$ mkdir -p debian-bullseye/
</span></span><span class="line"><span class="cl">$ <span class="nb">cd</span> debian-bullseye/
</span></span><span class="line"><span class="cl">$ wget https://raw.githubusercontent.com/google/syzkaller/master/tools/create-image.sh
</span></span><span class="line"><span class="cl">$ chmod +x create-image.sh
</span></span><span class="line"><span class="cl">$ ./create-image.sh --feature full
</span></span></code></pre></div></blockquote>
<p><code>bullseye.img</code> is now our new Linux image. Let&rsquo;s boot it using QEMU, mapping our local 10021 port to the VM&rsquo;s SSH (22) port.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">cd</span> ../
</span></span><span class="line"><span class="cl">$ qemu-system-x86_64 <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -m 2G <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -smp <span class="m">2</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -kernel linux-stable/arch/x86/boot/bzImage <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -append <span class="s2">&#34;console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -drive <span class="nv">file</span><span class="o">=</span>debian-bullseye/bullseye.img,format<span class="o">=</span>raw <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -net user,host<span class="o">=</span>10.0.2.10,hostfwd<span class="o">=</span>tcp:127.0.0.1:10021-:22 <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -net nic,model<span class="o">=</span>e1000 <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -enable-kvm <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -nographic <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -pidfile vm.pid <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  2&gt;<span class="p">&amp;</span><span class="m">1</span> <span class="p">|</span> tee vm.log
</span></span></code></pre></div></blockquote>
<p>Using a separate terminal, we can now use <code>ssh</code> and <code>scp</code> to access the VM and copy files to it. Let&rsquo;s copy the <code>vortex-agent</code> binary to the home directory.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ scp -i debian-bullseye/bullseye.id_rsa -P <span class="m">10021</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -o <span class="s2">&#34;StrictHostKeyChecking no&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  <span class="nv">$VORTEX_AGENT_ROOT</span>/bin/vortex-agent <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  root@localhost:~/
</span></span></code></pre></div></blockquote>
<p>Finally, we can <code>ssh</code> to the VM.</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ ssh -i debian-bullseye/bullseye.id_rsa -p <span class="m">10021</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  -o <span class="s2">&#34;StrictHostKeyChecking no&#34;</span> root@localhost
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Then run the binary:</span>
</span></span><span class="line"><span class="cl">$ ./vortex-agent run --logtostderr
</span></span></code></pre></div></blockquote>
<p>To close the VM, we can either do:</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ poweroff
</span></span></code></pre></div></blockquote>
<p>from within the VM, or kill the process from outside:</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">kill</span> <span class="o">[</span>-9<span class="o">]</span> <span class="k">$(</span>cat vm.pid<span class="k">)</span>
</span></span></code></pre></div></blockquote>
<br>
<p>Related blogs:</p>
<ol>
<li><a href="/blog/2025-08-21-on-building-vortex/">On building Vortex</a></li>
<li>This blog</li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-09-03:/blog/2025-09-03-ebpf-qemu-test/</guid><link>https://flowerinthenight.com/blog/2025-09-03-ebpf-qemu-test/</link><pubDate>Wed, 03 Sep 2025 00:00:00 JST</pubDate><title>Using QEMU for eBPF testing</title></item><item><description><![CDATA[<br>
<p>A quick update on what I&rsquo;ve been building for the last couple of weeks.</p>
<p>I’ve put together a small team at <a href="https://alphaus.cloud/">Alphaus</a> to explore <a href="https://ebpf.io/">eBPF</a> with a specific goal: building a tool for AI security and monitoring. The kernel-level visibility you get with eBPF is perfect for this, but the real work is in making it useful.</p>
<p>Our immediate focus has been on getting visibility into encrypted network traffic. We needed a way to inspect TLS traffic without messing with certificates or acting as a man-in-the-middle. The obvious path was using eBPF uprobes to hook user-space functions directly.</p>
<p>We&rsquo;re calling the tool <a href="https://vortex.nightblue.io/">Vortex</a> (thought it sounded cool). It looks something like this:</p>
<br>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><a href="https://vortex.nightblue.io/">
<div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20250821-vortex-design-01.png"   style="height: auto; width: 90%"   >
</div>
</a>
</div>


<figcaption class="figure-caption mt-2 text-center" >Vortex design</figcaption>

</figure>
</div>
</div>

<br>
<p>And I&rsquo;m happy to report we&rsquo;ve had our first major successes. We can now reliably see plaintext traffic for apps built on Python (and other OpenSSL-based apps). We can now hook into the <code>SSL_read[_ex]</code> and <code>SSL_write[_ex]</code> functions in the underlying OpenSSL library that Python&rsquo;s <code>ssl</code> module (and many other tools) uses. This gives us the unencrypted buffer just before it&rsquo;s handed to the kernel for sending, or just after it&rsquo;s been received and decrypted.</p>
<p>Still in progress but the same fundamental approach works for NodeJS applications, although they have their own TLS stack. Fortunately for us, they also use the same function names so we can attach to those as well, and pull the plaintext data out.</p>
<p>This means we can now monitor the raw traffic of a huge range of services without any code instrumentation or network configuration changes.</p>
<p>Why are we doing this? The goal is to feed this traffic data into our monitoring agent. From there, we can do interesting work like detecting anomalous traffic patterns to (and from) AI models, identifying potential data exfiltration over encrypted channels, getting a clear map of what your AI services are actually talking to, etc.</p>
<p>It’s not trivial, of course. Dealing with function offsets, different library versions, and the nuances between runtimes is a challenge, but the core mechanism is solid.</p>
<p>More to share soon as we build out the next pieces.</p>
<blockquote>
<p><strong>Sidenote:</strong> This project actually made me excited in programming once again as it involves tinkering with the low-level guts of the Linux kernel, which I love.</p></blockquote>
<p>The source code is <a href="https://github.com/flowerinthenight/vortex-agent/">here</a>, if you&rsquo;re interested.</p>
<br>
<p>Related blogs:</p>
<ol>
<li>This blog</li>
<li><a href="/blog/2025-09-03-ebpf-qemu-test/">Using QEMU for eBPF testing</a></li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-08-21:/blog/2025-08-21-on-building-vortex/</guid><link>https://flowerinthenight.com/blog/2025-08-21-on-building-vortex/</link><pubDate>Thu, 21 Aug 2025 00:00:00 JST</pubDate><title>On building Vortex</title></item><item><description><![CDATA[<br>
<p>Apologies for being silent for a while. Still alive here; still doing CTO work for Alphaus. But at the same time, since about the start of the year, I started another startup here in Tokyo with two other cofounders. Long story short - it didn&rsquo;t really pan out, at least for me. I mean, the startup is still moving forward, but with me no longer in it. I &ldquo;resigned&rdquo;, just recently actually, due to founder conflict. There was a lot going for it; it got accepted into the Antler Japan accelerator program (Batch 4) this year. It also received an infra backing from Google Cloud worth about $100K. Anyway, it is what it is, and I have to move on.</p>
<p>To some extent, I&rsquo;m actually quite relieved that it didn&rsquo;t really work out for me as I was almost at the brink of burning out. I&rsquo;ve spent most of my non-work time (late nights, weekends, holidays) for that startup that it affected my health and personal relationships (not surprising, really), which would have been disastrous.</p>
<p>Now, I&rsquo;m working on another startup (again) in my spare time, and in my own pace. I&rsquo;m still with Alphaus during normal work times but I&rsquo;ve been building this new one slowly and steadily. It&rsquo;s still bootstrapped at the moment; I don&rsquo;t think it&rsquo;s a good idea, nor good timing, to entertain any form of outside investments at this point given its current state. My CEO knows about it, and he&rsquo;s very supportive.</p>
<p>I might talk about it at some point in the future but in the meantime, I&rsquo;ll continue building it in stealth mode.</p>
<p>Just so you know.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-06-15:/blog/2025-06-15-some-updates/</guid><link>https://flowerinthenight.com/blog/2025-06-15-some-updates/</link><pubDate>Sun, 15 Jun 2025 00:00:00 JST</pubDate><title>What’s been going on?</title></item><item><description><![CDATA[<p>This is a common phrase I come across in Rust forums and IRC rooms. And I&rsquo;ve got the taste of what it is after I finished porting <a href="https://github.com/flowerinthenight/hedge/">hedge</a> to Rust (called <a href="https://github.com/flowerinthenight/hedge-rs/">hedge-rs</a>, obviously). Coming from C/C++/Go/Zig, working with Rust&rsquo;s borrow checker needs some getting used to. But just to get the thing working, at least initially, I had to do some bits of circumvention, which feels like &ldquo;dirty&rdquo; tricks to me at the moment. I&rsquo;m sure I&rsquo;ll learn some of the more idiomatic ways of doing these in the (near) future as I write more Rust code. Here are some of those observations.</p>
<p>For context, <a href="https://github.com/flowerinthenight/hedge-rs/">hedge-rs</a> is a non-async, multithreaded code that runs on multiple nodes communicating with each other using an internal protocol on top of TCP.</p>
<h4 id="lots-of-cloneing">Lots of <code>.clone()</code>ing</h4>
<p>In Rust, accessing a variable from another thread isn&rsquo;t as straightforward. This doesn&rsquo;t compile.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">args</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">env</span>::<span class="n">args</span><span class="p">().</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args1: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The new thread is &ldquo;borrowing&rdquo; <code>args</code> from the calling function which it could outlive. When <code>args</code> goes out of scope, the thread could be accessing <code>args</code> that is no longer there.</p>
<p>We could do a &ldquo;move&rdquo; by doing this.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">args</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">env</span>::<span class="n">args</span><span class="p">().</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Move args to the new thread.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args1: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This will now compile (<code>args</code> is now moved to the new thread), but then I couldn&rsquo;t do this anymore.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">args</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">env</span>::<span class="n">args</span><span class="p">().</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Move args to the new thread.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args1: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// args was now moved to the other thread.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args2: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args</span><span class="p">[</span><span class="mi">2</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Even if we argue that this code is correct as we are not changing any values; just reading it, this won&rsquo;t compile. We can clone (or copy) instead. This will work.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">args</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">env</span>::<span class="n">args</span><span class="p">().</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Create a clone of args; move to new thread.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">args_clone</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">args</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args1: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args_clone</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// args is still available to us here.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;args2: </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">args</span><span class="p">[</span><span class="mi">2</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There are several of these in <a href="https://github.com/flowerinthenight/hedge-rs/">hedge-rs</a>. I don&rsquo;t really see this as a big problem for small, stack-allocated variables, but it might be for big-ish structs or arrays/vectors.</p>
<h4 id="arcmutext-for-mutable-objects"><code>Arc&lt;Mutex&lt;T&gt;&gt;</code> for mutable objects</h4>
<p>To pass around mutable objects across threads, Rust provides <code>Arc&lt;Mutex&lt;T&gt;&gt;</code>, which is a mutex inside an &ldquo;Atomically Reference Counted&rdquo; pointer. This allows us to mutate variables across threads in a thread-safe manner.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Debug)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="k">struct</span> <span class="nc">X</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">val</span>: <span class="kt">i32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="k">impl</span><span class="w"> </span><span class="n">X</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">inc</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">val</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">val</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">Mutex</span>::<span class="n">new</span><span class="p">(</span><span class="n">X</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">val</span>: <span class="mi">0</span><span class="w"> </span><span class="p">}));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x_clone</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">x_clone</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">().</span><span class="n">inc</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// mutex unlocks when out of scope
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">().</span><span class="n">inc</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// mutex unlocks when out of scope
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this code, <code>x_clone</code> here looks like our previous example but it isn&rsquo;t. For <code>Arc</code>s, you clone the pointer to increase its reference count, not clone the inner object itself. <code>x_clone</code> and <code>x</code> here still points to the same <code>X</code> object. <code>Arc</code> will only cleanup its inner object once the reference count goes to zero.</p>
<p>This is actually alright for me, albeit the weird ergonomics of <code>.clone()</code>, and <code>.lock()</code> with its implicit unlock pair when going out of scope. What is slightly unexpected is if it&rsquo;s for atomics; I still have to wrap an atomic with an <code>Arc</code> as well. For example,</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">AtomicU32</span>::<span class="n">new</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">a_clone</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">a_clone</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">Ordering</span>::<span class="n">Relaxed</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">a_clone</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">Ordering</span>::<span class="n">Acquire</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="n">Ordering</span>::<span class="n">Relaxed</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">Ordering</span>::<span class="n">Acquire</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I feel like there&rsquo;s not really a need for <code>Arc</code> here since it&rsquo;s, you know, atomic.</p>
<h4 id="using-vect-for-late-initialization-of-objects">Using <code>Vec&lt;T&gt;</code> for late initialization of objects</h4>
<p>This one for me definitely feels like a cheat. Say you have a struct field in your other struct that you want to initialize later. Like this <code>Lock</code> inside <code>Op</code>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Op</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">lock</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">Mutex</span><span class="o">&lt;</span><span class="n">Lock</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tx_worker</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="n">WorkerCtrl</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Instantiating <code>Op</code> without setting up both <code>lock</code> and <code>tx_worker</code> is not that straightforward, like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">o</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Op</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">lock</span>: <span class="o">???</span><span class="p">,</span><span class="w"> </span><span class="n">tx_worker</span>: <span class="o">???</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c1">// Initialize o.lock here.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="c1">// Initialize o.tx_worker here.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>What did I do? Use vectors instead.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Op</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">lock</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Arc</span><span class="o">&lt;</span><span class="n">Mutex</span><span class="o">&lt;</span><span class="n">Lock</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tx_worker</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">WorkerCtrl</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// So I can do this:
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">o</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Op</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">lock</span>: <span class="nc">vec</span><span class="o">!</span><span class="p">[],</span><span class="w"> </span><span class="n">tx_worker</span>: <span class="nc">vec</span><span class="o">!</span><span class="p">[]</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Late initialization of lock.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="n">o</span><span class="p">.</span><span class="n">lock</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">Mutex</span>::<span class="n">new</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">LockBuilder</span>::<span class="n">new</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">db</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">db</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">table</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">table</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">name</span><span class="p">(</span><span class="n">lock_name</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">id</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">id</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">lease_ms</span><span class="p">(</span><span class="n">lease_ms</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">leader_tx</span><span class="p">(</span><span class="nb">Some</span><span class="p">(</span><span class="n">tx_ldr</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">build</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">))];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// And late initialization of o.tx_worker.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">WorkerCtrl</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="n">WorkerCtrl</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">o</span><span class="p">.</span><span class="n">tx_worker</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">()];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// I can then use lock like so:
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">o</span><span class="p">.</span><span class="n">lock</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// use o.lock[0]
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Same with tx_worker.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">o</span><span class="p">.</span><span class="n">tx_worker</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// use o.tx_worker[0]
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It doesn&rsquo;t feel right, but it works.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-03-28:/blog/2025-03-28-more-on-rust/</guid><link>https://flowerinthenight.com/blog/2025-03-28-more-on-rust/</link><pubDate>Fri, 28 Mar 2025 00:00:00 JST</pubDate><title>Fighting Rust’s borrow checker</title></item><item><description><![CDATA[<p>Last year, I <a href="/blog/2024-08-20-thoughts-on-newer-system-languages/">explored</a> the possibility of adding a complementary systems programming language to our stack. And I mentioned that I was leaning more into <a href="https://ziglang.org/">Zig</a> as it resonates with me and my biases. Well, I spent a bit of my time last week porting <a href="https://github.com/flowerinthenight/spindle">spindle</a> to <a href="https://www.rust-lang.org/">Rust</a> (called <a href="https://github.com/flowerinthenight/spindle-rs">spindle-rs</a>). This blog is me sharing some of my initial impressions.</p>
<p>The first thing is, I quite like it. Which, for me, is a very important criteria. Mind you, as a piece of software, <code>spindle-rs</code> is not really that complicated, so I&rsquo;m barely touching the surface of what it&rsquo;s going to be like writing system software fit for Rust. However, from that experience, here are some of my impressions.</p>
<p>Tooling is pretty good. Toolchain installation is handled by <a href="https://github.com/rust-lang/rustup">rustup</a>. It allows for easy switching between stable, nightly, and beta. The <a href="https://github.com/rust-lang/rust-analyzer">rust-analyzer</a>, which includes the Rust LSP server, works out of the box, and works as how you&rsquo;d expect it to work most of the time, albeit a tad slow at times. <a href="https://github.com/rust-lang/rustfmt">rustfmt</a> is good with decent defaults. And <a href="https://github.com/rust-lang/cargo">cargo</a>, Rust&rsquo;s package manager, is excellent. Coming from C and C++, and having experienced CMake, Meson, Ninja, and vcpkg; cargo, with its tight integration with Rust, is actually a breath of fresh air, especially its ease of use. The overall development experience as far as tooling is concerned, is top notch.</p>
<p>As a language, well, it takes some getting used to. The biggest friction for me comes from &ldquo;fighting the borrow checker&rdquo; (as Rustaceans term it). No surprise there as it&rsquo;s what really makes Rust, Rust. My mental programming model, which translates pretty well between C, C++, Go, and Zig, doesn&rsquo;t map that well with Rust&rsquo;s semantics. Not yet, at least. This is not a criticism, of course. To program in Rust just means doing it &ldquo;the Rust way&rdquo;. I just have to learn it with experience.</p>
<p>Syntax-wise, nothing really stood out to me. It feels similar to Zig. If I don&rsquo;t put too much thought into the flow structure, it&rsquo;s easy to end up with very deep nested code due to how pattern matching works in combination with <code>Result</code> and <code>Option</code>. Something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">run</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// First, see if already locked (could be us or somebody else).
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">DiffToken</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="n">DiffToken</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Err</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tx_main</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">ProtoCtrl</span>::<span class="n">CheckLock</span><span class="p">(</span><span class="n">tx</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">continue</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">locked</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">false</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">match</span><span class="w"> </span><span class="n">rx</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Ok</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="na">&#39;single</span>: <span class="nc">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="c1">// We are leader now.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="n">token</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">Ordering</span>::<span class="n">Acquire</span><span class="p">)</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">token</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">u64</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="k">break</span><span class="w"> </span><span class="nl">&#39;single</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="c1">// We&#39;re not leader now.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">diff</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="k">break</span><span class="w"> </span><span class="nl">&#39;single</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="k">break</span><span class="w"> </span><span class="nl">&#39;single</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Err</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">continue</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="n">locked</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">continue</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="n">initial</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// Attempt first ever lock. The return commit timestamp will be our fencing
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                </span><span class="c1">// token. Only one node should be able to do this successfully.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                </span><span class="n">initial</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">false</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tx_main</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">ProtoCtrl</span>::<span class="n">InitialLock</span><span class="p">(</span><span class="n">tx</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rx</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="o">-</span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="n">token</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="n">t</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">u64</span><span class="p">,</span><span class="w"> </span><span class="n">Ordering</span>::<span class="n">Relaxed</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// For the succeeding lock attempts.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">Record</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="n">Record</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tx_main</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">ProtoCtrl</span>::<span class="n">CurrentToken</span><span class="p">(</span><span class="n">tx</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rx</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">token</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="k">continue</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="c1">// Attempt to grab the next lock. Multiple nodes could potentially
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                        </span><span class="c1">// do this successfully.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">update</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">false</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">token_up</span>: <span class="kt">i128</span> <span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="fm">write!</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;{}_{}&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">lock_name</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">token</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tx_main</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">ProtoCtrl</span>::<span class="n">NextLockInsert</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">tx</span><span class="w"> </span><span class="p">})</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rx</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="k">if</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                    </span><span class="n">update</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">true</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                    </span><span class="n">token_up</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="k">if</span><span class="w"> </span><span class="n">update</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="c1">// We got the lock. Attempt to update the current token
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                            </span><span class="c1">// to this commit timestamp.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                            </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span>: <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="kt">i128</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tx_main</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">ProtoCtrl</span>::<span class="n">NextLockUpdate</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">token</span>: <span class="nc">token_up</span><span class="p">,</span><span class="w"> </span><span class="n">tx</span><span class="w"> </span><span class="p">})</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rx</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                    </span><span class="k">if</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                        </span><span class="c1">// Doesn&#39;t mean we&#39;re leader, yet.
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">                                        </span><span class="n">token</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="n">token_up</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">u64</span><span class="p">,</span><span class="w"> </span><span class="n">Ordering</span>::<span class="n">Relaxed</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In Zig, it&rsquo;s easier to tame very deep nesting due to its support for labeled blocks; most blocks can be labeled, and you can do early returns with <code>break :label;</code>. There is also support for labeled blocks in Rust but, as far as I&rsquo;m aware, only for <code>loop</code> and <code>let</code> blocks.</p>
<p>For example, in Zig, you can do something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-zig" data-lang="zig"><span class="line"><span class="cl"><span class="n">label</span><span class="o">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">const</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="n">proto2</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="mh">0x8000000000000000</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">63</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">on</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="k">break</span><span class="w"> </span><span class="o">:</span><span class="n">label</span><span class="p">;</span><span class="w"> </span><span class="c1">// early return
</span></span></span><span class="line"><span class="cl"><span class="c1"></span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">const</span><span class="w"> </span><span class="n">lmin</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">((</span><span class="n">msg</span><span class="p">.</span><span class="n">proto2</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="mh">0x7FFFFFFF80000000</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">31</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">1000</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">const</span><span class="w"> </span><span class="n">lmax</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">((</span><span class="n">msg</span><span class="p">.</span><span class="n">proto2</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="mh">0x700000007FFFFFFF</span><span class="p">))</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">1000</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">@atomicStore</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kt">u64</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="n">self</span><span class="p">.</span><span class="n">elex_tm_min</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">lmin</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span><span class="p">.</span><span class="n">builtin</span><span class="p">.</span><span class="n">AtomicOrder</span><span class="p">.</span><span class="n">seq_cst</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And then, there&rsquo;s async Rust. The only experience I have with async programming (with <code>async</code>/<code>await</code> and its viral nature) was with C#, and quite frankly, I didn&rsquo;t really warm up to the idea. I understand the need for a fast, runtime-assisted, concurrent programming model, but coming from Go, async Rust feels like a totally different language. At least in Go, code for concurrent programming using goroutines and channels still looks and feels like normal Go code. Mixing concurrent with synchronous Go code doesn&rsquo;t feel &ldquo;weird&rdquo;. Not so with async Rust. When writing <code>spindle-rs</code>, I had to use the <a href="https://crates.io/crates/google-cloud-spanner">google-cloud-spanner</a> crate, which is async only. Since I didn&rsquo;t want to write <code>spindle-rs</code> in async, I had to &ldquo;bridge&rdquo; the async crate with my sync code. Mixing async with sync Rust code feels &ldquo;weird&rdquo;. At the moment, my understanding of how <a href="https://tokio.rs/">Tokio</a>&rsquo;s runtime works (Tokio seems like &ldquo;the runtime&rdquo; to use when doing async Rust) is not deep enough for me to know whether it&rsquo;s even a good idea to mix the two. Now, with that said, this is, again, not a criticism to async Rust. I can even see myself using it when doing embedded programming.</p>
<p>The only real complaint I have at the moment is the slow compile times. I know that this is being worked on at the moment, so I won&rsquo;t take it against Rust too much for now. With all the criticisms hurled towards Go, they&rsquo;ve done it right in terms of compilation times. Even Zig has much better compilation times despite its age. But I&rsquo;m sure it&rsquo;ll get better in time.</p>
<p>So, going back to my choices: personally, I enjoy writing in Zig more so I will continue using it for personal projects, but I can&rsquo;t, in good conscience, push it to the company; not until it becomes 1.0+ with a relatively mature ecosystem of libraries. Then again, we&rsquo;ve probably established our Rust usage by that time anyway, so it doesn&rsquo;t make sense to &ldquo;replace&rdquo; it by the time Zig becomes viable. So there you go. Rust it is. And I believe it&rsquo;s objectively the better choice.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-03-18:/blog/2025-03-18-on-rust/</guid><link>https://flowerinthenight.com/blog/2025-03-18-on-rust/</link><pubDate>Tue, 18 Mar 2025 00:00:00 JST</pubDate><title>Initial impressions on Rust</title></item><item><description><![CDATA[<p>Let me start with a disclaimer. These thoughts come from a startup perspective where funds and personnel are a bit scarce. There&rsquo;s a lot of generalizations and assumptions made here as well; take them with a pinch of salt.</p>
<p>In just a short span of time, the speed of improvements to LLMs, especially the mainstream ones, are short of astonishing. The massive ones are getting smarter, faster, and more accurate. And even better, the open ones, such as Llama, DeepSeek, Gemma, Qwen, etc. are also catching up, which is a good thing, as I&rsquo;m more interested in them. And for enterprises who are looking into integrating LLMs into their internal workflows, or even products, the options available now are so many it&rsquo;s quite confusing where to even start. I hope this blog will shed some light on some of these confusions.</p>
<p>One of the first things to consider is whether to go AIaaS (AI as a service, or external) or deploy internally. While I don&rsquo;t really have a problem with using Gemini, ChatGPT, or Sonnet for explorations, I think when internal data or knowledge-base are involved, hybrid deployment is the way to go. Hybrid in this sense means hosting LLMs internally for more critical information (for obvious reasons), and using external LLMs for the rest. Ultimately, the decision of what to use, whether external, hybrid, or internal, will depend on your company&rsquo;s data privacy policies, governance, and regulations. I won&rsquo;t be digging further into going external as it&rsquo;s more about tracking and controlling what information are being included in the prompts than deployment, so we treat it as we do with any other API-based vendor integrations. Internal deployment, on the other hand, we generally have two options; a) fine-tuning a closed LLM, and b) using open models.</p>
<p>Fine-tuning a closed LLM really depends on whether that feature is provided by the vendor. For example, OpenAI&rsquo;s ChatGPT, and Google&rsquo;s Gemini can be fine-tuned with your own data, and you can use the fine-tuned version as your new LLM; the vendor will still host that LLM for you, you use it as you would with any other LLMs, which is through its API. Now, you might argue that it&rsquo;s technically not an internal deployment, and you might be right. I only categorized it as such since in doing so, there&rsquo;s almost always the provision from the vendor effectively promising to not use your fine-tuning data as training data for their next LLM versions, so there&rsquo;s an assumption of privacy there; whether you trust them or not is up to you.</p>
<p>The second one; using open models, is really what you want. The idea is that, for enterprise use, you don&rsquo;t really need, say, ChatGPT&rsquo;s vast, generic knowledge of the world. You want an LLM that understands your hiring and onboarding policies instead of knowing the historical weather data of the Atlantic for the past decade; you want its inferencing (or &ldquo;thinking&rdquo;) capabilities applied to your internal data instead. In this route, you can either fine-tune an open model with your data, all hosted internally, or use an open model as is, and use <a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/">RAG</a> to augment, and <a href="https://techcommunity.microsoft.com/blog/azuredevcommunityblog/why-and-how-to-ground-a-large-language-models-using-your-data-rag/4152064">ground</a> it with your internal data, also hosted internally.</p>
<p>So, RAG? Or fine-tuning? I think doing both is the way to go. The general rule of thumb (advise I got from the Gemma Japan team) is fine-tune for static (or near-static) data, and RAG for more dynamic, always-changing data.</p>
<p>Considerations for fine-tuning are expertise, and costs. To fine-tune an LLM, you need quality training/fine-tuning data. And since enterprise data are usually all over the place, and often, not centralized, data preparation is actually one of the biggest hurdles. You probably need a team of data engineers, data scientists, and infra/ops personnel to pull this off. And the costs will involve the upfront costs of the actual fine-tuning (you need compute, both CPU and GPU/TPU), and the ongoing inferencing (or serving) costs, which will involve compute (CPU/GPU/TPU), storage, and networking.</p>
<p>Considerations for RAG are also expertise, and costs. Setting up RAG-based workflows involves some important components: LLM routers, embeddings generators, and vector databases. You will most definitely be using multiple LLMs for multiple purposes; one for text summarization or generation, one for reasoning, another for research, or coding, and so on. Each LLM will be deployed separately; could be on a big VM, or a cluster of VMs. And since these clusters need GPUs, you&rsquo;d probably want a serverless, scale-to-zero environment, as GPUs will bear most of the costs in this layer. You might be able to get away with traditional auto-scaling clusters (with additional checks in place to scale down during idle time) but environments like Kubernetes with, say, <a href="https://keda.sh/">Keda</a> for scale-to-zero, or <a href="https://fly.io/">Fly</a>&rsquo;s GPUs (which I believe can scale to zero), or GCP&rsquo;s <a href="https://cloud.google.com/blog/products/application-development/run-your-ai-inference-applications-on-cloud-run-with-nvidia-gpus">Cloud Run</a>, etc., will be easier to manage. So with multiple LLM deployments, you&rsquo;d also want a router that will route input requests, or prompts, to the appropriate LLM target. You can do this traditionally, utilizing an LLM to facilitate with the routing, or you could also do <a href="https://developer.nvidia.com/blog/applying-mixture-of-experts-in-llm-architectures/">MoE (Mixture of Experts)</a> deployments, where the routing is done within the LLM itself. One issue, however, is that there&rsquo;s not a lot of open MoE models yet; there&rsquo;s IBM&rsquo;s Granite, Mistral-MoE, and Qwen-MoE models (I&rsquo;m monitoring this space closely as well; I think there will be more improvements here in the near future).</p>
<p>Embeddings generators and vector databases are specific to RAG. To &ldquo;feed&rdquo; your enterprise data to an LLM (as opposed to fine-tuning), you need to generate &ldquo;embeddings&rdquo; for them first. Embeddings are vector representations (with semantic context) of your data. These embeddings will then be stored in a vector database for later use. The more data you have, or at least the more data you want an LLM to have access to, have, the more embeddings you will generate. How many of these will depend on the size of the context windows of the LLMs you choose. An example would be 1 page of a document is 1 embedding, or, if an LLM&rsquo;s context window is smaller (looking at you, Gemma2), you could &ldquo;chunk&rdquo; your data into smaller pieces of defined length; 1 chunk will be 1 embedding, and so on and so on. Options for generators here are plenty; you can use the mainstream providers&rsquo; vector embedding APIs, such as OpenAI&rsquo;s Vector Embeddings API, GCP&rsquo;s Embeddings API, AWS&rsquo; Titan Text Embeddings, etc., or use open models, such as Word2Vec, LexVec, bert, chroma, etc., although you still need to host them internally. Choices for vector databases are also the same; you can use vendor-provided ones, or self-host open source ones. Note that these are still databases, so when self-hosting, you still need to tackle the headaches of deploying databases, even vendor-provided ones. Once these are in place, you do RAG by converting the input query, or prompt, to its embedding equivalent, do a semantic/similarity/distance search from your vector database, map the resulting embeddings to the real data you have, and then use those data as context to (or part of) your final prompt to the target LLM. So, still an ops-heavy deployment, as you can see, notwithstanding the costs, both upfront and operational, that you will incur in deploying these components.</p>
<p>I will write something about cost estimations or simulations regarding these deployments, and some ideas on actual deployments as well, but that will be on a separate blog.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-02-25:/blog/2025-02-25-considerations-llm-integration/</guid><link>https://flowerinthenight.com/blog/2025-02-25-considerations-llm-integration/</link><pubDate>Tue, 25 Feb 2025 00:00:00 JST</pubDate><title>Considerations on deploying LLM-based workflows</title></item><item><description><![CDATA[<p>The ability to produce statically-linked binaries by default in Go is one of the many good reasons why I appreciate and use it. Being able to copy or move only a single file around across all sorts of locations without worrying too much about missing libraries or runtimes has definitely saved me a lot of annoyances many times over. However, working on <a href="https://github.com/flowerinthenight/hedge-cb">hedge-cb</a>, and therefore, CGO, for the past few days, reminded me once again how much of a good thing static binaries are. So I tried checking out whether I could still do static linking with CGO.</p>
<p>As I mentioned, even though Go produces static binaries by default, introducing CGO into the mix makes your binary dynamically-linked instead, as it will now require a C runtime, usually <code>glibc</code>, to be present on the target environment. That in itself is not really such a bad thing as <code>glibc</code> is almost always present in most Linux distributions used in production. The annoyance, however, starts when you introduce a dependency through CGO, in which case, you will need to make sure that that dependency&rsquo;s <code>lib***.so</code> is also present in your target environment, alongside the C runtime.</p>
<p>For a binary that depends on <a href="https://github.com/flowerinthenight/hedge-cb">hedge-cb</a>, and therefore, <a href="https://github.com/aws/clock-bound">ClockBound</a>, deploying it to, say, EC2 (it&rsquo;s AWS-native after all), requires the following to be present in the VM:</p>
<ul>
<li>the <a href="https://github.com/aws/clock-bound/tree/main/clock-bound-d">ClockBound daemon</a> running - our source for &ldquo;true time&rdquo;;</li>
<li><code>libclockbound.so</code> copied to <code>/usr/lib/</code> or <code>/usr/lib64/</code> - output when compiling <a href="https://github.com/aws/clock-bound/tree/main/clock-bound-ffi">ClockBound FFI</a>;</li>
<li>a C runtime - Amazon Linux comes with <code>glibc</code> preinstalled.</li>
</ul>
<p>There&rsquo;s no getting around the ClockBound daemon as it&rsquo;s functionally required. But I can do away with <code>libclockbound.so</code> and <code>glibc</code> altogether if the binary is statically built. I will share what I did in this blog, although this is not the only way how. Now, there&rsquo;s a fair bit of discussions on the interwebs about the pros and cons of static binaries when it comes to the C runtime; I&rsquo;m not going to rehash them here. For example purposes only, I went with <a href="https://musl.libc.org/">musl</a>, a lightweight alternative to <code>glibc</code>, and the <a href="https://ziglang.org/">Zig</a> toolchain. Zig here is probably optional as <code>musl</code> comes with <code>musl-gcc</code>, which can serve as my CGO compiler, but I was struggling to make it work. Zig (<code>zig cc</code> to be specific), on the other hand, was a breeze.</p>
<p>First, let&rsquo;s build ClockBound&rsquo;s FFI. Rust environment is required here.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">cd</span> /tmp/ <span class="o">&amp;&amp;</span> git clone https://github.com/aws/clock-bound
</span></span><span class="line"><span class="cl">$ <span class="nb">cd</span> clock-bound/clock-bound-ffi/
</span></span><span class="line"><span class="cl">$ rustup target add x86_64-unknown-linux-musl
</span></span><span class="line"><span class="cl">$ cargo build --release --target<span class="o">=</span>x86_64-unknown-linux-musl
</span></span></code></pre></div><p>The build outputs of note here are <code>libclockbound.so</code>, which we don&rsquo;t need, and <code>libclockbound.a</code>, which we do. Now, let&rsquo;s build <code>musl</code>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">cd</span> /tmp/ <span class="o">&amp;&amp;</span> wget https://musl.libc.org/releases/musl-1.2.5.tar.gz
</span></span><span class="line"><span class="cl">$ tar xvzf musl-1.2.5.tar.gz <span class="o">&amp;&amp;</span> <span class="nb">cd</span> musl-1.2.5/ <span class="o">&amp;&amp;</span> ./configure <span class="o">&amp;&amp;</span> make <span class="o">&amp;&amp;</span> sudo make install
</span></span></code></pre></div><p>Next, copy the FFI archive library to <code>musl</code>&rsquo;s install location.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ sudo cp -v /tmp/clock-bound/target/x86_64-unknown-linux-musl/release/libclockbound.a /usr/local/musl/lib/
</span></span></code></pre></div><p>Finally, build the binary using <code>zig cc</code> as our CGO compiler. Here, I&rsquo;m building <code>hedge-cb</code>&rsquo;s sample code.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="nb">cd</span> /tmp/ <span class="o">&amp;&amp;</span> git clone https://github.com/flowerinthenight/hedge-cb <span class="o">&amp;&amp;</span> <span class="nb">cd</span> hedge-cb/example/
</span></span><span class="line"><span class="cl">$ cp /tmp/clock-bound/clock-bound-ffi/include/clockbound.h .
</span></span><span class="line"><span class="cl">$ <span class="nv">CC</span><span class="o">=</span><span class="s2">&#34;zig cc -target x86_64-linux-musl -I. -L/usr/local/musl/lib -lunwind&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="cl"><span class="se"></span>  <span class="nv">GOOS</span><span class="o">=</span>linux <span class="nv">GOARCH</span><span class="o">=</span>amd64 go build -v --ldflags <span class="s1">&#39;-linkmode=external -extldflags=-static&#39;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Output should be a static binary.</span>
</span></span><span class="line"><span class="cl">$ ldd ./example
</span></span><span class="line"><span class="cl">      not a dynamic executable
</span></span></code></pre></div><p>So, yes, we can build static binaries even with CGO; but now, I&rsquo;m stuck with <code>musl</code>. I&rsquo;m sure <code>musl</code> is a fine piece of software, but I&rsquo;m not really that familiar with it. I&rsquo;m aware we use it at work since many of our containers in production use Alpine Linux as base. But would I exchange <code>glibc</code> for <code>musl</code> just for static binaries? I&rsquo;m not sure yet. Probably.</p>
<p>As always, tradeoffs.</p>
<br>
<p>Related blogs:</p>
<ol>
<li><a href="/blog/2025-01-22-clockbound-client-go/">AWS ClockBound client for Go</a></li>
<li><a href="/blog/2025-01-27-clockbound-client-go-update/">AWS ClockBound client for Go (update)</a></li>
<li><a href="/blog/2025-02-02-aws-dist-locking/">Distributed locking on AWS (ClockBound)</a></li>
<li><a href="/blog/2025-02-07-aws-cluster-membership/">Cluster membership management on AWS</a></li>
<li>This blog</li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-02-15:/blog/2025-02-15-cgo-static-linked-bin-musl/</guid><link>https://flowerinthenight.com/blog/2025-02-15-cgo-static-linked-bin-musl/</link><pubDate>Sat, 15 Feb 2025 00:00:00 JST</pubDate><title>Static-linked CGO binaries using musl and Zig</title></item><item><description><![CDATA[<p>In continuation with my previous <a href="/blog/2025-02-02-aws-dist-locking/">post</a>, I have now finished porting <a href="https://github.com/flowerinthenight/hedge">hedge</a> to AWS. It&rsquo;s a trimmed-down version for now; only the features directly related to cluster membership are ported. I decided to make a separate repo, called <a href="https://github.com/flowerinthenight/hedge-cb">hedge-cb</a> (in keeping with the <code>-cb</code> theme), instead of updating hedge directly. And it&rsquo;s mainly due to CGO. I didn&rsquo;t really fancy the idea of introducing CGO to hedge as it could break a lot of the CI builds at work. I had to extract the shared protobuf definitions to a separate <a href="https://github.com/flowerinthenight/hedge-proto">repo</a> however, which is a breaking change to hedge. But at least it&rsquo;s only a version change (v2) as opposed to adding CGO.</p>
<p>What remains now in the short term is figuring out how to make this thing work on EKS. Since the ClockBound daemon is a requirement, EKS pods need to be able to read its shared memory segment backing file which is <code>/var/run/clockbound/shm</code>. I have to mount that file (or maybe the whole directory) to the pod. I&rsquo;m not sure how it will behave though; it&rsquo;s not just some regular file.</p>
<p>Anyway, as far as testing goes, I&rsquo;ve tried it with both single- and multi-zone AutoScaling Groups and it works as expected. Time will tell how stable it&rsquo;s going to be once deployed to some production environment with larger cluster sizes.</p>
<p>Feature-wise, my initial intention was to make the cluster membership management part working first. That includes the ability to track member nodes dynamically, member-list, election of a leader node, the ability for any node to send both streaming and one-shot message(s) to the leader node, and broadcast-to-all mechanisms, both one-shot and streaming. And these are all working now. At the moment, I don&rsquo;t have any plans to support the remaining features like distributed semaphores, and storage (both K/V and spill-over). I can use other libraries for those should the need arises.</p>
<p>Finally, I&rsquo;m thinking of porting hedge to both Rust and Zig as well. Looking forward to that.</p>
<br>
<p>Related blogs:</p>
<ol>
<li><a href="/blog/2025-01-22-clockbound-client-go/">AWS ClockBound client for Go</a></li>
<li><a href="/blog/2025-01-27-clockbound-client-go-update/">AWS ClockBound client for Go (update)</a></li>
<li><a href="/blog/2025-02-02-aws-dist-locking/">Distributed locking on AWS (ClockBound)</a></li>
<li>This blog</li>
<li><a href="/blog/2025-02-15-cgo-static-linked-bin-musl/">Static-linked CGO binaries using musl and Zig</a></li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-02-07:/blog/2025-02-07-aws-cluster-membership/</guid><link>https://flowerinthenight.com/blog/2025-02-07-aws-cluster-membership/</link><pubDate>Fri, 07 Feb 2025 00:00:00 JST</pubDate><title>Cluster membership management on AWS</title></item><item><description><![CDATA[<p>After some testing time, I now have a working port of <a href="https://github.com/flowerinthenight/spindle">spindle</a> in AWS. There were some slight changes from the original library to account for some of the differences between Cloud Spanner and PostgreSQL, but not really by much. It&rsquo;s called <a href="https://github.com/flowerinthenight/spindle-cb">spindle-cb</a>, if you&rsquo;re interested. It&rsquo;s still half the battle though; I still have to port <a href="https://github.com/flowerinthenight/hedge">hedge</a> as well before I could really use it for some of the planned projects in my pipeline.</p>
<p>As far as lock services are concerned, spindle-cb (and spindle for that matter) is what you call a coarse-grained lock (as opposed to fine-grained). Most of my use cases of distributed locks are really around application-level orchestrations across multiple nodes. That means using a single lock (or two) within the duration of the application, instead of multiple locks for multiple objects/data, although that&rsquo;s not a limitation of the library by any means. With that said, I don&rsquo;t usually use spindle[-cb] directly, but through a cluster-aware wrapper, like, say, hedge.</p>
<p>At this point, I couldn&rsquo;t really vouch for spindle-cb&rsquo;s reliability, at least not yet. But since it&rsquo;s really just spindle with a different &ldquo;true time&rdquo; source, I have high hopes. And having used spindle in production for many years without any issues, I&rsquo;m optimistic.</p>
<br>
<p>Related blogs:</p>
<ol>
<li><a href="/blog/2025-01-22-clockbound-client-go/">AWS ClockBound client for Go</a></li>
<li><a href="/blog/2025-01-27-clockbound-client-go-update/">AWS ClockBound client for Go (update)</a></li>
<li>This blog post</li>
<li><a href="/blog/2025-02-07-aws-cluster-membership/">Cluster membership management on AWS</a></li>
<li><a href="/blog/2025-02-15-cgo-static-linked-bin-musl/">Static-linked CGO binaries using musl and Zig</a></li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-02-02:/blog/2025-02-02-aws-dist-locking/</guid><link>https://flowerinthenight.com/blog/2025-02-02-aws-dist-locking/</link><pubDate>Sun, 02 Feb 2025 00:00:00 JST</pubDate><title>Distributed locking on AWS (ClockBound)</title></item><item><description><![CDATA[<p>A week ago I published a short <a href="/blog/2025-01-22-clockbound-client-go/">blog</a> about <a href="https://github.com/flowerinthenight/clockbound-client-go">clockbound-client-go</a>. After some testing, turns out there&rsquo;s an issue in reading the actual time from ClockBound&rsquo;s shared memory segment; it seems to provide only the elapsed time since boot. However, using the Rust client and the FFI bindings produce the correct results. Either there is a problem in the code that reads the shared memory segment, or the SHM contents are wrong. Or, the contents are actually correct, but the expectation is for the implementing client to figure out the bounded time from the available values.</p>
<p>Either way, this is unusable for me at the moment. Instead of spending time debugging this, I wrote another client, called <a href="https://github.com/flowerinthenight/clockbound-ffi-go">clockbound-ffi-go</a>, utilizing the provided <a href="https://github.com/aws/clock-bound/tree/main/clock-bound-ffi">FFI</a> (meaning, requiring CGO). So far, it works well, as expected. I would have preferred to not use CGO but I need a working version as soon as possible. I&rsquo;ll come back on this when I have the time.</p>
<br>
<p>Related blogs:</p>
<ol>
<li><a href="/blog/2025-01-22-clockbound-client-go/">AWS ClockBound client for Go</a></li>
<li>This blog post</li>
<li><a href="/blog/2025-02-02-aws-dist-locking/">Distributed locking on AWS (ClockBound)</a></li>
<li><a href="/blog/2025-02-07-aws-cluster-membership/">Cluster membership management on AWS</a></li>
<li><a href="/blog/2025-02-15-cgo-static-linked-bin-musl/">Static-linked CGO binaries using musl and Zig</a></li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-01-27:/blog/2025-01-27-clockbound-client-go-update/</guid><link>https://flowerinthenight.com/blog/2025-01-27-clockbound-client-go-update/</link><pubDate>Mon, 27 Jan 2025 00:00:00 JST</pubDate><title>AWS ClockBound client for Go (update)</title></item><item><description><![CDATA[<p>I&rsquo;ve written a Go client for <a href="https://github.com/aws/clock-bound">AWS ClockBound</a> called <a href="https://github.com/flowerinthenight/clockbound-client-go">clockbound-client-go</a>. It uses the newer, shared memory segment protocol instead of the older, socket-based protocol. This is a prerequisite library needed to port <a href="https://github.com/flowerinthenight/spindle">spindle</a> (and maybe even <a href="https://github.com/flowerinthenight/hedge">hedge</a>) to AWS (for an upcoming project).</p>
<p>As great a tech Google&rsquo;s <a href="https://cloud.google.com/spanner/docs/true-time-external-consistency">TrueTime</a> is, there is no available API for it. It is only through Spanner that spindle achieves its locking mechanisms with TrueTime. Currently, it&rsquo;s the cheapest way, surprisingly, to do distributed locking that I&rsquo;ve tried so far, compared to the likes of Redis, Zookeeper, etcd, Consul, etc. Yes, I could do VPC peering between GCP and AWS but it is quite costly at the moment.</p>
<p>I&rsquo;ve heard of AWS&rsquo; <a href="https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/">TimeSync Service</a> before but only in passing. Now that they&rsquo;ve also released <a href="https://aws.amazon.com/blogs/database/introducing-amazon-aurora-dsql/">DSQL</a>, and having seen the papers and blogs about it, turns out that TimeSync is their equivalent to TrueTime, and that there is an API for it!</p>
<br>
<p>Related blogs:</p>
<ol>
<li>This blog</li>
<li><a href="/blog/2025-01-27-clockbound-client-go-update/">AWS ClockBound client for Go (update)</a></li>
<li><a href="/blog/2025-02-02-aws-dist-locking/">Distributed locking on AWS (ClockBound)</a></li>
<li><a href="/blog/2025-02-07-aws-cluster-membership/">Cluster membership management on AWS</a></li>
<li><a href="/blog/2025-02-15-cgo-static-linked-bin-musl/">Static-linked CGO binaries using musl and Zig</a></li>
</ol>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2025-01-22:/blog/2025-01-22-clockbound-client-go/</guid><link>https://flowerinthenight.com/blog/2025-01-22-clockbound-client-go/</link><pubDate>Wed, 22 Jan 2025 00:00:00 JST</pubDate><title>AWS ClockBound client for Go</title></item><item><description><![CDATA[



























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20241228-know-your-target.png"    >
</div>
</div>


</figure>
</div>
</div>

<br>
<p>I remember ages ago, a friend asked me what my tips would be to transition from a typical programmer to a &ldquo;systems programmer&rdquo;. And I distinctly remember answering him along the lines of &ldquo;Y&rsquo;know what, I don&rsquo;t really know, maybe just get more experience?&rdquo; And now, being an engineering leader to some extent, I get this question once in a while, but even now, I don’t really have a default answer, and a good one at that, to give. And writing this blog is, in part, me trying to remedy that.</p>
<p>Now, a bit of context, as this topic, I think, is a little vague, and you can tackle it from different perspectives. First, what is even a &ldquo;typical programmer&rdquo;? And what is a &ldquo;systems programmer&rdquo; anyway? I will try to narrow it down to my friend&rsquo;s perspective. He&rsquo;s a web developer, one you would consider as &ldquo;full-stack&rdquo;. He likes his programming languages, frameworks, design patterns, and we talk a lot about these things, which are all very interesting. We can pin the &ldquo;typical programmer&rdquo; definition to that. Does it make sense if we categorize junior, associate, mid-level, senior programmers to this group as well? Maybe. But I also have peers who are seniors for many, many years, and still enjoy what they do (e.g. maintaining legacy systems), and have no intentions/interest in going beyond that; it can be argued that they are also experts in their specific domains.</p>
<p>I also don&rsquo;t think my friend meant those developers who specifically do close-to-hardware systems, like embedded, drivers, OS, robotics, automotive, etc.; I think he refers to the IC path, such as [Senior] Staff+, [Senior] Principal Engineers, etc.</p>
<p>We can also phrase it controversially, like, &ldquo;How do I transition from a [insert language here|backend|frontend] programmer to an [expert generalist|staff+|principal] level programmer?&rdquo;. There&rsquo;s a lot of nuances to unpack in that line but let&rsquo;s just try to leave it like that and interpret it as you would so we can move on. I will also limit my discussions to the technical side of it; the social/political/soft-skills side of it is another topic I intend to explore in a separate blog post.</p>
<p>Aside from checking the literature regarding the topic, I also tried asking colleagues/peers what they think of this question. In no particular order, some interesting replies I got were:</p>
<ul>
<li>Read a lot of papers - I suppose it makes sense to some extent?</li>
<li>Get a masters or PhD - I&rsquo;ll leave this one for you to interpret. There&rsquo;s a spectrum here but I really admire PhD people with really deep levels of expertise in specific sciences as well as software in general.</li>
<li>Job hop (specifically between startups and big tech) - You could also do that, I suppose?</li>
<li>Contribute to OSS - not sure if this works; depends on the OSS in question I suppose?</li>
<li>Join an early-stage startup - if you&rsquo;re [un]lucky.</li>
</ul>
<p>Now, to the topic: my one-line answer would be the title of this blog: <strong>know your target</strong>. Let me explain.</p>
<p><strong>Know your target</strong></p>
<p>When I say &ldquo;target&rdquo;, I don&rsquo;t mean customers or users of your software, but your production target platform: mobile, browsers (if doing web development), architectures such as Intel&rsquo;s x86 (x64), RISC architectures such as ARM, MIPS, and PowerPC, the GPU, cloud and its internals, the network your software is running on, storage systems, data centers, mainframes, HPC systems, IoT, etc. With that said though, I don&rsquo;t really expect us to &ldquo;know&rdquo; all of these targets; we can go a step higher. Most of these targets are abstracted by the OS/software running on top of them: e.g. Linux, Unix, Windows, OSX, Android, IOS, browsers, the JVM, hypervisors, etc. Knowing these target OS/software is much more important than the language you use to program them. Actually, the language being used will be of lesser consequence to the system artifact itself compared to the target platform it will be running on. Therefore, I can further specify &ldquo;know your target&rdquo; to &ldquo;know your target OS&rdquo;.</p>
<p>You might say, &ldquo;Of course! Isn&rsquo;t that obvious/expected?&rdquo; You might be surprised how many programmers have no idea, or don&rsquo;t care, about the target OS their software will be running on. The facilities provided by the language (and the libraries/frameworks around the language ecosystem) are more than enough for them. So when you start talking about processes, threads, concurrency and synchronization at the OS level, or things like IPCs, memory models, I/O, the OS&rsquo; network stack, filesystems, virtualization, user/kernel mode, protection rings, context switching, etc., you&rsquo;ll get the impression that for them, these things are what solution architects and tech leads are there for.</p>
<p>Take Linux for example. It&rsquo;s only one of the &ldquo;targets&rdquo; in our list but Linux is a complex beast in and of itself. A programmer can probably reasonably pick a language up in weeks/months (maybe years for C++, Rust) but he/she might not really &ldquo;know&rdquo; Linux even until retirement. There are so many subsystems in Linux that learning all of them is probably an impossibility. One upside though is that your knowledge of Linux will more or less translate to other OSes without the need to start from scratch.</p>
<p>Understanding your target allows you to leverage what it can and can&rsquo;t do. Its capabilities will have a big influence on how you design your system. Of course, the language matters to some extent but you will be better off approaching it in terms of its capabilities and limitations within the confines of your target.</p>
<p>Your knowledge of targets will also help you better understand how complex systems are being designed: even distributed systems, or systems you&rsquo;re probably using now.</p>
<p>And lastly, debugging. Not language syntax debugging, but those pesky, difficult-to-reproduce, SLA-violating production bugs. I don&rsquo;t think I need to expound about this more.</p>
<p>So, interested in becoming a systems programmer? First, know your target. Your language of choice is just one of your tools to program your target.</p>
<p><strong>Closing</strong></p>
<p>I acknowledge that there are many opinions out there that don&rsquo;t necessarily align with this advice. For instance, a quick check on publicly available literature yields topics about architectural patterns, algorithmic thinking, FIRST and SOLID principles, systems thinking, formal verifications, thinking in terms of tradeoffs (which I agree by the way), thinking in terms of costs, etc. (which are all well and good), but I think that by being deliberate about target thinking takes you a long way towards becoming a systems programmer.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-12-28:/blog/2024-12-28-know-your-target/</guid><link>https://flowerinthenight.com/blog/2024-12-28-know-your-target/</link><pubDate>Sat, 28 Dec 2024 00:00:00 JST</pubDate><title>Know your target</title></item><item><description><![CDATA[



























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20241018-source-code-value.png"    >
</div>
</div>


</figure>
</div>
</div>

<br>
<p>I had a think about this topic when I saw a recent Ars Technica <a href="https://arstechnica.com/gadgets/2024/10/winamp-really-whips-open-source-coders-into-frenzy-with-its-source-release/">news article</a> about Winamp releasing their source code to GitHub and then deleting it after some backlash from the open source community. And we also hear about instances of leaked source codes as well, such as that of <a href="https://www.securityweek.com/new-york-times-responds-to-source-code-leak/">The New York Times</a>, <a href="https://thehackernews.com/2022/12/hackers-breach-oktas-github.html">Okta</a>, <a href="https://www.securityweek.com/intel-confirms-uefi-source-code-leak-security-experts-raise-concerns/">Intel&rsquo;s UEFI source</a>, <a href="https://www.securityweek.com/leaked-github-token-exposed-mercedes-source-code/">Mercedes Benz</a>, <a href="https://www.bleepingcomputer.com/news/microsoft/lapsus-hackers-leak-37gb-of-microsofts-alleged-source-code/">Microsoft</a>, among others. Until now, I haven&rsquo;t really put too much thought into this. I mean, from a business perspective, especially software companies who earn money through code, how valuable do we think our source code really is?</p>
<p>Before I dig deeper though, I think the value lies within a spectrum. And there are valid arguments from both sides. Forget the threat actors&rsquo; goal of obtaining sensitive information such as keys, secrets, credit card numbers, etc. from the source code, although one could argue that these information are also part of the source code itself (and you could also argue it&rsquo;s technically not), making it extremely valuable; I&rsquo;m generally referring to source code as a programming artifact, the text that contains the logic, and secondarily, the documentation that supports it. From this perspective, one expression of value would be does it matter if the source code is public or not. Think of Windows and Linux operating systems. One is closed source but the later is open. Is Linux&rsquo;s source code more &ldquo;valuable&rdquo; because it&rsquo;s publicly available? Or is it the other way around?</p>
<p>With that said, I want to dig a little deeper on an argument that leans toward source code, as presented above, not being really that valuable; <a href="https://en.wikipedia.org/wiki/Peter_Naur">Peter Naur</a>&rsquo;s <a href="https://pages.cs.wisc.edu/~remzi/Naur.pdf">&ldquo;Programming as Theory Building&rdquo;</a>. Although the &ldquo;Backus-Naur Form (BNF)&rdquo; author didn&rsquo;t explicitly say it as so, his arguments are quite fascinating, and maybe we could learn a thing or two from them.</p>
<p>Peter suggests that programming should be regarded as an activity by which the programmer form an insight, or theory, of how certain affairs of the world will be handled by, or supported by, code; a kind of mapping of these affairs, both characteristics and details, into program text, and any additional documentation. This is in contrast to the more common notion that programming is the production of programmed solutions, including design and implementation, and certain other texts, such as specifications, and documentation. It emphasizes that the knowledge possessed by the programmer by virtue of his or her having the theory, essentially transcends that which is recorded in the documented products. This transcendence can permeate in at least three different areas:</p>
<ul>
<li>The programmer having the theory of the program can explain how the solution relates to the affairs of the world that it helps to handle;</li>
<li>The programmer having the theory of the program can explain why each part of the program is what it is, or is able to support the actual program text with a justification of some sort;</li>
<li>The programmer having the theory of the program is able to respond constructively to any demand for a modification of the program so as to support the affairs of the world in a new manner.</li>
</ul>
<p>For the first point, he posits that during theory building, a large part of the world won&rsquo;t have a direct mapping to the code. That invisible contextual information can only be made relevant by someone who understands the program theory and its relation to the world. What is interesting is that he further argues that the code and its documentation is insufficient as a carrier of the most essential part of any program, its theory. It&rsquo;s something that could not be conceivably expressed, but is inextricably bound to human beings.</p>
<p>The second part talks about the technical decisions and tradeoffs made as to why a specific implementation was chosen. Other implementations would have been correct but the programmer with the theory can justify the decisions being made at that time, which comes from intuition, and experience. Here, major decisions such as architecture can be documented, but not all. It is implied that, as the first point argues, it is impossible to document all these tradeoffs, and probably won&rsquo;t have much value if done anyway.</p>
<p>The final point, I think, is the most interesting as it involves program modification, which I think we can all agree, is expected in software. And software modification is closely related to costs. This is the reason why onboarding of new programmers take time; new programmers need to understand the original theory. It is insufficient to only be familiar with the program text and other documentation. What is beneficial, or even required, is the opportunity to work in close contact with the programmer who had the original theory. This is similar to other educational problems of other activities such as writing, or learning a musical instrument. The student doing related activities under suitable supervision and guidance will be better off compared to the student who&rsquo;s only looking at manuals and written instructions without teacher supervision.</p>
<p>One more point he emphasized is the idea of program life, death, and revival. The initial building of the program is the building of its theory. The program remains alive when the original programmer (or team) remains in charge, and retains control of its modifications. Program death means the programmer (or team) who understands the theory is dissolved, or is not in charge anymore. A dead software can continue to provide value during its operation; the actual state of death only becomes visible when demands for modification cannot be answered and/or addressed anymore. Finally, revival simply means the rebuilding of its theory (not necessarily the original) by new programmers. This is why it is often faster to write new code than modify an existing one since writing new code means rebuilding the whole theory, but now, it will be based on the intuition and perception of the world by the new programmer, without being beholden by the original theory.</p>
<p>Philosophy aside, what can we learn from these arguments? Here are some proposals:</p>
<ul>
<li>It is a good idea to retain the original programmers for as long as possible. (Perhaps easier said than done due to a lot of factors.)</li>
<li>Onboarding is slow and expensive. Try to focus on the possibility of the new members having access to the original programmers during the transition.</li>
<li>In general, keeping programmers longer is more cost effective than high turnover.</li>
<li>Documentation, while unable to capture the whole program theory, is still helpful in providing context, however minimal, in theory knowledge transfer.</li>
</ul>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-10-18:/blog/2024-10-18-source-code-value/</guid><link>https://flowerinthenight.com/blog/2024-10-18-source-code-value/</link><pubDate>Fri, 18 Oct 2024 00:00:00 JST</pubDate><title>How valuable is your source code?</title></item><item><description><![CDATA[



























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20241001-tech-debt.png"    >
</div>
</div>


<figcaption class="figure-caption mt-2 text-center" >An LLM creating technical debt with the spelling.</figcaption>

</figure>
</div>
</div>

<br>
<p>In this post, I want to talk a little bit about technical debt. In a recent conversation I had with one of our engineers, we sort of inadvertently enumerated some of the more pressing technical debts we have off top of our heads in our current infrastructure, which, now that I think about it, is quite a lot. But what is &ldquo;<strong>technical debt</strong>&rdquo;, really?</p>
<p><a href="https://www.techopedia.com/definition/27913/technical-debt">Techopedia</a> defines it as &ldquo;a programming concept that reflects the extra work required when developers choose an easy short-term solution over the best long-term approach. Like financial debt, it incurs interest in the form of increased maintenance costs and complexity over time.&rdquo; I think I can get behind this definition. And while there are <a href="https://www.investopedia.com/articles/pf/12/good-debt-bad-debt.asp">good debts and bad debts</a> in finance, I think technical debt is mostly associated with a feeling of frustration, or resignation. I don&rsquo;t think I&rsquo;ve ever come across a situation where technical debt is being talked about in light of something akin to &ldquo;good debt&rdquo;. It&rsquo;s always &ldquo;bad code&rdquo;, &ldquo;bad design&rdquo;, &ldquo;bad choice of framework&rdquo;, etc. But I think this association is a little precarious. For developers, they might assume that the previous developers were just rubbish at their job without considering the constraints and tradeoffs made at that time. Product people might think that developers were just using tech debt as an excuse to extend the delivery schedule to a month just so they can play around with the stuff that they care about more than the feature they&rsquo;re being asked to do. Engineering leaders might assume that their developers are spending too much time debating on patterns, and differing opinions on the correct way of doing things instead of agreeing on tradeoffs and carrying on.</p>
<p>Another more interesting trap with that association is that if we just write good code, then there shouldn&rsquo;t be any tech debt. But in reality, even a well-engineered solution will eventually become tech debt over time, especially when the original implementers leave the company, bringing that undocumented, contextual knowledge with them. And things like changing technologies, abandoned open source solutions, software license changes, leadership changes, budget changes, even bit rot, can render an almost-perfect implementation a tech debt in no time.</p>
<p>One common thing I observed teams do to mitigate tech debt is to conduct tech debt-only sprints, or tech debt hackathons. While good-intentioned and practical, it sometimes encourages a culture of postponing good design upfront just to honor the initial, usually very tight schedule, instead of negotiating and meeting in the middle. And while I&rsquo;ve experienced some success with this approach, refactors and maintenance work are rarely rewarded. They&rsquo;re invisible, thankless, boring, and soul-crushing, especially if it&rsquo;s not your own. Now, I&rsquo;m speaking in general terms here; I know a bunch of people who prefer maintenance work than endless pumping of new features. And more power to them. But I think this is not uncommon. Companies rarely reward tech debt maintenance, especially when there are a lot of initiatives and new features that drives the business forward.</p>
<p>So how to deal with it then? Unfortunately, I don&rsquo;t really have a defined answer to this. At <a href="https://www.alphaus.cloud/">Alphaus</a>, I advise our engineers to include tech debt (if there&rsquo;s a need to deal with it) in their estimations, and think of it as just that: part of the solution. I don&rsquo;t encourage separate, isolated efforts just to tackle tech debt, although there are times when we just admit defeat, deal with it, and then move on. I like to think of it as how we deal with financial debt, really; we pay in chunks monthly until it&rsquo;s fully paid. It&rsquo;s just part of the monthly budget. And occasionally, you get some extra cash and so you pay a little bit more.</p>
<p>On a more systematic approach though, tech debt is really part and parcel of software development, at least in the current state of things. There are schedules to be met, and customer requests to be fulfilled. Whatever method(s) one employ to formalize tech debt management, which I think really depends on the company size, some points need to be considered:</p>
<ul>
<li>Metrics for visibility and understanding of systems;</li>
<li>Ownership at the higher level;</li>
<li>Alignment (and agreement) on the value to the business if a specific tech debt is addressed;</li>
<li>Understanding of cost and associated risks when not addressed; and</li>
<li>Metrics to measure success (or completion).</li>
</ul>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-10-01:/blog/2024-10-01-tech-debt/</guid><link>https://flowerinthenight.com/blog/2024-10-01-tech-debt/</link><pubDate>Tue, 01 Oct 2024 00:00:00 JST</pubDate><title>On technical debt</title></item><item><description><![CDATA[<p>In continuation with my <a href="/blog/2024-08-20-thoughts-on-newer-system-languages/">previous post</a> about the new systems programming languages, I mentioned I was considering <a href="https://ziglang.org/">Zig</a> as a potential complementary systems language to our main one, which is Go(lang). Well, for the past month or so, in my spare time, I tried writing something more substantial in it to understand the language more. For some time now, I&rsquo;ve been &ldquo;itching&rdquo; to write something similar to Hashicorp&rsquo;s <a href="https://github.com/hashicorp/memberlist">memberlist</a> library, but in a lower-level language for performance, smaller footprint and minimal network load. Now, I&rsquo;ve <a href="/blog/2023-04-28-hedge-memberlist/">used</a> <code>memberlist</code> before, and it is a superb piece of code, but I wanted something that supports a consistent leader across the whole fleet. It&rsquo;s a requirement to a system I plan on building in the near future (more on this in a later post). My top choices were C, Rust, and Zig, and as I said, I took a liking to Zig due to its promised simplicity, so I wrote it in Zig. The project is called <code>zgroup</code>, and you can check it out on <a href="https://github.com/flowerinthenight/zgroup">GitHub</a> if you&rsquo;re interested. It&rsquo;s still similar to <code>memberlist</code> but with the added capability of electing a leader across the whole group. It uses both the <a href="https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf">SWIM Protocol</a>, which <code>memberlist</code> uses, and <a href="https://raft.github.io/raft.pdf">Raft</a>&rsquo;s leader election algorithm.</p>
<p>Anyway, so what do I think of Zig? I think it has a lot of promise. It still feels incomplete, which it is, especially the standard library. Which is expected of course for a &lt; v1.0 software. But coming from C, and Go, I think Zig is the sweet spot for me between C++ and Rust. But I don&rsquo;t think I will be introducing it to <a href="https://www.alphaus.cloud/">Alphaus</a> anytime soon though; maybe when it&rsquo;s tagged v1.0+, which I think is still years from now.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-09-30:/blog/2024-09-30-on-zig/</guid><link>https://flowerinthenight.com/blog/2024-09-30-on-zig/</link><pubDate>Mon, 30 Sep 2024 00:00:00 JST</pubDate><title>Thoughts on Zig</title></item><item><description><![CDATA[<p><strong>TL;DR</strong>: To my fellow system builders, don&rsquo;t dismiss it but understand how it works. As the saying goes, &ldquo;a tool is only as good as the hands that wield it&rdquo;. As a software craftsman, your tools are important. And equally so, are your skills in using them effectively.</p>
<p>~~</p>
<p>There&rsquo;s no escaping GenAI nowadays, is there? I&rsquo;m sure you&rsquo;ve seen the full spectrum of its effects by now; from total naysayers to skeptics, to cautious optimists, to proponents and fanatics, to full-blown doom-bringers. In the cloud space, the big three cloud providers are &ldquo;all in&rdquo; on AI, as you can see in their headlines. Furthermore, there are thousands of AI-powered startups cropping up left, right, and center, with massive venture capital and valuations. If you&rsquo;re not in the thick of these things, it gives the feeling that if you don&rsquo;t invest, or apply AI in your business, or products, it&rsquo;s just a matter of time when you&rsquo;ll be left behind and eventually relegated into the pages of failures in history. It&rsquo;s a scary thought. The <a href="https://en.wikipedia.org/wiki/Fear,_uncertainty,_and_doubt">FUD</a> is real.</p>
<p>According to Gartner&rsquo;s Hype Cycle for Artificial Intelligence, 2024, AI is now past the &ldquo;Peak of Inflated Expectations&rdquo;, although the hype about it still continues. So somewhere around here:</p>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><a href="https://www.seldon.io/the-significance-of-ai-engineering-in-the-gartner-hype-cycle-for-artificial-intelligence-2024-report">
<div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240825-ai-hype.png"   style="height: auto; width: 90%"   >
</div>
</a>
</div>


</figure>
</div>
</div>

<p>That means the &ldquo;Trough of Disillusionment&rdquo;, or in other words, the time of it not being &ldquo;cool&rdquo; anymore, is just around the corner (could still be years though, who knows). That&rsquo;s the phase where you&rsquo;re going to get the &ldquo;You&rsquo;re still doing AI? The world has moved on to Quantum Computing now! Catch up, or you&rsquo;re gonna be left behind!&rdquo; sort of feedback.</p>
<p>Remember Web3? <a href="https://news.crunchbase.com/venture/global-funding-data-analysis-ai-eoy-2023/#Web3%20and%20consumer%20tumble">According to Crunchbase</a>, the uncontrolled hype of 2021 petered out dramatically around last year, with funding declining from $28 billion to $7.6 billion (73% decline) in 2023. $7.6 billion is not a slouch by any means, but you&rsquo;d better have a credible business model to be able to survive that. With AI, you can kind of see signs of brakes being applied already with <a href="https://www.goldmansachs.com/insights/top-of-mind/gen-ai-too-much-spend-too-little-benefit">reports</a> saying it might not perform well enough to justify the cost.</p>
<p>So what does that mean to us, system builders? I&rsquo;m not an economist, nor can I predict the future, but I can form my opinion around what&rsquo;s going on. Mind you, this is just my opinion; I&rsquo;m not even representing the opinion of my employer, <a href="https://www.alphaus.cloud/">Alphaus</a>. With that disclaimer out of the way, I think I&rsquo;m somewhere in the middle; cautiously optimistic about AI and its uses. AI has been compared to the internet revolution in the 90&rsquo;s, and rightly so, as a lot of the intricacies of AI still feels like &ldquo;magic&rdquo; to most people. And I think I can understand this sentiment as I&rsquo;m in the generation that saw the rise of the internet and social media during their prime years and a lot of it felt like magic as well. Regardless of the dot-com crash in the 2000&rsquo;s, now, the internet, just like the smartphone, is here to stay. We don&rsquo;t really think about them that much anymore; they&rsquo;re just, you know, part of our daily lives. And I think AI will be the same. It will find its place, whether niche or mainstream, in our daily lives in some form or another. Whether that&rsquo;s in my lifetime, I don&rsquo;t know.</p>
<p>We use AI at Alphaus. We use forecasting models to help our customers <a href="https://www.alphaus.cloud/en/ripple">forecast budgets and their future cloud spendings</a> based on their historical data. We use LLMs as part of our customer support. Our engineers use Copilot to aid them in their day-to-day coding tasks. I use both ChatGPT and Gemini in notetaking, summarization, and translation. And I know our CEO, <a href="https://www.linkedin.com/in/hajimehirose/">Hajime Hirose</a>, uses a lot of LLMs as well. On top of that, I read a lot of papers behind AI. It&rsquo;s something I enjoy doing. As a CTO of a startup, it is part of my job to assess technology and leverage it to further our business. And that includes AI, among other things.</p>
<p>If you&rsquo;re a software engineer who is currently employed, or looking for a job, or looking for something interesting to learn/build, or a technologist who is thinking of starting a company, I think it&rsquo;s wise to consider AI. Whether we like it or not, we are part of this corporate, capitalistic world, and AI is the current game. And we have to play the game to survive.</p>
<p>Do I think AI will replace developers? To some extent, yes. And not just developers, but other professions as well. But I think it will also create new jobs and positions at the same time. Just like the internet and other transformative tech revolutions before that. I would imagine the industrial revolution has replaced a lot of professions but also created completely new ones. Think assembly lines, steel-making, steam-power-based automation and mass production, etc., which I think didn&rsquo;t really exist before. Anyway, think of it this way: software engineering is about solving problems, not just coding. Coding is probably the easiest part of our job. LLMs might be able to replace that bit but, without <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">sentience or consciousness</a>, the social/human part of our job is still on us.</p>
<p>Do I think GenAI will cause human extinction? Or replace us humans? Well, it&rsquo;s like asking a hunter-gatherer during the Stone Age if he thinks a jet engine would render his bow and arrow useless. He&rsquo;ll probably just reply, &ldquo;You mad? What are you on about?&rdquo;.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-08-25:/blog/2024-08-25-on-genai/</guid><link>https://flowerinthenight.com/blog/2024-08-25-on-genai/</link><pubDate>Sun, 25 Aug 2024 00:00:00 JST</pubDate><title>On Generative AI (GenAI)</title></item><item><description><![CDATA[<p>I&rsquo;m talking about the new crop of systems programming languages that advertise themselves as better replacements for C and/or C++: <a href="https://www.rust-lang.org/">Rust</a>, <a href="https://ziglang.org/">Zig</a>, <a href="https://dlang.org/">D</a>, <a href="https://odin-lang.org/">Odin</a>, <a href="https://nim-lang.org/"> Nim </a>, etc. I&rsquo;m using the word &ldquo;new&rdquo; loosely here as Rust and Zig, for example, are almost a decade old now. The topic of systems programming has been on my radar (again) recently at work due to our attempts at improving the performance of some of the more critical parts of our stack. At <a href="https://www.alphaus.cloud/">Alphaus</a>, we use Go as our main programming language and as much as I like Go, there are still areas in our infrastructure that could be served better with non-GC languages.</p>
<p>This post will not be an X vs Y review, or at least, not intentionally. Programming languages, I think, are one of those things that developers tend to get attached to, alongside editors, OS&rsquo;es (or distros in the Linux world). &ldquo;Here be dragons.&rdquo;, as they say. But I want to share my thoughts from a perspective of someone who can introduce a new programming language to teams, instead of someone who wants to make a case and convince somebody who can do so.</p>
<p>Before I joined, Alphaus was a PHP shop. Around early 2018, after about half a year, I introduced Go as our main programming language but only for new services; no rewrites. I didn&rsquo;t know Go at that time, nobody in the company did. But why Go though? I could have chosen Java, or C#. Seven years on, now, I consider that a good decision. Did I know it would be? Of course not. Java probably would have been fine. But thinking about it now, I would attribute it mainly down to Go&rsquo;s simplicity.</p>
<br>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><a href="https://go.dev/">
<div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240820-golang.png"   style="height: auto; width: 90%"   >
</div>
</a>
</div>


<figcaption class="figure-caption mt-2 text-center" >From Go&rsquo;s homepage.</figcaption>

</figure>
</div>
</div>

<br>
<p>Before Alphaus, my background was embedded, low-level systems. Over the years of working with several programming languages, I&rsquo;ve come to appreciate the simplicity of C. Even though experience-wise, I probably have written more C++ than C, there&rsquo;s something in C&rsquo;s simplicity that I yearn when working with more complex languages. And I think that translated to my choice of Go, really. It was the C equivalent in the high-level language world for me (apart from the fact that it&rsquo;s created by C people). So you can say there&rsquo;s definitely a bias there on my part. And I think it&rsquo;s true. We always say (in software engineering circles) &ldquo;use the right tool for the job&rdquo;, but the &ldquo;right&rdquo; part there is the tricky bit. In my case, &ldquo;right&rdquo; was ultimately based on my own biases, my own preferences. Although in hindsight, it wasn&rsquo;t really purely selfish. There was a big push for Go when going cloud native at that time as well. And Docker and Kubernetes were also written in Go, which I think also had an influence, as we based our stack on these tools too. And our product was in infrastructure (we pivoted later) so there&rsquo;s definitely an alignment there.</p>
<p>Go&rsquo;s simplicity, in my mind, also meant easy recruitment. I said nobody in the company knew Go at that time but it wasn&rsquo;t really a big deal to me as I was quite sure the engineering team would pick it up and be productive with it in no time. The same with recruitment; I can hire non-Go engineers knowing that they can learn Go in about a week and then start contributing. With that said though, simplicity being translated to quick productivity is not really a deal-breaker to me. It only applies when you don&rsquo;t have people who know the language. When you have a team who are already proficient with, say, C++, they will be productive with it.</p>
<p>You see, I could go on enumerating about the pros of Go by mentioning its merits as a language but what I&rsquo;m really trying to say here is that the &ldquo;objective&rdquo; (or &ldquo;right&rdquo;) reasons why I should choose Go only had a slight influence on me. Even without all these, I&rsquo;m sure I&rsquo;d still end up with Go mainly because of my bias in C. I&rsquo;m sure there are a lot of engineering leaders out there who are more objective or data-driven than me but again, programming languages are one of those things we get emotional about. And I&rsquo;m sure I&rsquo;m not alone in thinking this way.</p>
<p>Now back to the present. At the moment, I&rsquo;m mulling about introducing a systems language to the company. Not a replacement of Go, but a compliment to Go. Of all the choices listed above, and based on the thought process I just laid out, it&rsquo;s obvious that my choice would be Zig. It&rsquo;s advertised as &ldquo;the better C&rdquo;, while Rust is &ldquo;the better C++&rdquo;. But I&rsquo;m not quite sure yet.</p>
<p>You could argue that it&rsquo;s impractical, maybe borderline irresponsible, of me to not choose Rust. It should be the &ldquo;right&rdquo; choice. It&rsquo;s memory-safe, more mature, already included in the Linux kernel, proven in production by some of the big names in tech, etc. And more importantly, some of the people I follow (and look up to) are also advocating for it.</p>
<br>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><a href="https://www.rust-lang.org/">
<div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240820-rust.png"   style="height: auto; width: 90%"   >
</div>
</a>
</div>


<figcaption class="figure-caption mt-2 text-center" >From Rust&rsquo;s homepage.</figcaption>

</figure>
</div>
</div>

<br>
<p>I know Rust a bit. I&rsquo;ve dabbled with it outside work although I&rsquo;m not really proficient with it since I haven&rsquo;t written any production code with it. It is a complex language. Not C++-level kind of complex but it is still very complex. I believe it is &ldquo;the better C++&rdquo; and if I&rsquo;m still working in C++ now, I&rsquo;d choose it as well. But I&rsquo;m not. And my bias is stronger now than ever before due to C, and now, Go.</p>
<p>I&rsquo;m actually looking at Zig now. It resonates with me as it still feels &ldquo;C&rdquo;-ish but a lot better. I like the choice of control over memory allocations. It&rsquo;s probably not as safe as Rust but it provides mechanisms to make it much safer than C. And it&rsquo;s still a simple language overall. Having said that, no decisions yet. It&rsquo;s only been 2-3 months. And I&rsquo;m not in a hurry. I think the only real blocker at the moment is it not being tagged 1.x yet.</p>
<br>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><a href="https://ziglang.org/">
<div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240820-zig.png"   style="height: auto; width: 90%"   >
</div>
</a>
</div>


<figcaption class="figure-caption mt-2 text-center" >From Zig&rsquo;s homepage.</figcaption>

</figure>
</div>
</div>

<br>
<p>This post has been all over the place. In the end, you might not care whatever my choice will be and the justifications I have for it but another way of looking at it is that if you happen to be in a position of trying to convince your leader(s) to switch to your preferred programming language, basing your arguments only from an objective point of view might not hold much water. You&rsquo;re probably going to fight an already losing battle from the start.</p>
<p>Or it could still work, you know. Who knows.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-08-20:/blog/2024-08-20-thoughts-on-newer-system-languages/</guid><link>https://flowerinthenight.com/blog/2024-08-20-thoughts-on-newer-system-languages/</link><pubDate>Tue, 20 Aug 2024 00:00:00 JST</pubDate><title>Thoughts on the newer systems programming languages</title></item><item><description><![CDATA[<p>One of <a href="https://www.alphaus.cloud/">Alphaus</a>&rsquo; data processing pipelines ingests around 10TB of client financial data per day. The processing engine is running on <a href="https://cloud.google.com/kubernetes-engine">GKE</a> with around 80-100 (depending on what week of the month) pods sharing the total workload. Each pod has around 10GB of memory and 30GB of attached storage. The consistency of this load allowed us to purchase enough <a href="https://cloud.google.com/docs/cuds">Committed Use Discounts (CUDs)</a> for the underlying VMs to save on compute costs.</p>
<p>These pod resource limits are usually enough 80% of the time. However, since late last year, some of the accounts have datasets that are way, way beyond these limits causing persistent <a href="https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#exceed-a-container-s-memory-limit"><strong>OOMKilled</strong></a> events.</p>
<p>Our first stop-gap solution was to increase the memory limit. The trouble was, even with 20GB+ of memory wasn&rsquo;t really enough for some of the input datasets. And on top of this, GKE&rsquo;s cluster autoscaler also started increasing the VM sizes to those which we don&rsquo;t have CUDs for. Suffice it to say, it increased our monthly cloud spend to about <strong>+20%</strong> while delaying the overall processing time due to pods crashing (and restarting).</p>
<p>We tried other solutions. One was using local files which required increasing the size of the attached storage. While cost-effective, the performance drop was significant mainly because most of the datasets that were well within the memory limit were now moved to disk as well. We also tried using the database we are currently using which turned out to be worse in terms of perfomance and costs. We also tried to use our cache layer (named <a href="https://github.com/alphauslabs/jupiter">Jupiter</a>) which was very performant but prohibitively expensive.</p>
<p>Enter <strong>Spill-over Store (SoS)</strong>, our current solution. Inspired by <a href="https://ignite.apache.org/">Apache Ignite</a>&rsquo;s design, the idea is to stitch together the already available memory and storage across the running pods, providing an ad-hoc, on-demand storage for really big datasets.</p>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/sos.png"   style="height: auto; width: 90%"   >
</div>
</div>


<figcaption class="figure-caption mt-2 text-center" >Illustration of hedge&rsquo;s Spill-over Store.</figcaption>

</figure>
</div>
</div>

<p>From the image above, the pod that is assigned to load a huge dataset will exhaust its local memory first, then &ldquo;spill over&rdquo; to its local disk, then another pod&rsquo;s memory (using gRPC streaming), then disk, and so on. Thus, our example 100GB dataset will utilize around 4 pods in total within the cluster.</p>
<p>This solution allowed us to actually revert back to our original pod resource limits. Both disk and network performance are acceptable (we don&rsquo;t use GCP&rsquo;s Tier 1 network) and still within our SLA as the solution only applies to about 20% of the ingestion pipeline. The majority still uses local in-memory processing.</p>
<p>As Alphaus grows (and therefore ingests more and more data) and serve more clients, maybe we will eventually end up using <strong>Apache Ignite</strong> or some other off-the-shelf distributed solutions, but as of now, <strong>SoS</strong> works. With that said, if you have a cost-effective (and better) product/solution in mind, please feel free to contact us. We&rsquo;d love to talk.</p>
<p>Finally, you can find <strong>SoS</strong>&rsquo;s implementation <a href="https://github.com/flowerinthenight/hedge/blob/main/sos.go">here</a> (if you&rsquo;re interested).</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-07-24:/blog/2024-07-24-spillover-store/</guid><link>https://flowerinthenight.com/blog/2024-07-24-spillover-store/</link><pubDate>Wed, 24 Jul 2024 00:00:00 JST</pubDate><title>How Alphaus saves on costs by ‘stitching storage’</title></item><item><description><![CDATA[<p>&ldquo;Back-of-the-envelop calculations&rdquo;, &ldquo;napkin-math&rdquo;, &ldquo;latency numbers every programmer should know&rdquo; - yes, those numbers that usually come up during system design interview questions. This came into my periphery again while looking at RDMA latency checks and benchmarks with P4d instances in AWS (using SoftRoCE). As an old-timer with (most likely) outdated ideas about system design-related latency numbers, although I&rsquo;m quite familiar with Jeff Dean&rsquo;s <a href="https://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf">&ldquo;Numbers every one should know&rdquo;</a> approximations, I noticed that in a jiffy, I&rsquo;m still (unconsciously) subscribed to the idea that disk access is most definitely faster than network. Somewhere along the lines of L* cache &gt; memory &gt; disks &gt; network. Come to think of it, not really sure why. It&rsquo;s probably because pre-cloud, I really didn&rsquo;t have access to high tier network bandwidth, so my experience with only crappy networks has been etched in my mind for the longest time. This is usually evident when I do quick, back-of-my-head latency calculations of a potential system to design. Of course in the end, benchmarking will have the last say when it comes to these numbers, but having a rough idea of the performance numbers pre-implementation is always helpful.</p>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240628-numbers-everyone-should-know.png"   style="height: auto; width: 70%"   >
</div>
</div>


<figcaption class="figure-caption mt-2 text-center" >From Jeff Dean&rsquo;s slides.</figcaption>

</figure>
</div>
</div>

<p>The idea of 50Gbps, 100Gbps, or even 200Gbps network speeds, and now with P4d&rsquo;s advertised 400Gbps (yes, 400!) somehow didn&rsquo;t translate to my internal insistence that SSDs, especially the NVMEs or even M.2s, should be faster. Now, there might be enterprise/military/space/etc&hellip;-grade SSDs that I&rsquo;m not aware of, or don&rsquo;t have access to, that have stupendous read speeds, but I&rsquo;ve never heard of one so let&rsquo;s leave them for now. Some of the faster M.2 SSDs can go up to ~15GB/s read speeds, which is quite impressive, but nowhere near as P4d&rsquo;s. I still can&rsquo;t wrap my head around it. It&rsquo;s the sort of numbers you find listed on supercomputers with their custom Infiniband interconnects.</p>
<p>Anyway, after scouring the rabbit hole that is Reddit regarding discussions, debates, (and fights,) about the correctness of these numbers, I came across sirupsen&rsquo;s <a href="https://github.com/sirupsen/napkin-math">napkin-math</a> repo, which I think is a fascinating piece of information. If you&rsquo;re a system designer, you really should check it out.</p>




























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20240628-numbers-net-ssd.png"   style="height: auto; width: 80%"   >
</div>
</div>


<figcaption class="figure-caption mt-2 text-center" >From napkin-math repo.</figcaption>

</figure>
</div>
</div>

<p>Again, it looks like network can indeed be faster than disk access. I really need to recalibrate my internal latency numbers table to accommodate for these more modern hardware capabilities. The good thing is, at least, these sort of information are now readily available anytime. With that said though, I still believe that system designers should still be able to roughly determine a system&rsquo;s overall performance behavior pre-implementation using napkin-math latency calculations.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-06-28:/blog/2024-06-28-revisiting-latency-numbers/</guid><link>https://flowerinthenight.com/blog/2024-06-28-revisiting-latency-numbers/</link><pubDate>Fri, 28 Jun 2024 00:00:00 JST</pubDate><title>Revisiting latency numbers</title></item><item><description><![CDATA[<p>It was simply because the Jekyll theme I was using before, a modified version of <a href="https://github.com/gchauras/much-worse-jekyll-theme">much-worse-jekyll-theme</a>, doesn&rsquo;t build anymore. Actually, no, that&rsquo;s not quite right. I updated some of its <code>npm</code> dependencies which caused it to not build anymore. It&rsquo;s using quite an old version of Ruby, including most of its dependencies, that I&rsquo;ve been receiving a lot of warning emails about vulnerabilities from GitHub. Instead of potentially spending a lot of time just updating the build itself, I&rsquo;d probably be better off moving to a ready-made, more modern templates/themes. I knew about <a href="https://gohugo.io/">Hugo</a> and a quick search of its free themes led me to choose Will Faught&rsquo;s <a href="https://github.com/willfaught/paige">Paige</a> theme, which I quite like due to its very simple look.</p>
<p>The migration was actually quite quick. I only copied my previous <code>_posts/</code> folder into the new <code>content/blog/</code> folder and added some of the needed front matter which was simply done using good ol&rsquo; <code>grep</code> + <code>awk</code>.</p>
<p>While I really liked the look and feel of my old site, this new one is also quite good. And Netlify&rsquo;s support for Hugo is quite solid as well so less build headaches for me. Hopefully, for many years to come.</p>
<p>Thank you OSS and free services.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-06-27:/blog/2024-06-27-new-look/</guid><link>https://flowerinthenight.com/blog/2024-06-27-new-look/</link><pubDate>Thu, 27 Jun 2024 00:00:00 JST</pubDate><title>New site look</title></item><item><description><![CDATA[<p>I shared a simple piece of code for getting a process&rsquo; memory usage in Linux. It&rsquo;s called <a href="https://github.com/flowerinthenight/memx"><code>memx</code></a>. It&rsquo;s Linux-specific only as it reads the <a href="https://en.wikipedia.org/wiki/Proportional_set_size">proportional set size (PSS)</a> data from either <code>/proc/{pid}/smaps_rollup</code> (if present) or <code>/proc/{pid}/smaps</code> file. I&rsquo;ve used this piece of code many times at work. We use memory-mapped files extensively in some of our services and this is how we get more accurate results. Very useful in debugging <code>OOMKilled</code> events in k8s.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-05-30:/blog/2024-05-30-memx-memory-usage-linux/</guid><link>https://flowerinthenight.com/blog/2024-05-30-memx-memory-usage-linux/</link><pubDate>Thu, 30 May 2024 00:00:00 JST</pubDate><title>memx - Get process’ memory usage (Linux)</title></item><item><description><![CDATA[<p>I posted a <a href="https://labs.alphaus.cloud/blog/2024/05/13/the-divs-model/">blog</a> introducing the <strong>DIVS</strong> model, the process we use at <a href="https://www.alphaus.cloud/">Alphaus</a>, the startup I work for. Check it out <a href="https://labs.alphaus.cloud/blog/2024/05/13/the-divs-model/">here</a>.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-05-13:/blog/2024-05-13-the-divs-model/</guid><link>https://flowerinthenight.com/blog/2024-05-13-the-divs-model/</link><pubDate>Mon, 13 May 2024 00:00:00 JST</pubDate><title>The DIVS model</title></item><item><description><![CDATA[<p>I recently uploaded a tool to GitHub that wraps the <code>kubectl get events -w</code> command for watching <code>OOMKilled</code> events in Kubernetes. It&rsquo;s called <a href="https://github.com/flowerinthenight/oomkill-watch"><code>oomkill-watch</code></a>. You can check out the code <a href="https://github.com/flowerinthenight/oomkill-watch">here</a>. You might find this useful.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2024-05-03:/blog/2024-05-03-oomkill-watch/</guid><link>https://flowerinthenight.com/blog/2024-05-03-oomkill-watch/</link><pubDate>Fri, 03 May 2024 00:00:00 JST</pubDate><title>oomkill-watch - A tool to watch OOMKilled events in k8s</title></item><item><description><![CDATA[



























    




<div class="paige-figure ">
<div class="align-items-center d-flex  h-100 justify-content-center ">
<figure class=" mb-0" >
<div class="d-flex justify-content-center text-center"><div class="paige-image">
<img   class="img-fluid "  crossorigin="anonymous"    referrerpolicy="no-referrer"  src="https://flowerinthenight.com/assets/20230809-ship-orgchart.png"    >
</div>
</div>


</figure>
</div>
</div>

<br>
<p>You&rsquo;ve probably heard of the warning &ldquo;Don&rsquo;t ship the org chart&rdquo; common among product development circles. I always thought of this as synonymous to <a href="https://en.wikipedia.org/wiki/Conway%27s_law">Conway&rsquo;s Law</a> which states that organizations design systems that mirror their own communication structure. In my experience, I&rsquo;ve come to believe that this is true. Whether you like it or not, it is an eventuality. At some point I thought that to be an effective solutions architect, you have to be an &ldquo;org architect&rdquo;. Or in order to achieve a certain system architecture, you&rsquo;re better off rearranging your org structure than influence cross-functional architects or senior engineering leadership to agree with you. It may work for a time, especially during the start, but eventually, as engineers come and go, the org-mirrored design will out. But I think there&rsquo;s more nuance to this than what&rsquo;s obvious.</p>
<p>There are several types of org structures but I will cover only what&rsquo;s relevant to my situation which is startups. This might be anecdotal but between peers, the most common types fall under these two broad classifications: less boundaries or bigger teams, and smaller, often siloed teams. Teams with less boundaries tend to move slower because more people need to communicate for alignment but will produce a more coherent product. Smaller teams on the other hand, move faster but their outputs are more difficult to integrate together, resulting in an inconsistent product experience. Personally, I don&rsquo;t really have a strong opinion on which is better. I think there are phases in a startup where one works better than the other. But this requires you to understand what you want to build in the first place and then build your org chart around that.</p>
<p>There&rsquo;s also the case of building your org structure based on product lines if you have multiple products as opposed to functional teams covering multiple products. Product-based org structures might promote autonomy and quicker response to industry changes but could easily lead to system duplication (wasted resources). Functional structure however, encourages specialization and focus but inhibits multi-boundary communications.</p>
<p>You may argue that, &ldquo;Does it really matter? Customers don&rsquo;t really care about how your org is structured as long as your product is usable and has great UX.&rdquo; I would actually agree that this is a good baseline for structuring your org. You will have several vertical teams per product but an overlapping horizontal team that ensures UX coherence, usability, and branding. I think this is what the <a href="https://www.hbs.edu/ris/Publication%20Files/16-124_7ae90679-0ce6-4d72-9e9d-828872c7af49.pdf">Mirroring Hypothesis</a> study (confirms Conway&rsquo;s Law) refers to as partial mirroring; in which technological knowledge are invested, shared, and acquired beyond operational boundaries. &ldquo;API-first&rdquo; companies come to mind as in partial mirroring, building contractual relationships (APIs) that support technical interdependency across boundaries seems to work. And more often than not, your org structure will undergo changes multiple times during your product&rsquo;s lifetime, which will result into subsequent changes to your system&rsquo;s design that will then mirror the new org, and so on and so forth, so you might as well embrace this dynamism instead of fighting against it.</p>
<p>Finally, the study also highlights collaboration patterns in the open source sphere that do not support the mirroring hypothesis (and thus, Conway&rsquo;s Law). My experience with OSS collaboration is mostly outside work so I&rsquo;m interested to know if there is a case for adopting OSS-style structure within the organization. I think that&rsquo;s a good topic for another day.</p>
<p>To conclude, instead of going with the warning &ldquo;don&rsquo;t ship the org chart&rdquo;, since you will ship your org chart anyway, be aware of it, understand it, and make sure it works on your favor.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2023-08-09:/blog/2023-08-09-cto-diaries-on-not-shipping-your-org-chart/</guid><link>https://flowerinthenight.com/blog/2023-08-09-cto-diaries-on-not-shipping-your-org-chart/</link><pubDate>Wed, 09 Aug 2023 00:00:00 JST</pubDate><title>CTO Diaries #4: On not shipping your org chart</title></item><item><description><![CDATA[<p>We just recently announced the public beta of our new product, <a href="https://lp.alphaus.cloud/">OCTO</a>. If you&rsquo;re interested, you can join our waiting list at <a href="https://lp.alphaus.cloud/waitlist">https://lp.alphaus.cloud/waitlist</a>.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2023-05-29:/blog/2023-05-29-announcing-octo/</guid><link>https://flowerinthenight.com/blog/2023-05-29-announcing-octo/</link><pubDate>Mon, 29 May 2023 00:00:00 JST</pubDate><title>Announcing our new product, OCTO</title></item><item><description><![CDATA[<p>In a distributed system, where multiple processes communicate with each other over a network, failures are inevitable. Network partitions, hardware failures, and software bugs can all cause a request to fail. Retries with backoff are a critical technique to help mitigate these failures.</p>
<p>Retries refer to the act of retrying a failed request. When a request fails, the client can retry the request, hoping that it will succeed the next time around. However, simply retrying the request immediately after a failure can be problematic. If the failure was caused by a temporary network issue, for example, retrying immediately will likely result in another failure. This is where backoff comes in.</p>
<p>Backoff refers to the practice of waiting a certain amount of time before retrying a failed request. The idea is to wait long enough for any temporary issues to resolve themselves before retrying. The amount of time to wait is typically increased with each retry, hence the term &ldquo;backoff.&rdquo; The idea is that if the request fails multiple times, the client will eventually back off enough to give the system a chance to recover.</p>
<p>There are several benefits to using retries with backoff in a distributed system. First, it can help reduce the impact of temporary failures. By waiting before retrying a request, the client can avoid bombarding the target with requests, which can exacerbate the problem. Second, it can help improve overall system availability. By retrying failed requests, the client can work around transient issues that might otherwise cause the entire system to fail.</p>
<p>There are several strategies for implementing retries with backoff. One common approach is exponential backoff, where the client waits an increasing amount of time between each retry. Another approach is jittered backoff, where the client adds a random amount of time to the wait period to avoid the so-called &ldquo;thundering herd&rdquo; problem, where multiple clients all retry at the same time.</p>
<p>Example 1:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kn">import</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="o">...</span>
</span></span><span class="line"><span class="cl">    <span class="nx">backoffv4</span> <span class="s">&#34;github.com/cenkalti/backoff/v4&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">var</span> <span class="nx">n</span> <span class="kt">int</span>
</span></span><span class="line"><span class="cl">    <span class="nx">operation</span> <span class="o">:=</span> <span class="kd">func</span><span class="p">()</span> <span class="kt">error</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">n</span><span class="o">++</span>
</span></span><span class="line"><span class="cl">        <span class="nx">log</span><span class="p">.</span><span class="nf">Printf</span><span class="p">(</span><span class="s">&#34;n=%v\n&#34;</span><span class="p">,</span> <span class="nx">n</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nx">n</span> <span class="o">&gt;=</span> <span class="mi">10</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;backoff&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="nx">err</span> <span class="o">:=</span> <span class="nx">backoffv4</span><span class="p">.</span><span class="nf">Retry</span><span class="p">(</span><span class="nx">operation</span><span class="p">,</span> <span class="nx">backoffv4</span><span class="p">.</span><span class="nf">NewExponentialBackOff</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">log</span><span class="p">.</span><span class="nf">Println</span><span class="p">(</span><span class="s">&#34;final backoff failed&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Example 2 (my preference):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kn">import</span> <span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="o">...</span>
</span></span><span class="line"><span class="cl">    <span class="nx">gaxv2</span> <span class="s">&#34;github.com/googleapis/gax-go/v2&#34;</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nx">bo</span> <span class="o">:=</span> <span class="nx">gaxv2</span><span class="p">.</span><span class="nx">Backoff</span><span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">Initial</span><span class="p">:</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Second</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="nx">Max</span><span class="p">:</span>     <span class="nx">time</span><span class="p">.</span><span class="nx">Minute</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kd">var</span> <span class="nx">n</span> <span class="kt">int</span>
</span></span><span class="line"><span class="cl">    <span class="nx">operation</span> <span class="o">:=</span> <span class="kd">func</span><span class="p">()</span> <span class="kt">error</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">n</span><span class="o">++</span>
</span></span><span class="line"><span class="cl">        <span class="nx">log</span><span class="p">.</span><span class="nf">Printf</span><span class="p">(</span><span class="s">&#34;cnt=%v\n&#34;</span><span class="p">,</span> <span class="nx">n</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nx">n</span> <span class="o">&gt;=</span> <span class="mi">10</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="kc">nil</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">return</span> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Errorf</span><span class="p">(</span><span class="s">&#34;backoff&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">err</span> <span class="o">:=</span> <span class="nf">operation</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nx">err</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="nx">time</span><span class="p">.</span><span class="nf">Sleep</span><span class="p">(</span><span class="nx">bo</span><span class="p">.</span><span class="nf">Pause</span><span class="p">())</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="k">break</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>In conclusion, retries with backoff are an important technique for improving the robustness and availability of distributed systems. By waiting before retrying failed requests, the client can help reduce the impact of temporary failures and improve overall system availability. There are several strategies for implementing retries with backoff, and choosing the right approach will depend on the specific requirements of the system.</p>
<p>Additional reading: <a href="https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/">https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/</a></p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2023-05-11:/blog/2023-05-11-retries-backoff/</guid><link>https://flowerinthenight.com/blog/2023-05-11-retries-backoff/</link><pubDate>Thu, 11 May 2023 00:00:00 JST</pubDate><title>Retries with backoff in distributed systems</title></item><item><description><![CDATA[<p>Hey there,</p>
<p>I just posted a blog about gRPC <a href="https://labs.alphaus.cloud/blog/2023/05/03/blue-api/">here</a>. If gRPC and <code>grpc-gateway</code> is right up your alley, you might find it interesting.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2023-05-03:/blog/2023-05-03-alphaus-blue-api/</guid><link>https://flowerinthenight.com/blog/2023-05-03-alphaus-blue-api/</link><pubDate>Wed, 03 May 2023 00:00:00 JST</pubDate><title>Alphaus Blue API</title></item><item><description><![CDATA[<p>I recently came across the <a href="https://github.com/hashicorp/memberlist">hashicorp/memberlist</a> library while browsing GitHub and I thought it would be a good replacement for <a href="https://github.com/flowerinthenight/hedge">hedge</a>&rsquo;s internal member tracking logic. It seems to be widely used (thus more battle-tested) as well. I was quite excited as I always thought that hedge&rsquo;s equivalent logic is too barebones and untested outside of our use cases. It works just fine for its current intended purpose but I&rsquo;ve been hesitating to build on top of it until I can really say that it&rsquo;s stable enough. With memberlist, it might just be what I needed.</p>
<p>After about a month of testing, I think it didn&rsquo;t really turn out quite well in the end. It is stable enough for deployments that are not spike-y in terms of workloads (frequent scaling up/down). Or if I set min = max in the <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/">HorizontalPodAutoscaler</a>. In these cases, memberlist can consistently track members just fine. What&rsquo;s better is that it works even in multiple deployments in the same namespace which I thought was brilliant. For example, if I have a deployment <code>app1</code> set to 10 pods in the <code>default</code> namespace using memberlist&rsquo;s default port numbers, and then I deploy another set, say, <code>app2</code>, within the same namespace using the same ports, app1&rsquo;s memberlist can track its 10 member pods just fine while app2&rsquo;s memberlist is also separated. But when applied to my use case, which has a minimum pod of 2 and a max of 150, with frequent scale up/down frequency depending on load, it can&rsquo;t seem to keep up. The potential for <a href="https://en.wikipedia.org/wiki/Byzantine_fault">Byzantine faults</a> is just too high: i.e. in a 50-pod scale, memberlist can end up having 2 groups of m-pods and n-pods where m+n=50. Very rarely, it can even go up to 3 groups.</p>
<p>I am a little frustrated. I really wanted it to work; I even attempted to update memberlist to incorporate hedge&rsquo;s logic but it was too much for now, with my schedule. So now, back to the old one. By the way, the current logic is fairly rudimentary: all members in the cluster/group send a liveness heartbeat to the leader and the leader broadcasting the final list of members to all via hedge&rsquo;s broadcast mechanism. CPU usage between the two is fairly similar depending on the sync timeout.</p>
<p>I&rsquo;ve been trying to improve hedge&rsquo;s member tracking system as I want to build a distributed in-memory cache within hedge itself. Most of the available ones are <a href="https://raft.github.io/">Raft</a>-based, and I still haven&rsquo;t figured out how to make Raft work in the same deployment configuration.</p>
<br>
]]></description><guid isPermaLink="false">tag:flowerinthenight.com,2023-04-28:/blog/2023-04-28-hedge-memberlist/</guid><link>https://flowerinthenight.com/blog/2023-04-28-hedge-memberlist/</link><pubDate>Fri, 28 Apr 2023 00:00:00 JST</pubDate><title>Attempt to replace hedge’s member tracking with hashicorp/memberlist</title></item></channel></rss>