Arch-Engineer
How To Turn A Frustrating Day Into A Win
Software engineering consists of a lot of skills, from deeply understanding design and computer science concepts, to knowing the quirks of your favorite tool, to the microscopic habits of getting back into focus after a trip to the bathroom. No curriculum can train it all. The training must come from within. And so, years ago, I wrote in the 7 Mistakes That Cause Fragile Code, I wrote an idea for learning the concepts no-one else can teach you.
Every time a code change is hard, reflect on why.
Well, a few days ago I had quite the opportunity to put this into practice. Because I wrote....
....a web crawler.
(If hearing that didn't make you groan, and you've actually written a web crawler before, then please teach me your secrets.)
I had been working on the final report for a consulting project. I wanted a PDF that screamed grandeur and majesty. And I had already heard about the right tool for the job: Remarq.
Problem was, I had to paste my source into their editor every time I wanted to make a change. And wait. They had no API, and the only automation available seemed incompatible with using git. I got mad after one too many clicks, had an hour before my next call, and channeled my annoyance into building a new automation. Though I swore off web development in 2009, only in 2020 accepting its unavoidable importance, I had built a couple similar scripts in the recent past on more complicated websites, and I felt confident in this one.
That hour passed, and I was still stuck figuring out how to programmatically give input into that textarea-which-is-not-a-textarea. I had little luck resuming later. Tried sending individual keypresses, custom POST requests, and calling the page's own JavaScript. Failure, failure, failure. I grabbed Mirdin's frontend-expert Nils on a call, borrowing what was left of his brain after a 10-hour workday, but it was of little help. Finally, as the four-hour mark drew near, I realized I had made a silly error trying to call the page's own JavaScript, and got that approach seemingly working.
How had this simple job eaten up my whole afternoon? Time to pull out: The Five Why's
The Five Why's, a teaching of the Toyota Production System aka Lean, is that, when something goes wrong, you ask "why?" five times, eventually reaching the deeper cause that can be fixed. In the words of Toyota manager John Shook, as quoted in The Toyota Way.
The point is not to literally ask why 5 times; it means pursue understanding of causality. As for how to understand reality, use the simplest means possible, starting with the five whys. When you need more than simply asking why a few times, pull out your more sophisticated tools. [...] confirm what's really going on with any situation and understand the causes of it [...]
And so, I started writing down my recollections of each tortured turn in that afternoon's struggle, asking for each the two reflection questions: what slowed me down, and what could I have done to avoid that.
And I was shocked by just how much I learned.
Also by how long it took. I actually got most of the benefit by thinking about it before I even sat down. Writing down answers to the reflection questions was valuable, but writing down the play-by-play was not, except as a form of therapy. But I kept the detail going because I had learned so much that I already knew I wanted to write this newsletter about it. Also, I had received an upsetting E-mail right after finishing the web crawler, and really needed that therapy. You can read the whole reflection here.
In writing this reflection, I gained the understanding whose lack had slowed me down earlier that day. There's an episode where I had my confidence in my understanding of TypeScript's import system shaken and was reduced to trial-and-error. When I went back to it that night, I discovered that imports worked exactly like I thought they did, but I hadn't cleared my mental cache after switching library version. With this improved confidence, next time unexpected import behavior happens, I'll know the problem is in the library, and not in my head.
I learned which things I should accept require more exposure. The mistake I made trying to run on-page JavaScript, not noticing that a function which could take either a string or a unit => string would process the two in opposite manners, is one of many things I must train my mental alert system to detect when I work in a language whose type system is a titanium wall full of bullet holes.
But the biggest lessons were learning how I could increase iteration speed. And how, to do so, I just needed to apply some tools that I already knew about.
I was partway through writing how I'd like some kind of REPL that lets me try running commands on a page without restarting the script, when I realized I was describing a debugger. I had been so conditioned into assuming debuggers don't work that I don't look for them when working in a new tech stack. But this case — a single-process program (albeit one controlling a browser) in the mature Node ecosystem — is one where I should fully expect to find working debuggers.
For a long phase, I had given up on trying to programmatically edit this not-a-textarea, and instead wanted to directly send a POST request mimicking the action of the page. But just looking at the request headers in Chrome and translating to the Axios request library didn't do the trick.
After enough HTTP 422 "Unprocessable Entity" codes and after discovering that the website's normal form submission sent a malformatted request with two different values for the same field, I stopped assuming that the server would treat any two packets the same unless they were byte-for-byte identical. So I needed to configure the Axios library to send as similar a packet as possible as what the browser was sending. But I couldn't get Axios to give me a low-level description of what it was actually sending. And, on the browser side, I had stopped trusting the Chrome debugger's report as well, after discovering many of the views omitted fields. I was again reduced to trial and error.
During the reflection, I realized that I already had all the skills needed to both exactly inspect the packets being sent and to try many variations in a shorter period of time. And it would be trivial to overcome the previous barriers I had to using those skills.
To inspect packets, one should use Wireshark. Why didn't I? Because I knew that using Wireshark meant drowning in information. That just meant I should practice using its filters, which I've now done.
To see how much I could deviate from the page's packets, I knew about a Chrome extension called Tamper for intercepting and modifying requests live. But I didn't have it set up and had never used it before. Now I've done both.
Beyond these lessons, I made another discovery: I actually hadn't gotten the web crawler working! Due to a crazy quirk of that page, it had actually entered a state where it appeared to submit correctly, but would actually send the server into an infinite loop rendering the PDF. With a fresh mind, I saw the flaws in my old solution. With fresh eyes, I saw that this page actually had a hidden duplicate of the main form for some unfathomable reason. A few minutes later, I pressed Enter to run my script, and had a beautiful PDF.
This level of reflection, complete with installing every tool I could or should have used, is not something to do every day. Nor is watching a recording of yourself coding, another strategy I've seen recommended. But for every micro-struggle, there is a micro-learning that can be had when we step away from the task at hand. For when you're focused on doing, you're not focused on improving. After all, the practice is not the performance.