) in machine learning work are the same.
Coding, waiting for results, interpreting them, returning back to coding. Plus, some intermediate presentations of one’s progress to the management*. But, things mostly being the same does not mean that there’s nothing to learn. Quite the contrary! Two to three years ago, I started a daily habit of writing down lessons that I learned from my ML work. Still, until this day, each month leaves me with a handful of small lessons. Here are three lessons from this past month.
Connecting with humans (no ML involved)
As the Christmas holiday season approaches, the year-end gatherings start. Often, these gatherings are made of informal chats. Not much “work” gets done — which is natural, as these are commonly after-work events. Usually, I skip such events. For the Christmas season, however, I didn’t. I joined some after-work get-together over the past weeks and just talked — nothing urgent, nothing profound. The socializing was good, and I had a lot of fun.
It reminded me that our work projects don’t run only on code and compute. They run on working-together-with-others-for-long-time fuel. Here, small moments — a joke, a quick story, a shared complaint about flaky GPUs — can re-fuel the engine and make collaboration smoother when things get tense later.
Just think about it from another perspective: your colleagues have to live with you for years to come. And you with them. If this would be a “bearing” – nono, not good. But, if this is a “together” – yes, definitely good.
So, when your company’s or research institute’s get-together invites roll into your mailbox: join.
Copilot didn’t necessarily make me faster
This past month, I’ve been setting up a new project and adapting a list of algorithms to a new problem.
Some day, while mindlessly wasting time on the web, I came across a MIT study** suggesting that (heavy) AI assistance — especially before doing the work — can significantly lower recall, reduce engagement, and weaken identification with the outcome. Granted, the study used essay writing at the test objective, but coding an algorithm is a similarly creative task.
So I tried something simple: I completely disabled Copilot in VS Code.
After some weeks, my (subjective and self-assessed, thus heavily-biased) results were: no noticeable difference for my core tasks.
For writing training loops, the loaders, the training anatomy — I know them well. In these cases, AI suggestions didn’t add speed; they sometimes even added friction. Just think about correcting AI outputs that are almost correct.
That finding is a bit in contrast to how I felt a month or two ago when I had the impression that Copilot made me more efficient.
Thinking about the differences between the two moments, it came to me that the effect seems domain-dependent. When I’m in a new area (say, load scheduling), assistance helps me get into the field more quickly. In my home domains, the gains are marginal — and may come with hidden downsides that take years to notice.
My current take on the AI assistants (which I’ve only used for coding through Copilot): they are good to ramp up to unfamiliar territory. For core work that defines the majority of your salary, it’s optional at best.
Thus, for the future, I can recommend other to
- Write the first pass yourself; use AI only for polish (naming, small refactors, tests).
- Honestly check AI’s proclaimed benefits: 5 days with AI off, 5 days with it on. Between them, track: tasks completed, bugs found, time to finish, how well you can remember and explain the code a day later.
- Toggle at your fingertips: bind a hotkey to enable/disable suggestions. If you’re reaching for it every minute, you’re probably using it too extensively.
Carefully calibrated pragmatism
As ML folks, we can overthink details. An example is which Learning Rate to use for training. Or, using a fixed learning rate versus decaying them at fixed steps. Or, whether to use a cosine annealing strategy.
You see, even for the simple LR case, one can quickly come up with a lot of options; which should we choose? I went in circles on a version of this recently.
In these moments, it helped me to zoom out: what does the end user care about? Mostly, it is latency, accuracy, stability, and, often primarily, cost. They don’t care which LR schedule you chose — unless it affects those four. That suggests a boring but useful approach: pick the simplest viable option, and stick to it.
A few defaults cover most cases. Baseline optimizer. Vanilla LR with one decay milestone. A plain early-stopping rule. If metrics are bad, escalate to fancier choices. If they’re good, move on. But don’t throw everything at the problem all at once.
* It seems to be that even at Deepmind, probably the most successful pure-research institute (at least formerly), researchers have management to satisfy
** The study is available or arXiv at: https://arxiv.org/abs/2506.08872
Source link
#Machine #Learning #Lessons #Ive #Learned #Month









