...

How to Become a Machine Learning Engineer (Step-by-Step)


machine learning engineers are currently the highest-paid tech professionals in the UK?

According to Levels.fyi, the average salary is almost £100k — higher than software engineers, AI engineers, and data scientists.

But it’s not just about the paycheque.

As a machine learning engineer, you get to tackle fascinating problems, experiment with cutting-edge tools, and actually positively affect the world.

I can tell you from first hand experience — it’s one of the most exciting and fulfilling jobs.

So in this article, I’ll give you a clear and simple learning roadmap to becoming a machine learning engineer, along with the best resources.

Let’s get into it!

Maths and Statistics

I have said it time and time again, but maths and statistics are by far the most important things you should learn if you want a career in machine learning or in data as a whole.

Technologies come and go, think blockchain and AI, but maths remains a fundamental staple throughout the centuries.

Fortunately, you don’t need to be some maths genius to work in machine learning, I can wholeheartedly confirm this from first-hand experience.

The level of maths required is equivalent is most of the things you get taught in your final years of high school and first year or two of an undergraduate STEM degree.

In general, there are three areas of maths you need to study:

  • Linear Algebra — To learn about matrices, eigenvalues and vectors. These are used everywhere in areas like principal component analysis (PCA), TensorFlow, I mean, even a dataframe is a matrix!
  • Calculus — To learn about differentiation, which is how algorithms like gradient descent and backpropagation work under the hood. These are literally used inside every machine learning algorithm for training and learning.
  • Statistics — To understand probability, distributions, Bayesian statistics, the central limit theorem and maximum likelihood estimation. Statistics is the most valuable one out of the three, and I would focus most of my attention here.

Resources:

I have a full article explaining more in depth the maths topics you need and a more thorough breakdown.

How to Learn the Math Needed for Machine Learning

Python

Python is the lingua franca when it comes to machine learning; forget about R, learn Python. (Sorry to my R lovers out there!)

A common theme I have observed among my coaching clients and beginners I have spoken to is that they are trying to find the “best” course to learn Python.

I will reiterate this point again, the “best” does not exist, so stop looking for it; it’s simply a form of procrastination. Any popular introduction to Python course will work, as they will teach you the exact same things.

Anyway, the main things you want to learn are:

  • Native data structures (dicts, tuples, list)
  • For and while loops
  • If-else conditional statements
  • Functions and classes
  • Common libraries
  • Design patterns

You also want to learn popular machine learning packages, such as:

  • NumPy — Numerical computing for arrays.
  • Pandas — Data manipulation and analysis.
  • Matplotlib— Data visualisation and plotting.
  • scikit-learn— Implementing fundamental ML algorithms.
  • SciPy — General scientific computing package.

Resources:

SQL

As a machine learning engineer, you will be spending a reasonable amount of time working in SQL when trying to create datasets or do some feature engineering.

I probably work in SQL around 30-40% of my time as a machine learning engineer. That’s a lot, so you definitely need to be well versed in it, more than you think.

The things to learn are:

  • SELECT * FROM, AS
  • ALTER, INSERT, CREATE, UPDATE, DELETE
  • GROUP BY, ORDER BY
  • WHERE, AND, OR, BETWEEN, IN, HAVING
  • AVG, COUNT, MIN, MAX, SUM
  • FULL JOIN, LEFT JOIN, RIGHT JOIN, INNER JOIN, UNION
  • CASE, IFF
  • DATEADD, DATEDIFF, DATEPART
  • PARTITION BY, QUALIFY, ROW()

Resources:

There are many free resources for SQL, so I don’t recommend you bother spending money on a course, unless you really want to. You can also always use ChatGPT as well!

Machine Learning

To everyone’s surprise, we need to learn machine learning to be a machine learning engineer!

This is arguably the most fun part of the roadmap and what most people get into this field for. I get it, because it was the reason I decided to work in machine learning!

I would be lying if I said learning these algorithms was always fun. It does require a bit of mental effort and time to fully grasp all the concepts, but eventually, things will click, and it will be well worth it.

The key algorithms and concepts you need are:

  • Linear, logistic and polynomial regression.
  • Generalised linear models and generalised additive models.
  • Decision trees, random forests and gradient-boosted trees.
  • Support vector machines.
  • K-means and K-nearest neighbour clustering.
  • Feature engineering, particularly how to deal with categorical features.
  • Evaluation metrics for different types of problems.
  • Regularisation, bias vs variance tradeoff and cross-validation.
  • Gradient descent and backpropagation.

Resources:

  • Machine Learning Specialisation by Andrew Ng— This the first ML course I took, and I think it is probably the best one out there. Andrew is honestly the best teacher, this course is one everyone should take in my opinion.
  • The Hundred-Page ML Book— Concise with practical insights into building ML models and the core theory behind them. Lovely nighttime reading.
  • Hands-On ML with Scikit-Learn, Keras, and TensorFlow — If I had to give only one book to learn machine learning, this would be it! This book is the GOAT and covers literally every topic you would need as an entry/mid-level machine learning engineer.

Deep Learning

Being honest, the fundamental machine learning algorithms will cover the majority of models you will build in your career.

I still use regular regression models most of the time!

Deep learning is beneficial in scenarios such as natural language processing and computer vision, but its use in my daily work is minimal beyond these areas.

However, take this with a pinch of salt, given that I specialise in time series forecasting and optimisation problems, which are notoriously tricky for deep learning to perform well in.

With all that said, deep learning is an area all machine learning engineers should be somewhat aware of as it is a core part of the field.

The areas you want to study are:

  • Neural Networks — The algorithm that put machine learning on the map. I am sure many of you have heard of this algorithm.
  • Convolutional Neural Networks — These are used for computer vision and image detection. The key difference is that they use the convolution operation to “pre-select” information before passing it into a regular neural network.
  • Recurrent Neural Networks — A little bit obsolete now, but were the original deep learning algorithm for sequence models like time series and natural language. The most popular one you may have heard of is sequence to sequence modelling.
  • Transformers — The current state-of-the-art model that is behind all the AI hype and growth. This comes from the famous paper “Attention Is All You Need”, that I highly recommend you read!

Resources:

Software Engineering

Given the title is machine learning “engineer”, you need to know software engineering best practices as this is important when deploying your models to production.

When I was trying to become a machine learning engineer, I really underestimated the engineering part. I can even argue now that it’s more important than the theoretical machine learning knowledge.

Theory is just theory; where you really earn your money is by helping the company and business make decisions with your algorithms. For that, you need to know software engineering.

The areas you need to know are:

  • Data Structures and Algorithms — For passing interviews and helping you write better code. Learn the basics and make sure to practise.
    • Arrays
    • Linked lists
    • Queues
    • Sorting
    • Binary search
    • Trees
    • Hashing
    • Graphs
  • System Design — For passing interviews and understanding how to deploy machine learning algorithms at scale. Once again, learn the basics.
    • Networking
    • APIs
    • Caching
    • Proxies
    • Storage
  • Production Code — Writing well-tested and efficient code through things like typing, linting, testing and using principles such as DRY, KISS and YAGNI. This is probably the most crucial part to learn, as it is the most applicable to the job.
  • APIs — The majority of software operates using APIs, and many machine learning models are served as API endpoints. Understanding how they work and their different types.

Resources

  • Neetcode.io — Great introductory, intermediate and advanced data structure and algorithms courses, as well as system design courses. 100% recommend this platform when learning software engineering fundamentals to anyone.
  • Leetcode & Hackerrank — Platforms to practise for interviews. I am sure many of you have heard of “grinding LeetCode”; you don’t need to do that for machine learning engineer jobs as much as for software engineering jobs. However, you should know the basics. I recommend working through the NeetCode 150.
  • Software Engineering for Data Scientists — Like it says on the tin, a book specifically designed for data scientists to learn software engineering. Great alternative to learn all the software engineering skills if you don’t like courses.

MLOps

A model in a Jupyter Notebook has literally no business value.

It’s much better to have something in production, making subpar decisions that benefit the business, instead of a flashy neural network in a notebook doing nothing but has unreal accuracy.

Therefore, if you actually want to be a sound machine learning engineer, you need to be able to deploy your models so you actually benefit the company from a financial perspective.

To do this, you need to learn the following:

  • Cloud — Learn cloud technologies like AWS, GCP or Azure. All the machine learning models I have worked on have been deployed on the cloud, and this is only going to increase in the future. AWS is the most popular, so I recommend that’s the one you learn.
  • Containerisation — Learn Docker and Kubernetes; this is necessary for running your models in the cloud.
  • Version Control — Learn Git and Github, there is no way around it. This is how all software is built.
  • Shell/Terminal — You will be working in your terminal a lot, so knowing basic Bash/Zsh is essential.

Resources:

  • Practical MLOps— This is probably the only book you need to understand how to deploy your machine-learning model and all the associated topics. I use it more as a reference text, but it teaches almost everything you need to know.
  • Designing Machine Learning Systems— Another great book and resource to vary your information sources. This is by Chip Huyen, who is probably the leading expert on AI/ML production systems.
  • freeCodeCamp — A variety of resources covering literally every software engineering and MLOps topic.

Studying everything here will give you all the knowledge required to be a machine learning engineer; however, that is not enough on its own to land you a job.

You need to demonstrate your skills by building a solid portfolio with the right projects.

If you want to know exactly how to do that, then check out this article where I explain exactly how you can do that. I will see you there!

STOP Building Useless ML Projects – What Actually Works

Another Thing!

I offer 1:1 coaching calls where we can chat about whatever you need — whether it’s projects, career advice, or just figuring out your next step. I’m here to help you move forward!

1:1 Mentoring Call with Egor Howell
Career guidance, job advice, project help, resume reviewtopmate.io

Connect With Me

Source link

#Machine #Learning #Engineer #StepbyStep