The Machine Learning Alchemist – Medium

The Machine Learning Alchemist

Decoding the Encoder

Since Google introduced the Transformer model with its self-attention mechanism, it has been pivotal in the advancements of Generative AI…

Mar 29, 2024

Decoding the Encoder

Mar 29, 2024

Part 3: Optimizing Performance with the ZeRO Optimizer

In the initial parts of this series, I discussed how to handle datasets too large for a single GPU. This involved distributing the datasets…

Nov 24, 2023

Part 3: Optimizing Performance with the ZeRO Optimizer

Nov 24, 2023

Part 2 — Scaling with the Distributed Data Parallel (DDP) Algorithm

In the first part of this series, I explored the Data Parallel (DP) algorithm, highlighting its efficiency in scenarios where all the GPUs…

Nov 15, 2023

Part 2 — Scaling with the Distributed Data Parallel (DDP) Algorithm

Nov 15, 2023

Part 1: A Brief Guide to the Data Parallel Algorithm

In my exploration of machine learning, I quickly realized that advanced work demands far more computational power than a single GPU card…

Nov 13, 2023

Part 1: A Brief Guide to the Data Parallel Algorithm

Nov 13, 2023

The Machine Learning Alchemist

The Machine Learning Alchemist

Two decades in tech, Masters from Georgia Tech, Microsoft Alum. Here to demystify machine learning. Join me, and we'll discover the secrets of ML alchemy.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech