Michael BerkinTowards Data Science1.5 Years of Spark Knowledge in 8 TipsMy learnings from Databricks customer engagements8 min read·Dec 24, 2023--11--11
Michael BerkinTowards Data ScienceHyperOpt DemystifiedHow to automate model tuning with HyperOpt13 min read·Nov 8, 2022--4--4
Michael BerkinDev GeniusHow to Automate Your Data Infrastructure with CodeWhat is Terraform and why should you use it10 min read·Sep 8, 2022--1--1
Michael BerkinTowards Data ScienceDemystifying the Parquet File FormatThe default file format for any data science workflow8 min read·Aug 16, 2022--10--10
Michael BerkinTowards Data SciencePySpark Data Skew in 5 MinutesExactly what you need, and no more5 min read·May 10, 2022--1--1
Michael BerkinTowards Data ScienceSQL to PySparkA quick guide for moving from SQL to PySpark.4 min read·May 6, 2022--1--1
Michael BerkinTowards Data ScienceHow does linear regression really work?The math and intuition behind ordinary least squares (OLS)12 min read·Feb 23, 2022--2--2
Michael BerkinTowards Data Science5 Advanced Tips on Python ObjectsPython is an object oriented programming language but can behave strangely. If you come from other OOP languages, this post may benefit you5 min read·Feb 9, 2022--1--1
Michael BerkinTowards Data ScienceDon’t Use a T-Test for A/B TestingHow to use multiple linear regression to determine ATE and statistical significance7 min read·Feb 2, 2022--6--6
Michael BerkinTowards Data Science5 Advanced Tips on Python DecoratorsDo you want to write concise, readable, and efficient code? Well, python decorators may help you on your journey.5 min read·Jan 24, 2022--3--3