Michael BerkinTowards Data Science1.5 Years of Spark Knowledge in 8 TipsMy learnings from Databricks customer engagementsDec 24, 202312Dec 24, 202312
Michael BerkinTowards Data ScienceHyperOpt DemystifiedHow to automate model tuning with HyperOptNov 8, 20224Nov 8, 20224
Michael BerkinDev GeniusHow to Automate Your Data Infrastructure with CodeWhat is Terraform and why should you use itSep 8, 20221Sep 8, 20221
Michael BerkinTowards Data ScienceDemystifying the Parquet File FormatThe default file format for any data science workflowAug 16, 202211Aug 16, 202211
Michael BerkinTowards Data SciencePySpark Data Skew in 5 MinutesExactly what you need, and no moreMay 10, 20221May 10, 20221
Michael BerkinTowards Data ScienceSQL to PySparkA quick guide for moving from SQL to PySpark.May 6, 20221May 6, 20221
Michael BerkinTowards Data ScienceHow does linear regression really work?The math and intuition behind ordinary least squares (OLS)Feb 23, 20222Feb 23, 20222
Michael BerkinTowards Data Science5 Advanced Tips on Python ObjectsPython is an object oriented programming language but can behave strangely. If you come from other OOP languages, this post may benefit youFeb 9, 20221Feb 9, 20221
Michael BerkinTowards Data ScienceDon’t Use a T-Test for A/B TestingHow to use multiple linear regression to determine ATE and statistical significanceFeb 2, 20226Feb 2, 20226
Michael BerkinTowards Data Science5 Advanced Tips on Python DecoratorsDo you want to write concise, readable, and efficient code? Well, python decorators may help you on your journey.Jan 24, 20223Jan 24, 20223