STL Algorithm Explained: STL Part II

This part of a three part series on STL decomposition focuses on a sketch of the algorithm. It is not a rigorous treatment, but hopefully thorough enough to provide a mathematical understanding of how the various hyperparameters affect the decomposition. This post is a bit heavy on the mathematics. For an introduction to STL, please look at Part I

Read More

Selection Bias in Application Data

Selection bias is a known issue in data science, but its depth is not fully appreciated. The bias makes a data set far from an infallible neutral source from which models can learning. To blindly run analysis on biased data risks infecting the model with the same bias in the initial data and thereby perpetuating errors and discrimination. In this post, I hope to explain why selection bias is so challenging particularly in application data such as when applying for a financial loan.

Read More

Seasonal Decomposition Intro

I’ve spent a lot of time working with time series data. On nearly any project, one of the first issues that crops up is the seasonality of the data. It is easy enough to dump the data into a decomposition function and get good results. For example, seasonal_decompose in Python’s statsmodels and decompose or stl in base R.

Read More

Welcome!

Welcome to my personal blog. I am a data scientist with a background in physics. To study physics is to seek the beautifully simple models of describing complex phenomena. I love taking the same approach when playing with data to develop new insights.

Read More