Dangus's Blog

Home Page

N-Grams - NLP in Python

Beyond single words, text insights with n-grams

Natural Language Processing with Python - N-Grams When we are analyzing a text is crucial to identify the words that are relevant. So we can assume that a word is more relevant if they appear mor...

Data Leakage in Data Science

The model cheats by seeing unwanted information

Data Leakage in Data Science [!cue] What is data leakage? Data Leakage in DS occurs when information from the test (or validation) set inadvertently “leaks” into the training process. It esse...

Gitds-flow - Daily Cheatsheet Work Flow

Cheat sheet to dominate gitds-flow

Gitds-flow Explanation Work Flow Detailed description of the gitds-flow work flow (Init): Repository with two principal branches: ⚛️main (production) and 🧪dev (development). New developme...

Gitds-flow - Data Science Git Work Flow

Introduction to gitds-flow

Gitds-flow (Data science git work flow) [!cue] This is a Git workflow proposed by me Daniel Guitron(aka: @danguitron) to easily have a minimalist way to work in your data science projects. ...

Decision Trees: Your Data’s Choose

Why You Should Too

🌳 Decision Trees: Your Data’s “Choose Your Own Adventure” Book (Spoiler: The Dragon is Overfitting) “Why Decision Trees Are Like Dating Apps” Imagine swiping left/right based on: 🐶 Pet pre...