Member-only story

Feature Selection with Hierarchical Clustering for Interpretable Models

Create a short list of features using this statistical method (Python Tutorial)

Conor O'Sullivan
TDS Archive
11 min readApr 1, 2024

In industry, you can have hundreds and even thousands of potential model features in your dataset. And, using dimensionality reduction methods, like PCA, can leave you with features that are hard to explain. Thankfully, feature clustering can help create a short list of features and an interpretable model.

We will:

  • Apply hierarchical clustering using Python
  • Explain the theory behind this method
  • Discuss its benefit over other clustering methods for feature selection.

We end by gaining some intuition of how the method works using correlation heatmaps. You can also find the project on GitHub.

You may also enjoy this video on the topic. And, if you want to learn more, check out my course — XAI with Python. You can get free access if you sign up to my newsletter.

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Conor O'Sullivan
Conor O'Sullivan

Written by Conor O'Sullivan

PhD Student | Writer | Houseplant Addict | Follow me for articles on IML, XAI, Algorithm Fairness and Remote Sensing

Responses (2)

Write a response