I wanted to give scikit-automl a try, but I don’t like to install many packages on my machine. Conda solves the problem of creating a mess, although it does not deal with the issue of running unknown code on my computer. Hence, I decided to try it on Kaggle.

Table of Contents

  1. What can go wrong?

I followed the installation instructions and did not expect any issues. After all, the instruction consists of only two steps:

!curl https://raw.githubusercontent.com/automl/auto-sklearn/master/requirements.txt | xargs -n 1 -L 1 pip install
!pip install auto-sklearn

Want to build AI systems that actually work?

Download my expert-crafted GenAI Transformation Guide for Data Teams and discover how to properly measure AI performance, set up guardrails, and continuously improve your AI solutions like the pros.

What can go wrong?

Swig. Swig can go wrong. I was installing the packages, and at some point, I saw this error message. Pip was installing the pyrfr package but failed because of a problem with swig.

building 'pyrfr._regression' extension
    swigging pyrfr/regression.i to pyrfr/regression_wrap.cpp
    swig -python -c++ -modern -features nondynamic -I./include -o pyrfr/regression_wrap.cpp pyrfr/regression.i
    unable to execute 'swig': No such file or directory
    error: command 'swig' failed with exit status 1

It turns out that Kaggle has some old swig version installed on their machines, but we can quickly fix it:

!apt-get remove swig
!apt-get install swig3.0
!ln -s /usr/bin/swig3.0 /usr/bin/swig

After that change, I ran the installation script, and everything worked fine.

Want to build AI systems that actually work?

Download my expert-crafted GenAI Transformation Guide for Data Teams and discover how to properly measure AI performance, set up guardrails, and continuously improve your AI solutions like the pros.

Older post

Predicting customer lifetime value using the Pareto/NBD model and Gamma-Gamma model

How to estimate the CLV from a list of customer transactions using the lifetimes library in Python

Newer post

Preprocessing the input Pandas DataFrame using ColumnTransformer in Scikit-learn

How to encode text/categorical variables and scale numerical values using only one Scikit-learn class

Are you looking for an experienced AI consultant? Do you need assistance with your RAG or Agentic Workflow?
Book a Quick Consultation, send me a message on LinkedIn. Book a Quick Consultation or send me a message on LinkedIn

>