Making TabPFN work forarbitrary classification problems. #377

ahayler · 2025-07-10T13:39:18Z

ahayler
Jul 10, 2025

Hi all,
I am interested in using TabPFN for an arbitrary classification setting, i.e. more than 500 features, more than 10000 rows/training samples and more than 10 classes. I do not expect that TabPFN will outperform trained methods on these problems, but hope to be able to trade off how long a model needs to train with performance (e.g. AutoGluon may give me the best performance, but takes quite long to fit).

As mentioned above, there are three major hurdles to applying TabPFN to arbitrary classification settings, namely:

Too many features
Too many classes
Too many samples

I am sure that someone has already thought about or even tried this, so I would really appreciate any thoughts, experience or relevant resources anyone could share!

As a starting point, you can find my first (naive) ideas on how I would try to solve the hurdles below:

Too many features:

Apply relatively cheap dimensionality reduction technique like PCA. Any recommendations on what techniques I should try out besides PCA?

Too many classes:

In some preliminary research, I found the many_classes_classifier, which I would use as a first starting point.

Too many samples:

Naively, I could just randomly subsample the data to 10000 rows and fit TabPFN on this smaller train set. This completely ignores the rest of the samples that didn’t get picked for the smaller train set. To improve performance, I could bootstrap multiple of these smaller train sets independently and then use some for ensembling (e.g. ensemble selection) to combine the predictions. Any better ideas?

Any help or pointers to the right resources are deeply appreciated!

Update: I did find this paper, which has some interesting ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Making TabPFN work forarbitrary classification problems. #377

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Making TabPFN work forarbitrary classification problems. #377

Uh oh!

Uh oh!

ahayler Jul 10, 2025

Too many features:

Too many classes:

Too many samples:

Replies: 0 comments

ahayler
Jul 10, 2025