Latest update

# Naive Bayes Should generate prediction given missing features (scikit learn)

2017-03-20 23:29:08

Seeing that Naive Bayes uses probability to make a prediction, and treats features as being conditionally independent of each other, then it makes sense that the model can still make a prediction given that there are some features missing in the test data.

I know that it is common practice to impute missing data, but why do this when Naïve Bayes should be able to make a prediction given that there are some features missing?

Can this be implemented in sci-kit learn? I tried a test set with less features, and got a ValueError as the shapes are not aligned.

So theoretically this is possible, but is it possible in scikit learn?

Your question is sensible. The way in which posterior probability is calculated in the classical Naive Bayes classifier (in sklearn) is like summation of the conditional probabilities of the all the features in the dataset. Even though the features are treated as conditionally independent, to learn the classification probability all the feat

• Your question is sensible. The way in which posterior probability is calculated in the classical Naive Bayes classifier (in sklearn) is like summation of the conditional probabilities of the all the features in the dataset. Even though the features are treated as conditionally independent, to learn the classification probability all the features are always used in this setup. Once the model has been learned you still all those features to calculate the posterior for a new observation. The conditional independence is just an assumption that is taken to make the statistics and math obey the rules and work.

But slightly modifying the way in which the posterior is calculated you can use Bayesian approach to make predictions even with the absence of certain features. Using Bayesian approach to make predictions in the absence of certain features is still an ongoing work. You may want to have a look at this paper in which Bayesian approach is applied to astronomy to do classification with

2017-03-20 23:57:20