python - Is it possible to combine multiple partially fit estimators in sklearn? -
i have lot of data , want parallelize estimator fitting splitting data , fitting multiple estimators running in multiple threads, or multiple machines.
some estimators provide partial_fit api out-of-core learning (e.g. passiveaggressiveclassifier
here)
is possible have multiple estimators fit partially, , combine individual fits single estimator?
not using standard api. can average coef_
, intercept_
, produce meaningful estimator. want parallelize on 1 core or on network? there might more efficient options you, of require little more work. there parallel implementations of sgd, these pay of huge data sets. how large data (number of samples, number of features, sparsity)?