Here I report just parallel-coordinate plot for the resampling results across the models. Each line corresponds to a common cross-validation holdout.
Is this a zero-sum game? As for bias and variance, it seems there’s a clear trade-off between accuracy and scalability. On the other hand, continuing the metaphor, as for machine learning problems I need to check there’s no additional noises in addition to bias, variance and irreducible errors, so here it's necessary to check that such a loss of scalability for top performer models is intrinsically bound to the problem and not to the implementation.
Is it possible to improve RMSE performances of linear regressors (that is middle performing in this contest) with an octave based model? Similarly, is it possible to build a nu-SVR based model that improves caret SVM RMSE performance fitting on the training set in less than a minute?
… stay tuned …