Saturday, October 11, 2014

Comparing Octave based SVMs vs caret SVMs (accuracy + fitting time)

In this post caret R package regression models has been compared, where the solubility data can be obtained from the AppliedPredictiveModeling R package and where 
  • Models fitting on train set > 15 minutes has been discarded.
  • Accuracy measure: RMSE (Root Mean Squared Error)
From this, the top performing models are Support Vector Machines with and without Box–Cox transformations. Linear Regression / Partial Least Squares / Elastic Net with and without Box–Cox transformations are middle performing. Bagged trees / Conditional Inference Tree / CART showed modest results.
SVMs with Box–Cox transformations performs on test set as 0.60797 RMSE while without Box–Cox transformations as 0.61259.

Let's start Octave session with Regularized Polynomial Regression where we got performances pretty similar to caret Elastic Net. We got 0.71 RMSE on test set with a 10 polynomial degree and lambda 0.003. From the validation curve we can see the model is under fitting.    

Let's focus on SVMs (fom libsvm package).
epsilon-SVR performs as 0.59466 RMSE on test set with C = 13, gamma = 0.001536 and epsilon = 0.
Time to fit on train set: 9 secs. 


nu-SVR performs as 0.594129 RMSE on test set with C = 13, gamma = 0.001466 and nu = 0.85
Time to fit on train set: 8 secs.


So, Octave based SVMs have similar accuracy performances of caret SVMs (0.59 vs 0.60 RMSE) on this data set (perhaps, a bit better), but they are much more fast in training (9 secs vs 424 secs)In my experience, same considerations holds for memory consumption, but I'm not going to prove it here.   

Let's go back to our on-line learning applications. In that shipping service website where user comes, specifies origin and destination, you offer to ship their package for some asking price, and users sometimes choose to use your shipping service (y = 1) , sometimes not (y = 0). Features x captures properties of user, of origin/destination and asking price. We want to learn p(y = 1 | x) to optimize price.
Clearly, based on above example, Octave seems a much more performant and scalable choice than R. For instance, our application architecture can be made of 
  • presentation tier: bootstap js + JSP 
  • application tier: Octave (Machine Learning) + Java (backoffice, monitoring tools, etc.)
  • data tier: MongoDB or MySql  

This is a hybrid choice, good for all seasons. It's the aggregation of 2 "pure architectures":
  • bootstap + Octave + MongoDB 
  • JSP + Java + Octave MySql 
For both of them, the question is: is there any interface (open source?) JavaScript 2 Octave / Java 2 Octave / MySql 2 Octave / MongoDB 2 Octave? Are they stable enough for production? What about the community behind them?