# IntelÂ® Extension for Scikit-learn NuSVR for Medical Charges dataset

In [1]:
from time import time
from sklearn.datasets import fetch_openml
x, y = fetch_openml(name='medical_charges_nominal', return_X_y=True)
cat_columns = x.select_dtypes(['category']).columns
x[cat_columns] = x[cat_columns].apply(lambda x: x.cat.codes)

In [2]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.3, random_state=42)
x_train.shape, x_test.shape

((48919, 11), (114146, 11))

Intel Extension for Scikit-learn (previously known as daal4py) contains drop-in replacement functionality for the stock scikit-learn package. You can take advantage of the performance optimizations of Intel Extension for Scikit-learn by adding just two lines of code before the usual scikit-learn imports:

In [3]:
from sklearnex import patch_sklearn
patch_sklearn()

Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)


Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality. Refer to the [list of supported algorithms and parameters](https://intel.github.io/scikit-learn-intelex/algorithms.html) for details. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. If the patching does not cover your scenarios, [submit an issue on GitHub](https://github.com/intel/scikit-learn-intelex/issues).

In [4]:
params = {
    'nu': 0.4,
    'C': y_train.mean(),
    'degree': 2,
    'kernel': 'poly',
}

Training of the NuSVR algorithm with Intel(R) Extension for Scikit-learn for Medical Charges dataset

In [5]:
start = time()
from sklearn.svm import NuSVR
nusvr = NuSVR(**params).fit(x_train, y_train)
f"Intel(R) extension for Scikit-learn time: {(time() - start):.2f} s"

'Intel(R) extension for Scikit-learn time: 22.30 s'

Predict and get a result of the NuSVR algorithm with Intel(R) Extension for Scikit-learn

In [6]:
score = nusvr.score(x_test, y_test)
print('R2 score: {:.4f}'.format(score))

R2 score: 0.8636


In order to cancel optimizations, we use *unpatch_sklearn* and reimport the class NuSVR

In [7]:
from sklearnex import unpatch_sklearn
unpatch_sklearn()

Training of the NuSVR algorithm with original scikit-learn library for Medical Charges dataset

In [8]:
start = time()
from sklearn.svm import NuSVR
nusvr = NuSVR(**params).fit(x_train, y_train)
f"Original Scikit-learn time: {(time() - start):.2f} s"

'Original Scikit-learn time: 367.67 s'

Predict and get a result of the NuSVR algorithm with original Scikit-learn

In [9]:
score = nusvr.score(x_test, y_test)
print('R2 score: {:.4f}'.format(score))

R2 score: 0.8636


With scikit-learn-intelex patching you can:

- Use your scikit-learn code for training and prediction with minimal changes (a couple of lines of code);
- Fast execution training and prediction of scikit-learn models;
- Get the same quality;
- Get speedup more than **16** times.