# IntelÂ® Extension for Scikit-learn KNN for MNIST dataset

In [1]:
from time import time
from sklearn import metrics
from sklearn.model_selection import train_test_split

In [2]:
from sklearn.datasets import fetch_openml
x, y = fetch_openml(name='mnist_784', return_X_y=True)

In [3]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=72)

Intel Extension for Scikit-learn (previously known as daal4py) contains drop-in replacement functionality for the stock scikit-learn package. You can take advantage of the performance optimizations of Intel Extension for Scikit-learn by adding just two lines of code before the usual scikit-learn imports:

In [4]:
from sklearnex import patch_sklearn
patch_sklearn()

Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)


Intel(R) Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality. Refer to the [list of supported algorithms and parameters](https://intel.github.io/scikit-learn-intelex/algorithms.html) for details. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. If the patching does not cover your scenarios, [submit an issue on GitHub](https://github.com/intel/scikit-learn-intelex/issues).

In [5]:
params = {
    'n_neighbors': 40,
    'weights': 'distance',
    'n_jobs': -1
}

Training and predict KNN algorithm with Intel(R) Extension for Scikit-learn for MNIST dataset

In [6]:
start = time()
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(**params).fit(x_train, y_train)
predicted = knn.predict(x_test)
f"Intel(R) extension for Scikit-learn time: {(time() - start):.2f} s"

'Intel(R) extension for Scikit-learn time: 1.09 s'

In [7]:
report = metrics.classification_report(y_test, predicted)
print(f"Classification report for KNN:\n{report}\n")

Classification report for KNN:
              precision    recall  f1-score   support

           0       0.97      0.99      0.98      1365
           1       0.93      0.99      0.96      1637
           2       0.99      0.94      0.96      1401
           3       0.96      0.95      0.96      1455
           4       0.98      0.96      0.97      1380
           5       0.95      0.95      0.95      1219
           6       0.96      0.99      0.97      1317
           7       0.94      0.95      0.95      1420
           8       0.99      0.90      0.94      1379
           9       0.92      0.94      0.93      1427

    accuracy                           0.96     14000
   macro avg       0.96      0.96      0.96     14000
weighted avg       0.96      0.96      0.96     14000




*The first column of the classification report above is the class labels.*  
  
In order to cancel optimizations, we use *unpatch_sklearn* and reimport the class KNeighborsClassifier.

In [8]:
from sklearnex import unpatch_sklearn
unpatch_sklearn()

Training and predict KNN algorithm with original scikit-learn library for MNSIT dataset

In [9]:
start = time()
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(**params).fit(x_train, y_train)
predicted = knn.predict(x_test)
f"Original Scikit-learn time: {(time() - start):.2f} s"

'Original Scikit-learn time: 24.23 s'

In [10]:
report = metrics.classification_report(y_test, predicted)
print(f"Classification report for KNN:\n{report}\n")

Classification report for KNN:
              precision    recall  f1-score   support

           0       0.97      0.99      0.98      1365
           1       0.93      0.99      0.96      1637
           2       0.99      0.94      0.96      1401
           3       0.96      0.95      0.96      1455
           4       0.98      0.96      0.97      1380
           5       0.95      0.95      0.95      1219
           6       0.96      0.99      0.97      1317
           7       0.94      0.95      0.95      1420
           8       0.99      0.90      0.94      1379
           9       0.92      0.94      0.93      1427

    accuracy                           0.96     14000
   macro avg       0.96      0.96      0.96     14000
weighted avg       0.96      0.96      0.96     14000




With scikit-learn-intelex patching you can:

- Use your scikit-learn code for training and prediction with minimal changes (a couple of lines of code);
- Fast execution training and prediction of scikit-learn models;
- Get the same quality;
- Get speedup more than **24** times.