Scikit Mlp Classification.pptx

  • Uploaded by: Priyangka John Jayaraj
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Scikit Mlp Classification.pptx as PDF for free.

More details

  • Words: 349
  • Pages: 8
SUPERVISED LEARNING Multilayer Perceptron Implementation

Prerequisites • Python 3 (3.5+) – Tutorial built using 3.6.3 • Installed packages: – scikit-learn – numpy – matplotlib

• Github example code: – https://github.com/acun1994/scikit-learntutorial/blob/master/mlp.py

Dataset loading • First we load the Iris flower dataset. iris = datasets.load_iris() X = iris.data y = iris.target

• This snippet loads the Iris dataset from scikitlearn’s collection, and splits it into the Data(X) and Label(y) part

Best practice • This part is not strictly necessary, but is a best practice. # Dataset splitting X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33) # Dataset scaling scaler = StandardScaler() X_scaledTrain = scaler.fit_transform(X_train) X_scaledTest = scaler.transform(X_test) X_scaledAll = scaler.transform(X)

• MLP is exteremely sensitive to feature scaling, so use StandardScaler to scale. • We only fit on the TRAIN dataset. We then transform the TEST dataset according to the TRAIN dataset’s values.

Training • We then train the MLP model

clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 3)).fit(X_scaledTrain, y_train)





You will need to adjust the hidden_layer_sizes depending on the problem. The first value is for the hidden layer (how many hidden nodes), while the second value is for the output layer (how many classes). For Iris, we have 3 classes, so set it to 3. There are other parameters of MLP that you can tweak to increase the performance. Refer to https:// scikit-learn.org/stable/modules/neural_networks_supervised.html

Testing • Now we need to test our model predict_y = [] correct = 0 for i in range(len(X_test)): predict_me = np.array(X_scaledTest[i].astype(float)) predict_me = predict_me.reshape(-1, len(predict_me)) prediction = clf.predict(predict_me) predict_y.append(prediction[0]) if prediction[0] == y_test[i]: correct += 1 print ("Success rate : ", correct/len(X_test))

• You may have to retrain the model several times to get good accuracy

Visualisation • The source code includes functions to visualise the clusters # Visualization visualise( np.concatenate((X_scaledTrain,X_scaledTest),axis = 0), np.concatenate((y_train, predict_y), axis=0), "Train + Test", 1) visualise(X_scaledAll,y, "True", 2) plt.show()

• As you can see, the true clusters and the predicted clusters are rather similar, which means our model is trained correctly. • Pay attention to the grouping of the datapoints. You can drag around the scatter plots to see other perspectives.

Related Documents


More Documents from "mauricio"

C 3: P S A
May 2020 5
Computacion
October 2019 31
June 2020 17