• No products in the cart.

Logistic Regression


The Logistic Regression or Logistic Model also called the Logit Model is a classification algorithm that predicts a categorical feature based on a set of independent variables. Logistic Regression is one of the simplest classification algorithms that can be used to predict values for a categorical dependent variable.

For a deeper understanding of Logistic Regression, use the following resources:


In this practice session, we will learn to code Logistic Regression. We will perform the following steps to build a simple classifier using the popular Iris dataset. You can find the dataset here.

Step 1. Data Preprocessing 

  • Importing the libraries.
  • Importing dataset (Dataset Link https://archive.ics.uci.edu/ml/datasets/iris).
  • Dealing with the categorical variable.
  • Classifying dependent and independent variables.
  • Splitting the data into a training set and test set.
  • Feature scaling.

Step 2. Logistic Regression 

  • Create a Logistic classifier.
  • Feed the training data to the classifier.
  • Predicting the species for the test set.
  • Using the confusion matrix to find accuracy.


Click on Start/Continue Hackathon to go to the Practice page.

Hackathon Reviews


3 ratings
  • 5 stars2
  • 4 stars1
  • 3 stars0
  • 2 stars0
  • 1 stars0
  1. Good course


    Simple and easy to understand

  2. Finished


    #Importing libraries
    import pandas as pd
    import numpy as np
    import warnings

    # Dataset Link https://archive.ics.uci.edu/ml/datasets/iris

    #Importing dataset
    dataset = pd.read_csv(‘Iris.csv’)

    print(“\n————————-\nDataset\n”, dataset.head())

    #Dealing with categorical variable
    print(“\n————————-\nLabel Encoding The Categorical Variable – Species\n————————-“)

    from sklearn.preprocessing import LabelEncoder
    labelencoder = LabelEncoder()
    dataset[‘Species’] = labelencoder.fit_transform(dataset[‘Species’])

    print(“\n————————-\nDataset after Label Encoding Species:- Species\n————————-\n”, dataset.head())

    “””Classifying dependent and independent variables
    here SepalLengthCm, SepalWidthCm, PetalLengthCm and PetalWidthCm are independent variables where as Species is dependent”””
    X = dataset.iloc[:,:-1].values
    y = dataset.iloc[:,-1].values

    #Splitting into training set and test set
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train,y_test = train_test_split(X,y,test_size = 0.25, random_state=0)

    print(“\n————————-\nScaling or Normalizing the features \n————————-“)

    #Feature scaling
    from sklearn.preprocessing import StandardScaler
    sc = StandardScaler()
    X_train = sc.fit_transform(X_train)
    X_test = sc.transform(X_test)

    print(“\n————————-\nDataset after Scaling:\n————————-\n”, )

    print(“\nX_train :\n”, X_train)
    print(“\nX_test :\n”, X_test)

    ######### Logistic Regression ################

    #Create a Logistic classifier

    from sklearn.linear_model import LogisticRegression
    classifier = LogisticRegression(random_state=0, C= 10.617591834830002, penalty = ‘l1’,n_jobs=-1)

    #Feed the training data to the classifier

    #Predicting the species for test set
    y_pred = classifier.predict(X_test)

    #Using confusion matrix to find the accuracy
    from sklearn.metrics import confusion_matrix
    cm = confusion_matrix(y_test,y_pred)

    accuracy = cm.diagonal().sum()/cm.sum()

    print(“Accuracy of Predictions = “,accuracy )

  3. Logistic Regression


    The usage of logistic regression and standardizing the data using StandardSclaer()
    Insight on confusion matrix , and how to find the accuracy using it.


© Analytics India Magazine