AlexNet: The Architecture that Revolutionized Computer Vision

Propelling Artificial Intelligence Forward with AlexNet

Introduction

This is the first Deep Convolutional Neural Network (CNN) that revolutionized image classification tasks. This architecture won the ImageNet challenge (ILSVRC) in 2012.

AlexNet has multiple deep layers. It consists of 60 million parameters making it one of the largest Neural Networks of that time. It was originally designed for Image classification and used to process high-resolution images. AlexNet was optimized to take advantage of GPU.

What makes it good over LeNet?

  • AlexNet is deeper than LeNet, containing 5 convolutional layers, whereas LeNet has only 2 Convolutional layers.

  • AlexNet uses a much better activation function β€” ReLU. This activation function is an upgrade on the sigmoid or tanh function.

  • LeNet is limited to grayscale images where whereas AlexNet uses images with color, i.e., RGB channels.

Architecture

Let's dive in,

Note: if the layer uses padding or strides, i explicitly mentioned. if i dont mention it then its default.

Input:

  • The input size is 227x227x3. This means it has 227 pixels of height and width and has 3 channels (RGB).

Layer 1 (Convolutional Layer 1):

  • Input: 227x227x3

  • Number of Filters: 96

  • Filter Size: 11x11

  • Strides: 4

  • Activation Function: ReLU

  • Output Size (Feature map size): 55x55x96

Layer 2(Max Pooling Layer 1):

  • Input size: 55x55x96

  • Pool size: 3x3

  • Stride: 2

  • Output Size: 27x27x96

Layer 3(Convolutional Layer 2):

  • Input: 27x27x96

  • Number of Filters: 256

  • Padding: 1

  • Filter Size: 5x5

  • Activation Function: ReLU

  • Output Size: 27x27x256

Layer 4(Max Pooling Layer 2):

  • Input size: 27x27x256

  • Pool size: 3x3

  • Stride: 2

  • Output Size: 13x13x256

Layer 5(Convolutional Layer 3):

  • Input: 13x13x256

  • Number of Filters: 384

  • Filter Size: 3x3

  • Padding: 1

  • Activation Function: ReLU

  • Output Size (Feature map size): 13x13x384

Layer 6(Convolutional Layer 4):

  • Input: 13x13x384

  • Number of Filters: 384

  • Filter Size: 3x3

  • Padding: 1

  • Activation Function: ReLU

  • Output Size (Feature map size): 13x13x384

Layer 7(Convolutional Layer 5):

  • Input: 13x13x384

  • Number of Filters: 256

  • Filter Size: 3x3

  • Padding: 1

  • Activation Function: ReLU

  • Output Size (Feature map size): 13x13x256

Layer 8(Max Pooling Layer 3):

  • Input size: 13x13x256

  • Pool size: 3x3

  • Stride: 2

  • Output Size: 6x6x256

Layer 9 (Flatten):

  • Input size: 6x6x256

  • Output size: 9216

Layer 10 (Fully Connected Layer 1):

  • Input size: 9216

  • Nodes: 4096

  • Activation function: ReLU

  • Output size: 4096

Layer 11 (Fully Connected Layer 2):

  • Input size: 4096

  • Nodes:4096

  • Activation function: ReLU

  • Output size: 4096

Layer 12 (Fully Connected Layer 3) [Output Layer]:

  • Input size: 4096

  • Nodes: 1000

  • Activation function: Softmax

  • Output size: 1000

Code

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# Define the AlexNet model
def alexnet():
    model = Sequential()

    model.add(Conv2D(96, (11, 11), strides=(4, 4), activation='relu', input_shape=(227, 227, 3)))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))

    model.add(Conv2D(256, (5, 5), padding='same' , activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))

    model.add(Conv2D(384, (3, 3), padding='same' ,activation='relu'))
    model.add(Conv2D(384, (3, 3), padding='same' ,activation='relu'))

    model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))

    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))

    model.add(Dense(1000, activation='softmax'))

    return model

# Create an instance of the AlexNet model
model = alexnet()

# Print the model summary
model.summary()

Let’s Connect!
If you enjoyed this article, a few claps πŸ‘ would mean the world and motivate me to create more!

Feel free to reach out and explore more of my work:

Looking forward to connecting with you! πŸš€

Β