AlexNet: The Architecture that Revolutionized Computer Vision
Propelling Artificial Intelligence Forward with AlexNet
Photo by Simon Woehrer on Unsplash
Table of contents
Introduction
This is the first Deep Convolutional Neural Network (CNN) that revolutionized image classification tasks. This architecture won the ImageNet challenge (ILSVRC) in 2012.
AlexNet has multiple deep layers. It consists of 60 million parameters making it one of the largest Neural Networks of that time. It was originally designed for Image classification and used to process high-resolution images. AlexNet was optimized to take advantage of GPU.
What makes it good over LeNet?
AlexNet is deeper than LeNet, containing 5 convolutional layers, whereas LeNet has only 2 Convolutional layers.
AlexNet uses a much better activation function β ReLU. This activation function is an upgrade on the sigmoid or tanh function.
LeNet is limited to grayscale images where whereas AlexNet uses images with color, i.e., RGB channels.
Architecture
Let's dive in,
Note: if the layer uses padding or strides, i explicitly mentioned. if i dont mention it then its default.
Input:
- The input size is 227x227x3. This means it has 227 pixels of height and width and has 3 channels (RGB).
Layer 1 (Convolutional Layer 1):
Input: 227x227x3
Number of Filters: 96
Filter Size: 11x11
Strides: 4
Activation Function: ReLU
Output Size (Feature map size): 55x55x96
Layer 2(Max Pooling Layer 1):
Input size: 55x55x96
Pool size: 3x3
Stride: 2
Output Size: 27x27x96
Layer 3(Convolutional Layer 2):
Input: 27x27x96
Number of Filters: 256
Padding: 1
Filter Size: 5x5
Activation Function: ReLU
Output Size: 27x27x256
Layer 4(Max Pooling Layer 2):
Input size: 27x27x256
Pool size: 3x3
Stride: 2
Output Size: 13x13x256
Layer 5(Convolutional Layer 3):
Input: 13x13x256
Number of Filters: 384
Filter Size: 3x3
Padding: 1
Activation Function: ReLU
Output Size (Feature map size): 13x13x384
Layer 6(Convolutional Layer 4):
Input: 13x13x384
Number of Filters: 384
Filter Size: 3x3
Padding: 1
Activation Function: ReLU
Output Size (Feature map size): 13x13x384
Layer 7(Convolutional Layer 5):
Input: 13x13x384
Number of Filters: 256
Filter Size: 3x3
Padding: 1
Activation Function: ReLU
Output Size (Feature map size): 13x13x256
Layer 8(Max Pooling Layer 3):
Input size: 13x13x256
Pool size: 3x3
Stride: 2
Output Size: 6x6x256
Layer 9 (Flatten):
Input size: 6x6x256
Output size: 9216
Layer 10 (Fully Connected Layer 1):
Input size: 9216
Nodes: 4096
Activation function: ReLU
Output size: 4096
Layer 11 (Fully Connected Layer 2):
Input size: 4096
Nodes:4096
Activation function: ReLU
Output size: 4096
Layer 12 (Fully Connected Layer 3) [Output Layer]:
Input size: 4096
Nodes: 1000
Activation function: Softmax
Output size: 1000
Code
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# Define the AlexNet model
def alexnet():
model = Sequential()
model.add(Conv2D(96, (11, 11), strides=(4, 4), activation='relu', input_shape=(227, 227, 3)))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(256, (5, 5), padding='same' , activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(384, (3, 3), padding='same' ,activation='relu'))
model.add(Conv2D(384, (3, 3), padding='same' ,activation='relu'))
model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1000, activation='softmax'))
return model
# Create an instance of the AlexNet model
model = alexnet()
# Print the model summary
model.summary()
Letβs Connect!
If you enjoyed this article, a few claps π would mean the world and motivate me to create more!
Feel free to reach out and explore more of my work:
LinkedIn: Pranay Rishith
GitHub: pranayrishith16
Hashnode Blog: Beyond Blog
Medium: Pranay Blog
Looking forward to connecting with you! π