In this article, we will know what is face recognition and how is different from face detection. We will go briefly over the theory of face recognition and then jump on to the coding section. At the end of this article, you will be able to make a face recognition program for recognizing faces in images as well as on a live webcam feed.
What is Face Detection?
In computer vision, one essential problem we are trying to figure out is to automatically detect objects in an image without human intervention. Face detection can be thought of as such a problem where we detect human faces in an image. There may be slight differences in the faces of humans but overall, it is safe to say that there are certain features that are associated with all the human faces. There are various face detection algorithms but Viola-Jones Algorithm is one of the oldest methods that is also used today and we will use the same later in the article. You can go through the Viola-Jones Algorithm after completing this article as I’ll link it at the end of this article.
Face detection is usually the first step towards many face-related technologies, such as face recognition or verification. However, face detection can have very useful applications. The most successful application of face detection would probably be photo taking. When you take a photo of your friends, the face detection algorithm built into your digital camera detects where the faces are and adjusts the focus accordingly.
For a tutorial on Real-Time Face detection
What is Face Recognition?
Now that we are successful in making such algorithms that can detect faces, can we also recognise whose faces are they?
Face recognition is a method of identifying or verifying the identity of an individual using their face. There are various algorithms that can do face recognition but their accuracy might vary. Here I am going to describe how we do face recognition using deep learning.
So now let us understand how we recognise faces using deep learning. We make use of face embedding in which each face is converted into a vector and this technique is called deep metric learning. Let me further divide this process into three simple steps for easy understanding:
Face Detection: The very first task we perform is detecting faces in the image or video stream. Now that we know the exact location/coordinates of face, we extract this face for further processing ahead.
Feature Extraction: Now that we have cropped the face out of the image, we extract features from it. Here we are going to use face embeddings to extract the features out of the face. A neural network takes an image of the person’s face as input and outputs a vector which represents the most important features of a face. In machine learning, this vector is called embedding and thus we call this vector as face embedding. Now how does this help in recognizing faces of different persons?
While training the neural network, the network learns to output similar vectors for faces that look similar. For example, if I have multiple images of faces within different timespan, of course, some of the features of my face might change but not up to much extent. So in this case the vectors associated with the faces are similar or in short, they are very close in the vector space. Take a look at the below diagram for a rough idea:
Now after training the network, the network learns to output vectors that are closer to each other(similar) for faces of the same person(looking similar). The above vectors now transform into:
We are not going to train such a network here as it takes a significant amount of data and computation power to train such networks. We will use a pre-trained network trained by Davis King on a dataset of ~3 million images. The network outputs a vector of 128 numbers which represent the most important features of a face.
Now that we know how this network works, let us see how we use this network on our own data. We pass all the images in our data to this pre-trained network to get the respective embeddings and save these embeddings in a file for the next step.
Comparing faces: Now that we have face embeddings for every face in our data saved in a file, the next step is to recognise a new t image that is not in our data. So the first step is to compute the face embedding for the image using the same network we used above and then compare this embedding with the rest of the embeddings we have. We recognise the face if the generated embedding is closer or similar to any other embedding as shown below:
So we passed two images, one of the images is of Vladimir Putin and other of George W. Bush. In our example above, we did not save the embeddings for Putin but we saved the embeddings of Bush. Thus when we compared the two new embeddings with the existing ones, the vector for Bush is closer to the other face embeddings of Bush whereas the face embeddings of Putin are not closer to any other embedding and thus the program cannot recognise him.
What is OpenCV
In the field of Artificial Intelligence, Computer Vision is one of the most interesting and Challenging tasks. Computer Vision acts like a bridge between Computer Software and visualizations around us. It allows computer software to understand and learn about the visualizations in the surroundings. For Example: Based on the color, shape and size determining the fruit. This task can be very easy for the human brain however in the Computer Vision pipeline, first we gather the data, then we perform the data processing activities and then we train and teach the model to understand how to distinguish between the fruits based on size, shape and color of fruit.
Currently, various packages are present to perform machine learning, deep learning and computer vision tasks. By far, computer vision is the best module for such complex activities. OpenCV is an open-source library. It is supported by various programming languages such as R, Python. It runs on most of the platforms such as Windows, Linux and MacOS.
To know more about how face recognition works on opencv, check out the free course on face recognition in opencv.
Advantages of OpenCV:
- OpenCV is an open-source library and is free of cost.
- As compared to other libraries, it is fast since it is written in C/C++.
- It works better on System with lesser RAM
- To supports most of the Operating Systems such as Windows, Linux and MacOS.
Installation:
Here we will be focusing on installing OpenCV for python only. We can install OpenCV using pip or conda(for anaconda environment).
- Using pip:
Using pip, the installation process of openCV can be done by using the following command in the command prompt.
pip install opencv-python
- Anaconda:
If you are using anaconda environment, either you can execute the above code in anaconda prompt or you can execute the following code in anaconda prompt.
conda install -c conda-forge opencv
Face Recognition using Python
In this section, we shall implement face recognition using OpenCV and Python. First, let us see the libraries we will need and how to install them:
- OpenCV
- dlib
- Face_recognition
OpenCV is an image and video processing library and is used for image and video analysis, like facial detection, license plate reading, photo editing, advanced robotic vision, optical character recognition, and a whole lot more.
The dlib library, maintained by Davis King, contains our implementation of “deep metric learning” which is used to construct our face embeddings used for the actual recognition process.
The face_recognition library, created by Adam Geitgey, wraps around dlib’s facial recognition functionality, and this library is super easy to work with and we will be using this in our code. Remember to install dlib library first before you install face_recognition.
To install OpenCV, type in command prompt
pip install opencv-python
I have tried various ways to install dlib on Windows but the easiest of all of them is via Anaconda. First, install Anaconda (here is a guide to install it) and then use this command in your command prompt:
conda install -c conda-forge dlib
Next to install face_recognition, type in command prompt
pip install face_recognition
Now that we have all the dependencies installed, let us start coding. We will have to create three files, one will take our dataset and extract face embedding for each face using dlib. Next, we will save these embedding in a file.
In the next file we will compare the faces with the existing the recognise faces in images and next we will do the same but recognise faces in live webcam feed
Extracting features from Face
First, you need to get a dataset or even create one of you own. Just make sure to arrange all images in folders with each folder containing images of just one person.
Next, save the dataset in a folder the same as you are going to make the file. Now here is the code:
from imutils import paths
import face_recognition
import pickle
import cv2
import os
#get paths of each file in folder named Images
#Images here contains my data(folders of various persons)
imagePaths = list(paths.list_images('Images'))
knownEncodings = []
knownNames = []
# loop over the image paths
for (i, imagePath) in enumerate(imagePaths):
# extract the person name from the image path
name = imagePath.split(os.path.sep)[-2]
# load the input image and convert it from BGR (OpenCV ordering)
# to dlib ordering (RGB)
image = cv2.imread(imagePath)
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#Use Face_recognition to locate faces
boxes = face_recognition.face_locations(rgb,model='hog')
# compute the facial embedding for the face
encodings = face_recognition.face_encodings(rgb, boxes)
# loop over the encodings
for encoding in encodings:
knownEncodings.append(encoding)
knownNames.append(name)
#save emcodings along with their names in dictionary data
data = {"encodings": knownEncodings, "names": knownNames}
#use pickle to save data into a file for later use
f = open("face_enc", "wb")
f.write(pickle.dumps(data))
f.close()
Now that we have stored the embedding in a file named “face_enc”, we can use them to recognise faces in images or live video stream.
Face Recognition in Live webcam Feed
Here is the script to recognise faces on a live webcam feed:
import face_recognition
import imutils
import pickle
import time
import cv2
import os
#find path of xml file containing haarcascade file
cascPathface = os.path.dirname(
cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"
# load the harcaascade in the cascade classifier
faceCascade = cv2.CascadeClassifier(cascPathface)
# load the known faces and embeddings saved in last file
data = pickle.loads(open('face_enc', "rb").read())
print("Streaming started")
video_capture = cv2.VideoCapture(0)
# loop over frames from the video file stream
while True:
# grab the frame from the threaded video stream
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(60, 60),
flags=cv2.CASCADE_SCALE_IMAGE)
# convert the input frame from BGR to RGB
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# the facial embeddings for face in input
encodings = face_recognition.face_encodings(rgb)
names = []
# loop over the facial embeddings incase
# we have multiple embeddings for multiple fcaes
for encoding in encodings:
#Compare encodings with encodings in data["encodings"]
#Matches contain array with boolean values and True for the embeddings it matches closely
#and False for rest
matches = face_recognition.compare_faces(data["encodings"],
encoding)
#set name =inknown if no encoding matches
name = "Unknown"
# check to see if we have found a match
if True in matches:
#Find positions at which we get True and store them
matchedIdxs = [i for (i, b) in enumerate(matches) if b]
counts = {}
# loop over the matched indexes and maintain a count for
# each recognized face face
for i in matchedIdxs:
#Check the names at respective indexes we stored in matchedIdxs
name = data["names"][i]
#increase count for the name we got
counts[name] = counts.get(name, 0) + 1
#set name which has highest count
name = max(counts, key=counts.get)
# update the list of names
names.append(name)
# loop over the recognized faces
for ((x, y, w, h), name) in zip(faces, names):
# rescale the face coordinates
# draw the predicted face name on the image
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.putText(frame, name, (x, y), cv2.FONT_HERSHEY_SIMPLEX,
0.75, (0, 255, 0), 2)
cv2.imshow("Frame", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()
Although in the example above we have used haar cascade to detect faces, you can also use face_recognition.face_locations to detect a face as we did in the previous script
Face Recognition in Images
The script for detecting and recognising faces in images is almost similar to what you saw above. Try it yourself and if you can’t take a look at the code below:
import face_recognition
import imutils
import pickle
import time
import cv2
import os
#find path of xml file containing haarcascade file
cascPathface = os.path.dirname(
cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"
# load the harcaascade in the cascade classifier
faceCascade = cv2.CascadeClassifier(cascPathface)
# load the known faces and embeddings saved in last file
data = pickle.loads(open('face_enc', "rb").read())
#Find path to the image you want to detect face and pass it here
image = cv2.imread(Path-to-img)
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#convert image to Greyscale for haarcascade
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(60, 60),
flags=cv2.CASCADE_SCALE_IMAGE)
# the facial embeddings for face in input
encodings = face_recognition.face_encodings(rgb)
names = []
# loop over the facial embeddings incase
# we have multiple embeddings for multiple fcaes
for encoding in encodings:
#Compare encodings with encodings in data["encodings"]
#Matches contain array with boolean values and True for the embeddings it matches closely
#and False for rest
matches = face_recognition.compare_faces(data["encodings"],
encoding)
#set name =inknown if no encoding matches
name = "Unknown"
# check to see if we have found a match
if True in matches:
#Find positions at which we get True and store them
matchedIdxs = [i for (i, b) in enumerate(matches) if b]
counts = {}
# loop over the matched indexes and maintain a count for
# each recognized face face
for i in matchedIdxs:
#Check the names at respective indexes we stored in matchedIdxs
name = data["names"][i]
#increase count for the name we got
counts[name] = counts.get(name, 0) + 1
#set name which has highest count
name = max(counts, key=counts.get)
# update the list of names
names.append(name)
# loop over the recognized faces
for ((x, y, w, h), name) in zip(faces, names):
# rescale the face coordinates
# draw the predicted face name on the image
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.putText(image, name, (x, y), cv2.FONT_HERSHEY_SIMPLEX,
0.75, (0, 255, 0), 2)
cv2.imshow("Frame", image)
cv2.waitKey(0)
Output:
This brings us to the end of this article where we learned about face recognition.
You can also upskill with Great Learning’s PGP Artificial Intelligence and Machine Learning Course. The course offers mentorship from industry leaders, and you will also have the opportunity to work on real-time industry-relevant projects.
Great work! Can we extend this to detect multiple faces in an image?
This code works with multiple face inputs. I have tried it.
Thanks for this resource! I have a few doubts….
1. How do I write/append the names of the detected faces into a excel or csv file?
I tried pandas and openpyxl. But I am unable to get the names known names on the excel/csv file.
I am planning to make a smart attendance system for my class.
2. If a face is new/unknown. The code must allow us to add that face to the database. The code must ask – Who is
this?
It must crop out the face inside the rectangle and store it in the database under a new folder with the new name we
provide.
Kindly help.