Motion Detection


Python Libraries


NumPy

NumPy is a library for the Python programming language that (among other things) provides support for large, multi-dimensional arrays.
Why is that important?
Using NumPy, we can express images as multi-dimensional arrays.
Representing images as NumPy arrays is not only computational and resource efficient, but many other image processing and machine learning libraries use NumPy array representations as well.
Furthermore, by using NumPy’s built-in high-level mathematical functions, we can quickly perform numerical analysis on an image.

SciPy

Going hand-in-hand with NumPy, we also have SciPy.
SciPy adds further support for scientific and technical computing.
One of my favorite sub-packages of SciPy is the spatial package which includes a vast amount of distance functions and a kd-tree
implementation.
Why are distance functions important?
When we “describe” an image, we perform feature extraction.
Normally after feature extraction an image is represented by a vector (a list) of numbers.
In order to compare two images, we rely on distance functions, such as the Euclidean distance.
To compare two arbitrary feature vectors, we simply compute the distance between their feature vectors.
In the case of the Euclidean distance, the smaller the distance the more “similar” the two images are.

matplotlib

Simply put, matplotlib is a plotting library.
If you’ve ever used MATLAB before, you’ll probably feel very comfortable in the matplotlib environment.
When analyzing images,we’ll make use of matplotlib, whether plotting the overall accuracy of search systems or simply viewing the image itself, matplotlib is a great tool to have in your toolbox.

PIL and Pillow

These two packages are good and what they do: simple image manipulations, such as resizing, rotation, etc.
If you need to do some quick and dirty image manipulations definitely check out PIL and Pillow, but if you’re serious about learning about image processing, computer vision, and image search engines, I would highly recommend that you spend your time playing with OpenCV and SimpleCV instead.

OpenCV

If NumPy’s main goal is large, efficient, multi-dimensional array representations, then, by far, the main goal of OpenCV is real-time image processing.
This library has been around since 1999, but it wasn’t until the 2.0 release in 2009 did we see the incredible NumPy support.
The library itself is written in C/C++, but Python bindings are provided when running the installer.
OpenCV is hands down my favorite computer vision library, but it does have a learning curve.
Be prepared to spend a fair amount of time learning the intricacies of the library and browsing the docs (which have gotten substantially
better now that NumPy support has been added).
If you are still testing the computer vision waters, you might want to check out the SimpleCV library mentioned below, which has a substantially smaller learning curve.

SimpleCV

The goal of SimpleCV is to get you involved in image processing and computer vision as soon as possible.
And they do a great job at it.
The learning curve is substantially smaller than that of OpenCV, and as their tagline says, “it’s computer vision made easy”.
That all said, because the learning curve is smaller, you don’t have access to as many of the raw, powerful techniques supplied by OpenCV.
If you’re just testing the waters, definitely try this library out.

mahotas

Mahotas, just as OpenCV and SimpleCV, rely on NumPy arrays.
Much of the functionality implemented in Mahotas can be found in OpenCV and/or SimpleCV, but in some cases, the Mahotas interface is just easier to use, especially when it comes to their features package.

scikit-learn

Alright, you got me, Scikit-learn isn’t an image processing or computer vision library — it’s a machine learning library.
That said, you can’t have advanced computer vision techniques without some sort of machine learning, whether it be clustering, vector
quantization, classification models, etc.
Scikit-learn also includes a handful of image feature extraction functions as well.

scikit-image

Scikit-image is fantastic, but you have to know what you are doing to effectively use this library -- and I don’t mean this in a “there is a steep learning curve” type of way.
The learning curve is actually quite low, especially if you check out their gallery.
The algorithms included in scikit-image (I would argue) follow closer to the state-of-the-art in computer vision.
New algorithms right from academic papers can be found in scikitimage, but in order to (effectively) use these algorithms, you need to have developed some rigor and understanding in the computer vision field.
If you already have some experience in computer vision and image processing, definitely check out scikit-image; otherwise, I would continue working with OpenCV and SimpleCV to start.

ilastik

I’ll be honest.
I’ve never used ilastik.
But through my experiences at computer vision conferences, I’ve met a fair amount of people who do, so I felt compelled to put it in this
list.
Ilastik is mainly for image segmentation and classification and is especially geared towards the scientific community

pprocess

Extracting features from images is inherently a parallelizable task.
You can reduce the amount of time it takes to extract features from an entire dataset by using a multithreading/multitasking library.
My favorite is pprocess, due to the simple nature I need it for, but you can use your favorite.

h5py

The h5py library is the de-facto standard in Python to store large numerical datasets.
The best part?
It provides support for NumPy arrays.
So, if you have a large dataset represented as a NumPy array, and it won’t fit into memory, or if you want efficient, persistent storage of NumPy arrays, then h5py is the way to go.
One of my favorite techniques is to store my extracted features in a h5py dataset and then apply scikitlearn’s MiniBatchKMeans to cluster the features.
The entire dataset never has to be entirely loaded off disk at once and the memory footprint is extremely small, even for thousands of feature vectors.

Basic motion detection and tracking with Python and OpenCV


Source Code



# USAGE
# python motion_detector.py
# python motion_detector.py --video videos/example_01.mp4

# import the necessary packages
import argparse
import datetime
import imutils
import time
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", help="path to the video file")
ap.add_argument("-a", "--min-area", type=int, default=500, help="minimum area size")
args = vars(ap.parse_args())

# if the video argument is None, then we are reading from webcam
if args.get("video", None) is None:
 camera = cv2.VideoCapture(0)
 time.sleep(0.25)

# otherwise, we are reading from a video file
else:
 camera = cv2.VideoCapture(args["video"])

# initialize the first frame in the video stream
firstFrame = None

# loop over the frames of the video
while True:
 # grab the current frame and initialize the occupied/unoccupied
 # text
 (grabbed, frame) = camera.read()
 text = "Unoccupied"

 # if the frame could not be grabbed, then we have reached the end
 # of the video
 if not grabbed:
  break

 # resize the frame, convert it to grayscale, and blur it
 frame = imutils.resize(frame, width=500)
 gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 gray = cv2.GaussianBlur(gray, (21, 21), 0)

 # if the first frame is None, initialize it
 if firstFrame is None:
  firstFrame = gray
  continue

 # compute the absolute difference between the current frame and
 # first frame
 frameDelta = cv2.absdiff(firstFrame, gray)
 thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]

 # dilate the thresholded image to fill in holes, then find contours
 # on thresholded image
 thresh = cv2.dilate(thresh, None, iterations=2)
 (cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
  cv2.CHAIN_APPROX_SIMPLE)

 # loop over the contours
 for c in cnts:
  # if the contour is too small, ignore it
  if cv2.contourArea(c) < args["min_area"]:
   continue

  # compute the bounding box for the contour, draw it on the frame,
  # and update the text
  (x, y, w, h) = cv2.boundingRect(c)
  cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
  text = "Occupied"

 # draw the text and timestamp on the frame
 cv2.putText(frame, "Room Status: {}".format(text), (10, 20),
  cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
 cv2.putText(frame, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),
  (10, frame.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)

 # show the frame and record if the user presses a key
 cv2.imshow("Security Feed", frame)
 cv2.imshow("Thresh", thresh)
 cv2.imshow("Frame Delta", frameDelta)
 key = cv2.waitKey(1) & 0xFF

 # if the `q` key is pressed, break from the lop
 if key == ord("q"):
  break

# cleanup the camera and close any open windows
camera.release()
cv2.destroyAllWindows()

留言

熱門文章