Some people truly believe their laptop built-in webcams can pry them so they fearfully beware of that. Sometimes they’re so seriously afraid of prying that they even tape their device’s watchful eye. Actually, they do it invainly. We'll show you how to master the built-in laptop webcam and use its functionality in civilian purpose and not as much civilian too.
Implementation: first annoying troubles
I was very surprised and upset when I learned that great and mighty .NET Framework is completely released from the ability of easy web camera interaction. In .NET v4 the situation has got a bit better (SilverLight-projects got some relevant classes), but I didn’t have enough time to test it, because I began writing some code examples for this article before VS2010 and .NET v4 official release.
Almost desperate, I have tightly ensconced myself in Google. All I found were MSDN and DirectDraw technology links. I even tried to knock out a simple application, but due to lack of DirectDraw work experience I just got a can of worms. Actually, I wrote an application, but I was never able to find and fix all the bugs in it.
Getting even more desperate I started browsing our Western friends’ web resources. After I studied a few dozens of links, I dug a lot of different goodies. There were various application examples and small articles (Americans don’t like to write a lot) among them too. I even managed to find a working DirectDraw based application example, but I was really horrified when I saw the source code. It was pretty hard to understand. So I decided not to bother with that stuff and try to find some easier way. I had hardly bowed out the I-st DirectDraw application example, before my eye caught another one. The author of that application had coded a whole web cam and other video capture devices handling library on the basis of the VFW (Video for Windows) technology.
This project (I'm talking of the library) was neutered at the hilt and that was a big pity. All things that library could do is video outputting of the webcam picture. It didn’t include neither individual frames capturing or video recording nor any other useful features.
Nevertheless, my subconscious mind firmly told me that this project is what I was looking for. Before I had a quick glance through its code, I saw names of some familiar Win-messages and no less familiar names of WinAPI functions. Once upon a time I had to write a Delphi application for webcam operation. That’s when I faced these functions for the first time.
Ready!
It’s possible for one PC/laptop to have several webcams connected at the same time. Example is not far to seek. In a professional capacity I often have to organize some simple videoconferences. Usually they involve two people. Each participant is shot by individual cam. Web cams are connected to my PC. When I start shooting, I choose an appropriate camera to work with at the moment using special software. Since we decided to take the web cam under our control, we’ll have to figure out how to get a list of system installed video capture devices and choose the one to work with at the moment.
WindowsAPI provides the capGetDriverDescription() function to solve this simple problem. It deals with five parameters:
- wDriverIndex – capture driver index. Index value ranges from 0 to 9;
- lpszName – buffer pointer, which contains the appropriate driver name;
- cbName – lpszName buffer size (in bytes);
- lpszVer – buffer , which contains the description of a specific driver;
- cbVer – lpszVer buffer size (in bytes).
This function returns TRUE in case of success. Now we have the function description, so let's see how to define it in C #. This can be done as follows:
[DllImport("avicap32.dll")]
protected static extern bool capGetDriverDescriptionA (short wDriverIndex, [MarshalAs(UnmanagedType.VBByRefStr)] ref String lpszName, int cbName, [MarshalAs(UnmanagedType.VBByRefStr)] ref String lpszVer, int cbVer);
Please note that before you specify the name of the function it is required to add the DLL name which includes its definition. In our case it’s avicap32.dll.
So, the function is imported and now you can write a class it will be used in. I’m not going to show the whole class code, but only the key method code:
public static Device[] GetAllCapturesDevices()
{
String dName = "".PadRight(100);
String dVersion = "".PadRight(100);
for (short i = 0; i < 10; i++)
{
if (capGetDriverDescriptionA(i,
ref dName, 100, ref dVersion,
100))
{
Device d = new Device(i);
d.Name = dName.Trim();
d.Version = dVersion.Trim();
devices.Add(d);
}
}
return (Device[])devices.ToArray
(typeof(Device));
}
Source code looks like child's play. The most interesting place is a cycle, which references the above mentioned capGetDriverDescription function. MSDN tells us that its index (the first parameter of the capGetDriverDescription () function) can vary from 0 to 9, so we deliberately set the cycle in this range. The method result is an array of Device classes (this class I have defined by myself. See the appropriate code source).
After we get the device list, we should take care of displaying the cam video flow. There’s capCreateCaptureWindow () function invented to help us creating a capture window to make that.
By jumping a little ahead, I’d say that further camera involved action will take the form of banal capture window messaging. Yes, indeed, we’ll have to use the SendMessage () function which is painfully familiar for every windows-programmer.
Now let’s take a closer look at the capCreateCaptureWindow () function. There are six arguments to be set:
- lpszWindowName – null terminal line, which contains the name of the capture window;
- dwStyle – window style;
- x – X coordinate;
- y – Y coordinate;
- nWidth – window width;
- nHeight – window height;
- hWnd – parent window handle;
- nID – window ID.
The function result is handling of created window or NULL in case of error. This function has to be imported as it also applies to WinAPI. I won’t exemplify the import code, because it’s almost identical to the one I wrote for the capGetDriverDescription () function. We’d better look at the camera initializing process:
deviceHandle = capCreateCaptureWindowA (ref deviceIndex, WS_VISIBLE | WS_CHILD, 0, 0, windowWidth, windowHeight, handle, 0);
if (SendMessage(deviceHandle, WM_CAP_DRIVER_CONNECT, this.index, 0) > 0)
{
SendMessage(deviceHandle, WM_CAP_SET_SCALE, -1, 0);
SendMessage(deviceHandle, WM_CAP_SET_PREVIEWRATE, 0x42, 0);
SendMessage(deviceHandle, WM_CAP_SET_PREVIEW, -1, 0);
SetWindowPos(deviceHandle, 1, 0, 0, windowWidth, windowHeight, 6);
}
In this code, there goes an attempt to send a WM_CAP_DRIVER_CONNECT message immediately after the window is created. The non-null result will tell us about the function performing success.
Now we’ll imagine that today, the gods are on our side, and we’ll immediately send multiple messages: WM_CAP_SET_SCALE, WM_CAP_SET_PREVIEWRATE, WM_CAP_SET_PREVIEW. Alas, the story goes just the same as functions story. C# knows nothing about the existence of such constants. You'll need to define them by yourself. A list of all necessary constants and comments goes below.
// Custom message
private const int WM_CAP = 0x400;
// Video capture driver is connected
private const int WM_CAP_DRIVER_CONNECT = 0x40a;
// Video capture driver is disconnected
private const int WM_CAP_DRIVER_DISCONNECT = 0x40b;
// Buffer copy of a frame
private const int WM_CAP_EDIT_COPY = 0x41e;
// Preview mode On/Off
private const int WM_CAP_SET_PREVIEW = 0x432;
// Overlay mode On/Off
private const int WM_CAP_SET_OVERLAY = 0x433;
// Preview rate
private const int WM_CAP_SET_PREVIEWRATE = 0x434;
// Zoom On/Off
private const int WM_CAP_SET_SCALE = 0x435;
private const int WS_CHILD = 0x40000000;
private const int WS_VISIBLE = 0x10000000;
// Setting the preview callback function
private const int WM_CAP_SET_CALLBACK_FRAME = 0x405;
// Getting a single frame from a video capture driver
private const int WM_CAP_GRAB_FRAME = 0x43c;
// Saving a frame to a file
private const int WM_CAP_SAVEDIB = 0x419;
I will omit all further class description as I reviewed its basic structure. All the rest is easy to deal by getting acquainted to my well-commented source code. The only thing I don’t want to leave behind the scenes is an example of the library usage.
Totally, I have implemented a couple of methods in this library: GetAllDevices (already discussed), GetDevice (getting the video capture device driver by its index), ShowWindow (webcam video flow displaying), GetFrame (individual frame to image file capture) and GetCapture (video flow capture).
I made a small application in order to demonstrate the efficiency of created library. I've used one ComboBox component (which is used to store a list of available video capture devices) and a few buttons - "Refresh", "Start", "Stop" and "Screenshot". Ah, yes, there’s also an Image component which is to display the camera video flow.
We’ll start from the "Update" button. It gets a fresh list of all installed video capture devices. Event handler source code:
Device[] devices = DeviceManager.GetAllDevices();
foreach (Device d in devices)
{
cmbDevices.Items.Add(d);
}
Looks simple, isn’t it? We just enjoy the object-oriented programming because the developed library undertakes all dirty work. The code which displays the camera video flow is even easier:
Device selectedDevice = DeviceManager.GetDevice(cmbDevices.SelectedIndex);
selectedDevice.ShowWindow(this.picCapture);
Again, looks just like a piece of cake. Well, now let’s take a look at "Screenshot” source code:
Device selectedDevice = DeviceManager.GetDevice(cmbDevices.SelectedIndex);
selectedDevice.FrameGrabber();
I don’t pay some special attention to the FrameGrabber () method. In my source code this method call leads to direct root system drive saving of current frame. Of course that’s not the way it should be, so don’t forget to make all necessary changes before application “field” use.
Steady!
Now it’s time to talk about how to create a simple but reliable CCNC system. Typically, such systems are based on two algorithms: two frames distinguishing and a simple background simulation. Their implementation (source code) is quite a heavy thing, so I decided to go an easier way at the last moment. That easy way includes the use of powerful, but so far little-known AForge.NET which is a framework for .NET.
AForge.NET is primarily intended for developers and researchers. With its help, developers can greatly facilitate their work in developing projects in the following areas: neural networks, image operation (filtering, image editing, per-pixel filtering, resizing, and image rotation), genetics, robotics, interaction with video devices, etc. AForge.NET is delivered with good manual. It describes everything about the product. Take the time to thoroughly read it. I especially like to mention about the quality of the product source code. Digging that code is a real pleasure.
Now back to our immediate problem. Frankly, it can be solved as two and two by that framework means. "Then why did you give me soar brain with that WinAPI functions?" – You’ll ask dissatisfiedly. Just to ensure that you won’t be limited in anything. I think you know that there’re different kinds of project and in one case it’s more convenient to apply the .NET, but in some other case it’s easier to get away with just a good old WinAPI.
Let’s return to our problem again. We’ll have to take the MotionDetector class of the above mentioned framework in order to implement the motion detector. The class excellently operates with Bitmap objects and allows a quick calculating of two images difference percentage. Source code example:
MotionDetector detector = new MotionDetector(
new TwoFramesDifferenceDetector( ),
new MotionAreaHighlighting( ) );
// Next frame processing
if ( detector != null )
{
float motionLevel = detector.ProcessFrame( image );
if ( motionLevel > motionAlarmLevel )
{
flash = (int) ( 2 * ( 1000 / alarmTimer.Interval ) );
}
if ( detector.MotionProcessingAlgorithm is BlobCountingObjectsProcessing )
{
BlobCountingObjectsProcessing countingDetector = (BlobCountingObjectsProcessing) detector.MotionProcessingAlgorithm;
objectsCountLabel.Text = "Objects: " + countingDetector.ObjectsCount.ToString( );
}
else
{
objectsCountLabel.Text = "";
}
}
}
The above code (if not taking into count the MotionDetector class initialization) is performed when getting every next frame from the web cam. After we’ve got a frame there follows a banal comparison (based on ProcessFrame method). If the motionlevel value is more then motionLevelAlarm (0.015f) it means we should sound the alarm! Some motion is detected. One of the screenshots clearly demonstrates the work of the motion detector.
Go!
Any web cam can be easily adapted for facial recognition and advanced system logon establishment. If after browsing all this material you think that it’s difficult, then you're completely wrong! Late March, there appeared an example (and then a link to the article) on the http://codeplex.com web site (OpenSource MS projects hosting), which demonstrated the implementation of the application for web cam face detecting. Application example is based on the use of new opportunities of .NET and SilverLight. It’s unreal to be reviewed within the limit of one journal article, because the author of the source code tried to do everything elegant to the hilt. Here you can find as algorithms for image handling (blur filter, noise reduction, pixel by pixel comparison, stretching, etc.) so the demonstration of the SilverLight new products and much more. In other words, it gets the “must use” label with no doubt! See the project and article link below.
Finish
All application examples overviewed within the article will serve you a good start point. On the basis of those examples it is easy to create a webcam professional tool and earn a few hundred bucks a quarter by selling it or create some greasy and creepy spy Trojan.
Bethink the story about the backup of Skype conversation. It was told there that the keyloggers time had already passed away. Now audio and video data is extremely red hot. If you consider that nowadays, the webcam is a mandatory attribute of any laptop, it is easy to imagine how many interesting videos you can shoot by putting off this kind of "useful program" to your victim... But, anyway, I told you nothing about that, didn’t I? :). Good luck in programming! Remember, if you got any questions just feel free to ask me.
WWW
http://blogs.msdn.com/ – "Silverlight 4 real-time Face Detection" Russian version.
http://facelight.codeplex.com/ – "Facelight" project is hosted up here. It allows real time face recognition. If you’re going to code some serious software for person identification or system logon, then you’re simply obliged to check out this project.
http://www.aforgenet.com/framework/ – AForge .NET - is an excellent and easy to use framework for video and image handling.
http://facelight.codeplex.com/ – "Facelight" project is hosted up here. It allows real time face recognition. If you’re going to code some serious software for person identification or system logon, then you’re simply obliged to check out this project.
http://www.aforgenet.com/framework/ – AForge .NET - is an excellent and easy to use framework for video and image handling.
No comments:
Post a Comment