Artificial Intelligence Being Trained To Combat DeepFakes
A team of researchers from the University of Albany has developed a method of combat Deepfakes videos, using machine learning techniques to search videos for digital “fingerprints” left behind when a video has been altered.
One of the biggest concerns in the tech world over the past couple of years has been the rise of Deepfakes. Deepfakes are a type of fake video constructed by artificial intelligence algorithms run through deep neural networks, and the products of the combat deepfakes technology are shockingly good – sometimes difficult to tell apart from a real, genuine video.
AI researchers, ethicists, and political scientists are worried that the combat Deepfakes technology will eventually be used to impact political elections, disseminating misinformation in a form more convincing than a fake news story. In order to provide some defense against the manipulation and misinformation that combat Deepfakes can cause, researchers from the University of Albany have created tools to assist in the detection of fake videos.
Deepfake programs are capable of merging different images together into a video, compositing images of one person onto another person, for instance. These programs can be used to make powerful and influential people, like politicians, appear to say things they didn’t actually say. Combat Deepfakes programs operate by analyzing thousands of images of a person from different angles, saying different things, wearing different facial expressions, and learning the features that define the person.
However, Deepfakes aren’t exactly flawless yet. The videos have certain characteristics that can be analyzed by another algorithmic system to determine if the video is fake. One of these characteristics is that people in Deepfake videos don’t blink as often as regular people do. The deep neural networks that learn features of the video don’t pick up on blinking as humans do. Neural networks are limited by what data they have to analyze, and humans spend much more time with their eyes open than blinking. Beyond this, images with people blinking are usually deleted or tossed out, and this creates a type of bias.
Siwei Lyu, a professor at the University of Albany and leader of the team that created the fake detection tools, says that not only might a neural network miss picking up on blinking, they could also miss other subtle, but important, physiological signals humans have, such as a normal breathing rate. Lyu says that while the research the team worked on is specifically targeting video created with Deepfake programs, any software trained on images of humans could be missing subtle human cues because images don’t capture the entire physical human experience.
Lyu and his team weren’t surprised by this development. Lyu explains that Deepfakes will constantly be getting more sophisticated, as once you determine what “tells” give the video away as a fake, getting rid of the tale is just a technological problem. Compensating for a lack of blinking can be solved simply by training the neural network on more images with eyes closed, or using sections of video for training. In other words, there will likely be a constant battle between fake makers and fake detectors, with fake makers developing more sophisticated faking techniques and fake detectors constantly working to develop more sophisticated detection methods. Given this reality, the goal of Lyu’s team is simply to make convincing fakes harder and more time-intensive to make, in hopes this will deter fake makers.
Lyu’s research is being funded by Media Forensics, a DARPA initiative. DARPA and the rest of the intelligence/military community are extremely concerned about the advancement of Deepfake technology. Media Forensics (MediFor) was created in 2016 in response to the quality of video fakes rapidly increasing. The project’s goal is to create an automated media analysis system that can examine a video or photo and give the piece of media an “integrity score”.
MediFor’s analysis system comes up with an integrity score by examining three different types of “tells”, alterations to an image or video, each of which has its own “level” of detail. In the first level, a search for “digital fingerprints” is done. These are digital artifacts that show evidence of manipulation like image noise that is characteristic of a certain camera, compression artifacts, or irregular pixel intensities. The second level of analysis is a physical analysis, for instance, perhaps reflections on a surface aren’t where they should be given the light in the surrounding area. The final level of analysis is a “semantic” analysis, comparing the suspected fake image/video to control images that are known to be real or other data points that could falsify the fake. As an example, if a video is claimed to be from a particular time and place, does the weather in the video match the weather reported on that day?
Unfortunately, as videos and images become easier to fake, there could come a time where not just individual images/videos are fake but entire events are faked. An entire set of images or videos could be created that fabricate an event from multiple angles, adding to its perceived authenticity. Tech scientists are concerned that there could be a future where the video is either trusted too much or not trusted at all and valuable, real evidence is thrown out. Motivated by these worries, other organizations are working on their own methods of detecting video fakes.
“Basically we start with the idea that all of these AI generators of images have a limited set of things they can generate. So even if an image looks really complex to you or me just looking at it, there’s some pretty repeatable structure,” Moore says.
Another method used by Los Alamos is the implementation of sparse coding algorithms. Sparse coding algorithms can examine many real and fake images and classify images suspected to be fake by determining what elements/features are common to real images and what elements/features fake images have in common. Essentially, the algorithms create a “dictionary of visual elements” and cross-reference the dictionary to determine if an image is fake.
Researchers are Los Alamos and other locations are working hard to create fake detection methods precisely because humans are predisposed to believe their senses, believing what is right in front of their eyes. (“People will believe whatever they’re inclined to believe,” says Moore.) It will take quite a bit of evidence to dissuade someone from believing something they’ve seen with their own eyes, and even more if what they’ve seen reinforces an already existing belief. Ultimately, the solution may have to come not only from tools created to detect fake images/videos but from education about the manipulative power of Deepfakes and a cultural shift towards becoming more skeptical in general.
One of the biggest concerns in the tech world over the past couple of years has been the rise of Deepfakes. Deepfakes are a type of fake video constructed by artificial intelligence algorithms run through deep neural networks, and the products of the combat deepfakes technology are shockingly good – sometimes difficult to tell apart from a real, genuine video.
Artificial Intelligence Being Trained To Combat DeepFakes
AI researchers, ethicists, and political scientists are worried that the combat Deepfakes technology will eventually be used to impact political elections, disseminating misinformation in a form more convincing than a fake news story. In order to provide some defense against the manipulation and misinformation that combat Deepfakes can cause, researchers from the University of Albany have created tools to assist in the detection of fake videos.
The Subtle Tells Of A Fake Video
Deepfake programs are capable of merging different images together into a video, compositing images of one person onto another person, for instance. These programs can be used to make powerful and influential people, like politicians, appear to say things they didn’t actually say. Combat Deepfakes programs operate by analyzing thousands of images of a person from different angles, saying different things, wearing different facial expressions, and learning the features that define the person.
However, Deepfakes aren’t exactly flawless yet. The videos have certain characteristics that can be analyzed by another algorithmic system to determine if the video is fake. One of these characteristics is that people in Deepfake videos don’t blink as often as regular people do. The deep neural networks that learn features of the video don’t pick up on blinking as humans do. Neural networks are limited by what data they have to analyze, and humans spend much more time with their eyes open than blinking. Beyond this, images with people blinking are usually deleted or tossed out, and this creates a type of bias.
Siwei Lyu, a professor at the University of Albany and leader of the team that created the fake detection tools, says that not only might a neural network miss picking up on blinking, they could also miss other subtle, but important, physiological signals humans have, such as a normal breathing rate. Lyu says that while the research the team worked on is specifically targeting video created with Deepfake programs, any software trained on images of humans could be missing subtle human cues because images don’t capture the entire physical human experience.
A Constant Competition
Lyu and his team weren’t surprised by this development. Lyu explains that Deepfakes will constantly be getting more sophisticated, as once you determine what “tells” give the video away as a fake, getting rid of the tale is just a technological problem. Compensating for a lack of blinking can be solved simply by training the neural network on more images with eyes closed, or using sections of video for training. In other words, there will likely be a constant battle between fake makers and fake detectors, with fake makers developing more sophisticated faking techniques and fake detectors constantly working to develop more sophisticated detection methods. Given this reality, the goal of Lyu’s team is simply to make convincing fakes harder and more time-intensive to make, in hopes this will deter fake makers.
Lyu’s research is being funded by Media Forensics, a DARPA initiative. DARPA and the rest of the intelligence/military community are extremely concerned about the advancement of Deepfake technology. Media Forensics (MediFor) was created in 2016 in response to the quality of video fakes rapidly increasing. The project’s goal is to create an automated media analysis system that can examine a video or photo and give the piece of media an “integrity score”.
MediFor’s analysis system comes up with an integrity score by examining three different types of “tells”, alterations to an image or video, each of which has its own “level” of detail. In the first level, a search for “digital fingerprints” is done. These are digital artifacts that show evidence of manipulation like image noise that is characteristic of a certain camera, compression artifacts, or irregular pixel intensities. The second level of analysis is a physical analysis, for instance, perhaps reflections on a surface aren’t where they should be given the light in the surrounding area. The final level of analysis is a “semantic” analysis, comparing the suspected fake image/video to control images that are known to be real or other data points that could falsify the fake. As an example, if a video is claimed to be from a particular time and place, does the weather in the video match the weather reported on that day?
Racing To Create Defenses Against Misinformation
Unfortunately, as videos and images become easier to fake, there could come a time where not just individual images/videos are fake but entire events are faked. An entire set of images or videos could be created that fabricate an event from multiple angles, adding to its perceived authenticity. Tech scientists are concerned that there could be a future where the video is either trusted too much or not trusted at all and valuable, real evidence is thrown out. Motivated by these worries, other organizations are working on their own methods of detecting video fakes.
“Basically we start with the idea that all of these AI generators of images have a limited set of things they can generate. So even if an image looks really complex to you or me just looking at it, there’s some pretty repeatable structure,” Moore says.
Another method used by Los Alamos is the implementation of sparse coding algorithms. Sparse coding algorithms can examine many real and fake images and classify images suspected to be fake by determining what elements/features are common to real images and what elements/features fake images have in common. Essentially, the algorithms create a “dictionary of visual elements” and cross-reference the dictionary to determine if an image is fake.
Researchers are Los Alamos and other locations are working hard to create fake detection methods precisely because humans are predisposed to believe their senses, believing what is right in front of their eyes. (“People will believe whatever they’re inclined to believe,” says Moore.) It will take quite a bit of evidence to dissuade someone from believing something they’ve seen with their own eyes, and even more if what they’ve seen reinforces an already existing belief. Ultimately, the solution may have to come not only from tools created to detect fake images/videos but from education about the manipulative power of Deepfakes and a cultural shift towards becoming more skeptical in general.
Comments
Post a Comment