How AI Can Create And Detect Fake News

Post written by

Indre Deksnyte

SVP Marketing of CUJO AI, 13 years experience marketing for tech companies, PhD of Economics, more than 48 articles in economics magazines.

Getty

False news has consistently been growing around us, primarily as clickbait, and often tends to go viral. These are articles and stories created solely to mislead and misinform people into believing narratives that otherwise hold no merit. According to research published in Science magazine, the propagation of such media could be attributed to the fact that humans are more likely to spread lies faster than the truth.

The primary sources of information used to be journalists and authentic media outlets that had to verify their sources and the information they received; sadly, this isn’t always the case anymore. With advancements made in technology, the rumor and propaganda mills have been handed over to advanced AI algorithms that are designed to create believable content—which usually isn’t true.

The development of this technology is a great leap from Siri, optical character recognition or spam filters, but teaching AI to recognize vast amounts of data and manipulate it is a dangerous proposition.

This leads to the question: How can fake news be detected?

The good news is that algorithms designed to distinguish between human- and AI-generated content have been developed. On the other hand, these algorithms also come with the ability to create fake news themselves.

Falsehood Detection

While artificial intelligence seems like relatively new technology, it has been helping us sort through content for a while now.

The technology behind spam filters—machine learning algorithms—was initially developed in the 1700s. Today, we use it for a number of tasks, such as categorizing our emails to determine which correspondence is useful and which is just unsolicited mass distribution.

This has also led to the development of neural network technology, which acts as a discriminator that can detect discrepancies in articles to determine authenticity.

Some use comparative analysis between similar posts to check if the information and facts contained are true and match up to reliable sources. Others look for differences between the title and content—thereby identifying clickbait articles.

A newer algorithm still is Grover, which promises 92% efficiency in detecting nonhuman content.

Why We Haven't Reached 100% Accuracy

The biggest drawback of these systems is that they assume that fake text will always have telltale markers.

How many times has spam email sneaked its way into your inbox? While the numbers may not be high, there’s no denying that this does tend to happen at times—primarily due to the fact that creators of spam content keep evolving. They keep figuring out the triggers that help flag the content and learn to avoid them.

The same is true with false news: As the technology to catch and eliminate it gets better, so do the algorithms that create it.

This has created an endless back-and-forth between those who are working to curb the spread of false news and those who generate misleading content.

How Neural Network Algorithms Learn To Differentiate

Since the main purpose of creating such algorithms is to differentiate between real and fake information, developers first need to teach the system what these are.

Most of these AI, including Grover, are developed by feeding them existing articles from various fake news datasets. These are huge virtual data libraries that contain authentic information and sources to help AI learn the patterns of human writing.

Some of these datasets include:

• RealNews: This dataset was used to train Grover and has over 5,000 authentic publications that require 120 GB of space.

• Kaggle: This dataset takes up around 57 MB of disk space and contains 13,000 rows and 20 columns of data.

• George McIntire: Named after the data visualization analyst, this set of fake news data requires 31 MB of disk space.

Once this process is over, AI can build complex models that are able to identify how certain words are used and how different concepts are linked together.

Fake News Creation By AI

An important method to create these models for discriminatory programs is referred to as the "adversarial" system. Adversarial machine learning is the process of creating malicious or misinforming content that can slip past detection programs.

Grover and other AI systems improve their efficiency by generating articles and then using their own detection programs to evaluate the articles' believability. If the created content isn’t as convincing, the generators keep reproducing text and learn what is real and what isn’t.

This ability to "generate" fake articles is what works as a double-edged sword.

Another advancement is the creation of "deepfakes," which are doctored or artificially generated videos and photos that can superimpose the physique and face of one person on another to make it seem like they carried out a certain action.

Deepfakes can have drastic consequences upon misuse. From propaganda to incite hatred and violence, to maligning public figures with fake speeches and doctored videos, deepfakes can be used to create confusion and can result in a severe loss of public trust and a bad reputation.

A prime example of such misuse was the release of a doctored video of Facebook CEO Mark Zuckerberg around the time of his congressional hearing.

Although it is difficult to identify whether such media is authentic, the technology to fight this form of AI manipulation is still in the works.

Can The Loop Be Broken?

Fake news is shared without being vetted and verified, and such proliferation is what creates the need for more. In fact, a Pew Research Center survey found that 10% of respondents admitted to sharing a news story online that they knew was fake, while 49% had shared news that they later found to be false.

What we can do for now is create awareness to combat this propagation of fake news. In other words, we must stop sharing such media to take away its credibility.

Detecting fake news is a complex process that starts with awareness and education. You must verify the source. Quality information is typically fact-checked or peer-reviewed. You should rely on the insights that come from reputable channels or are sourced from trusted research companies.

Now, more people than ever rely on the internet as their main source of information. However, with this medium being easily polluted with a plethora of false information, everything we learn from online sources has to be carefully questioned and evaluated.

Forbes Communications Council is an invitation-only community for executives in successful public relations, media strategy, creative and advertising agencies. Do I qualify?
">
Getty

False news has consistently been growing around us, primarily as clickbait, and often tends to go viral. These are articles and stories created solely to mislead and misinform people into believing narratives that otherwise hold no merit. According to research published in Science magazine, the propagation of such media could be attributed to the fact that humans are more likely to spread lies faster than the truth.

The primary sources of information used to be journalists and authentic media outlets that had to verify their sources and the information they received; sadly, this isn’t always the case anymore. With advancements made in technology, the rumor and propaganda mills have been handed over to advanced AI algorithms that are designed to create believable content—which usually isn’t true.

The development of this technology is a great leap from Siri, optical character recognition or spam filters, but teaching AI to recognize vast amounts of data and manipulate it is a dangerous proposition.

This leads to the question: How can fake news be detected?

The good news is that algorithms designed to distinguish between human- and AI-generated content have been developed. On the other hand, these algorithms also come with the ability to create fake news themselves.

Falsehood Detection

While artificial intelligence seems like relatively new technology, it has been helping us sort through content for a while now.

The technology behind spam filters—machine learning algorithms—was initially developed in the 1700s. Today, we use it for a number of tasks, such as categorizing our emails to determine which correspondence is useful and which is just unsolicited mass distribution.

This has also led to the development of neural network technology, which acts as a discriminator that can detect discrepancies in articles to determine authenticity.

Some use comparative analysis between similar posts to check if the information and facts contained are true and match up to reliable sources. Others look for differences between the title and content—thereby identifying clickbait articles.

A newer algorithm still is Grover, which promises 92% efficiency in detecting nonhuman content.

Why We Haven't Reached 100% Accuracy

The biggest drawback of these systems is that they assume that fake text will always have telltale markers.

How many times has spam email sneaked its way into your inbox? While the numbers may not be high, there’s no denying that this does tend to happen at times—primarily due to the fact that creators of spam content keep evolving. They keep figuring out the triggers that help flag the content and learn to avoid them.

The same is true with false news: As the technology to catch and eliminate it gets better, so do the algorithms that create it.

This has created an endless back-and-forth between those who are working to curb the spread of false news and those who generate misleading content.

How Neural Network Algorithms Learn To Differentiate

Since the main purpose of creating such algorithms is to differentiate between real and fake information, developers first need to teach the system what these are.

Most of these AI, including Grover, are developed by feeding them existing articles from various fake news datasets. These are huge virtual data libraries that contain authentic information and sources to help AI learn the patterns of human writing.

Some of these datasets include:

• RealNews: This dataset was used to train Grover and has over 5,000 authentic publications that require 120 GB of space.

• Kaggle: This dataset takes up around 57 MB of disk space and contains 13,000 rows and 20 columns of data.

• George McIntire: Named after the data visualization analyst, this set of fake news data requires 31 MB of disk space.

Once this process is over, AI can build complex models that are able to identify how certain words are used and how different concepts are linked together.

Fake News Creation By AI

An important method to create these models for discriminatory programs is referred to as the "adversarial" system. Adversarial machine learning is the process of creating malicious or misinforming content that can slip past detection programs.

Grover and other AI systems improve their efficiency by generating articles and then using their own detection programs to evaluate the articles' believability. If the created content isn’t as convincing, the generators keep reproducing text and learn what is real and what isn’t.

This ability to "generate" fake articles is what works as a double-edged sword.

Another advancement is the creation of "deepfakes," which are doctored or artificially generated videos and photos that can superimpose the physique and face of one person on another to make it seem like they carried out a certain action.

Deepfakes can have drastic consequences upon misuse. From propaganda to incite hatred and violence, to maligning public figures with fake speeches and doctored videos, deepfakes can be used to create confusion and can result in a severe loss of public trust and a bad reputation.

A prime example of such misuse was the release of a doctored video of Facebook CEO Mark Zuckerberg around the time of his congressional hearing.

Although it is difficult to identify whether such media is authentic, the technology to fight this form of AI manipulation is still in the works.

Can The Loop Be Broken?

Fake news is shared without being vetted and verified, and such proliferation is what creates the need for more. In fact, a Pew Research Center survey found that 10% of respondents admitted to sharing a news story online that they knew was fake, while 49% had shared news that they later found to be false.

What we can do for now is create awareness to combat this propagation of fake news. In other words, we must stop sharing such media to take away its credibility.

Detecting fake news is a complex process that starts with awareness and education. You must verify the source. Quality information is typically fact-checked or peer-reviewed. You should rely on the insights that come from reputable channels or are sourced from trusted research companies.

Now, more people than ever rely on the internet as their main source of information. However, with this medium being easily polluted with a plethora of false information, everything we learn from online sources has to be carefully questioned and evaluated.

Forbes Communications Council is an invitation-only community for executives in successful public relations, media strategy, creative and advertising agencies. Do I qualify?

SVP Marketing of CUJO AI, 13 years experience marketing for tech companies, PhD of Economics, more than 48 articles in economics magazines....