IP.edited
docx
keyboard_arrow_up
School
NUCES - Lahore *
*We aren’t endorsed by this school
Course
MISC
Subject
Information Systems
Date
Nov 24, 2024
Type
docx
Pages
8
Uploaded by CorporalTitanium10882
7COM1039-0109-2021
REPORT
2023
Introduction
Fake news involves untrue claims that might be independently confirmed. If a country keeps
fibbing to itself. Concerning a certain number or grossly exaggerating the cost of a particular
service, may spark unrest in some countries, as it did during the Arab Spring. In order to handle
challenges like making sure authors are held accountable. The House of Representatives and the
Slap shot project are only two examples of the groups that have emerged. However, their
applicability is limited since they need human manual detection, which is neither accountable nor
feasible in a world where millions of articles are removed or posted every second. The creation
of a dependable automatic index score or rating system might be the solution. In terms of the
credibility of different outlets and the overall news setting.
Background of the research
Rapid change is occurring all throughout the globe. The use of digital technology may provide
several benefits. However, it is not without its flaws. Several obstacles stand in the way of our
further progress in the digital age. False information spreading is one such problem. False
information may quickly become viral online. The goal of spreading rumours about a person or
company is to damage their standing in the public eye. It might be an attack on a specific
political group or just another kind of propaganda. There are several online forums where the
user might spread erroneous information. This includes popular networking platforms like
Facebook and Twitter. The computational intelligence subfield of machine intelligence helps
bring to life something which is both versatile and quick to learn. There are many other kinds of
Ml algorithms to choose from, including supervised, unsupervised, and reinforced varieties (Paka
et al., 2021). Data sets referred known as train sets of data are utilised to educate the systems.
Once these algorithms are created, they may be implemented in a variety of settings. Machine
learning has many different uses and is used in many different fields. Most of the time, ml
algorithms were employed to make predictions or find something that was previously concealed.
Aim of the Project
The primary objective of the research is to identify textual features that may be used to identify fraudulent news reports from legitimate ones.
Research Questions
How exactly can machine learning assist in the identification of fake news?
How can AI be used to identify false information on social media, and what steps can be taken to
stop its spread?
What kind of effects does false news have on the country?
How are the classifiers that are used by machine learning trained to recognise fake news?
Objectives
In order to have a grasp on the notion of detecting false news
to comprehend the detrimental impact that false news has had on both the general people and political parties.
to get an understanding of the primary obstacles that must be overcome in order to identify false news using machine learning.
Description of the Idea
People can easily get news thanks to the proliferation of online channels. The problem with this,
though, is that it gives crooks online the opportunity to spread fake news via these sites. This
piece of news could have an impact on an individual or on society. People trust the news even
when there is no evidence. The technique of identifying fake news is not a simple one. It is
possible that the false information will swiftly spread, and everyone will unquestioningly accept
it if it is not identified as such as soon as possible. Individuals, institutions, and political parties
are all potential targets for fake news. According to Dou et al.'s 2021 study, people's opinions
and voting decisions in the 2016 US election were influenced by fake news.
Researchers from a variety of academic institutions are attempting to detect fake news. The use
of machine learning is assisting in this endeavour. The researchers use multiple different
algorithms in order to spot bogus news. According to the findings of Wani et al. (2021),
identifying false news is challenging. They used machine learning in order to detect fake news.
According to Pérez-Rosas et al. (2017), the prevalence of fake news is growing. Therefore, it is
essential to be able to recognise fake news. This is something that can be taught to machine
learning algorithms. After being instructed, algorithms that learn via machine learning will
automatically spot fake news.
Theory
Social Media and Fake News
The term "social media" is used to describe a wide range of online resources, including
discussion forums, social networking sites, microblogging platforms, social bookmarking sites,
and wikis. On the other hand, there is an alternative school of thought that holds that misleading
news is the result of inadvertent challenges like educational shock or accidental behaviours like
what happened in the Nepal earthquake instance. In the year 2020, the health of the global
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
population was threatened by the widespread dissemination of false health news. Early in 2020,
the World Health Organisation (WHO) issued a warning, indicating that the COVID-19 outbreak
has sparked a major 'infodemic,' or rise in legitimate and false news, including a substantial
quantity of misinformation (Wani et al., 2021).
Most people you encounter on websites such as Facebook are real; nevertheless, some may be
malicious trolls whose only goal is to spread disinformation. According to the FBI, social bots,
spammers, and cybernetic users make up the three most prevalent types of fake news
contributors. Due to the low barrier to entry presented by social media platforms, malicious user
accounts are not deterred from being created. Social media accounts managed by a computer
programme rather than a human being or humans are known as "social bots" (Reis et al., 2019).
A social bot is a computer software that can post to various social networking sites and interact
with other users without human intervention. The way social bots are built may completely
change whether or not they are harmful. The possibility for a social bot to be particularly
malicious and significantly add to the spread of misleading information on social media is
greatly increased if it is designed solely for the purpose of causing harm. According to research,
"social bots tilted the 2016 United States campaign discussions on an immense scale," with
approximately 19 million bot identities tweeting favourable reviews of either Clinton or Trump
in the week leading up to election day.
Ai detection in fake news
Facebook has implemented a set of policies and features designed to reduce the spread of false
information. Moreover, content that has been evaluated by independent fact-checkers. We need
to add caution and provide more background information, while also decreasing their circulation.
Inaccurate information that might cause immediate harm should also be eliminated. To expand
these efforts, however, we need a reliable way to detect potentially misleading content in new
postings and alert impartial fact-checkers. Then get a job identifying new variants automatically,
so that human fact-checkers may focus on reviewing new material. More than 26 million items
viewed by people in the United States on Facebook between March 1 and Election Day were
flagged after being debunked by third-party Politifact. In addition to flagging possible review
items, our AI algorithms can also automatically identify fresh cases of original signal deception.
We recognise that our methods are not perfect, but we are making strides in the right direction
(Shu et al., 2019).
A second method for identifying false news involves checking the reliability of the article's
claims. Regardless matter whether the text was generated by a human or a computer. Services
like Snopes, Politifact, and FactCheck all utilise human editors to check claims by doing
investigations and making direct contact with reliable sources. Artificial intelligence is being
progressively used by these businesses to aid in the processing of enormous data sets.
Facts are written in one way, while lies are written in another. Researchers are using an archive
of April Fools' Day hoaxes published over a span of 14 years to teach AI systems to discern
between reality and fiction.
One alternative method relies on assigning a rating to each news source, which is then used to
validate the data. For a claim to be accepted as true by the general public, it must first be
supported by other resources that have earned high marks. The Trust Project, for instance,
analyses a news organization's ethical standards, sources and methodology, corrections
procedures, and other aspects to establish its credibility.
Methodology
The method of classification used in this analysis is discussed here. In light of the
aforementioned paradigm, we build a method for detecting fake news stories. In this method,
supervised machine learning is used to classify the dataset. Starting with gathering datasets, the
next step in this classification challenge is features selection, followed by developing and
evaluating the dataset, and finally classifier execution. The approach relies heavily on the
aforementioned algorithms, which are only one part of several different designs of experiments
applied to a dataset. Examples of classifiers include random forests, support vector machines,
naive Bayes, majority voting, and more. Each algorithm undergoes its own set of tests in
isolation. To maximise accuracy and precision, it may be used in tandem with other algorithms
(Reis et al., 2019).
Kaggle was used to get these data sets. Both a fake news dataset and a real news dataset are at
your disposal. The true news category has 21417 stories, whereas the fake news category
contains 23481 stories. In the label column of both data sets, 1 indicates fake news and 0
indicates real news. We combined the two data sets with the help of a pre-existing statistics
method.
Throughout the course of the model's creation, we employed the following method to detect
political disinformation: The initial step is to collect data sets on political news. Then, some
quick processing to cut down on the noise. The next phase involves doing POS and feature
selection with the help of the NLTK. After conducting the dataset splitting, develop a
classification method for the proposed classifier. Utilising machine learning techniques (such as
Naive Bayes and Random Forests). Using the NLTK, we can essentially preliminary process the
dataset in the computer, and then use the resulting message to apply algorithms to the trained
subset of the information (Fig). Incorporating N.B. and Random forests into the way the system
responds to messages, the resulting modelling is utilised to inform future decisions. Once all of
the tests have been run and the results verified, the model's veracity is double-checked to make
sure it passes muster. After the user makes a selection, the approach is applied to information it
has never seen before. After constructing a dataset where half of the data is fabricated and the
other half is genuine, the model's first reset accuracy is 50%. We randomly choose data from
both datasets, merge the findings to build our full dataset, and use the remaining 20% as a test set
once our model is complete. To use a classifier on text data, we must first eliminate extraneous
information by doing POS (Part of Speech) preparation and tokenization using Boston College
NLP (Natural Language Processing). After that, we need to encode the generated data as integers
and image processing values so it can be read by the appropriate software. Artificial intelligence
algorithms use it as a data input (Pérez-Rosas et al., 2017).
Fig: Fake Detector Model Source: (Pérez-Rosas et al., 2017)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Plan to Conduct the Research
The whole research endeavour may be planned and executed in seven weeks once the study
begins. Only secondary information and data from reliable sources will be used to write the
report. The journals, magazines, and newspapers that will be used for this research have all been
reviewed and found to be reliable. Gantt charts will be used to illustrate the weekly planning and
execution of the various tasks.
Reference
Dou, Y., Shu, K., Xia, C., Yu, P.S., Sun, L., 2021. User preference-aware fake news detection, in
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in
Information Retrieval. pp. 2051–2055.
Paka, W.S., Bansal, R., Kaushik, A., Sengupta, S., Chakraborty, T., 2021. Cross-SEAN: A cross-
stitch semi-supervised neural attention model for COVID-19 fake news detection. Appl. Soft
Comput. 107, 107393.
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., Mihalcea, R., 2017. Automatic detection of fake
news. ArXiv Prepr. ArXiv170807104.
Reis, J.C., Correia, A., Murai, F., Veloso, A., Benevenuto, F., 2019. Supervised learning for fake
news detection. IEEE Intell. Syst. 34, 76–81.
Shu, K., Zhou, X., Wang, S., Zafarani, R., Liu, H., 2019. The role of user profiles for fake news
detection, in Proceedings of the 2019 IEEE/ACM International Conference on Advances in
Social Networks Analysis and Mining. pp. 436–439.
Wani, A., Joshi, I., Khandve, S., Wagh, V., Joshi, R., 2021. Evaluating deep learning approaches
for covid19 fake news detection, in International Workshop on Combating OnLine Ho St Ile
Posts in Regional Languages Dur Ing Emerge Ncy Si Tuition. Springer, pp. 153–163.
Zhou, X., Zafarani, R., 2020. A survey of fake news: Fundamental theories, detection methods,
and opportunities. ACM Comput. Surv. CSUR 53, 1–40.