Assignment 4 - Petya Gavrilova
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Information Technologies in Organizations
Winter Semester 21/22
Assignment 4
Data analysis
Petya Gavrilova
Fac. № 201218034
Email: petya.gavrilova@fdiba.tu-sofia.bg
MonkeyLearn is a text analysis machine learning tool. It enables users to extract actionable data from
unstructured text. You can recognize topic or sentiment represented in texts such as tweets, chats,
reviews, articles etc. One of MonkeyLearn's standout features is the ability to train a highly trustworthy
Machine Learning model using the user’s specific data in real time.
MonkeyLearn's models are divided into two groups:
Classification: models that take text and return labels or categories.
Extraction: models that extract specific information from a text.
The tool has a wide variety of functionalities, which are described below.
I. Sentiment Analysis
Sentiment analysis (or opinion mining) uses NLP to determine whether data is positive, negative or
neutral. Sentiment analysis is often performed on textual data to help businesses monitor brand and
product sentiment in customer feedback, and understand customer needs.
Since customers express their thoughts and feelings more openly than ever before, sentiment analysis is
becoming an essential tool to monitor and understand that sentiment. Automatically analyzing
customer feedback, such as opinions in survey responses and social media conversations, allows brands
to learn what makes customers happy or frustrated, so that they can tailor products and services to
meet their customers’ needs.
Sentiment analysis is extremely important because it allows businesses to understand the sentiment of
their customers towards their brand. By automatically sorting the sentiment behind social media
conversations, reviews, and more, businesses can make better and more informed decisions.
II.
2. Net Promoter Score
Net promoter score, or NPS, is a customer experience metric designed to gauge customer loyalty by
asking them how likely they are to recommend a company or a product.
NPS’s massive impact on companies’ ability to remain competitive also stems from game theory: NPS is
a zero sum game because customers can only occupy one group. Thus, turning a detractor into a
promoter brings about a double point swing effect per-capita. Situations where this is possible should be
given as much attention as possible.
III.
Voice of Customer
Voice of the customer (VoC) is used to describe how customers talk about а brand, products, and
services. VoC gauges customer expectations across the entire customer journey, allowing businesses to
become more customer-centric, improve products and/or services, and increase customer retention.
Voice of customer helps businesses improve their products or services and inform product
development, so they can refine their offerings into something customers truly feel good about
spending money on.
IV.
Data Cleaning
Data cleaning (also known as data cleansing or data scrubbing) is the process of correcting or removing
corrupt, incorrect, or unnecessary data from a data set (or group of datasets) before data analysis. This
way, you will analyze only relevant data, and your results will be more accurate.
Successful data cleaning measures will ensure that your analysis results are accurate and consistent. We
often hear about the power of data and the need for data-driven decision-making in business. But that
only really works when you use clean data from the outset.
V. Customer Experience
Customer experience is how customers perceive your brand as a whole based on the entirety of their
interaction with your business. Understanding and developing a strategy to improve CX is key to
customer retention and brand promotion for any business.
According to a recent Zendesk study, approximately 75% of customers are willing to spend more with
companies that deliver a good customer experience.
Consequently surveys indicate that customer experience is the number one brand differentiator,
outpacing product quality and all other factors.
Satisfied customers are likely to become loyal customers who spread the word about your brand.
Dissatisfied customers are likely to do the opposite.
VI.
Data Vizualization
Data visualization is the presentation of data or information in a visual format. Visual data is easier for
the human brain to process, allowing it to single out trends and interpret patterns. It serves as the
integral bridge between humans and the increasingly complex data at our disposal - it specifically shapes
data in order to display it in ways that our brains can best process it.
Charts, plots and graphs might not seem to be the most interesting topic right off the bat. Many might
recall early schooling lessons on pie graphs, venn diagrams, or x and y axes, and quickly become bored.
But for business ventures, proper data visualization can be the difference between success and disaster.
VII. Unstructured Data
Unstructured data are datasets that have not been structured in a predefined manner. Unstructured
data is typically textual, like open-ended survey responses and social media conversations, but can also
be non-textual, like images, video, and audio.
Unstructured information is growing quickly due to increased use of digital applications and services.
Some estimates say that 80-90% of company data is unstructured, and it continues to grow at an
alarming rate per year.
While structured data is important, unstructured data is even more valuable to businesses if analyzed
correctly. It can provide a wealth of insights that statistics and numbers just can’t explain.
VIII. Machine Learning
Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to “self-learn”
from training data and improve over time, without being explicitly programmed. Machine learning
algorithms are able to detect patterns in data and learn from them, in order to make their own
predictions. In short, machine learning algorithms and models learn through experience.
In traditional programming, a computer engineer writes a series of directions that instruct a computer
how to transform input data into a desired output. Instructions are mostly based on an IF-THEN
structure: when certain conditions are met, the program executes a specific action.
Machine learning, on the other hand, is an automated process that enables machines to solve problems
with little or no human input, and take actions based on past observations.
IX.
Natural Language Processing
Natural Language Processing (NLP) is a field of Artificial Intelligence (AI) that makes human language
intelligible to machines. NLP combines the power of linguistics and computer science to study the rules
and structure of language, and create intelligent systems (run on machine learning and NLP algorithms)
capable of understanding, analyzing, and extracting meaning from text and speech.
It is used to understand the structure and meaning of human language by analyzing different aspects
like syntax, semantics, pragmatics, and morphology. Then, computer science transforms this linguistic
knowledge into rule-based, machine learning algorithms that can solve specific problems and perform
desired tasks.
X. Word cloud
A word cloud (also known as a tag cloud or text cloud) is a visual representation of a text, in which the
words appear bigger the more often they are mentioned.
Word clouds are great for visualizing unstructured text data and getting insights on trends and patterns.
For example, take a look at this word cloud we created using a free online word cloud generator to
analyze hotel reviews.
XI.
Data Analysis
Data analysis is the process of cleaning, analyzing, interpreting, and visualizing data using various
techniques and business intelligence tools. Data analysis tools help you discover relevant insights that
lead to smarter and more effective decision-making. It focuses on the process of turning raw data into
useful statistics, information, and explanations.
From preparing for worst-case scenarios to improving services and products, all types of data analysis
can help businesses make better decisions and create data-driven strategies
By gaining first-hand insight into what’s wrong and why, leaders can define more effective strategies to
improve processes, prevent problems, detect growth opportunities, and decide where to focus
investments.
XII. Topic Analysis
Topic analysis (also called topic detection, topic modeling, or topic extraction) is a machine learning
technique that organizes and understands large collections of data, by assigning “tags” or categories
according to each individual topic or theme.
Topic analysis uses natural language processing (NLP) to break down human language so that you can
find patterns and unlock semantic structures within texts to extract insights and help make data-driven
decisions.
XIII. Customer Feedback
Customer feedback is the information and opinions your customers leave about your product, service,
or brand.
Often, it’s in the form of survey responses, but you can also find customer feedback in social media
conversations, online reviews, chats, customer support tickets, and more
By gathering and analyzing customer feedback, you can find out which aspects of your business are
working well and which may require improvement.
XIV. Keyword Extractor
Keyword extraction (also known as keyword detection or keyword analysis) is a text analysis technique
that automatically extracts the most used and most important words and expressions from a text. It
helps summarize the content of texts and recognize the main topics discussed.
Keyword extraction uses machine learning artificial intelligence (AI) with natural language processing
(NLP) to break down human language so that it can be understood and analyzed by machines. It’s used
to find keywords from all manner of text: regular documents and business reports, social media
comments, online forums and reviews, news reports, and more.
XV. Text analysis
Text analysis (TA) is a machine learning technique used to automatically extract valuable insights from
unstructured text data. Companies use text analysis tools to quickly digest online data and documents,
and transform them into actionable insights.
You can us text analysis to extract specific information, like keywords, names, or company information
from thousands of emails, or categorize survey responses by sentiment and topic.
XVI. Text classification
Text classification is a machine learning technique that assigns a set of predefined categories to openended
text. Text classifiers can be used to organize, structure, and categorize pretty much any kind of
text – from documents, medical studies and files, and all over the web.
For example, new articles can be organized by topics; support tickets can be organized by urgency; chat
conversations can be organized by language; brand mentions can be organized by sentiment; and so on.
It’s estimated that around 80% of all information is unstructured, with text being one of the most
common types of unstructured data. Because of the messy nature of text, analyzing, understanding,
organizing, and sorting through text data is hard and time-consuming, so most companies fail to use it to
its full potential.
This is where text classification with machine learning comes in. Using text classifiers, companies can
automatically structure all manner of relevant text, from emails, legal documents, social media,
chatbots, surveys, and more in a fast and cost-effective way. This allows companies to save time
analyzing text data, automate business processes, and make data-driven business decisions.