Assignment 4 - Petya Gavrilova

Information Technologies in Organizations

Winter Semester 21/22

Assignment 4

Data analysis

Petya Gavrilova

Fac. № 201218034

Email: petya.gavrilova@fdiba.tu-sofia.bg

MonkeyLearn is a text analysis machine learning tool. It enables users to extract actionable data from

unstructured text. You can recognize topic or sentiment represented in texts such as tweets, chats,

reviews, articles etc. One of MonkeyLearn's standout features is the ability to train a highly trustworthy

Machine Learning model using the user’s specific data in real time.

MonkeyLearn's models are divided into two groups:

Classification: models that take text and return labels or categories.

Extraction: models that extract specific information from a text.

The tool has a wide variety of functionalities, which are described below.

I. Sentiment Analysis

Sentiment analysis (or opinion mining) uses NLP to determine whether data is positive, negative or

neutral. Sentiment analysis is often performed on textual data to help businesses monitor brand and

product sentiment in customer feedback, and understand customer needs.

Since customers express their thoughts and feelings more openly than ever before, sentiment analysis is

becoming an essential tool to monitor and understand that sentiment. Automatically analyzing

customer feedback, such as opinions in survey responses and social media conversations, allows brands

to learn what makes customers happy or frustrated, so that they can tailor products and services to

meet their customers’ needs.

Sentiment analysis is extremely important because it allows businesses to understand the sentiment of

their customers towards their brand. By automatically sorting the sentiment behind social media

conversations, reviews, and more, businesses can make better and more informed decisions.

II.

2. Net Promoter Score

Net promoter score, or NPS, is a customer experience metric designed to gauge customer loyalty by

asking them how likely they are to recommend a company or a product.

NPS’s massive impact on companies’ ability to remain competitive also stems from game theory: NPS is

a zero sum game because customers can only occupy one group. Thus, turning a detractor into a

promoter brings about a double point swing effect per-capita. Situations where this is possible should be

given as much attention as possible.

III.

Voice of Customer

Voice of the customer (VoC) is used to describe how customers talk about а brand, products, and

services. VoC gauges customer expectations across the entire customer journey, allowing businesses to

become more customer-centric, improve products and/or services, and increase customer retention.

Voice of customer helps businesses improve their products or services and inform product

development, so they can refine their offerings into something customers truly feel good about

spending money on.

IV.

Data Cleaning

Data cleaning (also known as data cleansing or data scrubbing) is the process of correcting or removing

corrupt, incorrect, or unnecessary data from a data set (or group of datasets) before data analysis. This

way, you will analyze only relevant data, and your results will be more accurate.

Successful data cleaning measures will ensure that your analysis results are accurate and consistent. We

often hear about the power of data and the need for data-driven decision-making in business. But that

only really works when you use clean data from the outset.

V. Customer Experience

Customer experience is how customers perceive your brand as a whole based on the entirety of their

interaction with your business. Understanding and developing a strategy to improve CX is key to

customer retention and brand promotion for any business.

According to a recent Zendesk study, approximately 75% of customers are willing to spend more with

companies that deliver a good customer experience.

Consequently surveys indicate that customer experience is the number one brand differentiator,

outpacing product quality and all other factors.

Satisfied customers are likely to become loyal customers who spread the word about your brand.

Dissatisfied customers are likely to do the opposite.

VI.

Data Vizualization

Data visualization is the presentation of data or information in a visual format. Visual data is easier for

the human brain to process, allowing it to single out trends and interpret patterns. It serves as the

integral bridge between humans and the increasingly complex data at our disposal - it specifically shapes

data in order to display it in ways that our brains can best process it.

Charts, plots and graphs might not seem to be the most interesting topic right off the bat. Many might

recall early schooling lessons on pie graphs, venn diagrams, or x and y axes, and quickly become bored.

But for business ventures, proper data visualization can be the difference between success and disaster.

VII. Unstructured Data

Unstructured data are datasets that have not been structured in a predefined manner. Unstructured

data is typically textual, like open-ended survey responses and social media conversations, but can also

be non-textual, like images, video, and audio.

Unstructured information is growing quickly due to increased use of digital applications and services.

Some estimates say that 80-90% of company data is unstructured, and it continues to grow at an

alarming rate per year.

While structured data is important, unstructured data is even more valuable to businesses if analyzed

correctly. It can provide a wealth of insights that statistics and numbers just can’t explain.

VIII. Machine Learning

Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to “self-learn”

from training data and improve over time, without being explicitly programmed. Machine learning

algorithms are able to detect patterns in data and learn from them, in order to make their own

predictions. In short, machine learning algorithms and models learn through experience.

In traditional programming, a computer engineer writes a series of directions that instruct a computer

how to transform input data into a desired output. Instructions are mostly based on an IF-THEN

structure: when certain conditions are met, the program executes a specific action.

Machine learning, on the other hand, is an automated process that enables machines to solve problems

with little or no human input, and take actions based on past observations.

IX.

Natural Language Processing

Natural Language Processing (NLP) is a field of Artificial Intelligence (AI) that makes human language

intelligible to machines. NLP combines the power of linguistics and computer science to study the rules

and structure of language, and create intelligent systems (run on machine learning and NLP algorithms)

capable of understanding, analyzing, and extracting meaning from text and speech.

It is used to understand the structure and meaning of human language by analyzing different aspects

like syntax, semantics, pragmatics, and morphology. Then, computer science transforms this linguistic

knowledge into rule-based, machine learning algorithms that can solve specific problems and perform

desired tasks.

X. Word cloud

A word cloud (also known as a tag cloud or text cloud) is a visual representation of a text, in which the

words appear bigger the more often they are mentioned.

Word clouds are great for visualizing unstructured text data and getting insights on trends and patterns.

For example, take a look at this word cloud we created using a free online word cloud generator to

analyze hotel reviews.

XI.

Data Analysis

Data analysis is the process of cleaning, analyzing, interpreting, and visualizing data using various

techniques and business intelligence tools. Data analysis tools help you discover relevant insights that

lead to smarter and more effective decision-making. It focuses on the process of turning raw data into

useful statistics, information, and explanations.

From preparing for worst-case scenarios to improving services and products, all types of data analysis

can help businesses make better decisions and create data-driven strategies

By gaining first-hand insight into what’s wrong and why, leaders can define more effective strategies to

improve processes, prevent problems, detect growth opportunities, and decide where to focus

investments.

XII. Topic Analysis

Topic analysis (also called topic detection, topic modeling, or topic extraction) is a machine learning

technique that organizes and understands large collections of data, by assigning “tags” or categories

according to each individual topic or theme.

Topic analysis uses natural language processing (NLP) to break down human language so that you can

find patterns and unlock semantic structures within texts to extract insights and help make data-driven

decisions.

XIII. Customer Feedback

Customer feedback is the information and opinions your customers leave about your product, service,

or brand.

Often, it’s in the form of survey responses, but you can also find customer feedback in social media

conversations, online reviews, chats, customer support tickets, and more

By gathering and analyzing customer feedback, you can find out which aspects of your business are

working well and which may require improvement.

XIV. Keyword Extractor

Keyword extraction (also known as keyword detection or keyword analysis) is a text analysis technique

that automatically extracts the most used and most important words and expressions from a text. It

helps summarize the content of texts and recognize the main topics discussed.

Keyword extraction uses machine learning artificial intelligence (AI) with natural language processing

(NLP) to break down human language so that it can be understood and analyzed by machines. It’s used

to find keywords from all manner of text: regular documents and business reports, social media

comments, online forums and reviews, news reports, and more.

XV. Text analysis

Text analysis (TA) is a machine learning technique used to automatically extract valuable insights from

unstructured text data. Companies use text analysis tools to quickly digest online data and documents,

and transform them into actionable insights.

You can us text analysis to extract specific information, like keywords, names, or company information

from thousands of emails, or categorize survey responses by sentiment and topic.

XVI. Text classification

Text classification is a machine learning technique that assigns a set of predefined categories to openended

text. Text classifiers can be used to organize, structure, and categorize pretty much any kind of

text – from documents, medical studies and files, and all over the web.

For example, new articles can be organized by topics; support tickets can be organized by urgency; chat

conversations can be organized by language; brand mentions can be organized by sentiment; and so on.

It’s estimated that around 80% of all information is unstructured, with text being one of the most

common types of unstructured data. Because of the messy nature of text, analyzing, understanding,

organizing, and sorting through text data is hard and time-consuming, so most companies fail to use it to

its full potential.

This is where text classification with machine learning comes in. Using text classifiers, companies can

automatically structure all manner of relevant text, from emails, legal documents, social media,

chatbots, surveys, and more in a fast and cost-effective way. This allows companies to save time

analyzing text data, automate business processes, and make data-driven business decisions.

Assignment 4 - Petya Gavrilova

Create successful ePaper yourself

Delete template?

Save as template?