Trending February 2024 # Sentiment Analysis Using Transformers – Part I # Suggested March 2024 # Top 9 Popular

You are reading the article Sentiment Analysis Using Transformers – Part I updated in February 2024 on the website Flu.edu.vn. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested March 2024 Sentiment Analysis Using Transformers – Part I

This article was published as a part of the

Data Science Blogathon

.

Introdu

This article studies binary sentiment analysis on the IMDb movie reviews dataset. The dataset has 25000 positive and negative reviews in the training set and 25000 positive and negative reviews in the test set.

The image below shows the number of unique reviews and unique sentiment values in the dataset. The movie reviews are classified as having either a positive sentiment or a negative sentiment.

The image below takes a peek at four reviews and their target sentiments. As can be seen from the keywords of the first three reviews  – hooked, wonderful, unassuming, wonderful – lend the review a positive connotation. The keywords for the fourth review are difficult to find. However, the absence of strongly positive keywords in the first sentence combined with the keyword – zombie – indicates negative sentiment. Based on this preliminary dataset assessment, we proceed to the background and the theory behind text sentiment analysis.

Natural Language Processing (NLP)

Natural language processing is the study of languages using computational-statistical and/or mathematical-logical reasoning to extract meaningful responses from written – machine printed, typewritten, or handwritten –  text.

As shown in the image below, natural language processing or computational linguistics can be broadly classified into the following five categories:

1. Machine translation

2. Information retrieval

3. Sentiment analysis

4. Information Extraction

5. Question Answering

Machine translation: This task involves classifying text from one language to another. Speech processing tools can be used to convert spoken language to written text before translating the text. Encoder-decoder models are prevalent in this area of study. An example of a machine translation system is Google translate.

Information retrieval: Systems implementing web search engines typically employ this field of study. Keywords corresponding to web pages are indexed. The top few relevant documents are returned using these keywords based on keyword matching with the query string. An example of a web search engine is the Microsoft Bing search engine.

Sentiment analysis: This field of study involves analyzing the emotions expressed by the author of a piece of text. The feelings can be positive or negative, sadness, anger, happiness, etc. This field of Natural Language Processing is the topic of our study in this article.

Information extraction: This field is employed for extracting relevant and essential information from typed, machine-printed, or handwritten text. This field of study is common in the Intelligent document processing (IDP) industry. Typical examples include carrying out OCR on documents like ID cards and invoices and using NLP to extract relevant information like names and amounts from these documents.

Question answering: This field is concerned with answering questions based on the paragraph or text. An ML model is trained on textual data containing answers to questions. A typical example of this kind of model is the BERT question-answering model. BERT is a transformer-based model where an encoder encrypts the textual data in an n-dimensional vector space while a decoder decrypts the encoded data.

A use case of this field of study can be finding answers to questions like “What is the LIBOR rate?” from a given contract document. We start by extracting the relevant top-N (top-ranked) candidate paragraphs for the given question from the provided contract document using keyword-based queries. This can be achieved by open-source tools such as Elasticsearch or Solr. These paragraphs can then be passed to the Bert question-answering model to find answers specific to the question.

Sentiment Analysis

Sentiment analysis is the study of identifying the emotions attached to the given text. These emotions give additional information about the attitude of the writer of the text towards the object of the text. These emotions can be of various types – positive, negative, neutral, angry, happy, sad, etc. The intensity of these emotions is determined by a polarity score, which is beyond the scope of this study.

As seen in the cover image (at the top), sentiment analysis is generally carried out using:

1. Knowledge-based systems

2. Statistical systems

3. Hybrid approaches

4. Classification

Knowledge-based systems: These systems typically employ grammar, syntax, and word-meaning rules as a knowledge dictionary to construct hand-crafted features for the dataset. These features can then be used to build an NLP classifier for categorizing the text into target sentiments.

Statistical systems: These systems employ statistical measures to construct features while training. Statistical features include uni-gram, bi-gram, and tri-gram probability statistics and word embedding in a multi-dimensional vector space associated with a given target sentiment. These statistical features can then be used to build a probabilistic model to predict the sentiment for the given text.

Hybrid approaches: These approaches combine hand-crafted features with statistical features, which can then be used to train a Machine learning model as an NLP classifier.

Classification: This step usually involves a machine learning classifier trained on the given text data and used to predict the test data.

Sentiment Analysis Workflow

The typical workflow for the sentiment analysis task is depicted in the figure below. The various steps are:

Text input: This step involves ingesting text for the sentiment analysis application. The text can be obtained by Optical character recognition techniques when applicable, e.g., when *PDF and image files are uploaded.

Tokenization: Tokenization involves splitting the text into individual words or tokens.

Stop word filtering: This phase removes frequently occurring words in the English language as they do not provide distinguishing features to the text being analyzed.

Negation handling: This phase involves finding negations and reversing the polarity of the words in the vicinity. For E.g. “I am not happy today” can be misconstrued as a positive sentiment-bearing sentence. So we detect the negation token “not” and reverse the polarity of all (3 to 6) the words in its neighborhood.

Stemming: Stemming involves finding the root of the tokens in the given text. This is achieved by removing the last few alphabets in the word, e.g., “representation” and “represented” will be converted to “represent.” Stemming can result in words without meaning, too, e.g., “analysis” and “analyze” can result in “analysis.”

Classification: We convert the tokens obtained from the preprocessing steps to feature vectors. These feature vectors are then used to build a machine learning classifier to classify the text as either positive or negative.

Sentiment class: The output obtained from the classification step is input in downstream tasks.

Challenges in Sentiment Analysis

As seen in the figure below, the challenges in sentiment analysis are:

Tone: The text can contain an underlying emotion of the author, e.g., anger, sadness, etc. This expression will not be explicit in general and usually involves reading the subtext. This is called the text’s tone and requires thorough training on sufficient data of a good quality to model the behavior in sentiment analysis software.

Polarity: Sometimes, the polarity (positive, neutral, or negative sentiment) of a sentence is context-dependent. E.g., “I like ice-creams” is positive for most people, while the same sentence is negative for older people. Identifying polarity and, thus, the sentiment becomes challenging in such context-dependent scenarios.

Emojis: Emojis in a text convey the emotions the author feels towards a topic and the emotions the author wants to convey to the reader. Purely text-based methods can fail to analyze these sentiments.

Idioms: Authors use idioms to convey a meaning different from the meanings of the individual words.  Such text elements need context-dependent analysis and are challenging to handle in sentiment analysis software.

Negations: As discussed earlier, negations change the polarity of the words in their vicinity. Provisions should be made for negation handling in Sentiment-analysis software.

Comparatives: Comparatives are used to compare two or more objects, and the sentiment analysis model needs to know the root words and their comparative forms and usage.

Bias: Bias may be introduced in the system while annotating data for training purposes and making assumptions in formulating the ML model for sentiment analysis.

Multilingual data: Multi-lingual data is common in English language text. Such data should be handled carefully in sentiment analysis software. An example of multi-lingual data is “joie de vivre” in English language sentences.

Conclusion

This brings us to the end of the article. We started by introducing the IMDb movie reviews dataset for sentiment analysis. The dataset contains 25000 positive and negative reviews in the train set and 25000 positive and negative reviews in the test set.

We introduced Natural language processing and studied the five prominent use cases in NLP viz  Machine translation, Information retrieval, Sentiment analysis, Information Extraction, and Question Answering.

Next, we studied the theory behind sentiment analysis and the approaches used in solving the sentiment analysis problem. The commonly used approaches are knowledge-based, statistical, and hybrid. These approaches involve a classification step.

We concluded the article by studying the sentiment analysis workflow and the challenges faced in building sentiment analysis software.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

You're reading Sentiment Analysis Using Transformers – Part I

Customer Segmentation Using Rfm Analysis

This article was published as a part of the Data Science Blogathon

Before starting, let’s see what is RFM and why is it important.

Introduction: What is RFM?

RFM is a method used to analyze customer value. RFM stands for RECENCY, Frequency, and Monetary.

RECENCY: How recently did the customer visit our website or how recently did a customer purchase?

Frequency: How often do they visit or how often do they purchase?

Monetary: How much revenue we get from their visit or how much do they spend when they purchase?

For example, if we see the sales data in the last 12 months, the RFM will look something like below

Why is it needed?

RFM Analysis is a marketing framework that is used to understand and analyze customer behaviour based on the above three factors RECENCY, Frequency, and Monetary.

The RFM Analysis will help the businesses to segment their customer base into different homogenous groups so that they can engage with each group with different targeted marketing strategies.

RFM on Adventure works database:

Now, let’s start the real part. For this, I chose the Adventure works database that is available publicly

Adventure Works Cycles a multinational manufacturing company. The company manufactures and sells metal and composite bicycles to North American, European, and Asian commercial markets.

The database contains many details. But, I am just concentrating on the Sales details to get RFM values and segment the customers based on RFM values.

5NF Star Schema:

We have to identify the dimension tables and fact tables from the database based on our requirements.

I have prepared 5NF Star Schema (Fact, Customer, Product, Date, Location) from the database imported

Join the tables :

From the above tables, we can write an SQL query to Join all the tables and get the necessary data.

   SELECT pc.[EnglishProductCategoryName] ,Coalesce(p.[ModelName], p.[EnglishProductName]) ,CASE WHEN Month(GetDate()) < Month(c.[BirthDate]) THEN DateDiff(yy,c.[BirthDate],GetDate()) - 1 WHEN Month(GetDate()) = Month(c.[BirthDate]) AND Day(GetDate()) < Day(c.[BirthDate]) THEN DateDiff(yy,c.[BirthDate],GetDate()) - 1 ELSE DateDiff(yy,c.[BirthDate],GetDate()) END ,CASE WHEN c.[YearlyIncome] < 40000 THEN 'Low' ELSE 'Moderate' END ,d.[CalendarYear] ,f.[OrderDate] ,f.[SalesOrderNumber] ,f.SalesOrderLineNumber ,f.OrderQuantity ,f.ExtendedAmount FROM [dbo].[FactInternetSales] f, [dbo].[DimDate] d, [dbo].[DimProduct] p, [dbo].[DimProductSubcategory] psc, [dbo].[DimProductCategory] pc, [dbo].[DimCustomer] c, [dbo].[DimGeography] g, [dbo].[DimSalesTerritory] s where f.[OrderDateKey] = d.[DateKey] and f.[ProductKey] = p.[ProductKey] and p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey] and psc.[ProductCategoryKey] = pc.[ProductCategoryKey] and f.[CustomerKey] = c.[CustomerKey] and c.[GeographyKey] = g.[GeographyKey] and g.[SalesTerritoryKey] = s.[SalesTerritoryKey] order by c.CustomerKey

Pull the table to an excel sheet or CSV file. Bingo. Now you have the data to do RFM Analysis in python.

That’s all about SQL. 🙂

Calculating R, F, and M values in Python:

From the sales data we have, we calculate RFM values in Python and Analyze the customer behaviour and segment the customers based on RFM values.

I will be doing the analysis in the Jupyter notebook.

Read the data

aw_df = pd.read_excel('Adventure_Works_DB_2013.xlsx') aw_df.head()

It should look something like below.

CustomerKey EnglishProductCategoryName Model Country Region Age IncomeGroup CalendarYear OrderDate OrderNumber LineNumber Quantity Amount

11000 Bikes Mountain-200 Australia Pacific 49 High 2013 18-01-2013 SO51522 1 1 2319.99

11000 Accessories Fender Set – Mountain Australia Pacific 49 High 2013 18-01-2013 SO51522 2 1 21.98

11000 Bikes Touring-1000 Australia Pacific 49 High 2013 03-05-2013 SO57418 1 1 2384.07

11000 Accessories Touring Tire Australia Pacific 49 High 2013 03-05-2013 SO57418 2 1 28.99

11000 Accessories Touring Tire Tube Australia Pacific 49 High 2013 03-05-2013 SO57418 3 1 4.99

Check for Null Values or missing values:

aw_df.isnull().sum()

Exploratory Data Analysis:

Once you are good with the data, we are good to start doing Exploratory Data Analysis aka. EDA

Now, let’s check how much are sales happened for each product category and how many quantities each category is being sold.

we will check them using barplot.

product_df = aw_df[['EnglishProductCategoryName','Amount']] product_df1 = aw_df[['EnglishProductCategoryName','Quantity']] product_df.groupby("EnglishProductCategoryName").sum().plot(kind="bar",ax=axarr[0]) product_df1.groupby("EnglishProductCategoryName").sum().plot(kind="bar",ax=axarr[1])

We can see, Bikes account for huge revenue generation even though accessories are being sold in high quantity. This might be because the cost of Bikes will be higher than the cost of Accessories.

Similarly, we can check which region has a higher customer base.

fig, axarr = plt.subplots(1, 2,figsize = (15,6)) Customer_Country = aw_df1.groupby('Country')['CustomerKey'].nunique().sort_values(ascending=False).reset_index().head(11) sns.barplot(data=Customer_Country,x='Country',y='CustomerKey',palette='Blues',orient=True,ax=axarr[0]) Customer_Region = aw_df1.groupby('Region')['CustomerKey'].nunique().sort_values(ascending=False).reset_index().head(11) sns.barplot(data=Customer_Region,x='Region',y='CustomerKey',palette='Blues',orient=True,ax=axarr[1]) Calculate R, F, and M values:

Recency

The reference date we have is 2013-12-31.

df_recency = aw_df1 df_recency = df_recency.groupby(by='CustomerKey',as_index=False)['OrderDate'].max() df_recency.columns = ['CustomerKey','max_date']

The difference between the reference date and maximum date in the dataframe for each customer(which is the recent visit) is Recency 

df_recency['Recency'] = df_recency['max_date'].apply(lambda row: (reference_date - row).days) df_recency.drop('max_date',inplace=True,axis=1) df_recency[['CustomerKey','Recency']].head()

We get the Recency values now.

CustomerKey Recency

0 11000 212

1 11001 319

2 11002 281

3 11003 205

Recency plot

plt.figure(figsize=(8,5)) sns.distplot(df_recency.Recency,bins=8,kde=False,rug=True)

We can see the customers who come within last 2 months are more and there are some customers that didn’t order more than a year. This way we can identify the customer and target them differently. But it is too early to say with only Recency value.

Frequency:

We can get the Frequency of the customer by summing up the number of orders.

df_frequency = aw_df1 #df_frequency = df_frequency.groupby(by='CustomerKey',as_index=False)['OrderNumber'].nunique() df_frequency.columns = ['CustomerKey','Frequency'] df_frequency.head()

They should look something like below

CustomerKey Frequency

11000 5

11001 6

11002 2

11003 4

11004 3

Frequency plot

plt.figure(figsize=(8,5)) sns.distplot(df_frequency,bins=8,kde=False,rug=True)

We can see the customers who order 2 times are more and then we see who orders 3 times. But there is very less number of customers that orders more than 5 times.

Now, it’s time for our last value which is Monetary.

Monetary can be calculated as the sum of the Amount of all orders by each customer.

df_monetary = aw_df1 df_monetary = df_monetary.groupby(by='CustomerKey',as_index=False)['Amount'].sum() df_monetary.columns = ['CustomerKey','Monetary'] df_monetary.head()

Customer Key Monetary

0 11000 4849

1 11001 2419.93

2 11002 2419.06

3 11003 4739.3

4 11004 4796.02

Monetary Plot

plt.figure(figsize=(8,5)) sns.distplot(df_monetary.Monetary,kde=False,rug=True)

We can clearly see, the customers spend is mostly less than 200$. This might be because they are buying more accessories. This is common since we buy Bikes once or twice a year but we buy accessories more.

We cannot come to any conclusion based on taking only Recency or Frequency or Monetary values independently. we have to take all 3 factors.

Let’s merge the Recency, Frequency, and Monetary values and create a new dataframe

r_f_m = r_f.merge(df_monetary,on='CustomerKey')

CustomerKey Recency LineNumber Monetary

0 11000 212 5 4849

1 11001 319 6 2419.93

2 11002 281 2 2419.06

3 11003 205 4 4739.3

4 11004 214 3 4796.02

Scatter Plot:

When we have more than two variables, we choose a scatter plot to analyze.

Recency Vs frequency

plt.scatter(r_f_m.groupby('CustomerKey')['Recency'].sum(), aw_df1.groupby('CustomerKey')['Quantity'].sum(), color = 'red', marker = '*', alpha = 0.3) plt.title('Scatter Plot for Recency and Frequency') plt.xlabel('Recency') plt.ylabel('Frequency')

We can see the customers whose Recency is less than a month have high Frequency i.e the customers buying more when their recency is less.

Frequency Vs Monetary

market_data = aw_df.groupby('CustomerKey')[['Quantity', 'Amount']].sum() plt.scatter(market_data['Amount'], market_data['Quantity'], color = 'red', marker = '*', alpha = 0.3) plt.title('Scatter Plot for Monetary and Frequency') plt.xlabel('Monetary') plt.ylabel('Frequency')

We can see, customers buying frequently are spending less amount. This might be because we frequently buy Accessories which are less costly.

Recency Vs Frequency Vs Monetary

Monetary = aw_df1.groupby('CustomerKey')['Amount'].sum() plt.scatter(r_f_m.groupby('CustomerKey')['Recency'].sum(), aw_df1.groupby('CustomerKey')['Quantity'].sum(), marker = '*', alpha = 0.3,c=Monetary) plt.title('Scatter Plot for Recency and Frequency') plt.xlabel('Recency') plt.ylabel('Frequency')

Now, in the above plot, the color specifies Monetary. From the above plot, we can say the customers whose Recency is less have high Frequency but less Monetary.

This might vary from case to case and company to company. That is why we need to take all the 3 factors into consideration to identify customer behavior.

How do we Segment:

We can bucket the customers based on the above 3 Factors(RFM). like, put all the customers whose Recency is less than 60 days in 1 bucket. Similarly, customers whose Recency is greater than 60 days and less than 120 days in another bucket. we will apply the same concept for Frequency and Monetary also.

Depending on the Company’s objectives, Customers can be segmented in several ways. so that it is financially possible to make marketing campaigns.

The ideal customers for e-commerce companies are generally the most recent ones compared to the date of study(our reference date) who are very frequent and who spend enough.

Based on the RFM Values, I have assigned a score to each customer between 1 and 3(bucketing them). 3 is the best score and 1 is the worst score.

Ex: A

Customer who bought most recently and most often, and spent the most,

his RFM score is 3–3–3

To achieve this, we can write a simple code in python as below

Bucketing Recency:

def R_Score(x): if x['Recency'] 60 and x['Recency'] <=120: recency = 2 else: recency = 1 return recency r_f_m['R'] = r_f_m.apply(R_Score,axis=1)

Bucketing Frequency

def F_Score(x): if x['LineNumber'] 3 and x['LineNumber'] <=6: recency = 2 else: recency = 1 return recency r_f_m['F'] = r_f_m.apply(F_Score,axis=1)

Bucketing Monetary

M_Score = pd.qcut(r_f_m['Monetary'],q=3,labels=range(1,4))

r_f_m = r_f_m.assign(M = M_Score.values)

Once we bucket all of them, our dataframe looks like below

CustomerKey Recency LineNumber Monetary R F M

0 11000 212 5 4849 1 2 3

1 11001 319 6 2419.93 1 2 3

2 11002 281 2 2419.06 1 3 3

3 11003 205 4 4739.3 1 2 3

4 11004 214 3 4796.02 1 3 3

R-F-M Score

Now, let’s find the R-F-M Score for each customer by combining each factor.

def RFM_Score(x): return str(x['R']) + str(x['F']) + str(x['M']) r_f_m['RFM_Score'] = r_f_m.apply(RFM_Score,axis=1)

CustomerKey Recency LineNumber Monetary R F M RFM_Score

0 11000 212 5 4849 1 2 3 123

1 11001 319 6 2419.93 1 2 3 123

2 11002 281 2 2419.06 1 3 3 133

3 11003 205 4 4739.3 1 2 3 123

4 11004 214 3 4796.02 1 3 3 133

Now, we have to identify some key segments.

If the R-F-M score of any customer is 3-3-3. His Recency is good, frequency is more and Monetary is more. So, he is a Big spender. 

Similarly, if his score is 2-3-3, then his Recency is better and frequency and monetary are good. This customer hasn’t purchased for some time but he buys frequently and spends more.

we can have something like the below for all different segments

Now, we just have to do this in python. don’t worry, we can do it pretty easily as below.

segment = [0]*len(r_f_m) best = list(r_f_m.loc[r_f_m['RFM_Score']=='333'].index) lost_cheap = list(r_f_m.loc[r_f_m['RFM_Score']=='111'].index) lost = list(r_f_m.loc[r_f_m['RFM_Score']=='133'].index) lost_almost = list(r_f_m.loc[r_f_m['RFM_Score']=='233'].index) for i in range(0,len(r_f_m)): if r_f_m['RFM_Score'][i]=='111': segment[i]='Lost Cheap Customers' elif r_f_m['RFM_Score'][i]=='133': segment[i]='Lost Customers' elif r_f_m['RFM_Score'][i]=='233': segment[i]='Almost Lost Customers' elif r_f_m['RFM_Score'][i]=='333': segment[i]='Best Customers' else: segment[i]='Others' r_f_m['segment'] = segment

CustomerKey Recency LineNumber Monetary R F M RFM_Score segment

0 11000 212 5 4849 1 2 3 123 Spenders

1 11001 319 6 2419.93 1 2 3 123 Spenders

2 11002 281 2 2419.06 1 3 3 133 Customers

3 11003 205 4 4739.3 1 2 3 123 Spenders

4 11004 214 3 4796.02 1 3 3 133 Customers

5 11005 213 4 4746.34 1 2 3 123 Spenders

Now, lest plot a bar plot to identify the customer base for each segment.

Recommendations:

Based on the above R-F-M score, we can give some Recommendations.

Best Customers: We can Reward them for their multiples purchases. They can be early adopters to very new products. Suggest them “Refer a friend”. Also, they can be the most loyal customers that have the habit to order.

Lost Cheap Customers: Send them personalized emails/messages/notifications to encourage them to order.

Big Spenders: Notify them about the discounts to keep them spending more and more money on your products

Loyal Customers: Create loyalty cards in which they can gain points each time of purchasing and these points could transfer into a discount

This is how we can target a customer based on the customer segmentation which will help in marketing campaigns. Thus saving marketing costs, grab the customer, make customers spend more thereby increasing the revenue.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Related

Regression Analysis And The Best Fitting Line Using C++

Introduction

Regression Analysis is the most basic form of predictive analysis.

In Statistics, linear regression is the approach of modeling the relationship between a scalar value and one or more explanatory variables.

In Machine learning, Linear Regression is a supervised algorithm. Such an algorithm predicts a target value based on independent variables.

More About Linear Regression and Regression Analysis

In Linear Regression / Analysis the target is a real or continuous value like salary, BMI, etc. It is generally used to predict the relationship between a dependent and a bunch of independent variables. These models generally fit a linear equation, however, there are other types of regression as well including higher-order polynomials

Before fitting a linear model on the data, it is necessary to check if the data points have linear relationships between them. This is evident from their scatterplots. The goal of the algorithm/model is to find the best-fitting line.

In this article, we are going to explore Linear Regression Analysis and its implementation using C++.

The linear regression equation is in the form of Y = c + mx , where Y is the target variable and X is the independent or explanatory parameter/variable. m is the slope of the regression line and c is the intercept. Since this is a 2-dimensional regression task, the model tries to find the line of best fit during training. It is not necessary that all the points exactly line on the same line. Some of the data points may lie on the line and some scattered around it. The vertical distance between the line and the data point is the residual. This can be either negative or positive based on whether the point lies below or above the line. Residuals are the measure of how well the line fits the data. The algorithm continues to minimize the total residual error.

The residual for each observation is the difference between predicted values of y(dependent variable) and observed values of y

$$mathrm{Residual: =: actual: y: value:−:predicted: y: value}$$

$$mathrm{ri:=:yi:−:y’i}$$

The most common metric for evaluating linear regression model performance is called root mean squared error, or RMSE. The basic idea is to measure how bad/erroneous the model’s predictions are when compared to actual observed values.

So, a high RMSE is “bad” and a low RMSE is “good”

RMSE error is given as

$$mathrm{RMSE:=:sqrt{frac{sum_i^n=1:(yi:-:yi’)^2}{n}}}$$

Implementation using C++

using namespace std

;

int

main

(

)

{

int

n

,

i

;

float

x

[

N

]

,

y

[

N

]

,

sum_x

=

0

,

sum_x2

=

0

,

sum_y

=

0

,

sum_xy

=

0

,

a

,

b

;

/

*

Input

*

/

cout

<<

"Please enter the number of data points.."

;

cout

<<

"Enter data:"

<<

endl

;

for

(

i

=

1

;

i

<=

n

;

i

+

+

)

{

cout

<<

"x["

<<

i

<<

"] = "

;

cout

<<

"y["

<<

i

<<

"] = "

;

}

/

*

Calculating Required Sum

*

/

for

(

i

=

1

;

i

<=

n

;

i

+

+

)

{

sum_x

=

sum_x

+

x

[

i

]

;

sum_x2

=

sum_x2

+

x

[

i

]

*

x

[

i

]

;

sum_y

=

sum_y

+

y

[

i

]

;

sum_xy

=

sum_xy

+

x

[

i

]

*

y

[

i

]

;

}

/

*

Calculating a

and

b

*

/

b

=

(

n

*

sum_xy

-

sum_x

*

sum_y

)

/

(

n

*

sum_x2

-

sum_x

*

sum_x

)

;

a

=

(

sum_y

-

b

*

sum_x

)

/

n

;

/

*

Displaying value of a

and

b

*

/

cout

<<

"Calculated value of a is "

<<

a

<<

"and b is "

<<

b

<<

endl

;

cout

<<

"Equation of best fit line is: y = "

<<

a

<<

" + "

<<

b

<<

"x"

;

return

(

0

)

;

}

Output Please enter the number of data points..5 Enter data: x[1] = 2 y[1] = 5 x[2] = 5 y[2] = 7 x[3] = 2 y[3] = 6 x[4] = 8 y[4] = 9 x[5] = 2 y[5] = 7 Calculated value of a is 4.97917 and b is 0.479167 Equation of best fit line is: y = 4.97917 + 0.479167x Conclusion

Regression Analysis is a very simple yet powerful technique for predictive analysis both in Machine Learning and Statistics. The idea lies in its simplicity and underlying linear relationships between independent and target variables.

How Do I Know Which Chrome Tab Is Using The Most Memory?

Do you want to know which Chrome tab is using the most memory on your Windows computer? Chrome is quite notorious for taking up a lot of system resources. In fact, if you’ve several tabs open in Chrome and you take a look in the Task Manager, it may show you 100% RAM, CPU, or Disk usage by Chrome.

To deal with this issue, Google is constantly rolling out new features. It has recently introduced 2 such experimental features, known as the Energy Saver mode and the Memory Saver mode, to optimize the browser for using less battery and RAM on laptops, tablets, and other battery-operated devices.

However, despite these features, it sometimes becomes important to manually close tabs to free up system resources. But the question remains – how would you know which tab is using the most memory, so that you may close those tabs and reclaim system resources? Well, this post is going to guide you on how to identify such tabs.

How do I know which Chrome tab is using the most memory?

When you install a lot of extensions in Chrome or have misconfigured browser settings, the resource usage generally goes up. Identifying the right tabs that are actually draining the memory can help you cut them off and improve system performance. Windows Task Manager is a valuable tool to know the percentage of system resources occupied by different Chrome processes. However, it only shows multiple entries of the chúng tôi process and doesn’t tell which process belongs to which tab in Google Chrome.

In this post, we will tell you how to identify tabs with high memory usage using the following 2 methods:

Using Chrome’s built-in Task Manager

Using Chrome’s System Diagnostic data

Let us see these in detail.

1] Use Chrome Task Manager to identify tabs with high resources usage

The Task Manager window that opens up will show the following information by default:

Task: This column lists all the processes that are currently running on your Chrome browser, including processes associated with GPU-accelerated content, open tabs (or webpages), extensions, and other system-level processes such as Network and Audio Service.

Memory footprint: This column tells how much RAM is used by each of the processes.

CPU: This column shows what share of CPU resources (in percent) is being consumed by individual processes.

Network: This column shows how much data is used/transferred by each process. For example, for tabs with audio or video streaming, it will show the data download rate.

Process ID: This column lists the Unique ID given to each process by your system.

Also Read: How to stop multiple Chrome processes from running in Task Manager.

2] Use Chrome System Diagnostic data to identify tabs with high resources usage

This will list all the open tabs in Chrome with their webpage titles and memory usage. The data is already sorted from highest to lowest, so you can see which tab is using the most memory. The data also shows memory usage by Chrome extensions and other system-related processes.

Once you find the problematic tab(s), you can manually close them to free up system resources. Apart from this, you can use these tips to reduce Chrome’s resource usage.

Hope you find this useful!

Read Next: How to reduce Chrome high memory usage & make it use less RAM.

Ca World: Storage A Major Focus, Part 1

The idea that IT is in a recession looked pretty ludicrous at Computer Associates (CA) annual CA World event last month, held at the Mandalay Bay convention Center in Las Vegas. While other trade shows have scaled back over the past two years, CA World appeared to be bigger and more lavish than ever. 14,000 attendees, close to a thousand members of the press from all over the world, top notch speakers (industry analysts, technology experts plus a keynote by Henry Kissinger), as well as representation from most of the world’s biggest IT firms (Oracle, HP, Cisco, EDS, Sun, EMC, Intel, Microsoft and others) marked this as a significant event in the IT calendar.

The show’s thematic concept was on-demand computing, otherwise called adaptive computing, NI and other similar initiatives launched by a variety of vendors. Under that umbrella, CA unveiled products dealing storage provisioning web services, business process automation, integrated enterprise management and security vulnerabilities.

“While IBM, Sun, HP and CA may appear to have different approaches to on-demand computing, in essence, we are all talking about the same benefits,” said CA CEO Sanjay Kumar. “On demand means the convergence of networks, storage resources, servers and web services.”

Meta Group analyst Corey Ferengul clarified the concept in an informative briefing about future data center and storage trends. “To prevent the data center from consuming the entire IT budget, increased manageability and utilization through standardization and automation are essential,” said Ferengul. “The primary benefits of on demand computing are aligning the IT infrastructure with business processes, gaining efficiency with better utilization and productivity, and providing a highly infrastructure that adapts to changing demands.”

Perhaps the biggest stunner of the show, however, was the sheer strength of the Linux camp within the CA fold. The biggest session of the three-day event featured Linux founder Linus Torvalds, as well as a host of other Linux luminaries. The CA World Linux Day, in fact, drew more attendees than the traditional sessions on network management, storage and security.

“Software is now following the same pattern as hardware,” said Torvalds. “Just as hardware has become a commodity item, the value proposition is moving from the OS to the integration and higher level applications. As a result, open source will eventually take over the entire field.”

On Demand is In Demand

Explaining the evolution and development of on-demand computing, CA CTO Yogesh Gupta focused on management. The first major step towards proper IT management, he said, was the emergence of uniform data networks like IP (Internet Protocol). That is now being followed by a similar standardization in the storage arena.

“Storage networking today is at the same stage as data networks were 10 years ago – it’s a mess,” said Gupta. “Over the next two years, we will see storage rapidly catch up through various standardization and automation initiatives.”

The transformation of storage architectures into a more manageable framework is also being paralleled by increasing momentum toward server virtualization. This is seen as part of a growing trend in the infrastructure management world towards the dynamic allocation of resources according to the peaks and valleys of the day’s (or month’s or year’s) business and application needs. Server virtualization plays a vital role in this, allowing IT to pool resources by separating the physical server resources from the software that runs on them.

“Virtualization creates a single system illusion,” said Ferengul. “Basically, it involves the abstraction of the guts of the infrastructure.”

He explained that to virtualize a set of diverse, concrete resources is to access them through a uniform interface that enables them to behave as one virtual resource from the user perspective. Such technologies as blade servers, virtual machines and grid computing are all facets of this trend.

Page 2: continued

Triangle Patterns – Technical Analysis

Triangle Patterns – Technical Analysis

Technical analysis tools for recognizing emerging bullish or bearish market patterns

Written by

Tim Vipond

Published February 2, 2023

Updated July 7, 2023

Triangle Patterns

Triangle patterns are a commonly-used technical analysis tool. It is important for every trader to recognize patterns as they form in the market. Patterns are vital in a trader’s quest to spot trends and predict future outcomes so that they can trade more successfully and profitably. Triangle patterns are important because they help indicate the continuation of a bullish or bearish market. They can also assist a trader in spotting a market reversal.

There are three types of triangle patterns: ascending, descending, and symmetrical. The picture below depicts all three. As you read the breakdown for each pattern, you can use this picture as a point of reference, a helpful visualization tool you can use to get a mental picture of what each pattern might look like. And here is the short version of triangle patterns:

Ascending triangles are a bullish formation that anticipates an upside breakout.

Descending triangles are a bearish formation that anticipates a downside breakout.

Symmetrical triangles, where price action grows increasingly narrow, may be followed by a breakout to either side—up or down.

Ascending Triangle Patterns 

Ascending triangle patterns are bullish, meaning that they indicate that a security’s price is likely to climb higher as the pattern completes itself. This pattern is created with two trendlines. The first trendline is flat along the top of the triangle and acts as a resistance point which—after price successfully breaks above it—signals the resumption or beginning of an uptrend. The second trendline—the bottom line of the triangle that shows price support—is a line of ascension formed by a series of higher lows. It is this configuration formed by higher lows that forms the triangle and gives it a bullish characterization. The basic interpretation is that the pattern reveals that each time sellers attempt to push prices lower, they are increasingly less successful.

Eventually, price breaks through the upside resistance and continues in an uptrend. In many cases, the price is already in an overall uptrend and the ascending triangle pattern is viewed as a consolidation and continuation pattern. In the event that an ascending triangle pattern forms during an overall downtrend in the market, it is typically seen as a possible indication of an impending market reversal to the upside.

Indications and Using the Ascending Triangle Pattern 

Because the ascending triangle is a bullish pattern, it’s important to pay close attention to the supporting ascension line because it indicates that bears are gradually exiting the market. Bulls (or buyers) are then capable of pushing security prices past the resistance level indicated by the flat top line of the triangle.

As a trader, it’s wise to be cautious about making trade entries before prices break above the resistance line because the pattern may fail to fully form or be violated by a move to the downside. There is less risk involved by waiting for the confirming breakout. Buyers can then reasonably place stop-loss orders below the low of the triangle pattern.

Using Descending Triangle Patterns 

Based on its name, it should come as no surprise that a descending triangle pattern is the exact opposite of the pattern we’ve just discussed. This triangle pattern offers traders a bearish signal, indicating that the price will continue to lower as the pattern completes itself. Again, two trendlines form the pattern, but in this case, the supporting bottom line is flat, while the top resistance line slopes downward.

Just as an ascending triangle is often a continuation pattern that forms in an overall uptrend, likewise a descending triangle is a common continuation pattern that forms in a downtrend. If it appears during a long-term uptrend, it is usually taken as a signal of a possible market reversal and trend change. This pattern develops when a security’s price falls but then bounces off the supporting line and rises. However, each attempt to push prices higher is less successful than the one before, and eventually, sellers take control of the market and push prices below the supporting bottom line of the triangle. This action confirms the descending triangle pattern’s indication that prices are headed lower. Traders can sell short at the time of the downside breakout, with a stop-loss order placed a bit above the highest price reached during the formation of the triangle.

Using Symmetrical Triangle Patterns 

Traders and market analysts commonly view symmetrical triangles as consolidation patterns which may forecast either the continuation of the existing trend or a trend reversal. This triangle pattern is formed as gradually ascending support lines and descending resistance lines meet up as a security’s trading range becomes increasingly smaller. Typically, a security’s price will bounce back and forth between the two trendlines, moving toward the apex of the triangle, eventually breaking out in one direction or the other and forming a sustained trend.

If a symmetrical triangle follows a bullish trend, watch carefully for a breakout below the ascending support line, which would indicate a market reversal to a downtrend. Conversely, a symmetrical triangle following a sustained bearish trend should be monitored for an upside breakout indication of a bullish market reversal.

Regardless of whether a symmetrical triangle breakout goes in the direction of continuing the existing trend or in the direction of a trend reversal, the momentum that is generated when price breaks out of the triangle is usually sufficient to propel the market price a significant distance. Thus, the breakout from a symmetrical triangle is usually considered a strong signal of future trend direction which traders can follow with some confidence. Again, the triangle formation offers easy identification of reasonable stop-loss order levels—below the low of the triangle when buying, or above the triangle high if selling short.

The Bottom Line 

In the end, as with any technical indicator, successfully using triangle patterns really comes down to patience and due diligence. While these three triangle patterns tend toward certain signals and indications, it’s important to stay vigilant and remember that the market is not known for being predictable and can change directions quickly. This is why judicious traders eyeing what looks like a triangle pattern shaping up will wait for the breakout confirmation by price action before adopting a new position in the market.

Additional Resources

Update the detailed information about Sentiment Analysis Using Transformers – Part I on the Flu.edu.vn website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!