Regresyon Analizi, Google AdsRegresyon Analizi, Google Ads

E-commerce Daily Revenue Prediction with Python and Machine Learning

In this article, we will examine the relationship between the morning sales performance of an e-commerce site and the end-of-day sales performance, and work on a model that will predict the end-of-day sales performance.

Regression Models

What is regression?

Regression analysis; explain the relation among one dependent variable and one or more independent variables to a mathematical equation. Regression show relation among variables and creates equation based with relation for prediction models. Regression models consist of two section as linear models and nonlinear models.

Linear Regression Models

What is linearity?

Linearity; explains the relation direction between two numerical variables, e.g: our height increases as increases our age, so there is a linear relation between age and height variables. Conversely, as one variable increases, another may decrease but If relation symmetry is proportional between variables, the statement of linearity remains valid even if variable affect each other negatively or positively. Regression models also include nonlinear models, but a linear model will be preferred in this article.

Simple Linear Regression

Simple linear regression model is using to examine and explain the relation between a dependent variable and independent variable.

Formula:
$\tilde{y} = \alpha + \beta x + \epsilon$

$\tilde{y}$ = dependent variable
$\alpha$ = intercept (constant)
$\beta$ = slope (multiplier coefficient)
$x$ = independent variable
$\epsilon$ = error term

$\beta$ coefficient is the key point in the equation. The $\beta$ coefficient shows effect of 1-unit change in $x$ variable on $\tilde{y}$ expression.

Import Libraries

import numpy as np
import pandas as pd

import seaborn as sns
import statsmodels.formula.api as smf

np.random.seed(777)
sns.set(rc={'figure.figsize':(11.7,8.27)})
%config InlineBackend.figure_format = 'retina'

import warnings
warnings.filterwarnings('ignore')

Load Data

In this model, our hypothesis will be tested to check for the effect of sales between 12 am and 9 am on sales between 9 am and 12 am. Therefore, we have to create the appropriate segments in Google Analytics and download the turnover up to 9 am and the turnover until 12 at night from Google Analytics separately in 2 columns.

df = pd.read_excel('Morning.xlsx')
df = df.drop(columns=['Day Index'])

Summary of Dataframe

df.head()

Distributions and Descriptive Statistics

df.describe().T

Kernel Density Estimation: Revenue of Morning

sns.distplot(df['morning_revenue'], hist = True, kde = True, 
             bins = 20, color = 'darkblue', 
             hist_kws = {'edgecolor':'black'},
             kde_kws = {'linewidth': 4})

Kernel Density Estimation: Revenue of Day

sns.distplot(df['day_revenue'], hist = True, kde = True, 
             bins = 20, color = 'darkblue', 
             hist_kws = {'edgecolor':'black'},
             kde_kws = {'linewidth': 4})

Correlation Exploration on Scatter Chart

sns.scatterplot(data = df, x = 'morning_revenue', y = 'day_revenue')

Linear Regression Model and Hypothesis Testing

linear_model = smf.ols('day_revenue ~ morning_revenue', data = df).fit()

Dep. Variable:day_revenueR-squared: 0.762
Model:OLSAdj. R-squared: 0.759
Method:Least SquaresF-statistic: 278.1
Date:Tue, 06 Apr 2021Prob (F-statistic):7.84e-29
Time:08:57:08Log-Likelihood: -1177.8
No. Observations:89AIC: 2360.
Df Residuals:87BIC: 2365.
Df Model:1
Covariance Type:nonrobust
coefstd errtP>|t|[0.0250.975]
Intercept1.284e+043.5e+040.3670.715-5.67e+048.24e+04
morning_revenue8.47150.50816.6760.0007.4629.481
Omnibus:11.027Durbin-Watson: 1.177
Prob(Omnibus):0.004Jarque-Bera (JB): 13.963
Skew:0.600Prob(JB): 0.000929
Kurtosis:4.526Cond. No. 1.66e+05

Parameters of Model

linear_model.params[0] #Intercept
>> 12836.998104752289
linear_model.params[1] #Slope
>> 8.471494381108705

Model

$Theoritical Equation = Intercept + Slope (MorningRevenue)$ $Model = 12836.998105 + 8.471494 (Morning Revenue)$

Equation and Prediction

test_morning_revenue = 50000
predict_daily_revenue = 12836.998105 + (8.471494 * test_morning_revenue)
>> 'Prediction: ' + str(round(predict_daily_revenue)) + ' USD'
'Prediction: 436412 TL'

Conclusion Notes

  • This method is just an any case, you can construct many hypotheses in the same way and you can create forecast models with hypotheses for the end of the day, month end, year end transactions with the observed values in the previous calendar periods.
  • By running the forecast model in the morning, digital marketers can predict that day’s marketing and budget effort.

Bir yanıt yazın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir