Do you have a desire to predict the future?
Do you want to increase the success of your business by making good decisions?
If you answered yes to one or both of those questions then hold on to your knickers because I’m about to introduce one of the most badass, forward-thinking, and useful topics in business, that is, regression analysis.
I know what you’re thinking, the term “regression analysis” sounds like a foreign word. This is where I am going to say don’t worry about vocabulary, just focus on the process.
Regression analysis deals with statistics and big data so you will inevitably come across vocabulary that scares you but words are just a small thing, don’t let them scare you.
I will do my best to keep everything simple. With that being said, let me introduce this magical process.
What Is Regression Analysis?
Regression analysis, otherwise known simply as “regression” is a statistical process that shows the relationship between two or more variables and allows predictions to be made about the future. A regression has two types of variables, independent and dependent.
A dependent variable is something that is determined by another variable. This is the value you are trying to predict. A great example is annual revenue in a company. Revenue is determined by the number of employees, product color, time of year, etc.
Just think of anything that is influenced by another factor, it depends on something else to get its value.
An independent variable doesn’t rely on anything to get its value. These are variables you believe have an influence on what you are trying to predict. Using the same example above, independent variables such as the number of employees, product color, and time of year influence the value of sales revenue.
That wasn’t too bad, was it? Understanding and interpreting variables are the key to success in regression analysis. There are just two more things to talk about, that is, intercepts and coefficients.
The intercept is the value of the dependent variable if the value of all independent variables each had a value of zero. Basically, without doing anything the value of what you are trying to predict would be the value of the intercept.
This means that if the number of employees was 0, the color was NOT red (compared to another color such as blue), and the month was NOT July (compared to another month such as August) then the sales revenue would be a certain value, $10,000 for example.
A coefficient is the value of the independent variables.
For every 1 unit increase of an independent variable, the dependent variable will increase by the value of the coefficient.
Remember this bolded sentence, it is the easiest way to interpret coefficients. Say it. Say it again. Now repeat it one more time.
You now understand the basic concept and terms of regression. Let’s go over an easy example to truly grasp the concept. Trust me, regression is easier than it seems.
Example: Movie Revenue
Let’s say you are planning to make a movie and you are trying to determine how much revenue the movie will make. To keep things simple, we will only use one independent variable, the movie budget.
There is no perfect variable to select but we have a pretty good idea that if you spend a lot of money to make a movie the quality will be better thus leading to more sales.
Ultimately we are trying to predict movie revenue (dependent variable) based on the movie budget (independent variable).
After making a claim we need actual data to see how the two variables work with each other. With this example, we can scour the internet to find movie revenues and budgets for various movies and put them inside a database.
For basic models that include 2 variables, it’s best to use Microsoft Excel because most people have Excel and it can run regressions with few variables very easy.
In the more advanced regression models its best to use more powerful programs such as Minitab or R Studio. You can organize the data just like in the photo below.
After inputting the data into a regression model, Excel gives us something that looks like this: Y = 150 + 25x
An easy way to read this equation is: Revenue = 150 + 25 * (budget)
Because the units are in thousands of dollars this means if budget = $0 the revenue would be $150,000. If we increase budget 1 unit ($1,000) then revenue will increase by 25 units ($25,000) equaling $175,000.
Let’s Use The Terms Above
Remember the terms I introduced above? Let’s put them into the equation so you understand where they fit and how to interpret this regression model.
Dependent variable = intercept + coefficient * independent variable
Dependent variable: revenue (what you are trying to predict)
Intercept: 150. If the budget is 0 the revenue will always be $150,000.
Coefficient: For every 1 unit increase in the budget the revenue will increase by the 25 units ($25,000)
Independent variable: budget (determined by you)
What Does This All Mean?
It means we can predict the future with confidence very easily!
Listen, I’m not a math guy, not one bit. But the beauty of regression is that we don’t need to have a deep understanding of math to comprehend what’s happening.
The computer calculates everything for us! In this example, you can say “If we budget $x amount for this movie we are expected to get $x in revenue.”
Regression is all about selecting the right variables and interpreting the data. Do you see how powerful this can be?
Imagine adding more variables! Budget, a genre of movie, month of year, GDP, etc. With more variables it’s possible to predict the revenue using very specific independent variables. “If the budget is $x, the genre is horror, the month is July, and GDP is $40,000, the revenue will be $x.” I hope you are beginning to see how easy it is to predict the future.
Big Data Is The Future
Companies are already using data to their advantage.
Everything you do is most likely inputted inside some sort of database and companies are using this data with regressions to make smart decisions. Whether you are starting your own business or work for a company understanding and using regression will undoubtedly help you in the future.
If you don’t use it you will get crushed by the competition because they will have a deep understanding of the industry by looking at the data.
Test Your Understanding:
Use the following regression: sales revenue = 200 + 50 * (advertisements)
- What type of variable is sales revenue?
- What type of variable are advertisements?
- If there are no advertisements how much will we get in sales revenue?
- If we increase advertisements by 1 unit how much will sales revenue increase?
- If we create 5 advertisements what will total sales revenue be?