5 simple skills you need to become more data-driven instantly

With marketing moving towards being more of a data trade than a creative profession, we all need to become more data-driven if we want to stay on top. Since not everyone loves math and analysis as much as I do, I’ve listed five simple skills I think you should learn to get started.

(And no, you don’t need to be a math geek to become more data-driven, if you know some basic math you’ll have more than enough foundation to get started.)


1. Learn how to convert files from .csv to a format readable by humans, like .xlsx

“There’s something wrong with the file”. This comment is by far the single most common I get from clients and colleagues who start to work more with data.

However, there’s nothing wrong with CSV-files. The format is used to store lots and lots of data without ending up with huge files. A CSV-file is a text file where you have one observation or object per row, and list all the related values in a specific order. You separate the data points by a known symbol, most often a semicolon or a comma.

When you want to work with data stored as a CSV-file in spreadsheet software like Excel you’ll have to convert it from a text format into spreadsheet format. Here are two simple ways to do that:

Option 1: When you start with a clean Excel document

  1. Open a new Excel document and navigate to the Data tab
  2. Click “From Text” close to the top left corner
  3. Choose to the CSV file you wish to open from your computer and click “Import”
  4. In the window that opens up, choose “Delimited” and click “Next”
  5. Check one of the boxes next to the different delimiter suggestions – most CSV-files uses either a semicolon or a comma to separate the values
  6. Click “Finish”
  7. Tada!

Option 2: When you’ve already opened your file

  1. Open your CSV file in Excel and select column A in your document
  2. Navigate to the Data tab
  3. Click on “Text to Columns” somewhat in the middle
  4. In the window that opens up, choose “Delimited” and click “Next”
  5. Check one of the boxes next to the different delimiter suggestions – most CSV-files uses either a semicolon or a comma to separate the values
  6. Click “Next” and then “Finish”
  7. Tada!

2. Learn the difference between average, median and mode

When you have a dataset with multiple numerical values, you sometimes need a single number to represent the dataset. A simple method to use is to summarise all your data points into a “typical” data point that represent the “centre” of the dataset.

However, when you calculate the “center” of a numerical dataset you have three different measures to choose from: mean, median, and mode. They each summarise your dataset with a single number, but they are not the same.

Mean 

The mean is the “average” number in a dataset.  You calculate it by adding all your data points together and dividing this sum by the number of data points.

Example: The mean of 37, 10, and 67 is (37+10+67)/3 = 114/3 = 38

Median

The median is the middle number in a dataset. To get the median you order all data points from smallest to largest and pick out the number in the middle. If you have an even number of data points and there are two middle numbers, you take the mean (see above) of those two numbers.

Example: The median of 37, 10, and 67 is 37 because when you organise the numbers from smallest to largest (10, 37, 67), the number 37 is in the middle.

Mode

In any dataset, the mode is the most frequent value – the value that occurs most often among all values

Example: The mode of [2, 4, 3, 3, 3, 1, 1, 2, 2, 2, 4, 1] is “2” because it occurs four times, and all the other numbers occur fewer times than this.

Why should you care? – Mean, median and mode in different datasets

In a normal distribution, the mean, mode and median measures are equal. However, if a dataset is right or left skewed, they are different from each other. Using these three metrics are therefore an excellent way to learn about the distribution of the data points in your data set, and that is often an essential part of analysing a dataset.

Three different data distributions and how the mean, mode and median are impacted byt the distribution

3. Learn what axis is the x-axis and what is the y-axis in a two-dimensional (Cartesian) coordinate system

The X-axis and Y-axis in a Cartesian coordinate system
The X-axis and Y-axis in a Cartesian coordinate system 

The X-axis and the Y-axis when looking at only the first quartile of a Cartesian coordinate system
The X-axis is also called the horizontal axis, and the Y-axis is called the vertical axis

Even though you don’t work with coordinate systems, any data-driven marketer should know which axis is which when someone says “on the x-axis, you can see the time of day” and “on the y-axis, you can see the number of likes”.

Most of the time in marketing analytics the correct terminology to use is “the horizontal axis” and “the vertical axis” since the graphs displayed are not coordinate systems. However, even if this makes more sense to you, some people will use x and y instead, and you’ll have to know what they mean.


4. Learn the difference between correlation and causality

Being data-driven will always include working with data analysis. Analytical skill is something you pick up over time, but there are two key concepts that you need to keep apart from the start: Correlation and Causation. These two concepts are important because if you don’t get them right you won’t get the rest of your analysis right. Additionally, it’s very easy to divide the data-driven people from the non data-driven people based on these two concept.

Correlation = describes the size and direction of a relationship between two or more variables.
Causation = indicates that one event causes the other – i.e. there is a causal relationship between the two events.

One classic causation vs correlation example is that smoking correlates with alcoholism, but it doesn’t cause alcoholism. However, smoking causes an increase in the risk of developing lung cancer.


5. Know how to distinguish the dependent variable from the independent variable

This skill is a little bit more advanced than the others, but if you want to become more data-driven and know what an analyst or data scientist are saying about your A/B-tests, it’s a good thing to remember which is which.

When you test a hypothesis with an experiment, the two main variables are the independent and dependent variable. An independent variable is a variable that you change or control in your experiment to test if this has effects on the dependent variable.

Independent variable = the variable that you manipulate (i.e. the time you post on social media)
Dependent variable = the variable that you hope to impact from the manipulation (i.e. the number of impressions for your social media post) 


With marketing moving towards being more data-driven, we all need to update our skillsets if we want to stay on top. Being a math geek, I've listed five simple skills I think you should learn to get started.

2 Replies to “5 simple skills you need to become more data-driven instantly”

  1. […] analysis, or visualising and presenting data. However, it wasn’t long ago I didn’t know how to split up a CSV-file without help from Google; Hence, you shouldn’t be intimidated or feel like it’s something you cannot do […]

    Reply

  2. […] The danger of summary metrics is a more advanced version of why you need to know the difference between mean, median and mode. […]

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *