Correlation is a statistical method that measures the strength and direction of a relationship between two or more variables. It’s useful for understanding whether and how strongly variables are related to one another. The result of a correlation is a coefficient, usually denoted as r, which ranges from -1 to 1.
- Positive correlation: When one variable increases, the other variable also increases. For example, higher temperatures may be correlated with higher ice cream sales.
- Negative correlation: When one variable increases, the other variable decreases. For instance, as the price of a product increases, its demand typically decreases.
- Zero correlation: No apparent relationship exists between the variables.
Key Concepts:
- Pearson Correlation: Measures the linear relationship between two continuous variables.
- Spearman’s Rank Correlation: Measures the strength and direction of a relationship between two ranked variables (non-parametric).
| Correlation Coefficient (r) | Strength of Relationship |
| 0.9 to 1.0 (or -0.9 to -1.0) | Very strong |
| 0.7 to 0.9 (or -0.7 to -0.9) | Strong |
| 0.5 to 0.7 (or -0.5 to -0.7) | Moderate |
| 0.3 to 0.5 (or -0.3 to -0.5) | Weak |
| 0.0 to 0.3 (or -0.0 to -0.3) | Very weak/no correlation |
Correlation does not imply causation. Two variables might be correlated because they are both influenced by a third factor.
Forecasting refers to predicting future values or trends based on historical data. The goal is to anticipate future events or behaviors using statistical models. Forecasting is widely used in fields like economics, business, and weather prediction.
Types of Forecasting Methods:
- Qualitative Forecasting: Based on expert judgment rather than data. It’s used when there’s little historical data available (e.g., new products).
- Delphi method: Gathering expert opinions.
- Market research: Surveys and focus groups to gather opinions.
- Quantitative Forecasting: Uses historical data to predict future trends.
- Time Series Models: Focuses on patterns within the data (e.g., trends, seasonality).
- Moving Averages: Smoothing out fluctuations by averaging data over a set number of periods.
- Exponential Smoothing: Weighs recent observations more heavily.
- ARIMA (Auto-Regressive Integrated Moving Average): Combines autoregression (relationship between an observation and some lagged observations), differencing (making the data stationary), and moving averages.
- Causal Models: Assumes that the variable to be forecasted has a cause-effect relationship with one or more independent variables.
- Linear Regression: Forecasting by identifying the relationship between a dependent and one or more independent variables.
- Multiple Regression: Uses several independent variables to make predictions.
- Time Series Models: Focuses on patterns within the data (e.g., trends, seasonality).
Steps in Forecasting:
- Define the problem: Understand what needs to be forecasted.
- Gather data: Collect historical data or expert opinions.
- Choose a model: Select the most appropriate forecasting method (qualitative or quantitative).
- Model the data: Use statistical techniques or expert input to forecast future values.
- Evaluate the model: Measure its accuracy (e.g., using Mean Absolute Error or Root Mean Square Error).
- Make the forecast: Generate the prediction and provide the results.
Correlation vs Forecasting
- Correlation helps to understand relationships between variables, while forecasting focuses on predicting future values or outcomes based on patterns in data.
- You might use correlation in the early stages of forecasting to identify which variables are related to the future outcomes you’re trying to predict. For instance, if sales are correlated with temperature, you might incorporate temperature into your forecasting model.
Model Funtions
Violence
y = -2E-07x6 + 1E-05x5 – 0,0003x4 + 0,0032x3 – 0,0205x2 + 0,0637x
R² = 0,1641
Salary
y = -2E-06x6 + 0,0001x5 – 0,0039x4 + 0,0482x3 – 0,3006x2 + 0,8507x
R² = 0,3126
Heatwaves
y = -7E-06x6 + 0,0005x5 – 0,0104x4 + 0,1126x3 – 0,6196x2 + 1,8307x
R² = 0,7038
Wildfires
y = -3E-05x6 + 0,0019x5 – 0,0414x4 + 0,4573x3 – 2,6815x2 + 8,1998x
R² = 0,7939
Floods
y = -3E-05x6 + 0,0017x5 – 0,0372x4 + 0,388x3 – 2,0323x2 + 5,9953x
R² = 0,704
