In the logistic regression, the dependent variable is categorical (possibly ordinal) and we would like to estimate the probability. Instead of dealing with the probability directly, we can use the log-odds of the probability. Log-odds is a convenient quantity because it varies \( (-\infty, \infty) \). It is also called a **logit** function.

For a probability \( p \), the odds is defined as \( \frac{p}{1-p} \). The log odds \( \log \left( \frac{p}{1-p} \right) \) approaches \( -\infty \) as \( p \rightarrow 0 \) and approaches \( \infty \) as \( p \rightarrow 1 \).

Let \(y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots \)

Now we can formulate a model where $$ \log \left( \frac{p}{1-p} \right) \sim y $$ or $$ p \sim \frac{e^{y}}{1 + e^{y}} $$

This is the logistic regression model. We can think of it as an application of a linear regression on the logit of the probability, or as applying a logistic transformation to \( y \) to obtain a bounded probability value. The logit and the logistic function are the inverse of the other, and the logit function is called a "link function" in the context of Generalized linear model framework.

# Table of Contents

# Marginal effects and odds ratio #

The odds ratio captures multiplicative change in the dependent variable upon the change in an independent variable; by contrast, the marginal effect cpatures additive change in the dependent variable.

Because the logit function is the logarithm of the odds, the odds is \( e^{y} \). This means that if we calculate the odds ratio between the odds with \( x_i \) and that with \( x'_i = x_i + 1 \), we obtain \( \frac{e^{\beta_0 + \dots \beta_i (x_i+1) + \dots }}{e^{\beta_0 + \dots \beta_i x_i + \dots}} = e^{\beta_i} \). In other words, if we simply exponentiate a coefficient of the model, it gives us the odds ratio upon a unit change in the corresponding variable. Moreover, the odds ratio is a constant regardless of the value of the independent variables.

By contrast, the marginal effect is how much the probability (dependent variable) changes when we change an independent variable. Because it's about probability, not the odds, it describes an additive change and, unlike the odds ratio, varies depending on the other variables. For binary variable, the marginal effect is the amount of change upon the change of the variable from 0 to 1. Specifically, \( \hat{p}(x_i = 1) - \hat{p}(x_i = 0) \).

For continuous variables, it is the instantaneous rate of change ("dy/dx").

# Tutorials #

# Tools #

## Statsmodels #

- Logistic Regression in Python Using Rodeo
- Machine Learning for Hackers Chapter 2, Part 2: Logistic regression with statsmodels

### common usage patterns #

Using R-like formula. It takes care of categorical dummay variables and you can apply transformations (e.g. log) on the fly.

```
#!python
import statsmodels.formula.api as smf
result = smf.logit('DV ~ x1 + x2 + np.log(x3) + x4*x5', data=df).fit()
result.summary()
```

Calculating the odds ratio with 95% CI.

```
#!python
conf = result.conf_int()
conf['odds_ratio'] = result.params
conf.columns = ['2.5%', '97.5%', 'odds_ratio']
np.exp(conf)
```

F-test with human-readable restriction formula

```
#!python
result.f_test('x1 = x2 = 0')
result.f_test('x1 = x2')
```

Get average marginal effects (use `at`

parameter for other marginal effects).

```
#!python
margins = result.get_margeff() # marginal effects
margins.summary()
margins.summary_frame() # get a data frame
```

### Selecting a reference (pivot) dummy #

Currently, statsmodels does not support this choice. But because statsmodels picks a dummy using alphabetical order, we can simply replace the dummary variable that we want to have. For instance, if we have a `gender`

column with `m`

and `f`

values but we want to have `m`

as the pivot, then simply replace it with `a`

.

```
#!python
df.gender.replace('m', 'a', inplace=True)
```

We can use a simliar trick for the multinomial logit.

# Incoming Links #

## Related Articles (Article 0) #

# Suggested Pages #

- 0.028 Home
- 0.025 Multiple regression
- 0.025 Zero-inflated model
- 0.025 Bayesian linear regression
- 0.025 Matt Golder
- 0.025 Mirjam J. Knol
- 0.025 Tyler J. VanderWeele
- 0.013 Multicollinearity
- 0.013 Statistics
- 0.006 Dummy variable trap
- More suggestions...