Test a model before putting it into production and verify that the model we have assumed is correctly specified with the right assumptions. In this article I present a method to test its model: the test of Hosmer-Lemeshow.

To perform the Hosmer-Lemeshow test, you’ll need a dataset.

*Download* : credits_linear_regression_score.csv

This dataset contain relevant information about the scored for people who wants a credits.

First, we need to load the dataset from the CSV file to a new Python dataframe with the Pandas library.

## Loading and arrange the dataset

```
# coding: utf-8
import pandas as pd
import numpy as np
df = pd.read_csv("/Users/tbazzucchi/Desktop/credits_linear_regression_score.csv", sep=";")
print(df)
ID Age Rev.Tete Proprietaire Prof.Indep Nb.Prob Accept.Credit Target score
0 1 30 3 1 0 7 no 0 0,00000852482
1 2 35 1,5 0 0 4 no 0 0,004416058
2 3 46 3,4 0 0 3 no 0 0,019462335
3 4 40 3,71 0 0 3 no 0 0,033739399
4 5 27 4,9 1 0 3 no 0 0,111007108
5 6 55 2,64 1 0 1 yes 1 0,134469966
print "number of observations:",len(df) #number of observations (rows)
print "number of variables:",len(df.columns) #number of variables (columns)
```

The values of the score columns is written in french decimal mark. So let’s change it by using the str.replace function.

`df['score'] = df['score'].str.replace(",",".").astype(float)`

Now, let’s arrange the dataset ranking by score.

`df = df.sort_values('score')`

The credits dataset is now arrange. We need to subdivide the dataset in G groups on the basis of the deciles (it means that we have 10 subgroups).

## What is deciles?

Deciles are the nine values of a variable, which **divides its distribution into ten parts with equal frequencies**. For this, it is very important to arrange the set of data in ascending or descending order, which means the values have to be sorted out before dividing. Deciles are all the nine values of the variable that can divide an ordered data into ten equal parts.

Let’s use the Pandas librairie to subdivide automatically our dataframe by using the qcut package. We create a new column in our dataframe to analyze the Categorical of each deciles.

```
df['score_decile'] = pd.qcut(df['score'], 10)
0 [8.52e-06, 0.287]
1 [8.52e-06, 0.287]
2 [8.52e-06, 0.287]
3 [8.52e-06, 0.287]
4 [8.52e-06, 0.287]
5 [8.52e-06, 0.287]
6 [8.52e-06, 0.287]
7 [8.52e-06, 0.287]
8 [8.52e-06, 0.287]
9 [8.52e-06, 0.287]
10 (0.287, 0.625]
11 (0.287, 0.625]
Name: score_decile, dtype: category
Categories (10, object): [[8.52e-06, 0.287] < (0.287, 0.625] < (0.625, 0.734] < (0.734, 0.787] < ... <
(0.849, 0.877] < (0.877, 0.892] < (0.892, 0.91] < (0.91, 0.985]]
print df['score_decile']
```

- First decile = 0.287
- Second decile = 0.625

In each group, we count the number of positives and negatives. For the first group, we have 2 positives so 10 (subgroups) – 2 (positives) = 8 negatives.

```
obsevents_pos = df['Target'].groupby(df.score_decile).sum()
print(obsevents_pos)
score_decile
[8.52e-06, 0.287] 2
(0.287, 0.625] 4
(0.625, 0.734] 6
(0.734, 0.787] 7
(0.787, 0.815] 7
(0.815, 0.849] 10
(0.849, 0.877] 10
(0.877, 0.892] 8
(0.892, 0.91] 10
(0.91, 0.985] 9
Name: Target, dtype: int64
```

`obsevents_neg = df['score'].groupby(df.score_decile).count() - obsevents_pos`

Then we calculate the hopes expected by summing the scores in the group. For example, for the first group, we have : 0.000009 + 0.004416 + 0.019462 + … + 0.282818 = 1.198394

```
expevents_pos = df['score'].groupby(df.score_decile).sum()
print(expevents)
score_decile
[8.52e-06, 0.287] 1.198394
(0.287, 0.625] 5.094462
(0.625, 0.734] 6.788609
(0.734, 0.787] 7.588580
(0.787, 0.815] 8.042183
(0.815, 0.849] 8.337283
(0.849, 0.877] 8.672026
(0.877, 0.892] 8.856413
(0.892, 0.91] 9.035653
(0.91, 0.985] 9.386394
Name: score, dtype: float64
```

```
expevents_neg = df['score'].groupby(df.score_decile).count() - expevents_pos
print(expevents_neg)
score_decile
[8.52e-06, 0.287] 8.801606
(0.287, 0.625] 4.905538
(0.625, 0.734] 3.211391
(0.734, 0.787] 2.411420
(0.787, 0.815] 1.957817
(0.815, 0.849] 1.662717
(0.849, 0.877] 1.327974
(0.877, 0.892] 1.143587
(0.892, 0.91] 0.964347
(0.91, 0.985] 0.613606
Name: score, dtype: float64
```

## Hosmer-Lemeshow test

Lastly, we can calculate the Hosmer-Lemeshow test statistic by using the previous results with following this formula:

```
hl = (((obsevents_pos - expevents_pos)**2/expevents_pos) + ((obsevents_neg - expevents_neg)**2/expevents_neg)).sum()
print(hl)
```**7.8292234401678**

Create a Hosmer-Lemeshow test in Python is quite complicated because yet, there is no Python library which can handle a complicated formula such as Hosmer-Lemeshow formula. A better way, is using R because R libraries have functions to perform it quickly.

The next step should be compared if the p-values are changing significantly or not to conclude on the evidence of poor fit.