Dataset and reproducible data analysis
We studied the particulate.csv dataset, which contains 110 observations of 15 variables:
- idStudy, counter ranging from 1 to 110
- idIstat, an integer adopted from Italian National Institute of Statistics
- Province, the name of italian cities/provinces
- itCode, a two character province identifier (primary key)
- Exceedancees, the integer number (absolute frequency) of PM10 concentrations above the daily limit value
- StationsNum, the integer number of the air quality monitoring stations within each province
- Cases, the integer number of infections at sevententh day
- Population, the total residents of the province
- Density, the population over the surface, times 1000 and rounded
- Long, the longitude of the city center
- Lat, the latitude of the city center
- Where, a factor with two levels (north, south) according to the lines joining the towns of Ormea, Fraconalto and San Marino
- Commuters, the integer number of people moving to go to work according the Italian National Institute of Statistics
- CommutersDensity, the Commuters number divided by the Population, times 100 and rounded
- ExcedRatio25, the relative frequency of PM2.5 concentrations above the daily limit value
You can exploit the R language and its party package
to reproduce the data analysis following this code.