Here are some datasets that can be used for the first homework, on extremes

  • Daily rainfall accumulation, in South-West England [csv]
  • Business Interruption claims, in France [xls]
  • Fire insurance claims, in France [csv]
  • Hurricane losses, in the United States [csv]
  • Tornado losses, in the United States [csv]
  • Hail losses, in the United States [csv]
  • Flooding losses, in the United States [csv]
  • Sea level, in Venice, Italy [csv]
  • River level of the Seine, in Pommeuse, France [txt]
  • Daily precipitation, in some city, France, 0.1 mm [txt]
  • Daily (maximum) temperature, in some city, France 0.1°C [txt]
  • Large medical claims [zip, zip, zip]

Note that to import some datasets, it might be necessary to skip some lines. The standard code to import a dataset it the following,

base=read.table(
"http://freakonometrics.blog.free.fr/public/
data/RR_SOUID104968.txt"
, skip=20,nrows=43460,header=TRUE,sep=",") date=as.Date(as.character(base$DATE),"%Y%m%d") rain=base$RR

The last three datasets are from the Medical Large Claims Experience Study (http://www.soa.org/). Datasets are extremely large (more than 200Mo), and once data are extracted, they can be imported in R using

base=read.table("/Users/claim97fr2.txt",
header=TRUE,sep=",")
X=base$TOTCVCHG
X=X[is.na(X)==FALSE]

Note that information about the variates of the dataset can be found on the website.

If some students want to work on some specific data, please send me first a copy of the series.

For this first homework, I expect students to write a short report (5 pages maximum) containing an estimation of a 99.9% quantile (per claim or per day for daily time series), and the amount associated to a 10-year return period.

I will post some R functions on the blog soon.