Freakonometrics

To content | To menu | To search

Tag - Montréal

Entries feed - Comments feed

Thursday, May 24 2012

Nightlife in Montreal

additional information on Twitter, #casseroles, #ggi, #manifsencours or #loi78 among many others.

Tuesday, May 3 2011

Playing with robots

My son would be extremely proud if I tell him I can spend hours building robots. Well, my robots are not as fancy as Dr Tenma's, but they usually do what I ask them to do. For instance, it is extremely simple to build a robot with R, to extract data from websites. I have mentioned it here (one tennis matches), but it failed there (on NY Marathon). To illustrate the use of robots, assume that one wants to build his own dataset to study prices of airline tickets. First, we have to choose a departure city (e.g. Paris) and an arrival city (e.g. Montreal). Then, one wants to look at all possible dates from April first (I ran it last month) till the end of December (so we create a vector with all leaving dates, namely a vector for the day, one for the month, and one for the year). Then, we choose a return date (say 3 days after).
DEP="Paris"
ARR="Montreal"
DATE1D=rep(c(1:30,1:31,1:30,1:31,1:31,1:30,1:31,1:30,
1:31,1:31,1:29),3)
DATE1M=rep(c(rep(4,30),rep(5,31),rep(6,30),rep(7,31),
rep(8,31),rep(9,30),rep(10,31),rep(11,30),rep(12,31),
rep(1,31),rep(2,29)),3)
DATE1Y=rep(c(rep(2011,30+31+30+31+31+30+31+
30+31+31+28),rep(2012,31+29)),3)
k=3
DATE3D=c((1+k):30,1:31,1:30,1:31,1:31,1:30,1:31,
1:30,1:31,1:31,1:29,1:k)
DATE3M=c(rep(4,30-k),rep(5,31),rep(6,30),rep(7,31),rep(8,31),
rep(9,30),rep(10,31),rep(11,30),rep(12,31),rep(1,31),rep(2,29),
rep(3,k))
DATE3Y=c(rep(2011,30+31+30+31+31+30+31+30+31+
31+28-k),rep(2012,31+29+k))

It is also possible (for a nice robot), to skip all prior dates

skip=max(as.numeric(Sys.Date()-as.Date("2011-04-01")),1)

Then, we need a website where requests can be written nicely (with cities and dates appearing explicitly). Here, I cannot not mention the website that I used since it is stated on the website that it is strictly forbidden to run automatic requests... Anyway, consider a loop create a url address (actually I chose the value of the date randomly, since I had been told that those websites had memory: if you ask too many times for the same thing during a short period of time, prices would go up),

URL=paste("http://www.♦♦♦♦/dest.dll?qscr=fx&flag=q&city1=",
DEP,"&citd1=",ARR,"&",
"date1=",DATE1D[s],"/",DATE1M[s],"/",DATE1Y[s],
"&date2=",DATE3D[s],"/",DATE3M[s],"/",DATE3Y[s],
"&cADULT=1",sep="")

then, we just have to scan the webpage, looking for ticket prices (just looking for some specific names)

page=as.character(scan(URL,what="character"))
I=which(page%in%c("Price0","Price1","Price2"))
if(length(I)>0){
PRIX=substr(page[I+1],2,nchar(page[I+1]))
if(PRIX[1]=="1"){PRIX=paste(PRIX,page[I+2],sep="")}
if(PRIX[1]=="2"){PRIX=paste(PRIX,page[I+2],sep="")}

Here, we have to be a bit cautious, if prices exceed 1000. Then, it is possible to start a statistical study. For instance, if we compare to destination (from Paris), e.g. Montréal and New York, we obtain the following patterns (with high prices during holidays),

It is also possible to run the code twice (here it was run last month, and a couple of days ago), for the same destination (from Paris to Montréal),


Of course, it would be great if I could run that code say every week, to build up a nice dataset, and to study the dynamic of prices...

The problem is that it is forbidden to do this. In fact, on the website, it is mentioned that if we want to extract data (for an academic purpose), it is possible to ask for an extraction. But if we do tell that we study specific prices, data might be biased. So the good idea would be to use several servers, to make several requests, randomly, and to collect them (changing dates and destination). But here, my computing skills - unfortunately - reach a limit....

Friday, March 18 2011

911, jour après jour

Après deux billets (ici puis ) sur les cycles intrajournaliers des appels au 911, on peut se demander comment les crimes évoluent au cours de la semaine.

Pour l'ensemble des appels passés au 911, on a la distribution suivante

i.e. un pic les vendredi soir et samedi soir, et un creux le dimanche. Si on regarde les appels pour des cambriolages, on a la distribution suivante

avec des pics en matinée, les vendredi après midi, et les fins de semaine. On peut aussi suivre les troubles de la paix,

qui surviennent certes vers minuit, mais essentiellement en fin de semaine. Ce qui contraste assez avec les hold-ups,

Manifestement, il y a des tendances assez claires. La prochaine étape sera de regarder un peu les saisons, ou mieux, l'impact du climat...

à suivre donc...

Tuesday, January 25 2011

You find it cold in Montréal ? trust me, it is even worse than what you can imagine...

As people say in Montréal, "aujourd'hui, il fait frette". And I have been surprised recently when some people told my that we would reach -35°C Sunday evening... I checked around, and I found -25°C on all weather forecast websites. But nowhere -35°C. I asked some friends, and they told me that those people were not really looking at the air temperature (as we observe on the thermometer), but they were looking at the wind chill, also called "felt air temperature on exposed skin due to the wind" (température ressentie).
And indeed, such a quantity does exist, and can be found on the climate.weatheroffice.gc.ca website. There is also a physical background for that quantity. Hence, the windchill is http://freakonometrics.blog.free.fr/public/maths/windchill2.png defined as
http://freakonometrics.blog.free.fr/public/maths/windchill1.png
where http://freakonometrics.blog.free.fr/public/maths/windchill3.png is the air temperature (in °C), and http://freakonometrics.blog.free.fr/public/maths/windchill4.png the wind speed (in km/h). Please don't ask me how to interpret this power 0.16 (I already find difficult to explain a square root in an econometric equation). If we look at the past previous days we observe the following observations,
where points on top are temperature, while below we have felt temperature.So, basically, winters are even colder than what you might think..
And the story is not over, yet. The same thing holds for summer: if you take into account humidity, summer are even hotter than what you think... There is the humidex, http://freakonometrics.blog.free.fr/public/maths/humidex2.png, defined here as
http://freakonometrics.blog.free.fr/public/maths/humidex.png
where http://freakonometrics.blog.free.fr/public/maths/humidex3.png denotes a dewpoint (see here for more details).
That index appeared in the 70's, with a work of Masterson and Richardson entitled a method of quantifying human discomfort due to excessive heat and humidity (published in 1979).By that time, in Canada, on average, 22 people died, per year, because of those excessive heat and humidity. For those interested by the origin of that index, you can have a look here.
Recently, @Annmaria (here) told me that one might expect variance to increase, i.e. maximas should be increasing faster than minimas. I just wonder if this intuition can be related to the fact that more and more people (including some medias) now talk more about felt temperatures than measured temperatures. And if we compare past temperatures to felt temperature we have today, it looks like the difference between extremes is increasing....

Tuesday, September 28 2010

Tiens voilà, la pluie. Ah! quel sale temps. Où est-il l'été ? l'été où est-il ?

Dans Les cafards, de Jo Nesbø, Harry Hole (qui vient d'Oslo pour ceux qui n'ont pas dévoré la série) s'entend dire à un moment qu'à Bangkok, les discussions quotidiennes portent rarement sur "la pluie et le beau temps" pour la bonne raison que le climat est assez prévisible à Bangkok (par contre on parle quotidiennement du trafic routier). J'avais évoqué ici l'idée reçue que nous avions, sur le fait que le temps à Rennes n'est pas si changeant que ça. Et je me suis rendu compte depuis que nous sommes à Montréal que le temps pouvait vraiment changer d'un jour sur l'autre (et encore, paraît-il, je n'ai rien vu).
Alors tout d'abord, pour ceux qui en douteraient, la température à Rennes, et à Montréal, ça n'est pas vraiment la même chose,
pour Montréal, alors que pour Rennes on obtient (je précise mais je pense qu'on aurait deviné sans)
Autrement dit, on a une dispersion presque deux fois plus grande à Montréal. Si regarde également les variations (min/max) dans la journée, on obtient, pour Montréal les courbes suivantes,

avec en rouge, la valeur moyenne (sur 15 ans) du maximum observé dans la journée, et en bleu, la valeur moyenne du minimum. A Rennes, on retrouve le fait que la dispersion est relativement faible,

En fait, les choses sont encore pire quand on regarde les quantiles, i.e. le pire minimum observé dans 10% des cas (les plus froids) et le pire maximum observé dans 10% des cas (les plus chauds);

à  Montréal, alors qu'à Rennes, on est beaucoup plus resserrés,

Mais ces pires de cas ne sont pas obtenus dans la même journée, on ne peut pas en dire grand chose. Si on veut aller un peu plus loin, on peut regarder la dispersion dans la journée (i.e. différence entre le minimum et le maximum). On note qu'il est assez stable à Montréal (toujours d'environ 10 degrés)

alors qu'à Rennes, la différence est de 5 degrés l'hiver, contre 10 degré l'été,

Mais surtout, si on regarde la variation d'un jour sur l'autre, danns 90% des cas, on varie de +/- 8 degrés, sur la température moyenne journalière,

avec pas mal de cas un peu extrêmes, avec des variation de 10 ou 15 degré d'un jour sur l'autre sur la température moyenne, alors qu'à Rennes, on a plutôt +/- 5 degrés (avec très peu d'évènements extrêmes, car j'ai tronqué les ordonnées afin de mieux voir au centre)

Bref, je ne suis pas prêt de m'arrêter de parler longtemps de la pluie et du beau temps !