No, this time it is not a post for Valentine's day... It is simply that
a few days ago, on
Gaïa Universitas, Rachel mentioned a study on academic journals in
astronomy. The study is online
here, written by Krzysztof Zbigniew Stanek.
« Naively, one would
expect longer papers to have larger impact (i.e., to be cited more) –
how long a paper should be to maximize its impact? Is it better to write
several shorter papers or one longer paper? ». Actually, I did not expect that, and I was truly surprised to see that the number of citations was an
increasing function of the number of pages. So I had a look at several
academic in economics and mathematics. The first journal I looked at is the
Journal of Finance. Here, the relationship between the number of pages and the number of
citation is strong: a 40 page paper is - on average - two times
more cited than a 20 page paper. Actually, the shape of the regression curve is rather close to the one obtained in Krzysztof's paper,

The colors are due to the fact that some articles were published in
2010, and some in 1995. Obviously, the number of citations is a
function of the year of publication (a longer post will be published
soon on the dynamics of the citation process). I considered only
articles published before 2005 to fit a regression model (here a spline
regression, to see if the function is linear). And indeed, a longer
paper has more chance to be cited.
Then, I looked at the
Journal of Multivariate Analysis,
which is a mathematical journal, where (theoretical) econometricians can
publish.

Here the pattern is rather different: if we do not take into
account short notes, the number of citations is
independent of the size
of the paper.
The more I look at those graphs, the more disturbing I find them...
- on the one hand, I strongly believe that the size of the paper has
nothing to do with the quality (or the importance) of the paper: so in
some sense, I was expecting a flat regression, like the one we see above, with the Journal of Multivariate Analysis. It might come from the fact that in statistics (for instance) with empirical processes, e.g., proofs are extremely long and technical (and papers are long), while on stochastic orderings, proofs are extremely short (and papers rather short). But both can appear in the same journal...
- on the other hand, researchers are more and more evaluated, and a
common tool is to look at citation indexes (the called « publish or perish » paradigm). The more citations, the
better the researcher, something like that... So sometimes, when we have a
quite long paper, the question that naturally arises is: why not
splitting the paper in two ? with flat regression, it is, indeed,
optimal to split the paper (if possible) in two, since two papers with
20 page each might yield two times more citations than one. But with a curve like the one we see with the Journal of Finance, such a split is not an issue...
Anyway, I did really like the conclusion of Krzysztof's paper «
This
paper will not be submitted to any journal, but please feel free to
cite it as often as possible, or better yet cite my regular astronomical
papers »....