Programme on publishing
Vesa Muhonen

Joined: 06 Oct 2004
Posts: 5
Affiliation: Helsinki Institute of Physics

PostPosted: June 17 2005


the easiest way to acheive long-term storage is to produce paper copies - properly printed (text from an office laser would fade over a few years). However, this is cheaper than continually renewing a digital archive (transfer to new disks every ~5 years or so)

It might be the easiest way, but I'd like to have a completely digital archive. I doubt that limited hardcopy storage would even be cheapest way. The amount of data in arXiv is tiny by modern standards. (arXiv reports to have ~324 000 articles, which could be estimated to take ~350GB of storage space. An amount which one can easily have on a desktop computer.)

This problem is familiar to various archivists, such as national archives, so people are thinking about this. I don't see this as an urgent problem, but something that we can't completely overlook if we really want to build a long lasting digital archive. Like Antony suggested plain text (ASCII or UTF?) + LaTeX is propably the safest bet. PS/PDF are common, but they're not open standards and that might lead to problems in the future.
Boud Roukema

Joined: 24 Feb 2005
Posts: 82
Affiliation: Torun Centre for Astronomy, University of Nicolaus Copernicus

PostPosted: September 07 2005

I just want to link these three threads together, hopefully the software won't complain about self-links :P

Sarah Bridle

Joined: 24 Sep 2004
Posts: 149
Affiliation: University College London (UCL)

PostPosted: July 04 2006

Reading Andrew Jaffe's blog
I became aware of Nature's very interesting ongoing debate about publishing

In particular I quite liked the refereeing format described in
in which each paper spends a certain amount of time in a "reviewing phase", in which anybody can publically post constructive comments on the paper.
Then after this period has expired, the paper gets sent to a referee who gives the paper "pass" or "fail".
The idea is that the referee will naturally already have sent their comments on the paper during the "reviewing phase", and debatable issues can be resolved.

I think this would be great, if in practice people had the time to participate in the "reviewing phase".

This partially addresses some of the concerns and suggestions by Anonymous in
where public referee reports were suggested.
Boud Roukema

Joined: 24 Feb 2005
Posts: 82
Affiliation: Torun Centre for Astronomy, University of Nicolaus Copernicus

PostPosted: July 11 2006

Antony Lewis wrote:
Quality-stamping vs traditional journal

This is precisely why Arxiv only accepts pure PS/PDF as a last resort. Almost all submissions are in LaTeX (and this could be made a requirement for physics Open Journals). This basically represents the source code for the PS/PDF, and could be converted into different new forms over the coming decades as required. I don't see why the LaTeX cannot form a safe information storage medium, given it is simple, human-readable and computer processable. Referee reports could be plain text or LaTeX.

The size of the Arxiv database must be trivial compared to any large scientific data repositry.

I don't know how stringent arXiv is with submissions, but AFAIR (last time I read the submission info) it requests that the full gzipped tarball is no more than about 300kb, though 500kb is tolerated and I think I remember someone who recently tried submitted a bigger package and got rejected by the robot.

Claude Bertout and Peter Schneider wrote in

the various astronomy journals send about 8000 refereeing requests for new papers every year.

So if we suppose that all astronomers (including non-cosmologists) started using arXiv regularly, and we conservatively assume that 100% of articles are accepted and that global warming and the tail of the Hubbert Peak stop the total number of articles per year from dramatically increasing significantly (due to minor practical annoyances such as "coastal" cities being flooded/hurricaned and massive evacuations around the world) then 30 years' archives would be roughly:

30 * 8000 * 0.5Mb = 120Gb

So a 30 euro 160Gb hard disk bought in 2006 should be enough to store 30 years' of all astronomy articles. Just about any astronomer in a middling-rich to rich country could store his or her own personal backup of the entire archive.

Probably most people on cosmocoffee are reasonably confident about arXiv's mirroring and backup systems, but if we needed to persuade nervous librarians or directors-not-quite-living-in-the-GNU/linux-internet-era, i'm sure that a dozen people on cosmocoffee could each start doing backups, if the sort of system being discussed here were to be proposed more seriously. i've personally met a physicist who was involved in setting up arXiv at a very early stage, and that plus my experience with posting there is enough to give me confidence in the system - just as i'm also reasonably confident that the chance of all the copies of A&A and MNRAS and ApJ in libraries around the world being destroyed is 10-N where N >> 1 .

On another point: in Poland we have the same problem that Anze Slosar mentioned about "European" countries (though I think it's mostly Central/East European countries that have been victims of the shock treatment neoliberal economics experiment - the neocons in France are trying to introduce the same idea about "primes" (premiums) for publishing articles, but they're meeting with stiff resistance). My institute is not quite that bad, except that here the principle is that only publishing articles is relevant, the citation rates don't count for much.

Anyway, in general the idea is a good one, though I guess it would be good to do some sort of a survey and ask: does anyone still read the printed versions of journals (apart from what we print ourselves - and would in the old days have had to laboriously photocopy)? Probably (certainly) cosmocoffee readers are a biased sample: the sort of people who don't like reading articles on the screen or printing them for themselves are not likely to be the most enthusiastic participants in

On the other hand, we could just set something up in parallel to the existing system, find some bigwigs willing to lend their prestige to the project, and then it will take on its own momentum.

If a reasonable number of other cosmocoffee people are willing to publish in the system, then I would be willing to at least publish at least some small fraction of my articles in it to start off with.
