In recent discussions on Twitter and in the blogosphere, I've chatted with Mike Taylor, Ross Mounce, and others about self-archival as one of many mechanisms to bring about open access. Mike's recent blog post at SV-POW! summarizes much of the discussion to date, and I thank him for helping me to crystalize my thoughts on the topic.
For those who are not familiar with the term, self-archival refers to placing a freely-downloadable copy of a publication (or other work) on one's personal (or departmental, or whatever) web page. In this post, I want to discuss the pros and cons of such an approach.
- The PDF is freely available to anyone who wants to see it. No paywalls. No hassle.
- Once picked up by search engines, your posting may be the first one web users find - even above the "official" journal page!
- If users browse your website with the PDF, it means that they might discover closely-related work. This can be a big plus for getting the word out about your research program.
- A personal archive is probably not a permanent archive. Barring special arrangements, your personal or institutional web page is not likely to last substantively beyond your lifetime. Free hosting services such as WordPress may not be around in 20 years (remember Geocities?), so it may be worthwhile to pay for hosting. And make sure your descendents pay for hosting, or that your departmental web administrator doesn't delete your page 15 years after you retire. I have little faith that the PDFs I post on my own web page will be around 200 years from now, at least at that website. That sure would stink for that researcher in 2212, who wants to read all about ceratopsian sinuses.
- Author-hosted archives are not independent. There is nothing to prevent someone from removing embarrassing details or adding fraudulent information to their publications, and little that a casual reader can do to detect such fraud. The great majority of academic authors are honest - it's that tiny minority we have to watch out for. An independent archive, hosted by an institution, library, or publisher, provides a firewall protecting the literature from the authors.
- As article-level metrics gain prominence, author-hosted PDFs may skew some statistics. For instance, let's say I publish a paper in PLoS ONE, and also post a copy of the PDF to my site. Because PLoS ONE records and posts view and download statistics for its own site, any downloads or views from my site are not recorded there. Thus, the statistics are spread across several venues. This is not a major issue in my opinion, but some people may care.
- Under the terms of publication, a publisher may not allow you to post a PDF of your paper. Or, they may only allow you to post a pre-review copy. Or a post-review, unformatted copy. Things get complicated quickly, especially for those concerned about following the letter of the law.
If you are active researcher, you should be posting whatever PDFs of your own work that you (legally) can. If you don't, you're missing out on innumerable opportunities to publicize your work and interact with colleagues. However, personal archiving is not enough to ensure permanence. For the long-term, a bigger solution is needed. Institutional archives, journal archives, society archives, whatever. The ultimate answer may take some time to sort itself out.