Thursday, April 17, 2008

Update on PAST in Wine

A previous post detailed instructions for running the statistical program PAST within the Linux program Wine, which required turning on the virtual desktop option. Something happened between that post and the last few updates of both PAST (currently at 1.80) and Wine (now at 0.9.59), because now PAST runs just fine without any special options! This means that you can now turn the virtual desktop option back off - this is nice, because it gives you a lot more freedom in terms of resizing, minimizing, and moving the PAST window.

Wow, I think that was possibly the geekiest and most jargon-laden post I've ever written.

Data and the Open Source Paleontologist

Paleontological research generates data, and lots of it--photographs, measurements, CT scans, character matrices, etc. Data are the cornerstone of most good papers, yet for reasons of space and journal style, often never make it into print (or are relegated to that zone of "online supplementary information"). This is a Catch-22, because anyone hoping to evaluate, reproduce, or build upon your work needs these data.

With the growth of digital media and the internet, things are beginning to change. It's now much easier for paleontologists (and other scientists) to make available primary data - if they choose to do so. This post surveys a few on-line data repositories that are out there, and looks to the future. I'm going to focus on those that are most relevant to paleo types (sorry, no GenBank).

MorphoBank is an on-line data editor and repository for cladistic data matrices. Registration is required to start your own file, after which you can upload images and data matrices. The image upload is particularly nice, because it allows you to link a character coding in a taxon to an image of a particular specimen. This means that someone trying to figure out your character states actually has a prayer of understanding what is meant by "mastoid process elongate (0) or fungiform (1)." The only real downside is that, at present, there doesn't seem to be support for uploading large CT datasets.

This is a relatively new site, intended to archive the basic data underlying publications in evolutionary biology. A number of partner journals have signed on (e.g., Evolution, Systematic Biology, etc.), but unfortunately no paleo journals are there yet. One set of paleo data is available on Dryad, related to the Xenoposeidon type specimen. Kudos to Mike Taylor and Darren Naish for that! Data that could be archived here include photos, data matrices, measurements, and other media. Because the site is in such an early stage, the amount of available data and the search functions are relatively limited currently.
This is another relatively recent website for which I have high hopes. The site focuses on finite element modeling in vertebrate biology, with background information and material properties databases. Of even greater interest is an area where published FE models can be downloaded for others to try out. It would be really, really nice if more researchers went ahead and put their models out there!

This is one of the earliest data archives out there, focusing on CT scan data. Interested users can download movies of 3D reconstructions or slice sequences, download surface models (usually STL format), or read more about the scans. Unfortunately for most specimens, there is no way to download the actual data - so if you want to analyze some part of a specimen, you're out of luck (unless you contact the Digimorph folks directly and have them mail you a DVD). I had high hopes for the UTCT Data Archive, which did post TIFF and JPG stacks of images. But, this effort seems to have lost its wind, and very few datasets have actually been posted. Regardless, DigiMorph has done an admirable job of getting at least the basic CT data out there for a number of publications.

This is another new website, appearing in just the last few months. The basic goal is to make available 3D reconstructions generated from serial section data (whether CT or "old-fashioned" thin sections), in an environment where you can rotate and examine the specimens. Because it's in early stages, content is mostly limited to frog specimens (but how cool they are!). All files are in OBJ format, for which a Windows and Mac viewer is provided (I had no problem getting it to run in Wine, once I turned off virtual desktop). For objects with multiple parts (for instance, a frog head with bone and brain segmented separately), you can change colors on certain pieces or make them transparent. It's a nifty little toy for viewing morphology in 3D. The only downside is that the software features are pretty limited (turn part on or off, change color, change transparency), and you can't take measurements of any sort. Also, the raw data from which the reconstructions were generated aren't available. But, it's another great way to get 3D morphological information out there!

Paleobiology Database
This database brings together faunal, floral, and stratigraphic data from a variety of published and some unpublished sources. It's a fantastic resource for looking at patterns of distribution, extinction, and diversification. Detailed morphological data (beyond body size or tooth measurements) and images are pretty much absent, because they are beyond the primary scope of the database.

Coming Next. . .
As you can see, a number of resources (and these are just a few highlights) are available already. But, a casual user will notice that paleontological data are pretty scarce on many sites, and data capable of further analysis are even more scarce. In the next post, I'll examine the reasons behind this, and what we can do about it.

Tuesday, April 15, 2008

Data and the Open Source Paleontologist - Teaser

First, an assignment for all readers: Fill out a survey sponsored by NESCent (National Evolutionary Synthesis Center and the UNC Chapel Hill Metadata Research Center, who are aiming to establish a data repository for data underlying published work in evolutionary biology and related fields. The deadline is April 16, so act fast!

This is your chance to offer input on how you use and distribute original research data. Every researcher should have a stake in this. The survey took me about 10 minutes to fill out, and the questions were pretty basic (how often do you get requests to share data? What format are your original data in? What happens to your data after you publish?).

Why is this important? I'm going to cover this in more detail in a future post (hopefully in the next day or two). In brief, transparency is at the heart of reproducibility and testability, which are the basis of science. With more journals pushing research data to "supplementary information," and more scientists working in the digital realm, we have opportunities and problems to confront.

More to come. . .

Tuesday, April 8, 2008

I Win!

In a previous post, I lamented the fact that BioOne forced all of its article titles into all-capitals. This was a serious annoyance for anyone exporting references into their favorite bibliographic manager, because you had to retype the title for inclusion in bibliographies. But, I just went to download the reference for the latest paper on Triceratops cranial ornamentation and found a pleasant surprise. The titles are now in upper- and lower-case! Sure, you still have to do a little editing mid-title, but at least it's not a case of complete retyping. Unfortunately, it seems like the old references are still in all-caps. Ah well.

Old Format:

New Format:
Ontogeny of Cranial Epi-Ossifications in Triceratops

I like to think that BioOne changed this due to all of the angry readers of this blog who wrote in, or that some editor had a change of heart after reading my post. Yeah, I'll go on thinking that. . .

Saturday, April 5, 2008

Organizing Museum Notes and Photographs

Most paleontologists do the sort of research that requires an occasional museum visit. On these visits, we take notes and photographs, and spend time studying this or that specimen. The immediate result is a whole bunch of digital or hard-copy information. In this post, I'm not only sharing my own experience and practices, but also am looking for readers to share their own approaches.

Pre-2003 (before I got a laptop), I put all of my notes in bound notebooks. These were handy because I could put measurements, sketches, and brief descriptions in a single physical location. The plus side is that it was a physical copy that wasn't going to be targeted by laptop thieves or corrupted in that hard drive crash at the hotel room. The down side is that it was a physical copy that couldn't be backed up without a trip to the local photocopy center. Plus, anytime I wanted to look something up from a previous museum trip, I had to run to the bookshelf in my office. Not so handy when on a museum trip elsewhere. My notes also tended to be rather brief and scanty in some cases (what a pain to write out a full description in long-hand!). Finally, all of my photographs were taken on film. The result of this was literally hundreds of photographs, still sitting in a set of drawers in my office (organized by museum, but not much else).

Since joining the age of the laptop and digital camera, my work process at a museum has changed drastically. When sitting in collections, I type all of my notes into a word processing file. All of my photos are taken digitally, too. At the end of the day (or the end of the trip, more frequently), I download all of my images from the camera and sort them into folders by taxon and then by specimen. I can't begin to say how useful this photo organization system has been! My notes (and photos, if they're small enough) are backed up onto a USB jump drive that always stays in my pocket. On my hard drive, notes are organized by museum and by date.

Another incredibly useful practice has been the use of digital video. If I have a specimen with particularly interesting morphology, or something that doesn't photograph well, or something that would benefit from a walk-around, I use the video function on my digital camera in order to shoot a minute or two of footage. I probably don't do this as often as I ought - it makes a huge difference when I'm back in the office and trying to interpret a series of photos or notes! I must thank the guys at the Utah Museum of Natural History for cluing me in to this practice.

I have found several benefits to this digital approach. First, because my notes are digital, I can take them with me anywhere - for writing papers at home, or comparing with other specimens in a museum. I've also found that my notes are much more thorough - it's easier for me to type a page than to write out a page in long-hand. Finally, a well-organized photo collection is invaluable. On a visit to the ROM, it's very nice to pull up a few views of that specimen at the AMNH (and it's even nicer to be able to find the photos in a matter of five seconds, instead of five minutes).

Unfortunately, I've also discovered some limitations. First, the digital approach makes it rather difficult to sketch with any efficacy. Sure, you can hook a drawing tablet up to your computer - this just didn't work for me, though. Another alternative might be to make the sketches, and then scan them in to the notes. A second major limitation concerns working on exhibit mounts. There's never a good workspace (or even space, in some cases) on which to set up a laptop. Thus, I usually fall back on the "traditional" methods, in this case.

So, what do the rest of you researchers do? What works? What doesn't work?