Wednesday, January 13, 2010

Gold Standard Records

Starting on a new project always brings a certain level of excitement. Often what is a "new" project for a trainee is actually a continuation or a spin-off of a predecessor's work. If you're lucky, the predecessor is still around to give you all the highlights, provide protocols, show you the results and where to find the reagents, etc. If you're not so lucky, then your excitement can soon turn to frustration as you wade through stacks of notebooks and data files, trying to figure out what exactly your predecessor did and where s/he stored key reagents. Sometimes you learn that the "representative" results presented to the PI were actually the best results, a handful out of a virtual mountain of data.

This is just one situation that illustrates the importance of fastidious data management in research laboratories, an issue that might be one of the biggest weaknesses of academic research labs.

Back to basics
Let's start with the lab book. This is where most of us (and I include myself here) need to go back to the first day of gen chem. Anything that goes into the book should be legible and coherent. And we should be writing down everything--well, at least everything pertinent to the experiment (your successors don't really need to know what you had for breakfast or how hungover you are). This includes:
  • why you're doing the experiment (a.k.a. the objective)
  • the experimental setup and procedure including pesky things like recording concentrations of reagents, volumes for injections, the solvent or buffer used for dilutions, instrument and settings used... You catch my drift.
  • raw data (or reference to its location)
  • locations of data files, including physical location, directory, folder, file names
  • how data was processed
  • final results (i.e. the pretty graph or table)
  • conclusions and/or notes for future experiments
Writing all this can become extraordinarily tedious, especially when we're doing similar experiments on a weekly or even daily basis. In some cases, it is sufficient to reference a page in the lab book where the protocol was first described, making note of alterations. Alternatively write up the standard protocol in a word processing document, make appropriate changes for a given experiment, and print and paste it into the lab book. If something changes during the course of an experiments, make a note of it. It doesn't really matter what approach we use, so long as we are being thorough. There should be sufficient detail for someone to repeat the experiment without ever talking to us.

We should also be writing in the book as we work, whenever possible. Too often, we place faith in our memory or our complex system of notes on post-its, paper towels, and gloves. We become slack in maintaining our books, updating them every few days, or maybe even once a week... or less. Then as we're updating our books, we realize we're a little fuzzy on the details... or that we mistakenly tossed that glove in the trash because we thought it was rubbish... so we end up guessing or trying to back-calculate how much of X we added. Not good.

Finally, don't forget to index it! Those wonderfully detailed, coherent notes won't do anyone much good if they can't find it. Chances are, you don't need me to tell you how much of a PITA it is to dig through years of data and notebooks with no idea where you should be looking.

Data in the digital age
The thing about gen chem, at least when I took it, it was beautifully simplistic. I think there was maybe one lab in the entire year that used a probe connected to a computer. The same goes for every chemistry and most biology lab courses that I took as an undergrad. It was simple enough to put everything in a notebook then. As we advance to higher level research, though, the game changes. There's proteomics, FACS, real-time intravital imaging, and a myriad of other techniques that generate massive amounts of data. While working on this post, I was collecting about 5 GB of data... for a one replicate in one group of one experiment. Raw data from such experiments do not lend themselves to hard copy production. They only exist in the digital world. So we must be as fastidious in organizing and maintaining digital records as we are in maintaining our lab books.

Backup plan
I think we have lived in the digital age long enough to realize that sometimes computers die, and despite IT's best efforts, cannot be resuscitated. This is why we should be backing up all of our data files on a regular basis. Both Bear's and Guru's labs keep external hard drives around for this purpose. Some labs may have access to network storage through their institutes. Generally space is fairly limited, but this is fine, if you're not generating gigabytes of data on a daily basis.

When it comes to backups, though, one thing we don't think about so much is our physical lab books and data. However, there is the possibility of fire or flood in the lab destroying our research records. Or they might just sort of wander off. I have yet to see a lab that uses duplicator notebooks or that photocopies or scans notebook pages, but it's probably not a bad idea. Lab books, after all, are the primary record of everything that's been done in the lab.

Safeguard
A peculiarity of data management is that many PIs don't talk about it. In my graduate and postdoc labs, on my first day, someone showed me where the new notebooks were kept. That was it. When I left my graduate lab, I just told the lab manager where my lab books were stored. It seems PIs assume that scientists--whether students or postdocs or research associates--know how to fill out a lab book and keep data organized. Perhaps PIs anticipate that the lab manager or other colleagues will provide direction as necessary. Of course, because this is a day-to-day task, it is not feasible or reasonable for a PI to constantly check lab books. And some people won't make long-term change without constant reminders.

So what's a PI to do? How is s/he to monitory and maintain the integrity of data and records without randomly inspecting lab books? Does anyone actually do the "understood and witnessed by" thing outside of industry?

Guru is a fan of seeing all data--the good, the bad, the ugly, the inconclusive... He periodically meets with individuals to discuss projects and experiments. As a trainee, it's a necessity to bring your notebook to these meetings because Guru might ask you about results from days, weeks, or months ago. In so doing, Guru sees our notebooks. This could offer a solution. Yet I have encountered some of the same problems locating information from previous trainees.

Some might argue (rightfully) that PIs have better things to do and shouldn't bother. Trivial as it is, proper data management is a crux for an efficient and productive laboratory. Researchers must be vigilant in keeping good records, but PIs should ensure that records are clear and consistent.

Comments (10)

Loading... Logging you in...
  • Logged in as
Lab minion's avatar

Lab minion · 794 weeks ago

In my lab, this is one of those things (one of many things...) that isn't talked about explicitly at all.

I'm struggling a little bit with the requirements of record keeping during "normal" benchwork, high throughput and/or data heavy experiments, and computational work/data analysis. I do some of each at different times, and the experiments often interweave. I'm not the most organized person in the world, so I have to make extra effort to keep a good standard of record keeping (and also some of my struggles might sound awfully dumb.) But so far, it's not going so great.

Benchwork is easy enough to record in a lab book in something resembling a standard fashion. But when I'm at the computer writing software, I gravitate towards electronic documentation (who wants to write out by hand a sequence of commands and results when I can just copy and paste into a text file?). But how to integrate these? Where does source control fit into the picture (another thing I haven't set up yet...) And where does that leave things like general objectives/strategies, todo lists, and notes from meetings, papers or seminars?

I'm thinking that I might have to start printing out my notes after a coding session and taping them into my lab book. But that seems like overkill. If anyone has any advice or even just statements along the lines of "well, this is what I do, and it works for me..." I would be intensely grateful
2 replies · active 793 weeks ago
It is a unique challenge when you've got multiple projects going and/or you're using methods that involve long, drawn-out processes. My approach has been to setup multiple notebooks. I tend to divide based on general theme/method. In grad school, I had one for cloning and protein purification (just generating reagents for use), one for chemical synthesis, and several for "real" experiments (i.e. the ones that would produce data for papers). In my current lab, I'm using a similar system, dividing in vitro work, in vivo studies, cloning, and general protocols (i.e. buffer recipes and generic procedures). This has worked reasonably well for me. A colleague in my graduate lab, who used similar methods on many different projects for our lab and collaborators, divvied up his books by project theme.

I know nothing about programming, but I'm getting into work where I generate several GB of data per experiment. After all the appropriate processing, I'll end up with a set of movies. The approach I'm taking is to jot down pertinent experimental details and filenames with a description specifying what was done in that file. Later I'll record how the data were processed and the corresponding files and probably include a few screen shots.

Generally, I restrict information in my lab books to design and execution of experimental work. I include some strategic planning, i.e. why and how I want to do X. For instance, when working out a retrosynthetic pathway, I sketched it out in my notebook and included references for reactions that I didn't typically use. I also jot down notes for how the next experiment could be improved or what needs to be changed. I keep a separate (non-lab) notebook for seminar notes and another for meeting notes, planning, and to-do lists; most of this is not pertinent to the lab work.
The following comment was left on the "Responsibilities: Open Forum" post. I've copied it here because, even though I know nothing about this, it might be pertinent to Lab Minion's situation:

Tsu Dho Nimh said...
When you are collecting and tracking reams of computerized data, consider using the same systems that are used to maintain and modify software source code or web sites.

<a href="http://en.wikipedia.org/wiki/Revision_control" target="_blank">http://en.wikipedia.org/wiki/Revision_control

These systems can track changes and create branches off the main line, and handle reversion control on text documents. They can also check files into a controlled directory, for your data collection runs.

Some applications intended for web development have version control built into them. Look at Wikipedia articles' history. Automatic backups to an off-site facility, or daily burning to DVD, can be set up easily.

Good grief, join the computer age.
There should be sufficient detail for someone to repeat the experiment without ever talking to us.

AHAHAHAHAHAHAHAH!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2 replies · active 794 weeks ago
Ok, how about this:

"There should be sufficient detail for someone to repeat the experiment without hating our guts and wishing death upon us."
I know... if only we could live in such an idyllic world.
Sadly enough I used to use a duplicator notebook but we stopped ordering them. I type out a written protocol and end up typing up the data too and make a final report for experiments that goes into a project binder, besides writing every darn detail in my notebook. So at least if my notebook makes off I have the report in the project binder, on my puter, and a digital backup. That is as fastidious as I am going to get, anymore and I would be spending more time on records management than experiments.
1 reply · active 794 weeks ago
I like the final report idea. I might try this in the future.

As for the backup of lab books, I feel this is an issue that the PI should work out. As you point out, if we spend anymore time managing and backing up records, we start running out of time for experiments. I think this would ideally be under the purview of an admin or lab manager--although they might argue otherwise :P
Great post! I started to comment, then realized I have way too much to say, so I wrote some posts on my blog. http://samanthascientist.blogspot.com/ if you're interested.
1 reply · active 794 weeks ago
Nice! I'll be looking at this in more detail later.

SSci asks in one post if there is a lack of consensus for data management and organization. But I think the real issue (in academia) is a failure of PIs to establish practices and consequences. The most fastidiously maintained records I've seen among colleagues and predecessors are those who spent a few to several years in industry prior to working at a research university. If you're working in industry, there are established guidelines for good laboratory practice (GLP). GLP, esp. data management, is do-or-die for companies because of project and personnel turnover and legal requirements for patent filing. I don't whether the lack of GLP guidelines and repercussions for not following them in academic labs is due to "out of sight, out of mind" or the "I have better things to do with my time" mindset.

Post a new comment

Comments by