August 2004 Newsletter

« July 2004 Newsletter | Main | In-Between Fall 2004 Newsletter »

August 2004 Newsletter

The SEED Developers Meeting in Chicago in October

The next SEED Developers Meeting will be in Chicago during the week of October 24-28. The meeting will be split between Argonne National Lab (Oct 24-25) and the University of Chicago (October 26-28).

The initial period at Argonne will focus on preparing a distribution version of the GenDB/SEED merged environment. This should produce a set of DVDs with both GenDB and the SEED in the distribution. The merged environment is built on technology developed at the University of Bielefeld in Germany and the combination should offer capabilities that are attractive to many groups. In particular, it should support straightforward editing of genes (adding CDSs, deleteing them, and changing start locations), as well as comparative capabilties. This part of the meeting will be quite technical. Anyone planning on attending should contact Ross Overbeek (Ross@TheFIG.info) or Mike Kubal (mkubal@mcs.anl.gov) till September 15th. Entrance to Argonne is restricted, and we need to apply for guest passes for visitors well in advance (especially foreign nationals).

The second part of the meeting will be held at the University of Chicago and will focus on the Project to Annotate 1000 Genomes (i.e., subsystems annotation). This will include both a tutorial for people that wish to understand how to annotate using the SEED, as well as an exchange of views and status by people already quite active in the effort. For those wishing to attend this part, please contact either Mike or Ross, and feel free to do so right up to the meeting.

The Subsystems Effort: How it is Shaping Up

The subsystems effort is beginning to take shape. It now involves a number of people, some real experts and some relatively inexperienced. Some people work on the SEED server at the University of Chicago (http:TheSEED.uchicago.edu/FIG), but more work on their own personal systems. Periodically, they deposit versions of their analysis on a server from which others can download as desired (this is called "publishing" the subsystem). Any user of the SEED can download any or all of these deposited subsystems.

A Wiki is used to coordinate who is working on what (http://www-unix.mcs.anl.gov/SEEDWiki/moin.cgi/SubsystemBulletinBoard). A Forum (http:subsys.info) has been set up as a framework for posting notes of general interest, predictions, and general comments. The forum is intended to become a vehicle for opening up the effort to an unlimited set of potential users.

We have found that there has been a bit of effort required to get people acquainted with use of the Wiki and the Forum, and use of the software to support development of a subsystem does require a tutorial. There are basic inadequacies in the tools at all levels, and this can prove to be quite frustrating.

However, these last three weeks (vacation time seems to come to an end) have seen an amazing amount of progress, and things are now beginning to take off. A number of subsystems that cover thousands of assignments have been published in these last few weeks, and we know of many more in progress. We are very enthusiastic about this point.

The iBook: a Minor Technical Note

Sveta Gerdes recently joined FIG and has begun working on encoding subsystems. Due to some lovely technical work by Bob Olson and Ed Frank, the entire SEED can now be run on a laptop without an external drive. It does consume 20-30 gigabytes of the internal disk. Sveta now uses an iBook (basic cost of about $1499 + $129 for and additional 512 MB of internal memory) that supports a full version of the SEED on the internal disk.

Leading to a Proposed Annotation Environment

This leads us to briefly discuss the basic working environment that we are proposing for annotation teams:

  1. A central SEED server is optional, but is often useful for two reasons: it can be used to support people that have only really inexpensive laptops or desktop computers, and it can be used as a framework for offering access to the world at large. Most annotation efforts will include one, and it can be run on anything from a $1500 Linux box to a $4000 Mac G5.
  2. Some members of the annotation team will have moderate laptops ($1600-3000 machines) running local copies of the SEED. Others will use personal computers to work over the network on the central SEED server.
  3. Function assignments and annotations are exchanged between individual SEED versions of members of the team (and the central server, if it exists) each evening.

The currently available peer-to-peer tools in the SEED are not adequate to effectively support this style of synchronization. We consider it a high priority to get them to this point in the next month or so.

This leads us to a central tenant of the proposed environment: assignments and annotations are the output of an annotations effort, and there must be an external representation that defines exactly what is meant by their content. In the presence of such a definition, individual users can use whatever annotation software they wish, as long as the results can be integrated and synchronized with those of the entire team.

Once environments are established that contain multiple annotation systems, all synchronizing their results on a daily basis, systems will evolve to take over niches (sorry, perhaps the level of enthusiasm is clouding our judgement here...).

Tutorials

Ross is planning on teaching a 3-day course in annotations and subsystems technology at UCSD on Oct 4-6.

We plan on having a subsystems tutorial and general discussion of SEED technology at MIT in Oct (probably Oct 18-19).

The cancelled tutorial at Los Alamos will be rescheduled for Sante Fe in November.

If you wish to know about any of the details, please check the FIG Forum or contact Veronika (Veronika@TheFIG.info).

The Funding Situation

During FIG's first year, the funding situation might reasonably be characterized as somewhat precarious. We survived due to a few relatively small contracts, and some donations from people and institutions. We are deeply grateful to those who helped support our goals during that period. The plan was always to establish a basic open source platform and then to use it to secure portions of grants that would build upon those capabilities. Numerous grants were submitted in which FIG was included.

Finally, in August FIG signed a subcontract finalizing its participartion in a 5 year project led by PI Rick Stevens and Co-directed by Ross to build a National Microbial Pathogen Database (http://www.niaid.nih.gov/dmid/genomes/brc/awards.htm). This is an $18M grant made to the University of Chicago. We intend to participate in a number of grants; in some we play a very minor role helping to develop independent teams based on SEED technology, and in others we will play a more active role. The SEED technology will prove useful in a number of the emerging application areas ranging from pathogen databases to analysis of environmental samples, and we encourage groups to consider basing long-range developments on it.

# Permalink


 

Trackback Pings

TrackBack URL for this entry:
http://www.conservativecat.com/mt/mt-tb.cgi/118