CHLC REPORT

Volume 1, Number 1, May 1993

INTRODUCTION

Jeffrey C. Murray, M.D.
Principal Investigator

This newsletter represents the first report from the Cooperative Human Linkage Center (CHLC) established by the NCHGR in Fall, 1992. We have included short descriptions of each of the involved projects, which are located at The University of Iowa, Fox Chase Cancer Center, Marshfield Medical Research Foundation and Harvard Medical School. In addition to short project descriptions, we have included the first round of genetic maps developed by the center.

The long-range goal of the center is to develop high heterozygosity genetic maps that are greatly enriched for the presence of easy-to-use PCR-formatted microsatellite markers, with a particular emphasis on tri- and tetranucleotide repeats that are easy to genotype. The grant will synthesize published genotypic data developed on the CEPH families by outside investigators, as well as genotypic information generated from marker development in CHLC core laboratories. The center is also open to assisting outside investigators who would like incorporation of their own genotypic information into these maps, as well.

The maps presented here are a preliminary synthesis of publicly available genotypic information existing in the CEPH database and are seeded with the first sets of markers developed through our own efforts. We provide information for online access to a CHLC database of these markers and maps which will be revised collectively at approximately six month intervals. In addition, information and access to markers will be provided, both as an online service and through direct reagent access facilitated through primer availability at Research Genetics.

We will continue to work with others to bring genetic maps to a high degree of resolution and to facilitate disease gene mapping using a variety of strategies that benefit from the availability of highly polymorphic markers. Such strategies include not only linkage analysis, but also studies of non-traditional inheritance such as imprinting, locus expansion, and loss of heterozygosity studies. In addition, the markers developed in this center will also provide STSs for physical mapping efforts currently underway. All markers developed will be assigned chromosomal localizations, and although only those markers with heterozygosities above 0.7 will initially be genotyped and entered into the linkage maps, all markers with chromosomal assignments will be made available for efforts by other laboratories for genetic or physical mapping.

We welcome comments and suggestions pertaining to the newsletter and our plans and these can be communicated directly by e-mail, phone or fax to any of the relevant co-investigators or contacts listed below.

Jeffrey C. Murray, M.D.
Associate Professor of Pediatrics
The University of Iowa
Iowa City, IA 52242
TEL: (319) 356-3508
FAX: (319) 335-6970
e-mail: jeff-murray@umaxc.weeg.uiowa.edu

Geoffrey M. Duyk, M.D., Ph.D.
Department of Genetics, EQRF Room 447
Harvard Medical School, 200 Longwood Ave.
Boston, MA 02115
TEL: (617) 432-6072
FAX: (617) 432-7663
e-mail: duyk@rascal.med.harvard.edu

Val C. Sheffield, M.D., Ph.D.
Assistant Professor of Pediatrics
The University of Iowa
Iowa City, IA 52242
TEL: (319) 356-2674
FAX: (319) 356-3347.
e-mail: sheffield@vaxa.weeg.uiowa.edu

James L. Weber, Ph.D.
Senior Scientist, Human Genetics
Marshfield Medical Research Foundation
Marshfield, WI 54449
TEL: (715) 387-9179
FAX: (715) 389-3808
e-mail: weberj@dgabby.mfldclin.edu

Kenneth H. Buetow, Ph.D.
Fox Chase Cancer Center
7701 Burholme Avenue
Philadelphia, PA 19111
TEL: (215) 728-3152
FAX: (215) 728-3574
e-mail: kh_buetow@fccc.edu

Robert F. Weir, Ph.D.
Professor of Pediatrics
The University of Iowa
Iowa City, IA 52242
TEL: (319) 335-6705
FAX: (319) 335-8318

Nancy Newkirk
CHLC Administration
The University of Iowa
TEL: (319) 335-6899
FAX: (319) 335-6970


PROJECT 1

Geoffrey M. Duyk, M.D., Ph.D.

Our marker selection approach has been to develop technology which enables us to rapidly accumulate small insert clones from all classes of tri- tetranucleotide STRs. The basic strategy, termed marker selection, requires the construction of high complexity, small insert libraries essentially free of chimeras or clones without inserts.

This choice reflects the prior existence of large efforts to develop dinucleotide repeat markers, the general perception that these classes of markers result in more readable amplification products and the possibility that the availability of STRPs from multiple repeat classes will permit hybridization-based multiplex genotyping. In addition, with the increasing recognition that trinucleotide repeat expansion may be an important mechanism underlying human genetic disease, the availability of a large number of trinucleotide STRPs may provide an important resource for disease gene identification.

Other activities of Project 1 include studies devoted to increasing genotyping throughput as well as the development of efficient methods for recovery of STRs from large insert clones. Such methods will be essential for gap filling. As the project matures, the availability of a large set of STRPs will permit the investigation of the basis for repeat variability and explosion, help establish a set of cDNAs maintaining STRP sequences and further exploration of the role of repeat expansion in mutation. Investigators interested in additional information, detailed protocols, vectors or bacterial strains should contact:

Geoffrey M. Duyk, M.D., Ph.D.
Department of Genetics, EQRF Room 447
Harvard Medical School
200 Longwood Avenue
Boston, MA 02115
TEL: (617) 432-6072
FAX: (617) 432-7663
e-mail: Duyk@rascal.med.harvard.edu

PROJECT 2

Val C. Sheffield, M.D., Ph.D.

Project 2 of the Cooperative Human Linkage Center has as its primary goal the development of a minimum of 2,000 new, highly polymorphic (>0.70 heterozygosity) short tandem repeat polymorphisms (STRPs) with an emphasis on developing tri- and tetranucleotide repeat markers. The strategy for marker development consists of sequencing marker-selected clones obtained from Dr. Duyk's laboratory (Project 1), selecting PCR primers flanking the repeat and testing the PCR product for polymorphic information content. All markers are assigned to a specific chromosome using monochromosomal somatic cell hybrids, and all highly polymorphic markers are sent to Dr. Jeffrey Murray's (Project 3) and Dr. James Weber's (Project 4) laboratories for high resolution genetic mapping.

In the past few months, Project 2 has developed over 300 tetranucleotide markers. These markers are highly polymorphic, assayable using a standardized PCR condition, and have readily interpretable alleles. In addition to the goal of developing new STRPs, SSCP and DGGE are being used to identify polymorphisms in the 3' untranslated region of cDNA sequence. The identification of polymorphisms in cDNA sequence allows placement of cDNAs on the genetic map.

Another goal of Project 2 is to develop a set of approximately 200-300 uniformly distributed STRPs which can be used for primary disease linkage studies. To this end, a primary linkage set of approximately 200 markers was developed, which were assayable using a single PCR condition. These markers, most of which are dinucleotide repeats, have proven extremely useful for disease linkage studies. For example, in collaboration with others, Project 2 has used the primary linkage set of markers to identify five hereditary eye disease loci. In order to improve the efficiency of primary linkage studies, the dinucleotide repeat markers are gradually being replaced with tetranucleotide repeat markers.

An underlying theme of the CHLC is the distribution of its resources to the user community. To this end, the CHLC will distribute STRP primers through Research Genetics and other interested companies. In addition, arrangements can be made for investigators working on disease families to bring their family resources to the University of Iowa to perform linkage studies on a collaborative basis.

Val C. Sheffield, M.D., Ph.D.
Assistant Professor of Pediatrics
The University of Iowa
Iowa City, IA 52242
TEL: (319) 356-2674
FAX: (319) 356-3347
e-mail: sheffield@vaxa.weeg.uiowa.edu

PROJECT 3

Jeffrey C. Murray, M.D.

The primary goal of Project 3 is to generate genotypes for the STRPs developed through Projects 1 and 2. These genotypes are then fed to Project 4 for incorporation into the developing linkage maps. Project 3 focuses around generating high quality, reliable genotypes using a variety of robotic assists, on a subset of the 60 CEPH families. Genotypes are currently generated by bodylabelling PCR products using 35S, and analysis of fragments on sequencing gels.

Genotypes are set up from formatted 96well titre plates that include vacant wells at intervals to allow for controls and gel alignment. Multiplexing is currently done at the level of gel loading. Genotypes are scored and entered by hand in duplicate, with a subset of those generated also typed in duplicate through Project 4 to allow for data validity checks.

The project also has a limited ability to assist outside investigators in their own genotyping efforts. This would include hosting two-day to two-month visits for investigators who wish to carry out genotyping on their own samples, genotyping of newly-generated anonymous markers or shotgun linkage searches in familial disorders.

Jeffrey C. Murray, M.D.
Associate Professor of Pediatrics
The University of Iowa
Iowa City, IA 52242
TEL: (319) 356-3508
FAX: (310 335-6970
e-mail: jeff-murray@umaxc.weeg.uiowa.edu

PROJECT 4

James L. Weber, Ph.D.

The major goals of Project 4 are to type newly developed STRPs through the CEPH families, to improve STRP genotyping technology, to collaboratively map disease genes, and to analyze several human meiotic parameters such as interference and sexual differences in recombination.

Typing of new STRPs will initially involve use of about 210 individuals from 14 of the largest CEPH families. Emphasis will be placed upon reduction of typing errors through the use of standard arrays of DNA templates within microtiter plates and 12 channel pipetting devices. Alleles will be assigned consistently among different families leading to useful estimates of allele frequencies.

Improving STRP genotyping technology will initially involve efforts to maximize the numbers of genotypes obtained per sequencing gel. Routinely three to six markers will be amplified simultaneously and electrophoresed together on 144 lane gels. In this way, up to 850 genotypes will be obtained per gel. Image analysis software specifically designed for STRPs will be used to speed the scoring of the markers and to avoid inconsistencies in allele assignment among families. Hardware and software for fluorescence-based sizing of alleles will gradually be developed to decrease the amount of labor required for genotyping.

Collaborative disease gene mapping efforts which have already resulted in the localization of a dozen genes will be continued through the CHLC. Visitors will come to Marshfield for periods of up to two months to engage in concentrated genotyping efforts. Because of limited amounts of available equipment, generally only one visitor will be accepted at one time. Visitors are responsible for all travel costs and living expenses in Marshfield, but all supplies will be provided by the CHLC. Interested individuals should contact Jim Weber at the address below. Groups working on disorders prevalent in minority groups or disorders that primarily affect women are especially encouraged to apply.

As many as 106 new genotypes will be determined by the CHLC over the next few years. These data represent an enormous new resource of human meiotic information. Distributions of crossovers along the chromosomes, crossover interference, sex-specificity in recombination rates, recombination hotspots, and relationships between genetic and physical distances are among the meiotic parameters that will be analyzed.

James L. Weber, Ph.D.
Senior Scientist, Human Genetics
Marshfield Medical Research Foundation
Marshfield, WI 54449
TEL: (715) 387-9179
FAX: (715) 389-3808
e-mail: weberj@dgabby.mfldclin.edu

PROJECT 5

Kenneth H. Buetow, Ph.D.

It is the primary goal of this project to use the marker and genotype data generated in Projects 1-4 to construct a high integrity, fine structure, meiotic map of each human chromosome. Map construction will be conducted in a twotiered manner. First, a high heterozygosity 10 cM resolution index map of PCRdetectable markers will be constructed. Next, likelihood and crossover minimization techniques will be used to integrate additional points to achieve a 2.5 cM resolution index map. These techniques will also be applied to obtain likely locations for previous RFLP typing from the CEPH panel and lower heterozygosity gene loci. It is recognized the map construction here will parallel efforts in progress in other gene mapping laboratories. The centralized effort conducted in this investigation will be complementary to these investigations.

As the first step toward accomplishing the above goals, a collection of maps have been generated that combine publicly available data with new genotype data generated by CHLC investigators. These maps integrate the genetic maps generated by the NCHGR Index Map Consortium and Genethon. They are augmented by data on additional markers provided by CHLC and CEPH investigators. The datasets are available through anonymous FTP (see below).

To generate the maps, the CHLC is using a new, semi-automated, map construction algorithm. The mapping algorithm is a stepwise construction procedure that utilizes the program CRIMAP as its analytic engine. The dataset is initially diagnosed for pairwise observations that show heterogeneity in pairwise recombination estimates by family. Such loci are excluded from primary construction. Loci are initially added to the map in order of information content. As each locus is added, support for the map and map expansion is re-evaluated. Loci that expand the map and/or are not supported by lod 3 criteria are removed. Loci demonstrating map expansion are moved to the end of the list for consideration in locus placement. The process is repeated until no loci can be added to the map at lod 3 support. The maps built by this alogrithm are somewhat more sparse than maps built by more traditional mapping algorithms (average marker density is 6.7 cM). However, they have very high confidence, and low error rates. These maps, called skeletal maps, and their corresponding error profiles are available through anonymous FTP.

The CHLC group has also generated a more highly annotated collection of maps. These maps were constructed using the STRP-based skeletal maps as starting points and expanded using the CRIMAP-BUILD procedure with framework selection criteria for locus inclusion. These framework maps, their diagnostics, and likely locations for points that do not meet framework criteria, are also available through anonymous FTP. The sex-averaged version of these framework maps is included with this newsletter.

The map construction in this project will proceed simultaneously with development of statistical tools that allow the assessment of map quality and integrity. The primary focus of these efforts will be the development of statistical diagnostic methods for the evaluation of mapping outcomes. It is the goal of such diagnostics to identify error typings and biologically interesting observations.

Two concurrent approaches to the development of these tools will be taken. The first will use computational methods to assess the relative contributions to the final outcome of individual observations. These tests will be conducted at the level of individual typing, gamete, locus and family levels. As these methods are computer intensive, parallel/distributed algorithms for analysis/re-analysis of multipoint data are under development. In addition to these methods, explicit tests which are extensions of the statistical methods used in regression diagnostics will be explored.

Finally, means of applying goodness-offit tests will be evaluated. These will include the contrast of outcomes based on pairwise analysis (multiple pairwise likelihood analysis and seriation) as well as the use of empirical Bayes methods for assessing fit. The efficacy of using empirical Bayes methods to update linkage maps will also be examined.

Kenneth H. Buetow, Ph.D.
Fox Chase Cancer Center
7701 Burholme Ave.
Philadelphia, PA 19111
TEL: (215) 728-3152
FAX: (215) 728-3574
e-mail: kh_buetow@fccc.edu

INFORMATICS Core

Robert K. Stodola
Kenneth H. Buetow, Ph.D.

The objective of the Informatics Core is provision of computer based tools that facilitate scientific aims of the Center. Its responsibilities include the storage, retrieval, and interpretation of the map reagents and data generated in the proposed research. The Informatics Core is charged with the management of Centergenerated mapping reagents (sequence information, primers, genotypes, etc.), distribution and storage of protocols, and management and distribution of mapping outcomes (chromosome maps, meiotic breakpoint locations, etc.).

The primary purpose of this core is to generate and maintain a "production," database. This database will provide access to common resources and information within the Center. As CHLC efforts are proceeding at four geographically disparate locations (Harvard, University of Iowa, Marshfield, and Fox Chase) the current strategy is to build client-server based applications using the internet as a medium of communication between the four sites. Work is currently proceeding in the areas of database construction, distributed applications, and Graphical User Interface (GUI) tools.

The preliminary database has been constructed and a number of graphical interfaces to the database have been developed. We have selected Sybase as the database system and are currently using it with DECStation 5000 series computers. Several DEC AXP systems running OSF/1 have been purchased, and we plan to port the database when Sybase becomes available on this platform. To avoid dependence on Sybase, we have isolated the applications from the database with a database-independent interface, and used code generation techniques to reduce the complexity of building this interface library.

We have created several interesting distributed applications. One such is a distributed Primer PipeLine. Marker generation is currently underway at Harvard and the University of Iowa. Raw sequences are produced using ABI sequencers with Macintosh interfaces. The raw sequence files are copied directly onto a CHLC DECStation at these sites, and transferred to Fox Chase for processing. The PipeLine then assembles, strips cloning vector, identifies repeat regions, selects primers using PRIMER, verifies uniqueness, applies user selection criteria, and generates primer synthesis orders. At each stage data and user selections are stored in the production database for further information and use.

We are also developing a distributed linkage analysis program. Using the DCE (Distributed Computing Environment) component of OSF/1, we are partitioning the linkage analysis into a number of pieces which can be submitted to any available processor in the project. We anticipate making use of spare CPU cycles on all of the CHLC computer systems, including those at the remote sites by running linkage servers as a background process.

The CHLC Informatics Core is also responsible for the development and maintenance of a public access information system. This system will provide tools that facilitate the communication of the Center's mapping resources to the outside genetics communities. Primary assistance in gaining access to information or services beyond those described here can be requested via electronic mail at help@chlc.org. It is anticipated that the CHLC public access database server will not become operational until Fall/ Winter of 1993. In the interim, CHLC data will be available via anonymous FTP to ftp.chlc.org and through a CHLC Gopher Server addressed gopher.chlc.org. Described below is the information currently available.

README
A file describing the current contents. Each of the folders below also may include a README file describing the contents

chlc/newsletters
The CHLC newsletters in plain text and postscript

chlc/genotypes/tables
Tabular descriptions of marker systems in the chromosome specific datasets

chlc/genotypes/typing
Chromosome-specific genotype sets in CRIMAP file format chlc/maps/framework

chlc/maps/framework
Framework maps of all markers currently mapped by CHLC (including markers from other sources)

chlc/maps/skeletal
Maps generated using the stringent map build algorithm described above


 Each maps folder contains three folders: 

./diagnostics Diagnostic data on maps. ./figures Postscript figures. ./tables Map information in text form.

chlc/markers/chlc
CHLC-produced marker data

chlc/markers/marshfield
Marshfield-produced marker data

A collection of public analytic services will also be supported by the Informatics Core. These services will be a subset of the analysis and evaluation tools used within the project which do not require exceptional computational resources. This will be provided free of charge and without any implied commitment to any level or service, accuracy or usefulness.

These servers will be provided via automated electronic mail servers, and we can take no responsibility for the privacy or confidentiality of these channels. The services provided will often include procedures developed by people outside the CHLC group. When these have not been placed in the public domain, we have asked permission to use these programs and procedures and kindly thank these individuals and groups for their use. In all cases, each automated response will include attribution supplied by the author for his or her work. Instructions for each automated service can be found by sending any electronic mail message to the server address.

An information server has been placed in service that provides descriptive information about the CHLC project and data. It can be reached by sending e-mail to:


    info-server@chlc.org 

Mail to servers other than the info-server will reply with instructions on how to correctly structure messages to receive service and describe the services provided. It is anticipated that as of June 1, 1993 a server to perform linkage mapping will be in place. Initially, this server will take an individual marker system's genotype data and return markers from the CHLC data sets that show linkage. This information will include recombination fraction and lod scores. Later versions will provide map position information. To check the status of the linkage server send e-mail to:


    linkage-server@chlc.org 

Questions about CHLC services may be directed to help@chlc.org. Since there are people on the other end of this address, please be patient. There aren't a lot of people on the other end, and all have lots to do!

In order to make it convenient to have CHLC announcements delivered via either USENET News or via electronic mail, and to avoid adding to the confusion of how to subscribe to yet another mail service, all CHLC postings will be presented via an appropriate BIOSCI newsgroup (currently, via BIOSCI/ GENETIC-LINKAGE). If you have access to USENET news, this is the newsgroup:


    bionet.molbio.gene-linkage 

If you don't have access to USENET news or prefer to subscribe via electronical mail, the following instructions taken from Dave Kristofferson's "BIOSCI/bionet Frequently Asked Questions" posted to bionet.announce on May 1, 1993):

"For those who need e-mail subscriptions or who want to cancel current email subscriptions, please send a request to one of the following addresses. Please choose the site that serves your location. Simply pick the newsgroup(s) from the list above that you wish to subscribe to and request that your address be added to the chosen mailing lists. Please use plain English; no special message syntax is required in your subscription or cancellation request.


    Address                  Serving

    biosci@net.bio.net       The Americas 
                             and Pacific Rim

    biosci@daresbury.ac.uk   Europe, Africa, 
                             and Central Asia 

If you are changing e-mail addresses, please be sure to send a message to your request that your subscriptions be changed or canceled!"

Dave also strongly recommends that all participants subscribe to the BIOSCI/ ANNOUNCE group (USENET bionet.announce).

Robert K. Stodola
Kenneth H. Buetow
Fox Chase Cancer Center
7701 Burholme Ave.
Philadelphia, PA 19111
TEL: (215) 728-3660
FAX: (215) 728-2513
e-mail: rk_stodola@fccc.edu

ELSI Core

Robert F. Weir, Ph.D. James W. Hanson, M.D.

The ELSI (ethical, legal, and social implications) core is funded to carry out two projects: an IRB-type committee on genetics research and a postdoctoral fellowship program. The ELSI Committee Chair and Core Director, Dr. Robert Weir is the Director of the Program in Biomedical Ethics at the University of Iowa. Current committee members are listed below.

ELSI COMMITTEE MEMBERS


Robert Weir, Ph.D.          ELSI Core Chair
                            Biomedical Ethicist

Jeff Murray, M.D.           P.I., CHLC

James Hanson, M.D.          Medical Geneticist

Kathy Mathews, M.D.         Pediatric Neurologist
                            Genetics Researcher

Susan Johnson M.D.          OB-Gynecologist
                            U of I IRB Chair

Laura Hart, R.N., Ph.D.     College of Nursing
                            IRB Member

Stanley Grant, R.N.         OB-GYN
                            Prenatal Diagnosis 

Still to be added to the committee are a consumer of genetics services and a health-law attorney.

The ELSI committee has undertaken an analysis of the consent documents currently being used in genetics research. A written request for examples of these documents has been mailed to 150 genetics researchers nationwide, who were selected at random from the American Society of Human Genetics (ASHG) membership directory. Part of the committee's long-range plan is to develop one or more consent form models for genetics research that will prove helpful to both scientific investigators and to persons who participate as subjects in genetics-related research. The ELSI committee also plans to provide educational materials to be used by IRBs when they consider proposals for genetics research. We will coordinate our work with some of the work already done by the ASHG, the Alliance of Genetic Support Groups, and the Poynter Center at Indiana University.

The ELSI Core's postdoctoral fellowship program will be advertised nationally in the near future. This program will be directed at professionals outside the biological sciences who teach courses, give presentations, publish articles or books, or do other work pertaining to the ethical and legal issues of modern genetics. Such individuals would include persons in the fields of philosophy, history, law, journalism or religion. They will be at the University of Iowa for 2-4 months. During that time they will have a variety of work-related experiences in a molecular genetics lab, one or more other genetics labs, and several clinical genetics settings. On completion of this fellowship program, participants will have achieved a broader understanding of the challenges, technical vocabulary and problems regularly confronted by persons who work in molecular genetics and/or clinical genetics settings.

Robert F. Weir, Ph.D.
Professor of Pediatrics
University of Iowa
Iowa City, IA 52242
TEL: (319) 335-6705
FAX: (319) 335-8318

James W. Hanson, M.D.
Professor of Pediatrics
University of Iowa
Iowa City, IA 52242
TEL: (319) 356-2674
FAX: (319) 356-3347

ADMINISTRATIVE Core

Jeffrey C. Murray, M.D.

The Administrative Core serves as a focus for the overall center activities and also includes within it an educational component designed to establish outreach to the lay public.

Secondary School Educational Outreach

The Administrative Core is currently exploring several mechanisms to improve the knowledge base of secondary school students in relationship to the Human Genome Project. Funding is available for mini-sabbaticals by secondary school teachers to spend 1-2 months in the laboratory in a combination of didactic involvement related to human genetics and hands-on laboratory experience in genetic linkage analysis. In addition, collaborations are being developed with a number of external organizations, both in the development of textural materials related to teaching of secondary school students about the Human Genome Project in both its scientific and ethical implications, and also in direct outreach to such schools. The CHLC also participates in programs to have high school and undergraduate college students spend time in the laboratory, as well, again in a combination of didactic and hands-on laboratory experiences.

Jeffrey C. Murray, M.D.
Cooperative Human Linkage Center
The University of Iowa


If you would like to receive future issues of the CHLC Report in hard copy, please complete and send in the following form:

On completion, return to:

  CHLC Administration, 
  #431 EMRB,
  The University of Iowa,
  Iowa City, IA 52242

_________________________________________________________________________
Name

_________________________________________________________________________
Institution

_________________________________________________________________________
Department

_________________________________________________________________________
Street/Building

_________________________________________________________________________
City, State, Zip (COUNTRY)