We are pleased to be able to distribute this second edition of our CHLC Report. In addition to individual reports from each project, this newsletter contains an updated version of the skeletal maps that we first reported in our initial CHLC Report. These maps, for the first time, include a substantial number of markers developed through our own efforts and continue to integrate markers genotyped on CEPH families from the large number of CEPH collaborators around the world. We are particularly pleased that we have been able to provide significant numbers of tri- and tetranucleotide repeats and that these have been complementary to the efforts of other groups using PCR-based polymorphisms. The maps that we present integrate the efforts of all other CEPH mapping groups and, we hope provide a resource that allows easy use of the genetic maps for a variety of purposes.
On-line access to the CHLC maps continues to be supported through our Informatics Core at the Fox Chase Cancer Center. Initial access could be provided by sending an e-mail message to: info-server@chlc.org. The returned e-mail message will provide detailed infor-mation on how to use the FTP server, gopher services and an overview of the types of information available. This includes not only skeletal and framework maps, but also genotypes, information on mapping methodologies, primer sequences, sequences from which primers were developed and mapping data on our initial battery of markers.
In addition, data is also included on a subset of STRP markers whose low heterozygosity disallowed their mapping on the CEPH panel, but on which somatic cell hybrid assignments to specific chromosomes were made and so they do provide a useful resource for physical mapping efforts, as well as targeted genetic mapping efforts.
Finally, the primers are available at modest cost through:
Research Genetics 2130 Memorial Parkway Huntsville, AL 35801 TEL: 1-800-533-4363 UK: 0-800-89-1393 FAX: (205) 536-9016
The maps will continue to be available on-line and are continually updated with the addition of new data.
Jeffrey C. Murray, M.D.
The primary goal of Project 1 is to develop and implement strategies for the isolation of saturating numbers of small insert clones containing short tandem repeats. This collection of clones fuels the development of genetic markers for inclusion in the high density human genetic map. Our focus has been on the development of trinucleotide and tetranucleotide repeat based markers and the following classes appear to be suitable for marker development: GATA (ATAG), GGAA (AAGG), CTAT (ATCT), GAAT (AATG), GGAT (ATGG), GGAAT (AATGG), ATC, ACT, CTT and AAT. We have also isolated a large number of clones containing GCT (AGC) repeats. Amplification of GCT repeats has been associated with a number of human genetic diseases (HD, SCA I, DM, SBMA) and this set of STSs may be useful in both understanding the mechanism of repeat expansion as well as in the search for other disease genes.
The majority of the marker development for the CHLC utilizes a "genome wide approach". We have constructed a total of nine small insert libraries (300-900 bp inserts) with a combined complexity of eleven genome equivalents. The first generation library (1X) was composed of three independent libraries individually derived from Alu I, Hae III and Eco RV/Ssp I digested human genomic DNA. The second generation library (10X) is composed of six sub-libraries (RS1, RS2...RS6) derived from random sheared human genomic DNA. Short tandem repeat containing clones were isolated following enrichment for the targeted repeat. Our enrichment procedure is termed "marker selection" and provides a 500-1000x enrichment for the targeted repeat. (Detailed descriptions and protocols are available upon request).
Clones maintaining the desired STR are assembled into microtiter plates for storage. Sequencing templates are also prepared from replicas of these master plates at the University of Iowa core facility. Sequencing of the templates is completed both at the University of Iowa and Harvard Medical School. Data storage and exchange of the sequencing information and primer design is accomplished using a set of tools developed by Project 5, termed the "Primerpipeline". Primers are synthesized and distributed to the investigators by Research Genetics (Huntsville, Alabama). Markers are initially characterized by Project 2 and markers suitable for genotyping are passed on to Projects 3 and 4. Data management and map construction are coordinated by Project 5.
The actual completion of the human genetic map will require the development of strategies for gap filling. We plan to rely upon emerging physical mapping reagents as the primary template for targeted marker production. Efforts to implement efficient approaches for the recovery of genetic markers from large insert clones are currently under way. Such approaches will not only help fill gaps but will be important for the more general problem of construction of local high density genetic maps that facilitate positional cloning efforts.
Geoffrey M. Duyk, M.D., Ph.D.
In collaboration with Project 1, we have now developed over 500 short tandem repeat polymorphisms (STRPs). All of these markers are tri- or tetranucleotide repeats with the majority being [GATA]n repeats. The markers are highly polymorphic with an average heterozygosity > 0.7. The allele patterns of these markers are easily interpreted. Furthermore, the markers PCR amplify at a standardized condition. In addition, we have been successful in multiplexing (at the PCR level) most of these markers (three markers per reaction). This has allowed us to greatly increase our efficiency of genotyping. We have assigned these markers to chromosomes using somatic cell hybrids. They are being mapped genetically by Projects 3, 4 and 5. Primers for these markers are being made available at low cost through Research Genetics. In the second year of this project we will continue to place major emphasis on developing additional tri- and tetranucleotide repeat polymorphic markers. We will also define a subset of approximately 200 of these STRPs as the most useful for primary genome wide linkage studies.
In addition to marker development, Project 2 has continued to collaborate with others in the mapping of genetic diseases. We have now mapped seven hereditary eye diseases. Our success in disease gene mapping is attributed in part to our ability to efficiently genotype using STRPs. We are actively seeking collaborations with others involved in disease gene mapping projects. Interested individuals can contact Val Sheffield at the address given below.
Val C. Sheffield, M.D., Ph.D.
Project 3 has continued its efforts at generating genotype information on 15 standard CEPH families to allow linkage mapping of the newly developed STRP markers provided by Projects 1 and 2. Most of our emphasis in the first year has focused on mapping of the GATA tetranucleotide-based STRPs, although we are now focusing some additional efforts on new classes of tetranucleotides and initiating some trials on trinucleotides, as well.
Following the lead of Project 2, we have also been making the switch from using radioactive labelling to the more costefficient and less hazardous silver-staining protocols. We have found these to be very effective and robust and also allow for multiplexing of a large number of markers, either at the gel loading or PCR reaction level. We have genotyped approximately 200 CHLC markers in the course of these efforts and are accelerating our activities to increase the number of genotypes generated over the course of the next year.
In addition to the primary genotyping efforts of Project 3, we are also involved in a variety of educational outreach activities. These include providing sites for high school science teachers and students to learn more about the Human Genome Project. In particular, a core activity provides on site support for the 1-2 month stay of secondary school science teachers who wish to have direct experience with the technology of the genome project and also participate in didactic experiments on the Human Genome Project's science, social and ethical issues. Over the next year, we also plan on expanding this program so that we can assist the ELSI Core in providing similar experiences for the ELSI fellows.
We have also continued to provide educational outreach not only throughout our own local community, but by working with the Biological Sciences Curricula Study on developing specific modules aimed at instructing secondary school biology students and their teachers in the methods, goals and policy issues associated with the Human Genome Project.
Finally, our outreach activities include opportunities for linkage mapping of diseases with the support of our project. Interested individuals may spend up to three months in our laboratory using our reagents, maps and protocols on their own material. Contact Jeff Murray at the address below for additional information.
Jeffrey C. Murray, M.D.
One of the major responsibilities of Project 4 is to genotype new polymorphisms developed within the center through the CEPH reference families. During the first year, 380 STRPs were genotyped at Marshfield through CEPH reference families. This total included 200 CHLC markers, 110 new Marshfield (Mfd) markers and 70 additional STRPs developed at various other laboratories. Altogether, about 70,000 new genotypes were determined.
Marshfield marker development efforts have now ceased. A grand total of about 350 Mfd markers have been developed. Information characterizing nearly all of the Marshfield markers is available electronically through the Genome Data Base, GenBank and the CHLC Informatics Core.
Results of the CEPH family genotyping were used to derive preliminary characteristics of the new CHLC tetranucleotide STRPs. About 20% of the new markers had alleles with size differences other than multiples of four bases. Mean informativeness of the new CHLC markers was about 70%, similar to values reported for dinucleotide STRPs. However, evaluation of informativeness distributions for the tetranucleotide STRPs indicated that the tetranucleotides have fewer markers with heterozygosities above 85% than dinucleotide STRPs.
Improvement of STRP genotyping technology continues to be a major goal of Project 4. One of the most important developments in the first grant year has been the construction of a simple program for entry of genotypes into the computer. The program, which was written by Matt Stephenson, utilizes the arrow keys on a standard PC keyboard to increase or decrease the size of alleles in multiples of the repeat length relative to a common standard allele for each marker. For a GATA tetranucleotide STRP with frequent alleles of 150 bases, for example, a 158 base allele is entered simply by striking the up arrow key twice and then striking the enter key. Similarly, a 146 base allele is entered by striking the down arrow key once and then the enter key. Lanes assignments are preloaded into a file and are automatically called one after another across the gel as each pair of alleles is entered. Although extremely simple, this process offers dramatic improvement in the efficiency of genotype entry over standard data entry or even over sophisticated semiautomated image analysis systems. The genotype entry program (geno) may be obtained by anonymous ftp to: "dgabby.mfldclin.edu".
Collaborative disease gene mapping efforts are continuing at Marshfield. During the first grant year, genes responsible for colon cancer, familial expansile osteolysis and a form of pseudoachondroplasia were mapped through these collaborative efforts. Other projects initiated during the first grant year are still in progress. Those interested in engaging in collaborative gene mapping projects should contact Jim Weber at the address below. As was previously announced, visitors are responsible for travel costs and living expenses while supplies are provided by the CHLC. Groups working on disorders prevalent in minorities or disorders that primarily affect women are especially encouraged to apply.
The large amount of genotyping carried out with the new STRPs has provided abundant substrate for study of the STRPs themselves and also for study of various human meiotic parameters. Efforts to date have focused on STRP mutation and on meiotic recombination interference.
One particularly useful result from the mutation study was that the majority of STRP mutations observed using DNA from the transformed CEPH lymphoblastoid cell lines occurred in vitro, either during EBV transformation of the cells or during propagation of the cell lines (Weber and Wong, Hum Molec Genet 2:1123-1128, 1993). It was also found that most mutations involved the gain or loss of a single repeat. Average mutation rates for a collection of STRPs on chromosome 19 were 10-3 per locus per gamete per generation. Rates for tetranucleotide STRPs were about four times higher on average than rates for dinucleotide STRPs. Genotyping carried out during CHLC year 1 has confirmed this difference in rates on the basis of repeat length.
Evidence for interference on human chromosome 19 was obtained by comparing the observed distribution of distances between recombinations for chromosomes in which there were exactly two genetic exchanges to the distribution that was expected on the basis of zero interference (Weber et al., Am J Hum Genet 53:1079-1095, 1993). Preliminary extension of these results to the entire genome using the fraction of the genetic length of each chromosome as a distance metric, has confirmed the results seen for chromosome 19. Positive recombination interference seems to be a general property of both male and female human meioses.
James L. Weber, Ph.D.
A total of 241 tri- and tetra-nucleotide repeat markers have been submitted for map construction. As the first step of map construction, these markers have had their chromosomal assignment confirmed by linkage. After assignment to a chromosome, each marker was placed in the current collection of CHLC maps. This placement consisted of determining statistically the best interval within the map to which the marker could be assigned, as well as alternative intervals from which it could not be excluded with odds of 1000:1. The figures showing these assignments are included in the newsletter. The thick box indicates each marker's maximum likelihood assignment, while the thin lines indicate other intervals that could not be excluded. Markers with only a thick line were assigned a unique interval in the previous maps.
Using these interval assignments, we derived a collection of "scaffold maps". These scaffold maps are maps of tri- and tetra-nucleotide repeat markers (from the CHLC and the combined CEPH datasets) that could be uniquely ordered with respect to the version 1.0 reference maps. This ordered set was extracted and interlocus distances estimated. These scaffold maps have an average density of 18cM and represent the first step in identifying a collection of high heterozygosity, user-friendly markers, for genomewide searches. The new markers and maps are available electronically through anonymous ftp:
ftp.chlc.organd gopher:
gopher.chlc.org
For ftp access, new data is located in the directories under:
chlc/markers/chlc/v2.Kenneth H. Buetow, Ph.D.
On completion, return to:
CHLC Administration, #431 EMRB, The University of Iowa, Iowa City, IA 52242 _________________________________________________________________________ Name _________________________________________________________________________ Institution _________________________________________________________________________ Department _________________________________________________________________________ Street/Building _________________________________________________________________________ City, State, Zip (COUNTRY)