2011-05-25T21:12:49-07:00
Resource:Genome Reference Consortium
curated
http://www.ncbi.nlm.nih.gov/genome/assembly/grc/index.shtml
<br />
In order to improve the representation of the reference human genome, GRC corrects the small number of regions in the reference that are currently misrepresented, to close as many remaining gaps as possible and to produce alternative assemblies of structurally variant loci when necessary. This resource additionally provides mechanisms by which the scientific community can report loci in need of further review.<br />
The most recent assembly for human is GRCh37. This is the first assembly produced by the GRC and is considered the next version of NCBI Build 36. GRCh37 is a haploid assembly, constructed from multiple individuals and can be divided into a 'primary assembly' and a set of 'alternate loci'. The primary assembly represents the assembled chromosomes, plus any unlocalized or unplaced sequence that represent the non-redundant, haploid assembly.The alternate loci represent regions for which there is large scale variation and an alternate tilng path is available for this region.<br />
The most recent assembly for the mouse is Build 37, which was produced by the Mouse Genome Sequencing Consortium (MGSC). This assembly is based on DNA from a single inbred strain (C57BL/6J) and is largely composed of finished clone sequences.<br />
The GRC only maintains genomes that have been generated using a hierarchical (clone) based assembly method. Typically, these projects are considered complete in that most of the genome is well represented and typically the funding for the main genome project has come to an end. Currently, the only genomes supported are mouse and human. In this phase of genome assembly, the GRC focuses on the following:<br />
-identifying and correcting assembly errors<br />
-identifying regions of allelic complexity that require the addition of a partial assembly for that locus.<br />
-working with the research community to address questions and concerns<br />
-producing updated assemblies on a regular cycle<br />
A set of TPF files are maintained for each assembled chromosome and partial assembly. These files are stored in a central database that manages TPF tracking and validation. Sequences (also known as components) which are adjacent on the TPF are expected to have a specific type of sequence alignment known as a full dovetail. A program call 'find_overlaps' assesses all adjacent component sequences to determine if they have an appropriate overlap. Alignments can fall into the following categories:<br />
- Excellent alignment, meets all defined criteria.<br />
- Minor alignment problem.<br />
- Serious alignment problem requiring review.<br />
- An alignment certificate providing external evidence for accepting the join has been submitted, but not approved.<br />
- An alignment certificate has been approved for this join.<br />
Sponsors: GRC is supported by NIH.<br />
nif-0000-20983
Resource:Genome Reference Consortium
2011-05-18T00:00:00
GRC
Resource
Synonym
ModifiedDate
Label
Variation
Structurally
Strain c57bl/6j
Sequencing
Sequence
Region
Mouse
Mechanism
Locus
Cytoarchitectural
Human
Haploid
Grch37
Genome
Dna
Cycle
Clone
Chromosome
Alternate locus
Allelic
Allele
Alignment
Gap
Keywords
Id
Topical portal
Data storage repository
Has role
Definition
DefiningCitation
CurationStatus