Genomic selection

Basis voor de G-Genomic stieren van GGI-SPERMEX is de genomic selection. Lees meer over de doelen, achtergronden en techniek van de Duitse genomic selection.

Fundamental thoughts on animal breeding

The aim of animal breeding is to diversify genetic endowments of animals and, therefore, improve their traits. In doing so it is not about genetic manipulation in the animal itself, but specific selection using different, desirable traits. Congenitally these traits were only phenotypic (shown) before the introduction of genomic selection. So production performed and conformation traits were taken as indicators for predispositions connected with it and selection was done on that basis-

Of course, an old dream of research and science is to extract information directly from the animals’ genes without going the long way round by using the indirect phenotypical way. Genomic breeding value evaluation and selection is now really close to this desire.

Even though the cattle genome is sequenced now we just do not know much about the position and effect of genes for the majority of the transmission traits. So genomic breeding value estimation does not only work directly based on gene information, but based on markers. The development of these processes of genetic analysis and adaptions of breeding value estimation were developed supported by the Federal Ministry of Education and Research (BMBF) on occasion of the funding FUGATO.

It is so-called SNP (Single Nucleotide Polymorphism) markers that are used. They are composed of only one genetic letter and there are two different models for every marker in the entire population. Every animal has the genetic information in the form of a double chromosome set; one set from the father and one set from the mother.

The result is that every marker for an animal can have three combinations: homozygous option AA, homozygous option BB or heterozygous AB.
Heterozygous AB means that the animal received different options from the father and the mother.

Many hundreds of thousands of such SNP markers of cattle are known. 54,001 of them are shared equally in the entire genome i.e. all chromosomes, can now be extracted through a relatively cheap lab method in a one step process. This process is called “typing“. The unit used in the laboratory for testing the 54,001 markers is called a chip or 54K chip to indicate the number of markers simultaneously tested. 

A little of the animal’s genetic material is used for the typing. As all cells of an animal have genetic material, blood (approx. 2 ml) or semen (approx. 2 doses) can be used. You can also extract enough gene material for the typing from approx. 30 roots of hair. However, the danger of contamination of hair samples with genetic material of other animals is higher and the result could be incorrect. 


The typing’s result is 54,001 times AA or BB or AB.

As a result of the equal spread of many markers over the entire genome, it is assumed that close to every gene, which influences the production of one of the transmission traits, is one of the 54,001 markers. This means that the marker is transmitted almost every time, together with the appropriate combination of genes. We do not know however, the different genes and their effect. In order to be able to evaluate an animal’s genomic breeding value using its markers, certain advanced tests are first necessary. 

To find out which SNPs, i.e. which markers are connected with which trait, the SNP samples first have to be compared with known (genetic) production traits of selected animals. Animals, with known genetic production are daughter proven bulls as well as typed cows having production data. From the comparison of the SNP sample with the genetic production of these animals, it can be determined which SNP has how much influence on the trait, so the genomic evaluation formulas are conveyed that way. The more reliably proven reference sample animals you have available for this formula, the better you can allocate the SNP to a trait and indicate the extent of the influence. Proven bulls and typed cows including their production data that are included in this analysis are the so-called “training sample“. The conversion from the pure bull reference sample to a mixed reference sample of bulls and cows will be done with the proof run in April 2019. The genomic evaluation formulas conveyed from the training sample are then used for the calculation of the genomic breeding values of other (usually younger) animals without their own reliable conventional breeding value information.

Some further notes on the mixed reference sample

The basis of the primary, pure bull reference sample were bulls selected by the test bull process due to their pedigree breeding value and completed their test period. The selection intensity was relatively low and bulls therefore reflected the entire genetic variety of their population. With increasing genomic selection, selection intensity goes up as well running the risk of a biased reference sample because only bulls enter service that are already genomically pre-selected. Thus you do not have a cross section of genetic width of the actual population and the estimation of transmission power of bulls on occasion of conventional breeding value estimation is more difficult. Reliable, conventional breeding value estimation is the basis for reliable genomic breeding value estimation. The solution of this problem is to type all female calves in the dairy herds in order to have a non-selective and representative basis to the greatest possible extent. This was realized in the project “KuhVision” and numerous herds were typed so far. In April 2019, the current reference sample will be changed to a mixed reference sample. The reference sample will then consist of more than 38,000 bulls and approx. 150,000 cows.


The reliability of genomic breeding values mainly depends on the extent and structure of the training sample. The complexity and the reliability of the conventional breeding vales for all – as well as for all functional traits – are No 1 in the world. The size and structure of the German training sample is also unique worldwide because of the exchange of information with three European partners from France, Scandinavia and The Netherlands. None of the other training samples worldwide is so well-structured, i.e. represents the entire, latest Holstein genetics from Europe and North America. The genomic formulas can only convey reliably the breeding values of younger cows when their genetics (SNP sample) are well-represented via preferably many related animals in the training sample. So the training sample has to be continuously expanded and updated alongside the population development. This will be ensured with the introduction of the mixed reference sample as actual genetics of the current population are realistically reflected through female animals.  

The reliability of breeding values for younger animals based solely on genomic data (SNP typing) is shown by the figure in the middle column. The reliability shown is the actual reliability; i.e. it has already been corrected for the overestimation observed in all genomic evaluation methods.

Figure 1: Reliabilities of genomic breeding values compared to the reliability of the pedigree breeding value:

CEpat. = calving ease paternal
CEmat. = calving ease maternal 

Combined genomic breeding values

The direct genomic values (dGW) are calculated from the typings (SNP samples) for all traits. There is also more conventional breeding value information for all animals with known pedigree, namely the pedigree’s breeding value. In order that every animal has only one breeding value with maximum information and reliability at a particular time, the direct genomic value is not published but the genomically improved breeding value (gZW) which is a combination of the direct genomical value and the conventional breeding value is given. 

The weighting is done on the basis of the reliability of both values; i.e. for young animals with only a pedigree breeding value, the direct, genomical value counts the most and the unreliable pedigree breeding value can increase the reliability of the genomically improved breeding value ([ZW] gZW) by only approx. 3-5% (please see right column of the figure). Once the conventional breeding value is clearly more reliable from daughter information than from the direct genomic value, it has a higher weighting in the combined gZW. So the combined gZW of daughter proven bulls is usually little different from the purely conventional breeding value.

What are genomic breeding values able to do?

The actual reliability of genomically improved breeding values of young bulls – approx. 75% for the milk yield traits and 50% for daughter fertility – is clearly higher than the reliability of the current pedigree breeding value of test bulls. So young bulls with their official gZW qualify formally as sires, and there are no longer test bulls as they were previously known. 
The comparison of the reliabilities of daughter proven sires (Figure 2) shows however, that even bulls with only test daughters in first lactation have higher breeding value reliabilities for high heritability and important traits like production, conformation and udder health. 

Figure 2: Reliabilities of genomically improved breeding values (gZW) of young bulls compared to the reliability of daughter proven bulls

Even though German genomic breeding values are the most reliable when compared internationally, because  of the structure and extent of thetraining sample , the genomic breeding values are more unreliable than those of the daughter proven sires’. With stronger emphasis on the functional traits, the significance of sires with thousands of second crop daughters has increased over the past years because they also offer high reliability for traits like longevity and daughter fertility. The actual reliability of genomic breeding values for those functional traits is still limited to approx. 50%. So if you normally think that the reliability of the firstly published daughter breeding values of new sires are too low, young bulls, only genomically tested, may not be an option for you. On the other hand the young generation of genomically tested bulls provides a new breeding opportunity. When using those young genomic-bulls, you should always be aware of the limited reliability, so the risk should be spread by using several such bulls. The reliability calculations and data for genomic breeding values are not yet standardized internationally. The data of the German genomic breeding values are realistic however, and the quality of the German genomic breeding values is the best internationally. For correct evaluation of the German breeding values, as well as reliability expressed in percentages, the number of daughters with production information will still be given. This means that everybody is able – on an impartial basis – to select a suitable sire with a choice between a young genomically tested bull or the latest, proven bull with test daughters or even a proven second crop sire.