Estimated Breeding Values

(This article was submitted to the Norwich Terrier News, but was not published. Hopefully it will be published in a future issue.)

Estimated Breeding Values

Estimated breeding values (EBVs) are a breeding tool widely used in the commercial livestock industry. EBVs were originally developed in the 1960s for use in the dairy industry to improve milk production, then EBV use spread to other areas of the livestock industry (beef, pigs, chickens, etc), and are now also used in plant agriculture. EBVs have been credited - along with improved feed, the (over) use of antibiotics, artificial insemination, plus fertilizer and pesticides - with allowing farmers to feed the increasing world population. There are hundreds if not thousands of research articles on EBVs. Every major university has at least one researcher (usually in the agriculture department) whose specialty is EBVs.

The problem in describing EBVs is that it is a statistical (i.e., mathematical) technique. Now I promise not to "talk mathematics" - but for those of you interested see the appendix to this article. Let me try to describe the basic idea of EBVs in dog breeding terms. Say you are interested in improving some trait in your line, and you are trying to decide between two stud dogs for your bitch. Both stud dogs offer improvement for your trait of interest, and it is difficult to decide which is better. You might look at relatives of both stud dogs. Perhaps stud number 1 has many relatives who have good values for the trait you are considering; whereas stud number 2 has relatives who have poor values for the trait. You would probably consider stud number 1 as the better choice, as his line has a better chance of passing on improvement for your trait of interest. In other words, stud number 1 has a better (subjective) "estimated breeding value". I suspect that dog breeders make these kinds of decisions subconsciously all the time. In the commercial world where the financial stakes are high, a more objective method is desired to calculate EBVs.

There are two pieces of information needed to calculate EBVs. The first is the relationship of all the animals being considered. The other is an assessment of the heritable trait of interest in as many animals as possible.

The calculation of EBVs has two advantages over just doing a subjective calculation. The first is that it uses the information from all the animals whose trait value is known. As any dog breeder who has tried to make a chart of near-relatives quickly learns, it is difficult to determine how far out to go. Do you only use first-degree relatives? Or do you go out to second-degree relatives (aunts, uncles, grandparents, etc)? The further you go the more information you get, but the less influence a relative might have. Calculating EBVs lets you use all dogs whose value is known. The second advantage is that the calculation gives an EBV even for animals for which an assessment of the trait is unknown (by using the values of relatives where the trait is known).

Since I am a retired applied mathematician, I enjoyed puzzling out the mathematics of EBVs to see if EBVs could help in my hobby of dog breeding. I already had a pedigree database of Norwich Terriers, see [1]. I decided to use OFA hip values as my initial trait to investigate. OFA has a database with the hip values of many dogs (plus hips were something that I wanted to improve in my line). I programmed up a basic EBV algorithm and used my pedigree database and Norwich OFA hip values to calculate EBVs for hips. I followed hip-EBVs for a couple of years, and saw that my calculation of hip-EBVs could predict hip values.

It is important to remember that EBVs is a statistical technique; sometimes you get unlucky. Like weather prediction (another statistical technique) sometimes the prediction is wrong. But the odds are that if you choose a dog with a higher EBV for a trait, you will see improvement in that trait. Since I have been using hip-EBVs to help me choose stud dogs, hips in my line have improved. (Keep in mind that an EBV is just one tool. One needs to take the entire dog into consideration when making breeding decisions.)

By the time that you read this, I will have made public in my online Norwich pedigree database [1] hip-EBVs for Norwich in North America. You will see an entry like this:

Hips.EBV= 0.356 (0.644)

The first number is the EBV; larger values indicate better estimated breeding value. The number in parenthesis is the "accuracy" of the EBV; it indicates the number and closeness of relatives whose trait values are known. EBVs with higher accuracy (closer to 1.0) should be given more weight in decisions than ones with lower accuracy (closer to 0.0).

Some dogs will have Excellent hips yet have poor (negative) hip-EBVs. EBVs estimate the breeding value of an animal using not only the value of the animal but also the values of its relatives. So if a dog has relatives with poor or unknown hips, the dog is going to have a lower hip-EBV.

I have also calculated EBVs for upper airway syndrome (UAS) using publicly available scoping scores. For any mathematical model the more information available, the better the model. This is why I say that information shared is more valuable than information kept private (and breeder information kept private is lost when a breeder stops breeding). These UAS-EBVs are also available in my online Norwich pedigree database.

The UK Kennel Club provides hip and elbow EBVs for British dogs, see [2]. Cornell University had a project to calculate hip and elbow EBVs using OFA data and pedigree information, see [3]. (The Cornell effort seems to have been discontinued.) For those who wish to read more about EBVs, both of these projects have web pages with good explanations. One note of caution, both of these projects calculated EBVs so that a smaller EBV is "better".

EBVs are an additional tool in a breeder's toolbox. If anyone has a trait of interest, I will be happy to work with you to calculate EBVs for that trait.

Appendix

The calculation of EBVs is normally taught in graduate-level statistic courses. The primary tool is a linear mixed-model given by a matrix equation. The algorithm developed by Henderson in 1949 called Best Linear Unbiased Prediction (BLUP) allows for fixed and random effects to be simultaneously estimated. See [4] for details. The BLUP algorithm is an example of the statistical idea of linear ridge regression.

Variance component estimation can be done by restricted maximum likelihood estimation (REML). I used the expectation-maximization (EM) algorithm as detailed in [5] and [6]. An alternate way to do variance estimation is by offspring-parent regression, see [7].

I used the SAGE computer algebra system to program the algorithms, see [8].

Calculations of Norwich hip-EBVs took approximately 36 hours on a workstation for initial estimation of parameters, a large part of the work was computing the rank of a matrix of size approximately 5000. The BLUP calculation took approximately 30 minutes on a workstation and involved working with matrices of size approximately 8000.

You can see the correlation between hip values and hip-EBVs in the following chart constructed using version 2.79 of my Norwich pedigree database and OFA hip values available at the time I constructed version 2.79.

hip value     # dogs     mean EBV     range of EBVs
Excellent 100  0.170 [-0.362, 0.448]
Good 927 -0.097 [-0.849, 0.295]
Fair 309 -0.405 [-1.108, -0.005]
Borderline     4 -0.944 [-1.472, -0.559]
Mild   38 -1.037 [-1.579, -0.662]
Moderate     5 -1.300 [-1.462, -1.014]
Severe     2 -1.612 [-1.685, -1.538]

[1] http://shakspernorwich.net/pedigreedb/

[2] https://www.thekennelclub.org.uk/health-and-dog-care/health/getting-started-with-health-testing-and-screening/estimated-breeding-values/

[3] https://www.vet.cornell.edu/health-topics/estimated-breeding-values-ebvs

[4] Linear Models for the Prediction of Animal Breeding Values, 3rd Edition, Rapheal A. Mrode, 2014.

[5] Average Information REML: An Efficient Algorithm for Variance Parameter Estimation in Linear Mixed Models, Gilmour et al., Biometrics, 1995.

[6] http://morotalab.org/Mrode2005/vce/vce.html

[7] Introduction to Quantitative Genetics, 4th Edition, Falconer & Mackay, 1996.

[8] SageMath, the Sage Mathematics Software System (Version 9.2), The Sage Developers, 2020, https://www.sagemath.org.

Blair Kelly
Shaksper Norwich

20211102