Single-nucleotide polymorphism

This tag is associated with 7 posts

Incorporating genetic and environmental risk factors in to risk prediction for colorectal cancer

Cancer Epidemiology HomeFrom: Incorporating non-genetic risk factors and behavioural modifications into risk prediction models for colorectal cancer

Yarnall, Crouch & Lewis (Division of Genetics and Molecular Medicine, King’s College London, United Kingdom.). Cancer Epidemiology 2013 Jan 29


Background: Epidemiological studies have identified potentially modifiable risks for colorectal cancer, including alcohol intake, diet and a sedentary lifestyle. Modelling these environmental factors alongside genetic risk is critical in obtaining accurate estimates of disease risk and improving our understanding of behavioural modifications. Methods: 14 independent single nucleotide polymorphisms identified though GWAS studies and reported on by the international consortium COGENT were used to model genetic disease risk at a population level. Six well validated environmental risks were selected for modelling together with the genetic risk factors (alcohol intake; smoking; exercise levels; BMI; fibre intake and consumption of red and processed meat). Through a simulation study using risk modelling software, we assessed the potential impact of behavioural modifications on disease risk. Results: Modelling the genetic data alone leads to 24% of the population being classified as reduced risk; 60% average risk; 10% elevated risk and 6% high risk for colorectal cancer. Adding alcohol consumption to the model reduced the elevated and high risk categories to 9% and 5% respectively. The simulation study suggests that a substantial proportion of individuals could reduce their disease risk profile by altering their behaviour, including reclassification of over 62% of heavy drinkers. Conclusion: Modelling lifestyle factors alongside genetic risk can provide useful strategies to select individuals for screening for colorectal cancer risk. Impact: Quantifying the impact of moderating behaviour, particularly related to alcohol intake and obesity levels, is beneficial for informing health campaigns and tailoring prevention strategies.

Risk factors

Low risk (Green), average risk (blue), moderate risk (yellow) and high risk (red)using combined genetic and environmental risk factors


Over the last 30 years the lifetime risk of colorectal cancer (CRC) for men has almost doubled, from 3.5% to 6.9% in the UK in 2008. For women the increase is more than a quarter, rising from 3.9% to 5.4%.  Since both genetic and environmental factors contribute to the susceptibility to colorectal cancer, this trend may be due to a change in the dietary and lifestyle factors of the general population leading to higher levels of obesity and more sedentary pastimes.

The major risk factor for colorectal cancer is age and over 85% of colorectal cancer occurs in people over the age of 60.  Other risk factors include the presence of polyps and people having an Ashkenazi Jewish genetic heritage. The use of non-steroidal anti-inflammatory drugs (NSAIDs), hormone replacement therapy and aspirin use have also been associated with disease risk. However, it is estimated that between 52 and 57% of colorectal cancers are associated with lifestyle and environmental factors.  Many risk factors for colorectal cancer may be modified by intervention, ranging from known risks, such as increased risk from a sedentary lifestyle and dietary changes. The evidence for dietary factors indicates possible increased risk from diets low in fibre, garlic, calcium, fruit, vegetables and fish and high in red and processed meat. In addition to alcohol, BMI, smoking and exercise, we chose to model the most consistent and well validated dietary findings, which suggest that low levels of fibre and high levels of red and processed meat are both significant risk factors.

The international consortium COGENT (COlorectal cancer GENeTics) have identified many of the known genetic variants that predispose to CRC with the 14 single nucleotide polymorphisms (SNPs) found to be convincingly associated with CRC risk from GWA studies summarized in Houlston et al.’s recent update.  Of these 14 SNPs, the mean odds ratio per allele is 1.14, with the highest odds ratio reported for SNP rs16892766 near the EIF3H gene (OR 1.28).

The identification of SNPs that contribute to susceptibility for CRC has raised the prospect of genetic screening. Companies such as DeCODEme and 24andme include panels of SNPs for CRC in their genetic testing panels, yet research suggests that the genetic risk prediction alone is of questionable utility.  In this research study, we combined the known genetic risk with data on the environmental risks for CRC, enabling more complete risk prediction. We applied a statistical risk model and to determine the impact of modelling environmental factors alongside the 14 genetic susceptibility loci identified by the COGENT consortium.

Early screening for colorectal cancer can be extremely helpful in identifying individuals with polyps and nonpolypoid lesions and preventing the development of cancer. Regular faecal occult blood tests (FOBT) in the over 50 s for example have been found to reduce the number of deaths due to CRC by 15–33%.  In the UK, screening is offered to all men and women aged between 60 and 69 at a cost of £77.3 million and this will be extended to 74 year olds. However, it has been suggested that if individuals are provided with a personalized disease risk assessment from their combined genetic and environmental profile, they are likely to be more motivated to alter their lifestyle as a preventative measure, which would increase the effectiveness of health campaigns. In this study we develop predictions of CRC risk in different sub populations and assess the impact of modifying lifestyle factors on risk levels. By providing predictions of disease risk both before and after a lifestyle change for a given genetic profile, the study illustrates the potential benefits for both selection of candidates for screening programmes and the tailored promotion of healthier lifestyle choices, in high risk groups.

There are several modifiable risk factors for colorectal cancer and building predictive models encompassing both genetic and environmental factors enables us to move in the direction of a complete assessment of disease risk. This paper describes a predictive model which takes account of the known genetic contribution as well as the modifiable risks. There is considerable evidence to suggest that detecting polyps in the early stages can reduce mortality rates for colorectal cancer and whilst the interactions between the genetic and environmental elements are undeniably complex, separating out the inherited risk from the lifestyle factors using this model helps to illustrate the potential gains from modifying lifestyle behaviour and could usefully inform healthy lifestyle campaigns.

Our findings indicate that that cessation of alcohol consumption and reducing obesity levels lead to the most significant changes to the proportion of the population reducing their disease risk category. Whilst this could have been predicted to some extent by the higher odds ratios for these factors, it is the combination of relative risk, together with the prevalence of the factor within the population that determines the overall impact. In addition, being able to create personalized risk predictions in this way, has the potential to motivate greater behavioural change, showing for example, that it is possible to significantly reduce disease risk by moving from a high risk category to an average risk category though increasing fibre levels; cessation of alcohol consumption or weight management, given a particular genetic profile. Further research is required to increase understanding of how individuals respond to risk assessment based on genetic information.  This may increases their motivation since the results are personal, or decrease their motivation because they consider that their genetic risk cannot be modified.

Our focus has been on risk categorization, and not on the absolute level of risk estimated from the combination of genetic and environmental risk factors, which is modest for most categories. There are two advantages to this strategy. Firstly it moves away from the strategy used, for example, by direct-to-consumer genetic testing companies such as 23andme and deCODEme (who provide a single figure of risk with no confidence intervals) towards the strategy deployed in genetic counselling of using a qualitative risk level, which can be more easily interpreted for the purpose of risk prediction. Secondly, it puts a stronger statistical framework on the risk model: an assignment to elevated risk implies that the risk is statistically distinct from the risk of the average, baseline, individual, given the uncertainty of the parameters used in the model.

There are several limitations of the model. Firstly, the model is built from estimates in the literature extracted from different studies. This enables researchers to select the best study to capture information on each risk factor, but assumes that information is directly comparable between studies. This limits the precision with which risk estimates can be calculated. A further limitation is that the model assumes all risk factors entered are independent. For known gene and environment interactions, this can be overcome by either modelling the interaction explicitly as an environmental risk factor, or by omitting known genetic loci to prevent over-representation of a risk factor (such as SNPs on the FTO gene which are associated with BMI). Within the genetic component, linkage disequilibrium between SNPs can be tested to confirm no correlation at a population level; few interactions of risk between genetic loci have been identified, so the assumption of independence should not be a major problem. For the environmental component, assumptions of independence are more difficult to assess. Lack of independence may lead to inaccuracies in the population frequencies estimated, but the contribution of environmental factors to the model is based on relative risks that are estimated in the presence of relevant covariates, so levels of risk should not be inflated. Increasing our understanding of the association between lifestyle factors, as well as between genes and the environment, will be important in obtaining more accurate assessments of risk. In addition, the accuracy could be further improved by more specific modelling of the population being targeted. Applying data with relative risks by sex, by population group, or for individuals with a first degree relative with CRC for example, would provide more accurate estimations of disease risk specific to those populations.

Colorectal cancer screening programmes are widespread, but are age-targeted and look for signs of cancer in early development. In contrast, the methods described here can be used to target lifestyle factors, and are relevant for younger age-groups. The approach could encourage behavioural changes and help to reduce CRC rates. Although the model indicates that certain individuals can reduce their CRC risk by changing their behaviour, the time taken for changes in environmental risk factors to have an effect on risk is unknown, and will differ by factor. Additional research is needed to further elucidate the genetic and environmental contributions to disease risk and to measure the longer term impact of behavioural change on disease outcomes.

Association of the insulin-like growth factor 1 (IGF1) microsatellite with predisposition to colorectal cancer

Gut 2011;60:A116 doi:10.1136/gut.2011.239301.245
  • Neoplasia and cancer pathogenesis

* Association of the insulin-like growth factor 1 microsatellite with predisposition to colorectal cancer

K J Monahan *, S Spain, H J Thomas, I P Tomlinson

Gastroenterology, West Middlesex University Hospital, London, Cancer Medicine, Imperial College, London,Molecular and Population Genetics, Cancer Research UK, London, UK


Introduction IGF1 may be important for colorectal cancer risk because of its role in cell growth and differentiation. High IGF1 serum levels have been associated with increased risk of colorectal cancer. Variations in these serum levels have been associated with a CA repeat microsatellite 1 kilobase upstream of the transcription start site. We sought to determine the association of germline variation of the IGF1 gene with colorectal cancer predisposition by performing a large case-control study.

Methods Genescan 500 was used to differentiate alleles of the IGF1 microsatellite among 2143 colorectal cancer cases (enriched for family history) and 1715 controls from the CORGI (COloRectal Gene Identification) study, with subsequent 100% confirmation of about 5% of genotypes by direct sequencing. Associations of genotypes with the following clinicopathological features were tested: sex; site of tumour; Dukes stage; age of onset; presence of adenomas. Using genotype data obtained from the Hap550 platform by colleagues1 plus the genotypes at the insertion/deletion, we reconstructed haplotype blocks around the IGF1 gene in the controls using 68 tagging SNPs.

Results All the alleles confer increased risk for colorectal neoplasia except ‘192′ (192 copies of CA repeat), which is a protective allele (allelic OR for ‘192′ =1.199; 95% CI 1.09 to 1.32; p = 0.000152). The population attributable risk (PAR) for the risk ‘allele’ (ie, where ‘X’=not ‘192′) is 2.94%. The risk alleles occurred more frequently in more advanced Dukes’ stage tumours (p=0.039, χ2). Several SNPs in close linkage disequilibrium with IGF1 are also significantly associated with colorectal neoplasia risk.


LD Plot IGF1

IGF1 p value

Conclusion This study demonstrates a novel association of IGF1 microsatellite with colorectal cancer risk. The association is stronger with advanced stage colorectal cancers, and in colonic rather than rectal cancers. This microsatellite is in linkage disequilibrium with other significant SNPs in the promoter region of this gene.

P21 / CDKN1A germline variation and low penetrance predisposition to colorectal cancer

Gut 2011;60:A116-A117 doi:10.1136/gut.2011.239301.246

CDKN1A germline variation and low penetrance predisposition to colorectal cancer

K J Monahan *, S Spain, H J Thomas, I P Tomlinson

Family History of Bowel `Cancer Clinic, West Middlesex University Hospital, London, UK; Cancer Medicine, Imperial College, London, UK; Molecular and Population Genetics, Cancer Research UK, London, UK


Introduction Progressive loss of cell cycle control is an important feature on the adenoma-carcinoma sequence of colorectal cancer. Cyclin-dependent kinase inhibitor 1A (P21/CDKN1A) is an important target of the TGFβ signalling pathway, and it is commonly under-expressed as colorectal neoplasia develop. The aim of this study was to identify low penetrance germline variation in this gene which predisposes individuals to colorectal cancer.

Methods Variation in the coding region of CDKN1A was determined in fifty colorectal cancer cases with a strong family history and 50 controls were tested using the Lightscanner and direct sequencing. 15 tagging SNPs around CDKN1A were typed in the CORGI cases and controls as part of a genome-wide analysis using the Illumina Hap550 platform in 930 cases and 960 controls performed by colleagues (Tomlinson et al 2007). Allele-specific expression of the gene was examined using quantitative reverse transcriptase PCR linked to SNPs in an upstream promoter region.

Results A novel amino-acid changing variant Phe22Leu was identified in a single colorectal cancer patient. Six patients were identified in the case group and 5 in the control group with Arg31Ser. In the association study the two most significant SNPs lie in an upstream promoter region and are in linkage disequilibrium, both are risk alleles for colorectal cancer (OR 1.13; 95% CI 1.06 to 1.2).

Allele A
Allele b
 p ( χ2)
95% CI
p21 LD inverse p values

p21 LD inverse p values

LD plot p value

There were significant differences in expression between CDKN1A and the controls in 94 samples (p=0.037, Student’s t test), demonstrating linkage of these upstream polymorphisms to allele-specific expression of CDKN1A.

Taqman p21

Conclusion Rare variants of P21/CDKN1A are an infrequent cause of predisposition to colorectal neoplasia. A promoter region upstream of the CDKN1A is associated with prediposition to colorectal neoplasia, and is linked to allele-specific expression of this gene.


PLOS Genetics: Comparison of Family History and SNPs for Predicting Risk of Complex Disease

Authors: Chuong B. Do, David A. Hinds, Uta Francke, Nicholas Eriksson


The clinical utility of family history and genetic tests is generally well understood for simple Mendelian disorders and rare subforms of complex diseases that are directly attributable to highly penetrant genetic variants. However, little is presently known regarding the performance of these methods in situations where disease susceptibility depends on the cumulative contribution of multiple genetic factors of moderate or low penetrance. Using quantitative genetic theory, we develop a model for studying the predictive ability of family history and single nucleotide polymorphism (SNP)–based methods for assessing risk of polygenic disorders. We show that family history is most useful for highly common, heritable conditions (e.g., coronary artery disease), where it explains roughly 20%–30% of disease heritability, on par with the most successful SNP models based on associations discovered to date. In contrast, we find that for diseases of moderate or low frequency (e.g., Crohn disease) family history accounts for less than 4% of disease heritability, substantially lagging behind SNPs in almost all cases. These results indicate that, for a broad range of diseases, already identified SNP associations may be better predictors of risk than their family history–based counterparts, despite the large fraction of missing heritability that remains to be explained. Our model illustrates the difficulty of using either family history or SNPs for standalone disease prediction. On the other hand, we show that, unlike family history, SNP–based tests can reveal extreme likelihood ratios for a relatively large percentage of individuals, thus providing potentially valuable adjunctive evidence in a differential diagnosis.

Author Summary

In clinical practice, obtaining a detailed family history is often considered the standard-of-care for characterizing the inherited component of an individual’s disease risk. Recently, genetic risk assessments based on the cumulative effect of known single nucleotide polymorphism (SNP) disease associations have been proposed as another potentially useful source of information. To date, however, little is known regarding the predictive power of each approach. In this study, we develop models based on quantitative genetic theory to analyze and compare family history and SNP–based models. Our models explain the impact of disease frequency and heritability on performance for each method, and reveal a wide range of scenarios (16 out of the 23 diseases considered) where SNP associations may already be better predictors of risk than family history. Our results confirm the difficulty of obtaining accurate prediction when SNP or family history–based methods are used alone, and they show the benefits of combining information from the two approaches. They also suggest that, in some situations, SNP associations may be potentially useful as supporting evidence alongside other types of clinical information. To our knowledge, this study is the first broad comparison of family history– and SNP–based methods across a wide range of health conditions.

via PLOS Genetics: Comparison of Family History and SNPs for Predicting Risk of Complex Disease.

Much Of The Population Genetic Risk Of Colorectal Cancer Is Likely To Be Mediated Through Susceptibility To Adenomas

Gastroenterology (journal)

Gastroenterology (journal) (Photo credit: Wikipedia)

Several single nucleotide polymorphisms (SNPs) have been associated with colorectal cancer (CRC) susceptibility. Most CRCs arise from adenomas, and SNPs might therefore affect predisposition to CRC by increasing adenoma risk. We found that 8 of 18 known CRC-associated SNPs (rs10936599, rs6983267, rs10795668, rs3802842, rs4444235, rs1957636, rs4939827, and rs961253) were over-represented in CRC-free patients with adenomas, compared with controls. Ten other CRC-associated SNPs (rs6691170, rs6687758, rs16892766, rs7136702, rs11169552, rs4779584, rs9929218, rs10411210, rs4813802, and rs4925386) were not significantly associated with adenoma risk. Genetic susceptibility to CRC in the general population is likely to be mediated in part by predisposition to adenomas. – Gastroenterology – Much Of The Population Genetic Risk Of Colorectal Cancer Is Likely To Be Mediated Through Susceptibility To Adenomas. Carvajal-Carmona et al In Press September 2012

Low penetrance risk and colorectal cancer: A review

Low penetrance variants and colorectal tumours

Although inherited susceptibility is responsible for 30% of all CRC (Lichtenstein, Holm et al. 2000), high-penetrance mutations in APC, the mismatch repair (MMR) genes, MUTYH, SMAD4, BMPR1A and STK11 account for <5% of cases (Aaltonen et al. 2007).    The nature of the residual inherited susceptibilityto CRC is at present undefined, but a model in which high-riskalleles account for all of the excess inherited risk seems improbable.It is likely that the remaining CRC inherited risk is largely accounted forby common, low penetrance alleles.   These alleles may either predispose directly to colorectal tumourigenesis or may have an additive effect on predisposition.  Candidate alleles studied include variants on known tumour suppressor genes, oncogenes, DNA repair genes, folate metabolising genes, and others.

A global view of the genetic contribution to colorectal cancer.
The highly penetrant causative mutations in familial adenomatous polyposis (FAP), Lynch syndrome, the hamartomatous polyposis syndromes and other familial conditions underlie cases of colorectal cancer (CRC) that have a strong hereditary component, with little environmental influence. However, there are also several low-penetrance mutations that contribute to CRC susceptibility in an additive way, involving interactions between genes and with environmental factors. As well as accounting for cases of hereditary CRC, these mutations are also likely to contribute to cases of CRC that are classified as ‘sporadic’. In addition, although none has been identified so far, modifier genes are also likely to influence the effects of genetic and environmental factors that contribute to CRC. Therefore, the distinction between ‘sporadic’ and ‘familial’ cases and between ‘genetic’ and ‘environmental’ predisposing factors has become blurred and might be better thought of as a continuum of risks contributing to CRC development. APC, adenomatous polyposis coli; BLM, Bloom syndrome; MMR, mismatch repair; TGFβR2, transforming growth factor-β receptor 2. Nat Rev Cancer 4(10):769-780, 2004

The APC I1307K variant is present in about 6% of Ashkenazi Jews,but is much rarer in those of other ethnic groups. I1307K createsan A8 tract (eight consecutive adenine residues) which appears to be somatically unstable, leadingto frameshift mutations (Laken et al. 1997).  The tumour risk associated with I1307K has been controversial, but most recent reports suggest that it has a relatively small effect (perhaps only 1.5-fold risk of colorectal cancer), suggesting that the A8 tract is only modestly hypermutable (Gryfe et al. 1999).

A number of other low-penetrance alleles have been found with varying degrees of evidence and importance (table 1.1).  The ability to identify these genes and to understand their interactions with other relevant environmental and genetic factors remains important however. It will help to stratify an individual patient’s risk for entry into surveillance programs and to reveal causative factors, allowing more effective prevention strategies.

Genome-wide association studies in cancer

To date a number of genome-wide association studies have been performed in breast (Easton et al. 2007; Stacey et al. 2007; Stacey et al. 2008), lung(Amos et al. 2008), prostate (Gudmundsson et al. 2007; Gudmundsson et al. 2007; Eeles et al. 2008; Gudmundsson et al. 2008), melanoma (Gudbjartsson et al. 2008) as well as colorectal cancer (Broderick et al. 2007; Tomlinson, Webb et al. 2007; Jaeger, Webb et al. 2008; Tomlinson et al. 2008).  Most of these studies have been published over the last 2 years.  The odds ratios for the loci identified range from 1.1 to 1.75, the majority having an odds ratio <1.5 (Easton and Eeles 2008).  There has been a certain amount of replication between these studies, particularly for the locus 8q24 which has been associated with risk of breast, prostate and colorectal cancer in separate studies.  However results so far suggest that these loci account for a small proportion of the overall risk.

(a) GWA studies identify common genetic variants (tag SNPs) associated with disease. (b) These tag SNPs are typically correlated, or in linkage disequilibrium, with other variants. (c–e) Integrating comparative sequence (c), chromatin profiling (d) and predictions of transcription factor binding sites (e) can identify putative functional SNPs (red asterisk). (f) There are a variety of functional assays for validating SNPs with predicted function.

It is difficult to speculate on the true function of these risk alleles.  There appears to be very little epistasis between the 28 loci identified in these 5 cancer types.  None of these loci are involved in DNA repair, frequently a cause of susceptibility to higher penetrance loci.  This may underlie why so many case control studies have failed to yield significant results consistently, as the underlying hypothesis may have been inaccurate.  One might speculate that many of the associations may be driven through their effects on gene expression, particularly as many lie in gene-poor regions.

Most GWAS have not been empowered to detect the effects of polymorphisms with minor allele frequencies (MAFs) <0.05; such variants are therefore sometimes included in the rare variant class. More often, rare variants are considered to be subpolymorphic (MAF <0.01), with very rare or ‘private’ variants having MAF <0.001. Clearly much of the distinction between ‘common disease-common variant’ and ‘rare variant’ models is arbitrary.  Nevertheless it is probably worth arbitrarily defining them in order to illustrate important differences between common and rare variants models, in terms of gene discovery and possible clinical relevance.  For example, the significance of rare variants is such that they are likely to have more biological impact than common variants, having arisen more recently in evolutionary terms (Bodmer and Bonilla 2008).

Rare variants as low-penetrance alleles


Rare variants will not be detectable by population association studies based on the use of linked polymorphic markers, even with very large case/control cohort studies.  This is because of low allelic frequency and individually small contributions to the overall inherited susceptibility of a disease.  These variants are less common than those studied in association studies (i.e. minor allele frequency (MAF) <0.05) but not as rare as obvious mutations (MAF >0.01), although such mutations may also be identified.  Finding rare variants requires nomination of candidate genes likely to have a role in disease aetiology, which are then directly screened for sequence variants which may affect protein function.  This is known as the ‘common-disease/rare-variant’ hypothesis (Pritchard 2001).

Allele frequency and effect sizes for genetic variants associated with colorectal cancer. Hindorff L A et al. Carcinogenesis 2011;32:945-954

So far there have been few rare variants identified in colorectal cancer, partially because candidate genes are not easily identified, and because there have only been a few studies performed.   In one such study variants in APC I1307K and E1317Q, in AXIN1, CTNNB1, and the mismatch repair genes hMLH1 and hMSH2 were more common in 124 multiple adenoma cases than in controls (Fearnhead et al. 2004).   Studies of other candidate genes have produced results of low or no significance however (Dallosso et al. 2008; Zogopoulos et al. 2008).

Labelling APC I1307K a rare variant may not be accurate, as the frequency of the polymorphism in the Ashkenazi population where it is present is 6%, thus potentially suitable for large association studies.  This distinction underlines the arbitrary nature of how such polymorphisms are labelled as rare or common variants.

Although the population attributable risk (PAR) of rare variants may be relatively high, the relative influence of these common variants is low, with reported odds ratios below 2 and peaking at approximately 1.2 (Easton and Eeles 2008).  Most rare variants have odds ratios a little higher than 2 but not above 5, with a mean of 3.7 in observations thus far (Bodmer and Bonilla 2008).  Their individual contributions are small, and they do not give rise to familial concentrations of cases.  As techniques improve to interrogate genetic sequence in an inexpensive, high-throughput and efficient manner this method of identifying variants is likely to generate a higher yield of significant results in the near future.

A candidate gene approach demonstrated rare novel low penetrance breast cancer predisposition loci in three genes, PALB2, BRIP1, and RAD51C.  (Seal et al 2006; Rahman et al 2007; Meindl et al 2010).   This discovery was assisted by the identification of breast cancer cases in Fanconi Anaemia pedigrees.  In general however, it is not a simple task to prioritize candidates for rare variant studies.  In the short term, it is likely that discovery efforts will be focused largely on sequencing candidate genes. Nevertheless, it is becoming feasible to sequence entire genomes to discover variants, due to decreased costs and increased efficiency of such methods.  In a proof of principle study, complete exomic sequencing of a patient with familial pancreatic cancer identified a germline truncating mutation in PALB2 which appeared responsible for this individual’s predisposition to the disease (Jones et al 2009), although mutations in this gene are thought to be rare events in familial pancreatic cancer (Tischkowitz et al 2010).

The above mentioned rare variant loci for breast cancer in PALB2, BRIP1, and RAD51C were present in 10, 8 and 2 cases and 0, 1 and 0 controls respectively.  Due to lack of power rare variants are difficult to validate by frequency alone in an association-type study. If we assume that a single variant or a set of related variants (for example, in the same gene) occurs at a general population frequency of 0.01–0.001, as many as 1000 unselected cases or controls will be required to detect with probability of about 0.7 more than one variant in a discovery screen (Bodmer & Tomlinson 2010).

Nevertheless, in principle the more common a variant is in the population the less its biological impact, thus allowing it to be passed on through generations without affecting reproductive ability.  Rare variants are likely to reveal more about the pathophysiology of the disease process than common variants, as they are likely to have functional significance, as opposed to common variants which are probably in linkage disequilibrium with the causative mutations.

However it is more problematic to design useful studies of rare variants, as random variation identified cannot be readily assumed to be of functional significance, for example over 1500 variants of uncertain significance (VUSs) have been identified in BRCA1 using a sequencing based approach in breast cancer cases.  The difficulty with rare variant discovery, particularly with whole exomic sequence analysis, will be to sort out the candidate functional variation from an almost overwhelming background of functionally irrelevant variation.  The choice of targets will, in general, require some a priori assessment of functional effects.  In silico biometric approaches have been developed with increasing predictive ability, although in vitro demonstration of effects are generally preferable in order to determine functional effects, for example simple effects on expression or protein truncation.

Studying a cohort of affected cases and subsequently examining a control set for variants identified can cause ascertainment bias.  Thus it would be preferable to search for them in affected individuals and controls with equal rigour, and to use a statistical framework to determine whether variants are truly more common in the affected.  These studies are likely to require extremely large and/or enriched data sets in order to identify and verify significant rare variants.  Nevertheless it is becoming increasingly cost and time effective to perform even whole genome sequencing to determine genetic predisposition to both common and rare disease.

Copy number variation and predisposition

A copy number polymorphism (CNP) in MTUS1 was found to be associated with breast cancer predisposition (Frank et al. 2007), but not colorectal cancer (Monahan et al 2008).  Recently, multiple studies have discovered an abundance of germline copy number variation (CNV) of DNA segments ranging from small to large chromosomal segments (e.g. Down syndrome results from trisomy 21), probably encompassing over 12% of the human genome (Redon et al. 2006). These include deletions, insertions, duplications and complex multi-site variants.  The extent and role of these copy number polymorphisms (CNPs) is increasingly understood with the development of new techniques which allow us to identify such variation (Lupski 2007).

Many new CNPs have been identified from studies using whole genome SNP chips (Redon et al. 2006).  However, the extent of linkage disequilibrium between SNPs and CNPs is unclear.  The biological impact of these types of variation, for example on gene expression, is strikingly different.  Expression profiles from SNPs and CNPs had little overlap (Stranger et al. 2007).  Multiplex ligation-probe amplification (MLPA) has revealed complex whole exon duplications and deletions in APC which lead to the classic FAP phenotype (Schouten et al. 2002; McCart et al. 2006; Pagenstecher et al. 2007).  High penetrance conditions such as FAP are rare whatever the type of mutation may be, e.g. point mutations or exon CNV.  In theory, complex disease might be more susceptible to subtle, lower penetrance forms of variation which alter whole gene copy number without disabling gene function.  In addition, the impact of individual CNPs may be even subtler, with disease phenotype being caused by combinations of low penetrance alleles.

Identification of significant CNPs is thus far hampered by the cost of performing such studies and the lack of techniques available.  Genome wide association studies using SNPs are better at identifying deletion copy number variation that duplication (Locke et al. 2006).  The new generation arrays (e.g. the Affymetrix 5.0 and 6.0, and Illumina 1 M) are being designed to offer the potential to simultaneously interrogate SNPs and CNPs in a single experiment.  However, it may be that more comprehensive genome wide CNP maps are first required with the level of detail for CNPs that the Hapmap project provided for SNPs, before such genome wide CNP arrays are truly useful.

Much as SNPs can be either common or rare variants, so can CNPs.  Using a comparative genomic hybridisation (aCGH) platform, a large study concluded that these CNVs are well tagged on existing SNP platforms and probably contribute little to disease predisposition (Craddock et al 2010).  However this study was limited by the selection of CNVs and did not examine the impact of rare CNVs.  While genome-wide association using common CNPs may be a potentially useful method to elucidate predisposition caused by such CNPs, this technique is not useful for such rare variants.  The true role of these variants are as of yet of undetermined importance in human disease.

Functional consequences of risk alleles

When a Mendelian cancer predisposition gene is first identified, much of the evidence of it’s linkage to the phenotype derives from the finding of several different variants in that gene that

  • Have strong functional effects (for example, protein-truncating mutations).
  • Are often accompanied by ‘second hits’ in the cancer themselves.
  • Are essentially absent from the general population and are hence associated with a very high relative risk.

Conversely the finding of a statistical association of low penetrance alleles with disease in association studies does not necessarily prove that the underlying variant has biological consequence such as causing low-penetrance predisposition.  The likely disease-causing locus (with which the polymorphism is in linkage disequilibrium) has rarely been identified.  IGF1 microsatellite and the TSER TYMS polymorphisms may be in linkage disequilibrium with a sequence variant which alters gene expression Monahan et al 2009).  In a number of recent genome-wide and candidate gene association studies performed, the downstream effect of such variation on RNA and protein function is largely unknown.  Nevertheless identification of a germline mutation in linkage disequilibrium with predisposition alleles has remained elusive and it is felt that allele-specific expression may be an important aetiological factor in colorectal cancer predisposition, particularly as many observed significant variants are not close to any known coding regions (Houlston et al. 2008; Valle et al. 2008).  A SNP in SMAD7 whilst strongly associated with colorectal cancer risk was not found to alter expression of the gene despite lying in the 3’UTR region of the gene (Broderick et al. 2007).  This study may have been limited by the effects of tissue-specific expression as it was performed on lymphoblastoid cell lines derived from cases.  In contrast colorectal cancer associated locus 8q24 lies in a gene desert but contains regulatory elements of MYC, and this region preferentially binds TCF4 the primary target of the canonical Wnt signalling pathway (Tuupanen et al 2009; Pomerantz et al 2009).

Whilst association studies may not easily reveal germline mutations, quantitative and qualitative gene expression studies may be a useful direction for future studies.

Understanding proteomics may be used to yield information as to epistasis between genes as protein-protein interactions are amongst the most important determinants of interaction between genes.  However, in variants identified to date there appears to be very little epistasis (Houlston et al. 2008).  There have been some significant advances in the understanding of diseases such as Crohn’s disease (Parkes et al. 2007) and Coeliac disease (van Heel et al. 2007) due to the results of non-hypothesis driven association studies.  A number of low-penetrance loci have been linked to specific biological pathways with likely biological relevance in these conditions.  Five of the 10 SNPs identified by GWAS of colorectal cancer are in close LD with genes of the TGF/BMP signalling pathway including SMAD7, BMP2 and BMP4.  In the next few years research is likely to reveal further advances in our understanding of the role of both common and rare low penetrance alleles in colorectal cancer by analysing the associated effects on expression and protein function, and by the identification of disease causing mutations.

Gene-environment interactions

Recently published data analysis from the CAPP2 study demonstrates significant modification of colorectal cancer risk in Lynch Syndrome patients by aspirin (Burn et al 2011).  Thus even high penetrant syndromes may be modifiable by the environment.  A priori, environmental agents are even more likely to modify lower penetrance genetic risk factors.  An association of smoking-related cancers with polymorphisms at the cancer susceptibility locus 8q24 (identified by genome-wide association) has been suggested (Park et al. 2008).  When the odds ratios for predisposition alleles are well below 1.5 there is a possibility of interaction (or bias) through an unmeasured environmental factor, as in the context of lung cancer risk and association with 15q which contains the nicotinic acetylcholine receptor (Chanock and Hunter 2008).  Furthermore, the role of gene-environment interactions remains poorly defined and a reductionist approach to understanding the aetiology of colorectal neoplasia means that few such studies exist.  Naturally common low penetrance susceptibility alleles will individually contribute little to overall risk, and it is likely that environmental ‘modification’ by smoking, exercise, body habitus, diet, etc. will provide a more complete explanation of what drives normal colonic crypts along the pathway to cancer.  Indeed the odds ratios for environmental risk factors are comparable to many low penetrance alleles.

It is likely that combining data from genetic and environmental studies will provide clinicians with an increasingly powerful tool to understand and individual patient’s risk and tailor an appropriate management plan, whether this be colonoscopic screening, genetic testing, or lifestyle modification.  It has been proposed that this data may be used in future in association studies in a two-step process whereby patients are first screened for epidemiological risk factors before entering the genotyping analysis (Murcray et al. 2009).

COloRectal Gene Identification (CORGI) Study

In 1997, the ColoRectal tumour Gene Identification(CoRGI) Study Consortium was formed to ascertain and collect biologicalsamples and data from families segregating colorectal cancer, in order to identify novel predisposition genes.  This study led by Prof Ian Tomlinson has largely been undertaken in this laboratory by colleagues.  Families and individuals are being collected with the following entry criteria;

  • Bowel cancer aged < 75 years old
  • Colorectal adenoma < 45 years old
  • Three or more adenomas at any time
  • Severely dysplastic/villous/large (> 1cm) adenoma
  • Exclude Patients with IBD, pathogenic germline mutations, Peutz-Jeghers & juvenile polyposis.

Families were collected from centres throughout England, Scotland and Ireland.

CORGI 1 – Linkage Analysis: A genome wide linkage analysis has been performed on 69 families with a history of bowel cancer and/or polyps using the GeneChip Mapping 10K Xba 142 arrays containing 10 204SNP markers (Kemp et al. 2006).  Families in this study had at least 2 individuals (except parent/child) affected.  A maximum non-parametriclinkage statistic of 3.40 (P=0.0003) was identified at chromosomal region 3q21–q24.  The Galway family is the largest pedigree with over 29 informative meioses, and a decision was taken for it to be studied separately (Chapters 3 and 4).

CORGI 1b A second similar set of 34 families has been collected.  Linkage analysis was performed by colleagues which confirmed linkage at 3q22 (Papaemmanuil, Carvajal-Carmona et al. 2008).

CORGI 1c Approximately 100 families where siblings are affected are being collected for sib-pair analysis.


CORGI 2 – Genome Wide Association (GWA): CORGI 2 is a GWA study using an Illumina SNP platform on cases with the same entry criteria as CORGI 1 but without a family history.  Colleagues initially genotyped 550,163 tag SNPs in 940 individuals with familial colorectal neoplasia and 965 controls using the Illumina Infinium platform. (Tomlinson, Webb et al. 2007).  In CORGI 2b Approximately 42000 candidate SNPs with most significant association in CORGI 2 are being re-tested in a group of ~ 3000 colorectal cancer patients.  Several loci which contain SNPs associated with colorectal cancer susceptibility (at 8q23, 10p14, 11q24, 15q13.3 and 18q21) have been recently identified by colleagues in this cohort (Broderick, Carvajal-Carmona et al. 2007; Tomlinson, Webb et al. 2007; Jaeger, Webb et al. 2008; Tenesa et al. 2008; Tomlinson, Webb et al. 2008).  However no mutations have yet been identified at these loci with proven functional relevance.

CORGI 3 – Candidate gene screening: Genes in the CORGI 2 patient cohort are being screened for sequence abnormalities in functionally important genes such as those involved in DNA repair, the Wnt pathway, or other genes involved in the aetiology of colorectal neoplasia.  Colleagues are also screening the patients included in CORGI 1 and CORGI 2 for gene mutations the loci identified by linkage or association respectively.  Candidate genes EPHB1 and MBD4 have been screened for mutations at 3q21-24 in the CORGI 1 family set but none were found (Kemp, Carvajal-Carmona et al. 2006).


Because of the evidence from adenoma-to-carcinoma sequence model (Morson 1968; Fearon and Vogelstein 1990) the National Polyp Study (Winawer et al. 1993) and other prospective studies (Dove-Edwin et al. 2005; Dove-Edwin et al. 2006) we know that if polyps are removed during colonoscopy, cancer may be prevented.  Thus colorectal cancer is one of the most preventable of all cancers, and some early evidence is emerging that colonoscopic screening may reduce colorectal cancer related mortality (Baxter et al. 2009).  However, national colonoscopic screening programs are expensive, stretching the capacity of already busy services and therefore do not reach the whole population they target.  In addition to lifestyle modification advice to reduce environmental risk factors, it may be possible to identify two groups of patients with inherited risk by understanding the underlying molecular aetiology.

(Copyright, Dr Kevin Monahan)

COlorectal Gene Identification Study

Cancer Research UK

Cancer Research UK (Photo credit: Wikipedia)

COlorectal Gene Identification Study – Click Here

A study to find genes that may increase the risk of bowel tumours (CORGI study)


This study is trying to find genes that may increase your risk of bowel cancer or non cancerous tumours called polyps and adenomas.

We know from research that certain genetic factors you inherit from your parents may affect your risk of getting tumours in your bowel. This includes polyps and adenomas as well as bowel cancer.

In this study, the researchers will look at the genes of a large number of people who have had a tumour in their bowel. They will look for common gene faults that may increase the risk. And they will find out more about people’s family history to see if any of their relatives have also had tumours in the bowel.

The aim of the study is to identify genes that may cause bowel tumours.


Start 05/08/1999

End 31/05/2016

Who can enter

You can enter this study if you (or a relative) have had a bowel tumour.

You cannot enter this trial if you are known to have an inherited condition with a gene fault that increases your risk of bowel tumours.

Trial design

The study will recruit about 3,000 people. If you have had a bowel tumour, a doctor who has been involved in your care will talk to you about the study, or send you a letter explaining it.

If you agree to take part after talking to your doctor, they will give you a letter containing more information. They will ask you to sign a consent form and fill in a questionnaire with questions about your medical history and that of your family. Then they will arrange for you to give a blood sample. This can usually be done when you are due to have another routine blood test.

If you want longer to think about the study, you can take away some information and contact the research team later on if you decide you do want to take part.

If you receive an initial letter about the study and decide you want to take part, you return a reply slip to the research team. A member of the research team will then contact you by phone to give you more information about the study. If you agree to take part, they will then send you

  • A letter containing more information
  • The questionnaire about you and your family’s medical history
  • A consent form for you to sign
  • A blood sampling kit

You give a small sample of blood at your GP surgery or hospital. They will use the blood sampling kit provided by the researchers and send it back to them, along with the completed questionnaire and signed consent form.

The researchers will look at the DNA in the blood samples to try and find genes that may increase the risk of bowel tumours. By signing the consent form, you also give them permission to look at your medical records.

If you have had surgery to remove a bowel tumour, the researchers may get a sample of the tissue that was removed. They will do further genetic tests on this tissue sample.

If any of your close relatives have also had a bowel tumour, the researchers may ask you to contact them about the study. If a relative agrees, the researchers will contact them by phone and invite them to take part.

You may also ask a relative by marriage, who hasn’t had a bowel tumour to take part in the study. This is because the research team need blood samples from a group of people to use as a comparison. This is known as a control group. The control group will be made up of people who

  • Have not had a bowel tumour themselves
  • Don’t have any close relatives who have had a bowel tumour

All the information you and your relatives give will be kept confidential. No results will be given to the people taking part, or to their doctors. Taking part will have no effect on medical care for you or your family.

Hospital visits

The only extra visit will be if you need to go to your GP surgery for the blood test.

You may have already seen a genetics specialist, but if you haven’t and the study team think you have a strong family history, they will suggest you talk to your GP about being referred to a genetics service.

Side effects

You may have a small bruise where the blood sample is taken.

Location of trial

West Middlesex University Hospital and other sites throughout the UK

Open the full list

For more information

Please note: we cannot help you to join a specific trial. Unless we state otherwise in this trial summary, you need to print this page and take it to your own doctor to discuss.

Find out how to join a trial or contact our cancer information nurses for other questions about cancer by phone (0808 800 4040), by email at, or at

The Information Nurses,

Cancer Research UK

Angel Building

407 St John Street


Chief Investigator

Professor Ian Tomlinson

Supported by

Cancer Research UK

National Institute for Health Research Cancer Research Network (NCRN)

University of Oxford


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 1,301 other followers

Live twitter feed

Facebook page for this Clinic

RSS Familial Cancer

  • An error has occurred; the feed is probably down. Try again later.


Google Plus

%d bloggers like this: