Group four - Bioinformatics analysis of mixed lineage leukemia (MLL)

Masako Harada, Department of Oncology and Pathology, CCK, KI

Goncalo Castelo-Branco, Department of Neuroscience and DBRM, KI

Outline

Introduction

Cis-regulatory regions

Gene Expression Profile

miRNA and other class of non-coding RNAs

Protein/Domains

Protein Interaction

Conclusion

Introduction

MLL

MLL (mixed-lineage leukemia) is a large gene that spans approximately 88 kb of DNA and contains 36 exons. It encodes a 431-kD protein of more than 3,910 amino acids, containing 3 regions with homology to sequences within the Drosophila 'trithorax' gene, including cysteine-richregions that can be folded into 6 zinc finger-like domains. The MLL protein has DNA methyltransferase domain and a SET domain, which is a histone H3 lys4 (K4)-specific methyltransferase. This histone methylase activity was found to be associated with HOX gene activation and H3 K4 methylation at cis regulatory sequences in vivo. MLL is present within a stable multiprotein supercomplex composed of at least 29 proteins. The majority of the complex proteins are components of transcription complexes, including TFIID. Other components are involved in RNA processing or histone methylation. The complex remodels, acetylates, deacetylates, and methylates nucleosomes and/or free histones. MLL is cleaved by the protease taspase-1 at 2c onserved sites to generate an N-terminal 320-kD fragment (N320) and a C-terminal 180-kD fragment (C180), which heterodimerize to stabilize the complex and confer its subnuclear destination.

Entrezgene_summary_MLL.jpg

Source (Entrez Gene) (OMIM)

MLL in disease

Recurring chromosomal translocations involving chromosome 11, band q23, have been observed in acute lymphoid leukemias and especially in acute myeloid leukemias (AML). The breakpoints in four 11q23 translocations associated with leukemia were contained within a yeast artificial chromosome (YAC) clone bearing the Cd3d and Cd3g gene loci. Within this YAC, a transcription unit that spans the breakpoint junctions of 3 of these translocations, 4;11, 9;11, and 11;19 was identified. 2 other related transcripts were described, that were upregulated in a translocation cell line. This gene was named MLL for myeloid/lymphoid, or mixed lineage, leukemia. The breakpoint cluster region within MLL spans 8 kb and encompasses several small exons, most of which begin in the same phase of the open reading frame. The types of acute lymphoblastic leukemia and acute myeloid leukemia that are particularly associated with translocations involving 11q23 are acute monoblastic leukemia(AML-M5) and acute myelomonocytic leukemia (AMML-M4).

The MLL gene spans the breakpoint in translocations involving 11q23 which are responsible for approximately 70% of AML and ALL in infants and are also observed in treatment-related leukemias, especially in patients previously treated with drugs inhibiting topoisomerase II. Unique or clonotypic MLL-AF4 genomic fusion sequences were detectable in neonatal blood spots from individuals who developed ALL at ages 5 months to 2 years, thus providing unequivocal evidence for a prenatal initiation of acute leukemia in young patients. Common subtypes due to other translocation fusion genes can be expected to have a similar prenatal initiation. Epidemiologic studies suggested that maternal exposure to various substances such as pesticides, marijuana, or an excess of flavonoids (naturally occurring inhibitors of topoisomerase II) might be associated with acute leukemia in infants. Clustering algorithms showed that lymphoblastic leukemias with MLL translocations can clearly be separated from conventional acute lymphoblastic and acute myelogenous leukemias. They proposed that they constitute a distinct disease, denoted as MLL, and showed that the differences in gene expression are robust enough to classify leukemias correctly as MLL versus acute lymphoblastic leukemia or acute myelogenous leukemia.

Translocations involving 11q23 in leukemia result in the translocation of zinc finger domains with fusion to other genes on chromosome 4, chromosome 9, or chromosome 19. The gene on chromosome 19 with which it is fused is ENL. The genes with which it is fused on chromosome 4 (AF4) and chromosome 9 (AF9) show high homology of sequence to ENL. The protein products of the AF4, AF9, and ENL proteins contained nuclear targeting sequences as well as serine-rich and proline-rich regions. Many genes that can fuse with MLL, including AF6, AF9, FBP17, LPP, PNUTL1 and gephyrin, among others. The MLL gene is leukemogenic when it fuses with itself as well as when it fuses with one of the genes on other chromosomes. A direct tandem duplication involved a region spanning exons 2 to 6, and a partially duplicated protein gene product was demonstrated.

Source (OMIM)

Animal Models of MLL

Yu et al. (1995)

  • Mll deletion in mice was embryonic lethal.
  • Mll+/- mice had retarded growth, hemopoietic abnormalities and bidirectional homeotic transformation of the axial skeleton (with altered Hox gene expression), as well as sternal malformations.

Yamashita et al. (2006)

  • Role of MLL in the immune system using Mll +/- mice:
    • Mll+/- Cd4-positive T cells differentiated normally into antigen-specific effector Th1 and Th2 cells in vitro, but the ability of memory Th2 cells to produce Th2 cytokines was dramatically decreased.
    • Histone methylation and acetylation at Th2 cytokine gene loci was not maintained in Mll+/- memory Th2 cells. Levels of Gata3 mRNA were normal in Mll +/- effector Th2 cells, but they were substantially decreased in Mll +/- memory Th2 cells; mRNA levels of other transcription factors were not affected in Mll +/- memory Th2cells. Histone modifications of Gata3 were also aberrant in Th2 cell lines in which Mll expression had been knocked down by small interfering RNA.
    • Ovalbumin-induced allergic eosinophilic inflammation was reduced in Mll +/- Th2 cell-transferred mice.

Barabe et al. (2007)

  • Upon transplantation into immunodeficient mice, primitive human hematopoietic cells expressing a mixed-lineage leukemia (MLL) fusion gene generated myeloid or lymphoid acute leukemias, with features that recapitulated human diseases.

McMahon et al. (2007)

  • Fetal liver from MLL-knockout mouse embryos showed defects in the hematopoietic stem and progenitor pool, including reductions in long-term and short-term hematopoietic stem cell numbers and a decrease in the quiescent hematopoietic stem cell fraction.

  • Adult mice with conditional Mll knockout had no apparent abnormalities in mature hematopoietic cells in bone marrow, spleen, and thymus. However, conditional Mll-knockout bone marrow cells produced reduced numbers of colony-forming units and showed reduced ability to compete in hematopoietic reconstitution assays.

Source (OMIM)

MLL transcripts and cis-regulatory regions

The MLL gene spans approximately 89 kb of DNA and contains 36 exons exons. Analysis of MLL in the UCSC genome browser indicates that, although several mRNA and ESTs have been identified, there is only one human Refseq sequence for MLL (NM_005933.2), suggesting that there is no alternative splice variants. However, UCSC Gene Predictions include other possible variants with less exons (uc001psz.1,uc001ptd.1 anduc001pte.1). The mouse and rat Refseqs are similar to the human, except for exon 27, which appears to be alternatively spliced. The Danio rario Refseq indicates the absence of certain exons.

UCSC_MLL_RefSeq.jpg

Analysis of MLL in the ENSEMBL genome browser also reveals another possible variant, with 20 exons (Q9HB80_HUMAN).

These possible transcript variants would translate into proteins containing only certain domains. For instance, Q9hb80 would only contain the DNA binding domain and the zinc binding domain, as assessed in Uniprot

MLL_EMSEMBL.jpg

Cis-regulatory regions of MLL

Conservation

In order to identify cis-regulatory regions of the MLL genes, we first analyzed in the UCSC genome browser the human MLL gene (chr11:117,812,415-117,901,146, March 2006 assembly) and flanking regions for:

- the conservation among different species

- the conserved transcription factor binding sites

UCSC_conservation_MLL.jpg

Apart from the exon sequences, there were several regions of high sequence similarity between different species, that could be good candidates for cis-regulatory regions. These regions contained several transcription factor binding sites conserved in the human/mouse/rat alignment.

Cis-regulatory regions analysis and transcription factor binding sites

In order to analyze in more detail the different cis-regulatory regions, we performed a evolutionary conservation analysis in nECR browser. comparing human to mouse, rat and chicken. The different evolutionary conserved regions (ECRs) are highlighted with the red bars.

MLL_ECRbrowser.jpg

The human and mouse sequences of the first ECR in the 5' region of the MLL gene gene was then captured and a search was performed for conserved transcription factor binding sites in

R-Vista.

RVista_e1.jpg

The following TFBS were found:

1 V$HNF4_Q6 - 53-61 gTGGACctt 432-440 gagGTCCAc 95.002 V$AHRHIF_Q6 + 69-77 ggCGTGaag 419-427 ctgCACGcc 90.003 V$TAXCREB_02 + 72-86 gTGAAGTGCACCCTC 410-424 GAGGGTGTACTGCAc 90.004 V$AP2_Q6 - 84-95 ctcCGGAGGGcc 401-412 agCCCACTGgag 80.005 V$IK2_01 - 133-144 tcctTCCCaaat 351-362 atttGGGAacga 80.006 V$E2_Q6_01 + 181-196 tgtACCGGTGCGGGAg 299-314 cTCCCGCACCGGTacc 90.007 V$AREB6_03 - 182-193 gtaCCGGTGcgg 302-313 ccgCACCGGtac 90.008 V$E2_01 - 182-197 gtACCGGTGCGGGAgt 298-313 tcTCCCGCACCGGTac 90.009 V$E2_Q6 - 182-197 gtACCGGTGCGGGAgt 298-313 tcTCCCGCACCGGTac 90.0010 V$SMAD4_Q6 - 201-215 gGAAGGCTGCATGAC 280-294 GTCATGCAGCCGTCc 95.0011 V$ZID_01 + 204-216 aGGCTGCATGACC 279-291 GGTCATGCAGCCg 95.0012 V$FXR_IR1_Q6 + 205-217 GGCTGCATGACCt 278-290 aGGTCATGCAGCC 95.0013 V$T3R_01 - 207-222 ctgcATGACCTTCcag 273-288 cagGAAGGTCATgcag 95.0014 V$ER_Q6_02 - 208-218 tgcaTGACCTt 277-287 aAGGTCAtgca 95.0015 V$T3R_Q6 + 210-218 catGACCTt 277-285 aAGGTCatg 95.0016 V$AP2_Q3 - 230-245 cccctcGGCTTGGGgg 250-265 acCCCCAGCCgaggtg 95.0017 V$TFIII_Q6 + 244-252 gGATGGAGG 235-243 CCTCCATCt 95.0018 V$HIC1_02 + 248-262 ggaggcTGCcccggg 225-239 cccggaGCAgcctcc 95.0019 V$HIC1_03 + 248-265 ggaggcTGCcccggggcc 222-239 agacccggaGCAgcctcc 95.0020 V$AP2_Q3 + 263-278 gcCTGCAGGCtgtgta 210-225 tacactGCCTGCAGac 95.0021 V$CDPCR1_01 + 277-286 tATCGATccc 202-211 gggATCGATa 95.0022 V$CDPCR3HD_01 + 277-286 tATCGATccc 202-211 gggATCGATa 95.0023 V$AHR_01 - 361-378 tctAACGCTATGCTCgag 110-127 ctcCAGCGAGGCGTTaga 80.0024 V$STAT1_01 - 387-407 ccctctTTCCCGCAAtaaagt 81-101 actttaTTGCAGGAAatggga 90.0025 V$PAX8_B - 390-407 tctttCCCGCaataaagt 81-98 actttattGCAGGaaatg 90.0026 V$LPOLYA_B + 399-406 cAATAAAG 82-89 CTTTATTg 90.0027 V$OCT1_02 + 415-429 gtGGATATTCAttcc 59-73 ggggTGAATATCCac 90.00

Gene expression

In order to determine the general expression pattern of MLL in different human cell and tissues. we perform a query against MLL in Symatlas, a database on gene function and structure, from the Genomics Institute of the Novartis Research Foundation.. MLL is expressed in higher levels in Cd4+ and CD8+ T cells, as shown below by the profile of the MLL probe 212076_at. Other MLL probes in this array experiment show similar expression patterns.

Symatlas.jpg

We then performed a query with human MLL in Array Express and found the same array experiment in E-TABM-145, . The graphic representation allowed us to see the profile of the different MLL probes, which are consistent with the Symatlas analysis. We then performed a query to determine which other gene in the array had the most similar profile to MLL and identified ZBTB25 (depicted in red).

ETABM145.jpg

MLL_ZBTB25_Array.jpg

ZTBTB25, also known as ZNF46 or KUP, contains 2 zinc-finger domains and is likely to be a transcription factor. It is mainly expressed in testis and, interestingly, in the hematopoetic system. However, its function hasn't been extensively characterized, as can be assessed in the following query in NCBI All databases. The transcription correlation with MLL in different cell lines suggest a relevant role of ZTBTB25 in hematopoesis and even in diseases such as leukemia.

ZBTB25.jpg

BioinfoAnalysisCont

Conclusions:

  • The MLL can undergo translocation with several other genes in other chromossomes, leading to leukemia.
  • MLL has only one Refseq sequence, although several putative alternative splice forms suggest other isoforms of the protein.
  • Animal models with MLL misexpression indicate an important role in hematopoesis, consistent with it role in leukemia.
  • The MLL gene contains several evolutionary conserved domains in the 5' end.
  • MLL is highly expressed in CD4+ and CD8+ T cells and has an similar expression profile to ZBTB25.
  • MLL has long 3' UTR and has 20 miRNA target sites predicted
  • It was shown that mRNA is likely to have piwi-interacting RNA, whic can cause gene-silencing together with Piwi protein.
  • Several domains have been identified by sequence alignment and some are highly conserved throughout the species.
  • Zinc finger domain has a DNA-binding property, which has important role in transcription factor.
  • SET domain has histon methylase activity and found to be associated with HOX gene activation and H3 K4 methylation in vivo.
  • Both Zinc finger domain and SET domain has close interaction with each other and both seem to have important roles in various biological processes. (ex. p53)
 








 
Topic attachments
I Attachment Action Size Date Who Comment
JPEGjpg EntrezGene_summary_MLL.jpg manage 263.9 K 2008-05-19 - 09:28 UnknownUser  
JPEGjpg MLL_ECRbrowser.jpg manage 54.2 K 2008-05-19 - 11:26 UnknownUser  
JPEGjpg RVista_e1.jpg manage 21.2 K 2008-05-19 - 11:43 UnknownUser  
JPEGjpg Symatlas.jpg manage 83.0 K 2008-05-19 - 10:53 UnknownUser  
JPEGjpg UCSC_conservation_MLL.jpg manage 531.1 K 2008-05-19 - 09:31 UnknownUser  
JPEGjpg ZBTB25.jpg manage 442.5 K 2008-05-19 - 09:46 UnknownUser  
Topic revision: r17 - 2009-07-13 - MartinDahlo

Bioinformatics for Cell Biologists (Spring -08)

Course Information

DBRM
Knowledge Base
Research School


WikiHelp
Log In