17.04.2013 Views

Julio César Chávez Galarza - Universidade do Minho

Julio César Chávez Galarza - Universidade do Minho

Julio César Chávez Galarza - Universidade do Minho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Phylogenetic relationships of the most common pathogenic<br />

Candida species inferred by sequence analysis of nuclear genes<br />

<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />

<strong>Minho</strong> 2009<br />

U<br />

<strong>Universidade</strong> <strong>do</strong> <strong>Minho</strong><br />

Escola de Ciências<br />

<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />

Phylogenetic relationships of the most<br />

common pathogenic Candida species<br />

inferred by sequence analysis of nuclear<br />

genes<br />

Abril de 2009


<strong>Universidade</strong> <strong>do</strong> <strong>Minho</strong><br />

Escola de Ciências<br />

<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />

Phylogenetic relationships of the most<br />

common pathogenic Candida species<br />

inferred by sequence analysis of nuclear<br />

genes<br />

Tese de Mestra<strong>do</strong><br />

Mestra<strong>do</strong> em Genética Molecular<br />

Trabalho efectua<strong>do</strong> sob a orientação da<br />

Professora Doutora Ana Paula Fernandes<br />

Monteiro Sampaio Carvalho<br />

Abril de 2009


É AUTORIZADA A REPRODUÇÃO PARCIAL DESTA TESE,<br />

APENAS PARA EFEITOS DE INVESTIGAÇÃO, MEDIANTE DECLARAÇÃO<br />

ESCRITA DO INTERESSADO, QUE A TAL SE COMPROMETE.


This Thesis was supported by the Programme ALβAN, the European Union Programme of High<br />

Level Scholarships for Latin America, scholarship No E06M103915PE.<br />

iii


DEDICATORIA<br />

A Jehová Deus por tu<strong>do</strong><br />

Aos meus pais <strong>Julio</strong> y Claudia pelo seu<br />

amor e apoio em to<strong>do</strong> momento<br />

v


AGRADECIMENTOS<br />

Manifesto de maneira muito especial o meu agradecimiento à Prof. Doutora Ana Paula<br />

Sampaio, orienta<strong>do</strong>ra deste trabalho, por to<strong>do</strong> apoio, dedicação, determinação, constante<br />

estímulo, e em particular por to<strong>do</strong>s os ensinamentos que me transmitiu.<br />

À Prof. Doutora Célia Pais, pelo seu apoio, ensinamentos e valiosas críticas deste<br />

trabalho.<br />

À Prof. Cândida Lucas, por toda a sua ajuda que me proporcionou para começar os<br />

meus estu<strong>do</strong>s em Portugal.<br />

Ao Prof. Rui Oliveira, pelos seus conselhos e amizade.<br />

Ao Prof. Felipe Costa, pelo seu apoio e sugestões na realização desta tese.<br />

Ao Prof. Pedro Carvalho pela sua amizade e acolhida quan<strong>do</strong> cheguei a Portugal.<br />

Aos elementos que integram o Departamento de Biologia e, nomeadamente os seus<br />

<strong>do</strong>centes, investiga<strong>do</strong>res e técnicos, por to<strong>do</strong> o apoio demonstra<strong>do</strong>.<br />

Aos colegas que integram o laboratório Yolanda, Alexandra, Marlene, Eugénia, Ana,<br />

Rafael, Raquel, pelo auxilio no trabalho, pela companhia e pela amizade.<br />

Aos colegas <strong>do</strong> Departamento de Biologia, em especial à Andreia, Sílvia, Célia, Rita,<br />

Sofia, Emi ao Flavio e Jorge, pelos seus conselhos, pela amizade, alegria e longas<br />

tertúlias.<br />

Ao Cristovão pela sua ajuda na edição desta tese.<br />

À família Alves Pinto Pacheco, que me fizeram esquecer o meu Perú, para aprender<br />

querer Portugal.<br />

Aos colegas <strong>do</strong> mestra<strong>do</strong> á Dany, Dulcineia, Dulce, Ana, Nádia, ao Marcos e Ricar<strong>do</strong>,<br />

pelos bons momentos que partilhamos e apoio que me brindaram.<br />

vii


Aos professores da <strong>Universidade</strong> Nacional de Trujillo-Perú: Dr. Manuel Fernández<br />

Honores e Ms. C. Elmer Alvítez Izquier<strong>do</strong> por me terem apoia<strong>do</strong> e ensina<strong>do</strong> grandes<br />

valores e em forma muito especial ao Professor Roberto Rodríguez Rodríguez por ter<br />

si<strong>do</strong> o meu guía durante a vida universitária e modelo de pessoa.<br />

Aos professores Dr. Ricar<strong>do</strong> Fujita Alarcón <strong>do</strong> Instituto de Genética e Biologia<br />

Molecular da <strong>Universidade</strong> Particular San Martín de Porres e Dr. Enrique Pérez <strong>do</strong><br />

Instituto de Medicina Tropical da <strong>Universidade</strong> Cayetano Heredia, pelo apoio<br />

incondicional e confiança colocada em mim.<br />

Aos meus irmãos Gianmarco, Michael, Lizbeth, Pilar, Emilia, Ruth, Rommel, Maira e<br />

Milagros, pelo apoio incondicional.<br />

Ao meu tío Miguel pela sua confiança e apoio incondicional desde o início <strong>do</strong>s meus<br />

estu<strong>do</strong>s.<br />

Aos meus tíos Ramón, Picolín, Doris, Luz e Carlos pelo seu apoio e confiança que<br />

sempre têm posto em mim.<br />

À família Rodríguez Gonzales: Prof. Roberto, Sra. Rocio, Mireya, Lalo e Bran<strong>do</strong>n, pelo<br />

apoio e amizade nos momentos mais difíceis, este logro é produto da sua confiança<br />

posta em mim.<br />

Às famílias Salga<strong>do</strong> Rodríguez, Juárez Pita, Rebaza Rodríguez, Massé Miguel, sempre<br />

me fizeram sentir como parte delas.<br />

À Clarisa que sem a sua ajuda tivesse si<strong>do</strong> difícil este logro.<br />

À Yolanda pelo amor e carinho constante que nunca acabarão.<br />

Do mesmo mo<strong>do</strong>, aos meus amigos <strong>do</strong> Perú: Carlitos, Eduar<strong>do</strong>, Miguel, Victor Hugo,<br />

Daniel, Fibi, Ramón, Ludwig, Gabriela, Cecilia, Divián, Omar, Néstor, Marge, Sussy.<br />

viii


RESUMO<br />

As regiões genómicas codificantes são as mais usadas em estu<strong>do</strong>s filogenéticos,<br />

quer através de análises multigénicas, quer filogenómicas. Contu<strong>do</strong>, a maioria destas<br />

análises usa apenas as regiões com elevada similaridade entre as sequências ortólogas,<br />

não consideran<strong>do</strong> as restantes regiões, as quais nos podem dar informações muito úteis.<br />

Deste mo<strong>do</strong>, o principal objectivo desta tese foi avaliar o uso <strong>do</strong>s genes da família<br />

MADS-box, RLM1 e MCM1, assim como o gene IFF8, que codifica para uma proteína<br />

GPI, em estu<strong>do</strong>s de filogenia de fungos.<br />

Foram obtidas setenta e seis sequências ortólogas para RLM1 e MCM1 e oito<br />

para IFF8 através da pesquisa em várias bases de da<strong>do</strong>s. O alinhamento das sequências<br />

foi realiza<strong>do</strong> mediante o uso <strong>do</strong>s programas CLUSTALW e MUSCLE e a análise<br />

filogenética, utilizan<strong>do</strong> os programas PHYML 3.0 e MrBayes 3.2. Os resulta<strong>do</strong>s obti<strong>do</strong>s<br />

usan<strong>do</strong> o factor de transcrição RLM1 indicaram que este apresenta condições favoráveis<br />

para ser incluí<strong>do</strong> em estu<strong>do</strong>s filogenéticos de fungos, uma vez que a topologia obtida<br />

está muito próxima das estabelecidas em outras análises multigénicas e filogenómicas.<br />

O factor de transcrição MCM1 demonstrou limitações para ser utiliza<strong>do</strong> em estu<strong>do</strong>s<br />

filogenéticos a nível <strong>do</strong> reino, visto que as sequências obtidas apresentam uma grande<br />

variabilidade de tamanho, originan<strong>do</strong> problemas nos alinhamentos. Apesar desta<br />

limitação este gene pode ser utiliza<strong>do</strong>, quer independentemente quer combina<strong>do</strong> com<br />

outros genes, para resolver filogenias a nível <strong>do</strong> Filo ou de grupos de categoria inferior,<br />

uma vez que os resulta<strong>do</strong>s obti<strong>do</strong>s dentro <strong>do</strong> subfilo Saccharomycotina estão de acor<strong>do</strong><br />

com os estu<strong>do</strong>s publica<strong>do</strong>s. A utilização <strong>do</strong> gene IFF8 para inferir filogenia apresentou<br />

grandes limitações uma vez que foram identifica<strong>do</strong>s ortólogos deste gene apenas no<br />

grupo CUG e além disso a filogenia obtida não estava de acor<strong>do</strong> com outros estu<strong>do</strong>s<br />

publica<strong>do</strong>s, em particular no respeitante à posição de C. tropicalis na filogenia de<br />

Candida spp.<br />

É presentemente aceite que no Subfilo Saccharomycotina ocorreu um processo<br />

de duplicação completa <strong>do</strong> genoma, nos grupos sensu stricto e sensu lato <strong>do</strong> ‗Complexo<br />

Saccharomyces‘ com conservação de vários genes duplica<strong>do</strong>s. A pesquisa inicial por<br />

sequências ortólogas revelou que os factores de transcrição estuda<strong>do</strong>s neste trabalho<br />

estão dentro <strong>do</strong>s genes duplica<strong>do</strong>s que foram manti<strong>do</strong>s. Assim, outro objectivo deste<br />

trabalho foi determinar se o gene RLM1 esteve sob selecção positiva no Subfilo<br />

ix


Saccharomycotina. Esta análise identificou várias posições onde os aminoáci<strong>do</strong>s estão<br />

possivelmente sob selecção positiva e embora estas substituições tenham si<strong>do</strong><br />

observadas em diferentes locais da proteína, não se detectou nenhuma nas regiões<br />

conservadas. Esta observação sugere que a proteína executa uma função importante na<br />

célula que foi mantida durante o processo de divergência de espécies. A presença de<br />

aminoáci<strong>do</strong>s sob selecção positiva, no inicio da região repetitiva <strong>do</strong> terminal carboxílico<br />

da proteína, bem como a diferença no aminoáci<strong>do</strong> repeti<strong>do</strong> entre as espécies com o<br />

genoma duplica<strong>do</strong> e as espécies com genoma não duplica<strong>do</strong> sugeriam que uma mutação<br />

de frameshift seria a responsável pelas alterações observadas. Esta hipótese foi então<br />

testada no Subfilo Saccharomycotina, desenhan<strong>do</strong> as três grelhas de leitura e<br />

reconstruin<strong>do</strong> a sequência ancestral. Os resulta<strong>do</strong>s desta análise confirmaram que uma<br />

mutação de frameshift foi de facto responsável pelas alterações observadas na região<br />

repetitiva com substituição de aminoáci<strong>do</strong>s, diversifican<strong>do</strong> a função deste gene nas<br />

espécies que duplicaram o genoma.<br />

x<br />

Os principais resulta<strong>do</strong>s desta tese foram: (i) a identificação <strong>do</strong> potencial <strong>do</strong><br />

gene RLM1 para estu<strong>do</strong>s de filogenia <strong>do</strong> reino Fungi, com especial ênfase na filogenia<br />

das espécies <strong>do</strong> género Candida spp.; e (ii) a identificação <strong>do</strong> mecanismo molecular<br />

responsável pela alteração da região repetitiva <strong>do</strong> terminal carboxílico da proteína,<br />

ocorrida no Saccharomyces sensu stricto durante a divergência das espécies após<br />

duplicação <strong>do</strong> genoma.


ABSTRACT<br />

Coding regions are used to resolve phylogenetic relationships through<br />

multigenic and phylogenomic analyses. However, the majority of these analyses uses<br />

regions with similarity only among orthologue gene sequences and <strong>do</strong> not take into<br />

account other regions which could give useful information. Thus, the main purpose of<br />

this thesis was to evaluate the use of RLM1 and MCM1 MADS-box transcription<br />

factors, and IFF8, a GPI-anchor protein-coding gene, in fungi phylogeny.<br />

Seventy six putative orthologue sequences for RLM1 and MCM1 and eight for IFF8<br />

were obtained from different fungal databases. Sequence alignments were performed by<br />

using CLUSTALW and MUSCLE and phylogeny was inferred by using PHYML 3.0<br />

and Mr Bayes 3.2. Results obtained from the phylogenetic analysis, using the<br />

transcription factor RLM1, indicated that it presents conditions to be considered within a<br />

multigene analysis, since the obtained fungal phylogeny is closer to the ones established<br />

by multigene and phylogenomic analyses. The transcription factor MCM1 presented<br />

limitations to be used in phylogeny at the king<strong>do</strong>m level, because of its variable<br />

sequence sizes. However, despite this limitation this gene can be used to resolve<br />

phylogeny at the phylum or lower clade levels since the results obtained, independently<br />

and/or concatenated with the other genes used in this study, within the subphylum<br />

Saccharomycotina were in agreement with published studies. On the other hand, the use<br />

of IFF8 gene to infer phylogeny is limited and restricted to the CUG group, since other<br />

orthologues were not found within the king<strong>do</strong>m Fungi and the results obtained in the<br />

CUG group phylogeny presented conflicts, particularly in the position of Candida<br />

tropicalis which is not in agreement with previously determined relationship in Candida<br />

phylogeny.<br />

It is known that within the subphylum Saccharomycotina the process of genome<br />

duplication has occurred in the ‗Saccharomyces complex‘, groups sensu stricto and<br />

sensu lato, resulting in the duplication of some genes. The initial search for orthologue<br />

sequences showed that the transcription factors studied in this work are within the<br />

duplicated genes that were maintained. Thus, another objective of this work was to<br />

determine if RLM1 was under positive selection to search for an alternative explanation<br />

for the persistence and diversification of gene duplicates. These analyses identified<br />

several amino acid sites under positive selection within the subphylum<br />

xi


Saccharomycotina and although these substitutions were present in different positions,<br />

they were not inside conserved regions, suggesting that the protein plays an important<br />

role in fungi that was maintained during the evolution process of divergence of species.<br />

The presence of amino acids under positive selection at the beginning of Rlm1 C-<br />

terminal repetitive region and the differences in the amino acid under repetition between<br />

species that presented the duplicated genome (WGD) and species with non duplicated<br />

genome was indicative of a possible frameshift mutation. This hypothesis was tested in<br />

Saccharomycotina group by designing the three open reading frames and reconstructing<br />

the ancestral sequence. Results from this analysis showed that amino acid substitution<br />

that occurred during the divergence of species changed this repetitive region and<br />

diversified gene function in WGD species avoiding the loss of the protein function.<br />

xii<br />

The major findings of this work were (i) the identification of the potential use of<br />

RLM1 gene for inferring phylogeny in the king<strong>do</strong>m Fungi with special emphasis to<br />

Candida species, and (ii) the observation that this gene evolved within the<br />

Saccharomyces sensu stricto after genome duplication, being the molecular mechanism<br />

responsible for the change observed in the C-terminal of this protein most probably a<br />

frameshift mutation.


RESUMEN<br />

Las regiones genómicas codificantes son usadas en estudios filogenéticos através<br />

de análisis a niveles multigenicos y filogenómicos. Pero la mayoría de estos análisis<br />

usan solo regiones con similaridad entre secuencias de genes ortólogos y no toma en<br />

cuenta otras regiones las cuales podrían darnos informaciones útiles. Por lo tanto, el<br />

objetivo principal de esta tesis fue evaluar el uso de los genes MADS-box, RLM1 y<br />

MCM1, así como IFF8, un gen codificante de proteína de anclaje GPI, en la filogenia de<br />

hongos.<br />

Se obtuvieron setenta y seis secuencias putativas ortólogas para RLM1 y MCM1,<br />

y ocho para IFF8 a partir de las diferentes bases de datos de hongos. Los alineamientos<br />

de secuencias se realizaron mediante el uso de los programas CLUSTALW y MUSCLE,<br />

y la filogenia fue inferida por medio de los programas PHYML 3.0 y MrBayes 3.2. Los<br />

resulta<strong>do</strong>s obteni<strong>do</strong>s de los análisis filogenéticos, utilizan<strong>do</strong> el factor de transcripción<br />

RLM1 indicaron que este presenta condiciones para ser considera<strong>do</strong> dentro de análisis<br />

multigénico, ya que la filogenia obtenida para hongos está muy próxima a las<br />

establecidas por otros análisis multigénicos y filogenómicos. El factor de transcripción<br />

MCM1 presentan limitaciones para ser utiliza<strong>do</strong> en filogenia a nivel de reino, debi<strong>do</strong> a<br />

sus secuencias de tamaño variables, dan<strong>do</strong> lugar a problemas en el alineamiento. A<br />

pesar de esta limitación este gen se puede utilizar para resolver filogenias a nivel de Filo<br />

o cla<strong>do</strong>s inferiores debi<strong>do</strong> a los resulta<strong>do</strong>s obteni<strong>do</strong>s con su uso, de manera<br />

independiente y/o combina<strong>do</strong> con el resto de genes utiliza<strong>do</strong>s en este estudio, ya que<br />

dentro del Subfilo Saccharomycotina estuvieron de acuer<strong>do</strong> con estudios publica<strong>do</strong>s.<br />

Por otro la<strong>do</strong>, el uso de IFF8 gen para inferir filogenia es limita<strong>do</strong> y restringi<strong>do</strong> al grupo<br />

CUG, ya que otros ortólogos no se han encontra<strong>do</strong> en el reino Fungi y los resulta<strong>do</strong>s<br />

obteni<strong>do</strong>s en la filogenia del grupo CUG presentaron conflictos, en particular en la<br />

posición de C. tropicalis que no está de acuer<strong>do</strong> con lo que ya se ha determina<strong>do</strong> en la<br />

filogenia de Candida spp.<br />

Es actualmente acepta<strong>do</strong> que dentro del subfilo Saccharomycotina el proceso de<br />

duplicación del genoma se ha produci<strong>do</strong> en los grupos sensu stricto y sensu lato del<br />

'Complejo Saccharomyces‘, resultan<strong>do</strong> en la manutención de algunos genes. La<br />

búsqueda inicial de secuencias ortólogas mostró que los factores de transcripción<br />

estudia<strong>do</strong>s en este trabajo están dentro de los genes duplica<strong>do</strong>s que han si<strong>do</strong><br />

xiii


manteni<strong>do</strong>s. Así pues, otro objetivo de este trabajo fue determinar si RLM1 estuvo bajo<br />

selección positiva. Estos análisis identificaron varios sitios <strong>do</strong>nde los aminoáci<strong>do</strong>s<br />

estuvieron posiblemente bajo selección positiva dentro del Subfilo Saccharomycotina y<br />

aunque estas sustituciones estaban presentes en diferentes posiciones, no estuvieron<br />

presentes en las regiones conservadas, lo que sugiere que la proteína ejecuta una<br />

función importante en la célula que fue mantenida durante el proceso de divergencia de<br />

especies. La presencia de aminoáci<strong>do</strong>s bajo selección positiva en el inicio de la region<br />

repetitiva del C-terminal en Rlm1 y la diferencia en el aminoáci<strong>do</strong> bajo repetición entre<br />

las especies con el genoma duplica<strong>do</strong> y las especies con genoma no duplica<strong>do</strong> indicaba<br />

un posible frameshift mutation. Asi, esta hipótesis fue testada en el subfilo<br />

Saccharomycotina diseñan<strong>do</strong> tres open reading frames y reconstruyen<strong>do</strong> la secuencia<br />

ancestral. Los resulta<strong>do</strong>s de este análisis mostraron que la substitución de aminoáci<strong>do</strong>s<br />

ocurrida durante la divergencia de especies que alteró esa región fue mediante una<br />

frameshift mutation diversifican<strong>do</strong> la función de este gen en las especies que duplicaron<br />

su genoma.<br />

xiv<br />

Los principales resulta<strong>do</strong>s de esta tesis son: (i) la identificación del potencial del<br />

gen RLM1 para estudios de filogenia en el reino Fungi, con especial énfasis en la<br />

filogenia de las especies de Candida spp y (ii) la identificación del mecanismo<br />

molecular responsable por el cambio observa<strong>do</strong> en el C-terminal de esta proteína en el<br />

grupo Saccharomyces sensu stricto que alteró el gen durante la divergencia de especies<br />

después del proceso de duplicación de genoma.


INDEX<br />

DEDICATORIA .............................................................................................................. v<br />

AGRADECIMENTOS ................................................................................................. vii<br />

RESUMO ........................................................................................................................ ix<br />

ABSTRACT .................................................................................................................... xi<br />

RESUMEN ................................................................................................................... xiii<br />

ABBREVIATION LIST ............................................................................................. xvii<br />

FIGURE LIST .............................................................................................................. xxi<br />

TABLE LIST .............................................................................................................. xxiii<br />

1. INTRODUCTION ...................................................................................................... 1<br />

1.1. Classification of organisms .................................................................................. 3<br />

1.2. Tree of life ............................................................................................................. 5<br />

1.3. King<strong>do</strong>m Fungi ................................................................................................... 12<br />

1.4. Fungal taxonomy ................................................................................................ 15<br />

1.5. Fungal phylogeny ............................................................................................... 16<br />

1.6. Fungal identification .......................................................................................... 19<br />

1.6.1. Phenotypic methods .................................................................................... 20<br />

1.6.2. Genotypic methods ...................................................................................... 21<br />

1.7. Methods for phylogenetic inference.................................................................. 24<br />

1.7.1. Multiple sequence alignment ...................................................................... 25<br />

1.7.1.1. CLUSTAL series of programs .......................................................... 25<br />

1.7.1.2. MUSCLE ............................................................................................ 26<br />

1.7.2. Major methods for estimating phylogenetic trees and infer phylogeny . 26<br />

1.7.2.1. Distance methods ............................................................................... 26<br />

1.7.2.2. Character-based methods ................................................................. 28<br />

1.8. Adaptive evolution.............................................................................................. 30<br />

2. THESIS BACKGROUND AND OBJECTIVES .................................................... 33<br />

3. MATERIAL AND METHODS ............................................................................... 43<br />

3.1. Phylogenetic analysis ......................................................................................... 45<br />

3.1.1. Data retrieval ............................................................................................... 45<br />

3.1.2. Alignment of sequences ............................................................................... 46<br />

3.1.3. Selection of the best evolutionary model ................................................... 46<br />

xv


xvi<br />

3.1.4. Phylogenetic reconstruction ........................................................................ 46<br />

3.1.5. Interpretation of values considered for nodal support ............................. 47<br />

3.2. Analysis of adaptive evolution ........................................................................... 48<br />

3.3. Analysis of the repetitive region in the subphylum Saccharomycotina ......... 49<br />

4. RESULTS AND DISCUSSION................................................................................ 51<br />

4.1. Compilation of data ............................................................................................ 53<br />

4.2. Phylogeny based on MADS-box transcrition factors ...................................... 54<br />

4.2.1. Rlm1 transcription factor ........................................................................... 54<br />

4.2.2. Mcm1 transcription factor .......................................................................... 60<br />

4.2.3. Placement of fungal phylogeny based on MADS-box transcription factor<br />

within the current literature ................................................................................. 63<br />

4.3. Relationships within the subphylum Saccharomycotina ................................ 69<br />

4.3.1. Phylogeny of the ‘Saccharomyces complex’ ............................................... 69<br />

4.3.2. Phylogenetic relationships amongst Candida species ............................... 74<br />

4.4. Analysis of adaptive evolution ........................................................................... 77<br />

4.5. Event of frameshift mutation in S. cerevisiae RLM1 gene .............................. 82<br />

5. FINAL REMARKS ................................................................................................... 89<br />

6. REFERENCES .......................................................................................................... 95<br />

7. ANNEXES ................................................................................................................ 129


ABBREVIATION LIST<br />

AIC Akaike Information Criterion<br />

AICc Akaike Information Criterion corrected<br />

ACT1 Actin 1<br />

AFLP Amplified Fragment Length Polymorphism<br />

aLRT Approximate Likelihood Ratio Test<br />

AP-PCR Arbitrarily Primed - PCR<br />

ARG80 Arginine requirement 80<br />

BEB Bayesian Empirical Bayes<br />

BI Bayesian inference<br />

COX2 Cythocrome oxidase 2 gene<br />

CUG Organisms translate serine instead of leucine<br />

CytB Cythocrome oxidase B gene<br />

DHFR Dihydrofolate Reductase gene<br />

DNA Desoxiribonucleic acid<br />

dN Number of non-synonymous changes per non-synonymous site<br />

dS Number of synonymous changes per synonymous site<br />

EF-1α Elongation factor 1 alpha<br />

GPI Glycosylphosphatidylinositol<br />

HYPHY Hypothesis Testing Using Phylogenies<br />

IFF8 Individual proteins file family F 8<br />

ITS Intergenic Transcribed Spacer<br />

LPS Lipopolysaccharide<br />

LRTs Likelihood Ratio Tests<br />

MADS MCM1-AGAMOUS-DEFICIENS-SRF<br />

MAP1 Schizosaccharomyces pombe MCM1 homologue<br />

MCM1 MiniChromosome Maintenance<br />

MCMCMC Metropolis-coupled Markov Chain Monte Carlo<br />

MEC Mechanistical-Empirical Model<br />

MEF2 Myocite Enhancer Factor 2<br />

xvii


MEGA Molecular Evolutionary Genetics Analysis<br />

ML Maximum likelihood<br />

MMR Mismatch repair system<br />

MP Maximum parsimony<br />

mtDNA Mithocondrial Desoxiribonucleic acid<br />

mtLSUrDNA Mithocondrial large subunit ribosomal DNA gene<br />

mtSSUrDNA Mithocondrial small subunit ribosomal DNA gene<br />

MUSCLE Multiple Sequence Comparison by log-Expectation<br />

NJ Neighbor joining<br />

NonWGD Non Whole Genome Duplication<br />

nucLSUrDNA Nuclear large subunit ribosomal DNA gene<br />

nucSSUrDNA Nuclear small subunit ribosomal DNA gene<br />

OM Outer membrane<br />

Omp85 Outer membrane protein family 85<br />

ORF Open Reading Frame<br />

PAML Phylogenetic Analysis by Maximum Likelihood<br />

PAUP Phylogenetic Analysis Using Parsimony<br />

PCR Polymerase chain reaction<br />

PHYML Phylogeny by Maximum Likelihood<br />

PP Posterior Probability<br />

RAPD Ran<strong>do</strong>mly Amplified Polymorphic DNA<br />

rDNA ribosomal DNA gene<br />

rRNA ribosomal RNA<br />

RFLP Restriction Fragment Length Polymorphism<br />

RLM1 Resistance to Lethality of MKK1P386 overexpression gene<br />

RPB1 RNA polymerase II largest subunit<br />

RPB2 RNA polymerase II second largest subunit<br />

SAM SRF-Arg80-Mcm1<br />

S-H Shimodaira-Hasegawa suppot<br />

SMP1 Second MEF2-like Protein<br />

xviii


SRF Serum Response Factor<br />

TS Thymidylate Synthase gene<br />

Ty elements Transponsable element of yeast<br />

UMC1 Ustilago MCM1 homologue<br />

UPGMA Unweighted Pair Group Method with Aritmetic Mean<br />

WGD Whole Genome Duplication<br />

xix


FIGURE LIST<br />

Figure 1. Reconstruction of the Tree of Life based on transition analysis(…) ................ 7<br />

Figure 2. Reconstruction of the eukaryote evolutionary tree(…) .................................. 10<br />

Figure 3. Diversity of fungi(…) ..................................................................................... 14<br />

Figure 4. Phylogeny of the king<strong>do</strong>m Fungi(…) ............................................................. 19<br />

Figure 5. Schematic representation of Type I (SRF-like) and Type II (MEF2-like)<br />

protein <strong>do</strong>mains in plant, animal and fungi(…) .............................................................. 38<br />

Figure 6. Structure of Type I (SRF) and Type II (MEF2A) MADS <strong>do</strong>mains in complex<br />

with their DNA target (Homo sapiens)(…) .................................................................... 39<br />

Figure 7. GPI anchor structure(…) ................................................................................ 40<br />

Figure 8. Fungal phylogeny based on Rlm1 protein sequences(…) .............................. 56<br />

Figure 9. Fungal phylogeny based on RLM1 DNA sequences(…) ............................... 58<br />

Figure 10. Fungal phylogeny based on Mcm1 protein sequences(…) .......................... 62<br />

Figure 11. Phylogeny of the subphylum Saccharomycotina by using the RLM1(…) ... 70<br />

Figure 12. Phylogeny of the subphylum Saccharomycotina by using the MCM1(…) . 71<br />

Figure 13. Phylogeny of the subphylum Saccharomycotina inferred by using<br />

concatenated alignment of RLM1-MCM1(…) ............................................................... 72<br />

Figure 14. Phylogenetic network reconstructed using a concatenated alignment of<br />

RLM1-MCM1 in the subphylum Saccharomycotina(…) ............................................... 73<br />

Figure 15. Phylogeny of CUG Group based on IFF8(…) ............................................. 75<br />

Figure 16. Subphylum Saccharomycotina phylogeny reconstructed using concatenated<br />

alignment of RLM1-MCM1-IFF8(…) ............................................................................ 75<br />

Figure 17. Subphylum Saccharomycotina phylogenetic network reconstructed using a<br />

concatenated alignment of RLM1-MCM1-IFF8(…) ...................................................... 76<br />

xxi


Figure 18. Conserved regions in RLM1 gene of the subphylum Saccharomycotina:<br />

MADS-box, Acidic region, A-B conserved regions present in this subphylum(…) ...... 78<br />

Figure 19. Region of microsatellites present in some species from subphylum<br />

Saccharomycotina(…) ..................................................................................................... 79<br />

Figure 20. Saccharomyces cerevisiae amino acid sequence showing sites under positive<br />

selection by the MEC model(…) .................................................................................... 81<br />

Figure 21. The three RLM1 open reading frames of Candida albicans, Kluyveromyces<br />

lactis and Saccharomyces cerevisiae visualized with the Virtual Ribosome 1.0(…) ..... 83<br />

Figure 22. (A) Evolution of MADS-box genes based on Alvarez-Buylla et al. (2000)<br />

model, proposing a common origin for Animals, Fungi and Plants and a further<br />

duplication event in the Fungi lineage in some species within the subphylum<br />

Saccharomycotina (WGD group). (B) Comparison of Rlm1 protein/gene <strong>do</strong>mains in S.<br />

cerevisiae and C. albicans, showing the frameshift mutation ocurred after genome<br />

duplication in S. cerevisisae abd WGD species that changed the coding amino acid from<br />

Q to N .............................................................................................................................. 85<br />

Figure 23. Predicted putative ancestral RLM1 gene sequence from which CUG, WGD<br />

and Non-WGD groups diverged(…) ............................................................................... 87<br />

xxii


TABLE LIST<br />

Table 1. Evolution in the classification of living organisms throughout time. ................ 4<br />

Table 2. The six king<strong>do</strong>ms of life .................................................................................... 5<br />

Table 3. Phyla of the king<strong>do</strong>m Bacteria ........................................................................... 8<br />

xxiii


INTRODUCTION


1. INTRODUCTION<br />

1.1 Classification of organisms<br />

INTRODUCTION<br />

The classification of living organisms has always been a controversial subject.<br />

Attempts aiming at classifying living organisms were performed from the dawn of<br />

recorded history, however early classification systems were artificial and based<br />

primarily on habit and/or characteristics important to humans (i.e., medicines, food).<br />

The earliest known system of classification, from which current systems of<br />

classifying forms of life descend, comes from the thought presented by the Greek<br />

philosopher and scientist Aristotle (384-322 BC), who published in his metaphysical and<br />

logical works the first known classification of everything whatsoever, or "being". This<br />

was the scheme that gave moderns such words as substance, species and genus and was<br />

retained in a modified and less general form by Linnaeus. Aristotle‘s system classified<br />

all living organisms known at that time as either a plant or an animal on the basis of<br />

movement (air, land, or water), feeding mechanism, growth patterns, mode of<br />

reproduction and possession or lack of red blood cells. Microscopic organisms were<br />

unknown.<br />

In 1735, Swedish naturalist Carolus Linnaeus formalized the use of two latin names<br />

to identify each organism, a system called binomial nomenclature, in his great book the<br />

Systema Naturae that ran through twelve editions during his lifetime. In his work,<br />

nature was divided into three king<strong>do</strong>ms: mineral, vegetable and animal (Table1).<br />

Linnaeus grouped closely related organisms and introduced the modern classification<br />

groups: class, order, family, genus, and species.<br />

The german biologist Ernst Haeckel (1866) proposed a third king<strong>do</strong>m, Protista, to<br />

include all single-celled organisms. Some taxonomists also placed simple multicellular<br />

organisms, such as seaweeds, in the king<strong>do</strong>m Protista. Bacteria were also placed within<br />

Protista as a separate group called Monera.<br />

In 1938, American biologist Herbert Copeland was responsible for proposing the<br />

group Monera as a fourth king<strong>do</strong>m, which included only bacteria, separating it from<br />

king<strong>do</strong>m Protista. This was the first classification proposal to separate organisms<br />

3


4<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

without nuclei, the prokaryotes, from organisms with nuclei, called eukaryotes, at the<br />

king<strong>do</strong>m level.<br />

Table 1. Evolution in the Classification of Living Organisms throughout time.<br />

Linnaeus<br />

1735<br />

2<br />

King<strong>do</strong>ms<br />

Haeckel<br />

1866<br />

3<br />

King<strong>do</strong>ms<br />

Copeland<br />

1938<br />

4<br />

King<strong>do</strong>ms<br />

In 1969, American biologist Robert Whittaker proposed a fifth king<strong>do</strong>m, Fungi,<br />

based on fungi's unique structures and methods of obtaining food. Fungi <strong>do</strong> not ingest<br />

food as animals <strong>do</strong>, nor <strong>do</strong> they make their own food, as plants <strong>do</strong>; rather, they secrete<br />

digestive enzymes around their food and then absorb it into their cells.<br />

Woese and Fox (1977), divided the king<strong>do</strong>m Monera in two new king<strong>do</strong>ms:<br />

Eubacteria and Archeobacteria. Then, Carl Woese et al. (1990) proposed a new<br />

category, called Domain, to reflect evidence from nucleic acid studies that more<br />

precisely reveal evolutionary or family relationships. He suggested three <strong>do</strong>mains,<br />

Archaea, Bacteria, and Eukarya, based largely on studies of the small subunit rRNA.<br />

Recently, the system of classification has been modified well supported by<br />

molecular and cellular studies due to zoologist Thomas Cavalier-Smith (1981) with the<br />

proposal of a sixth king<strong>do</strong>m of life: the Chromista. In 2004, Chromista was accepted as<br />

king<strong>do</strong>m and the term Domain was changed to Empire, with the proposal of two:<br />

Prokaryota and Eukaryota (Table 2).<br />

Whittaker<br />

1969<br />

5<br />

King<strong>do</strong>ms<br />

Woese and Fox<br />

1977<br />

6<br />

King<strong>do</strong>ms<br />

Woese et<br />

al. 1990<br />

3 Domains<br />

Cavalier-<br />

Smith 2004<br />

2 Empires<br />

6<br />

King<strong>do</strong>ms<br />

Vegetabilia Protista Monera Monera Eubacteria Bacteria Prokaryota<br />

Archaebacteria Archea Bacteria<br />

Protista Protista Protista Eukarya Eukaryota<br />

Animalia Plantae Protozoa<br />

Plantae<br />

Fungi Fungi Chromista<br />

Animalia Plantae Plantae Fungi<br />

Animalia Animalia Animalia Plantae<br />

Animalia


Table 2. The six king<strong>do</strong>ms of life (Cavalier-Smith, 2004)<br />

1.2 Tree of Life<br />

INTRODUCTION<br />

The new insights coming from molecular sequence studies and their integration with<br />

numerous other lines of evidence, genetic, structural and biochemical have allowed an<br />

exact classification of organisms. Based on these facts, researchers are reconstructing<br />

the tree of life, and trying to place correctly the root of the evolutionary tree of all life<br />

what would enable us to deduce rigorously the major characteristics of the last common<br />

ancestor of all. It is probably the most difficult problem of all in phylogenetics, not yet<br />

solved and the most important to solve correctly because the result colours all<br />

interpretations of evolutionary history, influencing ideas of which features are primitive<br />

or derived and which branches are deeper and more ancient than others (Cavalier-Smith,<br />

2002a; Gribal<strong>do</strong> and Philippe, 2002).<br />

During a long time, researchers proposed that the root of the tree of life came<br />

from ancestral species of Archaebacteria, giving origin to other groups and ultimately to<br />

the formation of superior multicellular organisms. Other attempts of rooting the tree of<br />

life used protein paralogue trees that theoretically placed the root, but these results were<br />

contradictory due to tree-reconstruction artefacts or to poor resolution (Cavalier-Smith,<br />

2002a). Studies based on ribosome-related and DNA-handling enzymes suggested one<br />

root between Neomura (Eukaryotes plus Archaebacteria) and Eubacteria (Iwabe et al.,<br />

1989), whereas metabolic enzymes place the root within Eubacteria but in contradictory<br />

places (Peretó et al., 2004). Palaeontology shows that eubacteria are much more ancient<br />

5


6<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

than Eukaryotes (Cavalier-Smith, 1987; Douzery et al., 2004; Roger and Hug, 2006),<br />

with phylogenetic evidence that archaebacteria are sisters not ancestral to Eukaryotes,<br />

implying that the root is not within the Neomura.<br />

The first fully resolved prokaryote tree was inferred, by using 13 major<br />

biological transitions within eubacteria (Figure 1 and Table 3).This tree showed a basal<br />

stem comprising the new infraking<strong>do</strong>m Gli<strong>do</strong>bacteria (Chlorobacteria, Ha<strong>do</strong>bacteria,<br />

Cyanobacteria), which is entirely non-flagellate, and two derived branches comprising<br />

the infraking<strong>do</strong>m Gracilicutes and the subking<strong>do</strong>m Unibacteria that diverged<br />

immediately following the origin of flagella (Cavalier-Smith, 2006a).<br />

Proteasome evolution, a proteic structure responsible for degradating unneeded<br />

or damaged protein, indicated that the universal root would be outside the clade<br />

comprising Neomura and Actinomycetales (proteates), and thus lying within other<br />

eubacteria, contrary to the widespread assumption that it was between Eubacteria and<br />

neomura. Cell wall and flagellar evolution independently located the root outside<br />

Posibacteria (Actinobacteria and En<strong>do</strong>bacteria), and thus among Negibacteria, with two<br />

membranes, favouring a transition from de Negibacteria to Posibacteria and not from<br />

Posibacteria to Negibacteria as traditionally assumed. Posibacteria were derived from<br />

Eurybacteria and ancestral to Neomura.<br />

Studies based on the comparison of new amino acid insertions in the RNA<br />

polymerase gene strongly favor the monophyly of Gracilicutes which contain<br />

Proteobacteria, Planctobacteria, Sphingobacteria and Spirochaetes (Iyer et al., 2004).<br />

Evolution of the negibacterial outer membrane places the root within Eobacteria<br />

(Ha<strong>do</strong>bacteria and Chlorobacteria both primitive without lipopolysaccharide), as all<br />

phyla possesses the outer membrane β-barrel protein Omp85, and Chlorobacteria, the<br />

only Negibacteria without Omp85, or within Chlorobacteria (Cavalier-Smith, 2006a).<br />

A negibacterial root also fits the fossil record, which shows that Negibacteria are<br />

more than twice as old as Eukaryotes (Cavalier-Smith, 2002a; Cavalier-Smith, 2006b).<br />

Posibacteria, Archaebacteria and eukaryotes were probably all ancestrally heterotrophs,<br />

whereas Negibacteria are likely to have been ancestrally photosynthetic and diversified<br />

by evolving all the known types of photosystems and major antenna pigments.


INTRODUCTION<br />

Figure 1. Reconstruction of the Tree of Life based on biological transitions analysis. Thumbnail sketches<br />

show major variants in cell morphology (microtubular skeleton red, pepti<strong>do</strong>gglycan wall<br />

brown, outer membrane blue). The most likely root position is as shown; the possibility that it<br />

may lie within Chlorobacteria instead cannot yet be ruled out. Lowest level groups including or<br />

consisting enterily of photosynthetic organisms are in green or purple. The frequently<br />

misplaced hyperthermophilic eubacteria are in red (Adapted from Cavalier-Smith, 2006a).<br />

7


8<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Table 3. Phyla of the king<strong>do</strong>m Bacteria (Cavalier-Smith, 2006a). *Classification proposed by author.<br />

The last ancestor of all life forms would have been a eubacterium with acyl-ester<br />

membrane lipids, large genome, murein pepti<strong>do</strong>glycan walls, and with a fully developed<br />

eubacterial biology and cell division. It would be a non-flagellate negibacterium with<br />

two membranes, probably a photosynthetic green non-sulphur bacterium with relatively<br />

primitive secretory machinery, not a heterotrophic posibacterium with one membrane.<br />

As Negibacteria are the only prokaryotes that use sunlight to fix carbon dioxide this is<br />

also the only position that would have allowed the first ecosystems to have been based<br />

on photosynthesis, without which extensive evolution might have been impossible<br />

(Cavalier-Smith, 2002a; 2006a).


INTRODUCTION<br />

Ancestral neomura would have had their ancestral origin in an Arabobacteria<br />

(Actinobacteria). This ancestral would have been a thermophilic bacterium with<br />

cotranslational protein secretion, N-linked glicosylation, glycoprotein instead of murein,<br />

histones ¾ instead of DNA gyrase. This neomuran ancestor diverged sharply into two<br />

contrasting lineages. One formed a glycoprotein wall and became hyperthermophilic,<br />

evolving prenyl ether lipids and losing many eubacterial genes, as H1 histones, to form<br />

the Archaebacteria. The other became much more radically changed by using its<br />

glycoproteins as a flexible surface coat, evolving phagotrophy, an en<strong>do</strong>membrane<br />

system, en<strong>do</strong>skeleton, be uniciliate, unicentriolar aerobic zooflagellate and nucleus and<br />

enslaving an a-proteobacterium as a protomitochondrion to become the first eukaryote<br />

(Cavalier-Smith, 2000; 2002b).<br />

Eukaryotes comprise the basal king<strong>do</strong>m Protozoa and four derived king<strong>do</strong>ms:<br />

Animalia, Fungi, Plantae, and Chromista (Cavalier-Smith, 1998). Some thoughtful 19th<br />

century protozoologists suspected that flagellates preceded amoebae, as just asserted.<br />

But until DNA sequencing and molecular systematics burgeoned in the 1980s and<br />

1990s the often more popular idea of Haeckel that amoebae came first and flagellates<br />

were more advanced could not be disproved. However, it is now known that the primary<br />

diversification of eukaryote cells took place among zooflagellates with non-<br />

photosynthetic predatory cells having one or more flagella for swimming, and often also<br />

generating water currents for pulling in preys (Cavalier-Smith, 2006), from which two<br />

groups diverged, the bikonts and the unikonts that incorporate all eukaryotic king<strong>do</strong>ms<br />

(Figure 2).<br />

Plantae, Chromista, three protozoan infraking<strong>do</strong>ms (Alveolata, Rhizaria,<br />

Excavata), and the small zooflagellate phylum Apusozoa are all clearly ancestrally<br />

biciliate and together constitute a clade designated the bikonts (Cavalier-Smith, 2002b).<br />

Ancestral bikonts had two diverging cilia: the anterior undulated for swimming or prey<br />

attraction, and the posterior for gliding on surfaces in many groups (likely its ancestral<br />

condition), but secondarily adapted for swimming in other groups (or quite often was<br />

lost to make secondary uniciliates, not to be confused with genuine unikonts, which<br />

confusingly are sometimes secondarily biciliate). All major eukaryote groups except<br />

Ciliophora included severely modified organisms that lost cilia; some became highly<br />

amoeboid. The major shared derived character for all groups (not yet clearly<br />

9


10<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

demonstrated for Rhizaria) is the ciliary transformation in which the anterior<br />

cilium/centriole and its associated roots are always the first formed during the cellular<br />

division by bipartition; in the next cell cycle the cillium often undergo marked changes<br />

in structure and function to become the corresponding posterior organelles (Moestrup,<br />

2000).<br />

Figure 2. Reconstruction of the eukaryote evolutionary tree. Taxa in black comprise the basal king<strong>do</strong>m<br />

Protozoa. The thumbnail sketches show the contrasting microtubule skeletons of unikonts and<br />

bikonts in red. The dates highlighted in yellow are for the most ancient fossils known for each<br />

major group. Major innovations that help group higher taxa are shown by bars. Four major cell<br />

enslavements to form cellular chimaeras are shown by heavy arrows (enslaved bacteria) or<br />

dashed arrows (enslaved eukaryote algae) (Cavalier-Smith, 2006c).<br />

An independent derived character for bikonts is the fusion between thymidylate<br />

synthase (TS) and dihydrofolate reductase (DHFR) genes to encode a single<br />

bifunctional chimaeric protein. This fusion appears to have taken place in the common<br />

ancestor of the ancestral bikont after their invertion in the cenancestral eukaryote<br />

(Stechmann and Cavalier-Smith, 2002; 2003).<br />

Based on these facts, a new branching for the bikont group was proposed, in<br />

which the zooflagellate Phylum Apusozoa may be a clade sister of the Excavata. In an<br />

another group, it was observed that the clade chromalveolates was formed by the<br />

king<strong>do</strong>m Chromista and the protozoan Infraking<strong>do</strong>m Alveolata, which diverged from a<br />

common ancestor that enslaved a red algae and evolved novel plastid protein-targeting


INTRODUCTION<br />

machinery via the host rough en<strong>do</strong>plasmatic reticule (ER) and the enslaved algal plasma<br />

membrane (periplastid membrane). king<strong>do</strong>m Plantae, whose ancestor enslaved a<br />

cyanobacteria to form chloroplasts, may be sister of an ancestral to chromalveolates;<br />

Rhizaria and Excavata (jointly Cabozoa) are probably sisters if the formerly green algal<br />

plastid of euglenoids and chloarachneans (Cercozoa) was enslaved in a single event<br />

named symbiogenesis secondary in their common ancestor (Cavalier-Smith, 2002c;<br />

2003).<br />

The TS and DHFR genes are separately translated in Sarcomastigota, Animalia<br />

and Fungi. These three taxa are referred to as unikonts because it is argued that their<br />

common ancestor probably had only a single centriole and cilium per kinetid (Cavalier-<br />

Smith, 2002b). This unikont state is found in most distinct amoebozoan. On the other<br />

hand, unikonts share the same myosin II as in our muscles, being absent from all<br />

bikonts (Cavalier-Smith, 2006c). Unikont is argued to be the ancestral state for<br />

Amoebozoa (Cavalier-Smith, 2002b; Cavalier-Smith et al., 2004) as only myxogastrids<br />

and former protostelids arguably related to them have bicentriolar kinetids. The<br />

bicentriolar state of myxogastrids and many Choanozoa is developmentally different<br />

from that of bikonts, as their anterior cilium remains anterior in successive cell cycles<br />

and <strong>do</strong>es not transform into a posterior one. Thus, it is arguably not homologous to the<br />

bikont state with true ciliary transformation. A second gene fusion involving the first<br />

three enzymes of pyrimidine biosynthesis (carbamoyl phosphate synthase,<br />

dihydrorotase and aspartate carbamoyltransferase) is apparently a shared derived<br />

character for unikonts, absent from bikonts and the ancestral prokaryotes (Stechmann<br />

and Cavalier-Smith, 2003). As this involved two simultaneous fusion events, it is even<br />

less likely to ever have been reversed than the DHFR/TS fusion. If none of these gene<br />

fusions has ever been reversed during evolution, then they together indicate that the root<br />

of the eukaryote tree cannot lie within unikonts or bikonts, but must lie between these<br />

two clades (Stechmann and Cavalier-Smith, 2003).<br />

Earlier structural and molecular evidence would support the idea that Animalia,<br />

Choanozoa and Fungi together form a clade, characterized ancestrally by a single<br />

posterior cilium with a bicentriolar kinetid and flat mitochondrial cristae (Cavalier-<br />

Smith, 1987b). An exclusive grouping of animals and fungi was first noted in<br />

evolutionary trees of small subunit ribosomal RNA analysis (Wainright et al., 1993) and<br />

11


12<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

proteins such as Ef-1α, Act1 and α-β tubulin (Baldauf and Palmer, 1993). This grouping<br />

was subsequently designated as ‗‗Opisthokont‘‘ (Cavalier-Smith and Chao, 1995) and is<br />

now strongly supported by phylogenetic analyses of all taxonomically well-represented<br />

and well-characterized molecular data sets. These include concatenated multigene<br />

analyses (Baldauf et al., 2000; Bapteste et al. 2001; Lang et al., 2002) as well as many<br />

single-gene trees (Baldauf, 1999; Inagaki and Doolittle, 2000; Van de Peer et al., 2000).<br />

Recent molecular phylogenetic evidence has indicated that animals and fungi evolved<br />

independently from different unicellular protozoan choanozoan ancestors (Steenkamp et<br />

al., 2006; Shalchian-Tabrizi et al., 2008).<br />

The other clade that forms part of the Opisthokonts is the Phylum Choanozoa.<br />

Extant Choanozoa are either a clade that is sister to Animalia or a paraphyletic group<br />

ancestral to them (Cavalier-Smith and Chao, 2003; Rokas et al., 2003), and within the<br />

Choanozoa only the Nucleariid appears to be the closest sister taxon to Fungi<br />

(Steeankamp et al., 2006). A sister relationship between animals and choanoflagellates<br />

at least is supported by the probable gene fusion that generated the receptor tyrosine<br />

kinase from a cytoplasmic kinase and calcium-binding epidermal growth factor in their<br />

common ancestor after it diverged from Amoebozoa (King and Carroll, 2001). The<br />

Opisthokont clade is also supported by a shared derived 11–17 amino acid insertion in<br />

protein synthesis elongation factor EF-1α absent from prokaryotes and all other<br />

eukaryotes (Baldauf, 1999).<br />

1.3 King<strong>do</strong>m Fungi<br />

Fungi share a long history with human civilization. References in Greek literature,<br />

mushroom stones from Mesoamérica dating back to 1000-300 B.C. (Lowy, 1971), and<br />

dried mushrooms of Piptoporus betulinus found in a pouch around a Stone Age man´s<br />

neck in the Alps (Rensberger, 1992) all attest to this long relationship. These make up<br />

one of the major clades of life. Roughly 80000 species of fungi have been described, but<br />

the actual number of species has been estimated at approximately 1.5 million<br />

(Hawksworth, 1991; 2001). This number may yet underestimate the true magnitude of<br />

fungal diversity (Persoh and Rambold, 2003).<br />

For a long time, fungi were regarded as a single king<strong>do</strong>m belonging to the<br />

aforementioned five-king<strong>do</strong>m scheme. However, the organisms usually considered to be


INTRODUCTION<br />

fungi are very complex and diverse. Their recognition, delimitation and typification<br />

have been formidable problems since the discovery of these organisms. They can<br />

present chitin or not, multicellular and filamentous absorptive forms and unicellular<br />

assimilative forms, among others, and can reproduce by different types of propagules or<br />

even by fission (Alexopoulos et al., 1996; Guarro et al., 1999) (Figure 3). Despite this<br />

great variability in form and function, systematic studies based upon morphology and<br />

life cycle produced, by the middle 20th century, a manageable number of major phyla;<br />

and the use of molecular approaches have actually determined seven phyla well defined<br />

within the king<strong>do</strong>m Fungi (Hibbet et al., 2007).<br />

Fungi play a critical role impacting nearly all other forms of life in virtually all<br />

ecosystems as either friend or foe. Saprotrophic fungi are important in the environment<br />

in the cycling of nutrients through the decomposition of organic material, especially the<br />

carbon that is sequestered in wood and other plant tissues. Mutualistic symbiont fungi<br />

through relationships with prokaryotes, plant (including algae) and animals have<br />

enabled a diversity of other organisms to exploit novel habitats and resources. Indeed,<br />

the establishment of mycorrhizal associations may be a key factor that enabled plants to<br />

make the transition from aquatic to terrestrial habitats. Other group are pathogenic and<br />

parasitic Fungi that attack virtually all groups of organisms, including bacteria, plants,<br />

other Fungi, and animals, including humans. The economic impact of such fungi is<br />

massive either beneficial by the production of antibiotics or extremely detrimental by<br />

the devastating impacts in plant diseases, mycotoxins and mycosis (Pirozynski and<br />

Malloch, 1975; Moss, 1987; Alexopoulos et al., 1996; Blackwell, 2000).<br />

13


14<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Figure 3. Diversity of fungi. Microsporidia (Nosema apis (A), Octosporea bayeri (B));<br />

Blastocladiomycota (Flammulina velutipes (C)); Neocallimastigoycota (Caecomyces<br />

communis (D)); Chytridiomycota (Batrachochytrium dendrobatidis (E)); ‗Zygomycota‘-<br />

Mucoromycotina (Phycomyces blakesleeanus (H), Mucor circinelloides (G));<br />

Glomeromycota (Glomus coronatum (J)); Basidiomycota (Agaricus xanthodermus (F),<br />

Cryptococcus neoformans (I), Ustilago maydis (K)); Ascomycota (Microsporum gypseum<br />

(L), Aspergillus niger (M), Candida albicans (N), Blastomyces dermatitidis (O)).


1.4 Fungal taxonomy<br />

INTRODUCTION<br />

Historically the monophyletic Fungi have been classified into four phyla:<br />

Chytridiomycota, Zygomycota, Basidiomycota, and Ascomycota (Barr, 1992;<br />

Hawksworth et al., 1995; Alexopoulos et al. 1996). However, current studies have<br />

shown that such a simple classification <strong>do</strong>es not represent the phylogeny of those<br />

organisms, being currently recognized the phyla Microsporidia, Blastocladiomycota,<br />

Neocallimastigomycota, Chytridiomycota, Glomeromycota, Basidiomycota and<br />

Ascomycota. Presently, the phylum named Zygomycota is still to define due to the<br />

pending resolution of relationships among the clades that have traditionally been placed<br />

within this phylum (Hibbet et al., 2007).<br />

The current fungal classification accepts one king<strong>do</strong>m, one subking<strong>do</strong>m, seven<br />

phyla, ten subphyla, 35 classes, 12 subclasses, and 129 orders. The clade containing the<br />

Ascomycota and Basidiomycota is classified as the subking<strong>do</strong>m Dikarya (James et al.,<br />

2006), reflecting the putative synapomorphy of dikaryotic hyphae (Tehler, 1988). All of<br />

the other new names are based on automatically typified teleomorphic names. In<br />

Basidiomycota, the clades formerly called Basidiomycetes, Urediniomycetes, and<br />

Ustilaginomycetes in the last edition of Ainsworth & Bisby‘s Dictionary of the Fungi<br />

are now called the Agaricomycotina, Pucciniomycotina, and Ustilaginomycotina,<br />

respectively (Bauer et al., 2006). This was <strong>do</strong>ne to minimize confusion between taxon<br />

names and informal terms (basidiomycetes is a commonly used informal term for all<br />

Basidiomycota) and to refer to the included genera Agaricus (including the cultivated<br />

button mushroom) and Puccinia (which includes barberry-wheat rust). Another<br />

significant change in the Basidiomycota classification has been the inclusion of the<br />

Wallemiomycetes and Entorrhizomycetes as classes incertae sedis within the phylum,<br />

reflecting ambiguity about their higher-level placements (Matheny et al., 2007a).<br />

The most dramatic changes in the classification concern the ‗basal fungal<br />

lineages‘, which include the taxa that have traditionally been placed in the Zygomycota<br />

and Chytridiomycota. These groups have long been recognized to be polyphyletic,<br />

based on analyses of rRNA, EF-1α, and RPB1 (James et al., 2000; Nagahama et al.,<br />

1995; Tanabe et al., 2004, 2005). Recent multilocus analyses now provide the sampling,<br />

resolution, and support necessary to structure new classifications of these early-<br />

diverging groups, although significant questions remain (Liu et al., 2006; Lutzoni et<br />

15


16<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

al.,2004; James et al., 2006). The Chytridiomycota is retained in a highly restricted<br />

sense, including Chytridiomycetes and Monoblephari<strong>do</strong>mycetes. The Blastocladiales, a<br />

traditional member of the Chytridiomycota, is now treated as a phylum, the<br />

Blastocladiomycota (James et al., 2007). The Neocallimastigales, whose distinctiveness<br />

from other chytrids has long been recognized, is also elevated to phylum, based on both<br />

morphology and molecular phylogeny (Hibbett et al., 2007). The genera<br />

Caulochytrium, Olpidium, and Rozella, which have traditionally been placed in the<br />

Chytridiomycota, and Basidiobolus, previously classified in the Zygomycota<br />

(Entomophthorales), are not included in any higher taxa in the current classification,<br />

pending more definitive resolutions of their placements. The phylum Zygomycota has<br />

not been accepted in the current classification, and its resolution is pending due to the<br />

relationships among the groups that were traditionally part of this phylum. The<br />

traditional Zygomycota were distributed among the phylum Glomeromycota and four<br />

subphyla incertae sedis, including Mucoromycotina, Kickxellomycotina,<br />

Zoopagomycotina and Entomophthoromycotina (Hibbett et al., 2007). Microsporidia,<br />

unicellular parasites of animals and protists with highly reduced mitochondria, has been<br />

included as a phylum of the Fungi, based on multigenic analyses (Germot et al., 1997;<br />

Hirt et al., 1997; Peyretaillade et al., 1998; Keeling et al., 2000; Gill and Fast, 2006;<br />

James et al., 2006; Liu et al., 2006).<br />

1.5. Fungal phylogeny<br />

Until recently, the fungal phylogenetics has been well served by phenotypic<br />

characters such as morphological comparison, cell wall composition, cytologic testing,<br />

ultrastructure, cellular metabolism and fossil records (Le`John, 1974; Taylor, 1978;<br />

Heath, 1986; Hawksworth et al., 1995). However, higher-level relationships amongst<br />

these groups are less certain and are best elucidated using molecular techniques. More<br />

recently, the advent of cladistic and molecular approaches has changed the existing<br />

situation and provided new insights into fungal evolution. Genotypic analysis offers a<br />

straightforward means of resolving evolutionary questions where phenotypic characters<br />

mentioned are absent or contradictory.<br />

Many loci have been used by mycologists for evolutionary studies at single-gene<br />

level, but few of these loci revealed to be appropriated to resolve relationships among<br />

the main lineages of the Fungi. Even when trees are inferred by using multiple loci, the


INTRODUCTION<br />

phylogenetic signal may be limited strongly by the loci selected. Phylogenetic studies<br />

indicate that more than 83.9% of fungal phylogenies are based exclusively on sequences<br />

from the ribosomal RNA tandem repeats. The few protein-coding genes that have been<br />

sequenced for phylogenetic studies of fungi have demonstrated clearly that such genes<br />

can contribute greatly to resolving deep phylogenetic relationships with high support<br />

and/or increase support for topologies inferred by using ribosomal DNA genes (Liu et<br />

al., 1999). Studies using other protein-coding genes such as DNA topoisomerase II,<br />

Actina-1 (ACT1), first large subunit of RNA polymerase (RPB1), Citochrome oxidase 2<br />

(COX2) and Citochrome oxidase B (Cyt B) combined with traditional genes used in<br />

phylogeny are already being used to infer fungal phylogeny (Kurtzman and Robnett,<br />

2003; Diezmman et al., 2004; Lutzoni et al., 2004; Reeb et al. 2004; Wang et al., 2004;<br />

Liu et al., 2006; Matheny et al., 2007b).<br />

Until 2004, genome databases revealed that 21075 ITS, 7990 nucSSU, 5373<br />

nucLSU, 1991 mitSSU, and 349 RPB2 sequences were available and the number is<br />

continuosly increasing (Lutzoni et al., 2004). As impressive as these numbers is the<br />

collective effort to generate DNA sequence data for the Fungi but none of these loci<br />

alone can resolve the fungal tree of life with a satisfactory level of phylogenetic<br />

confidence (Kurtzman and Robnett, 1998; Tehler et al., 2000; Berbee, 2001; Binder and<br />

Hibbett, 2002; Moncalvo et al., 2002; Tehler et al., 2003). Combining sequence data<br />

from multiple loci is an integral part of large scale phylogenetic inference and is central<br />

to assembling the fungal tree of life. However, most published phylogenetic studies<br />

have restricted their analyses to one locus maximizing the number of fungal taxa. To<br />

quantify this observation, 560 publications reporting published fungal phylogenetic<br />

trees were surveyed from 1990 through 2003. Of the 595 trees considered in those<br />

studies, 489 (82.2%) were based on a single locus, only 77 trees were based on two<br />

combined loci, 19 on three combined loci, and 10 on four or more combined loci. Seven<br />

of the latter 10 studies were restricted to closely related species or strains within a<br />

species (Lutzoni et al.; 2004). Exceptions included studies with 93 species representing<br />

most major clades of Homobasidiomycetes (Binder and Hibbett, 2002), with 15 species<br />

representing 10 orders (Binder et al., 2001), and with 45 species representing nine<br />

orders (Hibbett and Binder, 2001).<br />

17


18<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

The largest phylogenetic tree based on one locus included 1551 nucSSU<br />

sequences representing 60 orders (Tehler et al., 2003). The largest multilocus trees<br />

included 162 ITS + ß-tubulin sequences representing a single order of Fungi (Stenroos<br />

et al., 2002); 158 species representing 10 orders based on nucSSU, nucLSU, and<br />

mitSSU (Hibbett et al., 2000); 110 species in a single order sequenced for ITS and<br />

nucLSU (Peterson, 2000); and 108 nucSSU + nucLSU sequences representing 19 orders<br />

of Fungi (Miadlikowska and Lutzoni, 2004). A more recent phylogenetic study directed<br />

toward resolving the fungal tree of life based on six combined loci, included members<br />

from all four traditionally recognized phyla of Fungi (Ascomycota, Basidiomycota,<br />

Chytridiomycota, and Zygomycota), and two new phyla Glomeromycota and<br />

Microsporidia (James et al., 2006).<br />

Although much effort has been invested in defining orders (compared to<br />

families, for example), few studies have focused on resolving relationships among<br />

orders of Fungi: 354 of the 595 trees examined (59.5%) conveyed relationships within<br />

single orders. The largest number of orders considered in a single study (N=62) resulted<br />

in a tree based only on nucSSU data (Tehler et al., 2000). The fungal trees based on<br />

combined data from multiple loci and encompassing the largest number of orders<br />

included 38 species representing 25 orders (Bhattacharya et al., 2000), 52 species<br />

representing 20 orders (Lutzoni et al., 2001), and 108 species representing 19 orders<br />

(Miadlikowska and Lutzoni, 2004). All of these studies focused on ascomycetes and<br />

were based on nucSSU and nucLSU rDNA. An exceptional study covering 16 orders of<br />

fungi (34 species) using a combined analysis of two protein-coding genes (α and ß-<br />

tubulin) was carried out to infer the phylogenetic placement of Microsporidia within<br />

king<strong>do</strong>m Fungi (Keeling, 2003). The major fungal phylogenomic study based on 42<br />

complete genomes was carried out by using all conserved regions of genes, resolving<br />

deep phylogenetic relationship of the majority of known taxa. The major drawback of<br />

this study is the low number of sequenced fungal genomes being an obstacle to resolve<br />

basal taxa in some clades (Fitzpatrick et al., 2006).<br />

Figure 4 shows a fungal phylogeny based on genetic and geologic data, which<br />

present the five main major groups of fungal organisms that have their genomes<br />

sequenced or in progress, but uses the previous taxonomic denomination. In this tree, it<br />

is evident that the Ascomycetes and Basidiomycetes are sister groups (subking<strong>do</strong>m


INTRODUCTION<br />

Dykaria), and that there is a close relationship of Glomeromycetes to the subking<strong>do</strong>m<br />

Dykaria, which diverged about 600 myrs ago. The basal groups in the king<strong>do</strong>m Fungi<br />

are Chytrids and Zygomycetes. Currently, studies for determining the correct placement<br />

of species in these groups are being carried out, since some of these produced paraphyly<br />

(Chytrids) or poliphyly (Zygomycetes) in the fungal phylogeny. Due to this result, the<br />

species from Zygomycetes have been reclassified (Hibbet et al., 2007).<br />

Figure 4. Phylogeny of the king<strong>do</strong>m Fungi. Major fungal groups colored as indicated by text at the right.<br />

Diamonds indicate evolutionary branch points, and their approximate dating (bottom), captured<br />

by fungal genomes sequenced in or in progress (modified Berbee and Taylor, 2001).<br />

1.6. Fungal identification<br />

The identification of fungi to species level and also the distinction of<br />

morphologically identical strains of the same species has been a challenge for the<br />

mycologists. Hence, most species have received only limited study, and the<br />

classification has been mainly traditional and based on readily observable<br />

morphological features. Other features besides morphology, such as susceptibility to<br />

chemicals and antifungal drugs, physiological and biochemical tests, production of<br />

secondary metabolites, ubiquinone systems, fatty acid composition, cell wall<br />

composition, protein composition and molecular approaches, have been used in<br />

classification and also in identification of fungal species (Guarro et al., 1999).<br />

19


20<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

1.6.1. Phenotypic methods<br />

The classification and identification of fungi using phenotypic methods relies<br />

mainly on morphological criteria. The classical light microscopic methods have been<br />

enhanced by Nomarski differential interference contrast, fluorescence, cytochemistry,<br />

and the development of new staining techniques such as those for ascus apical structures<br />

(Romero and Winter, 1988). Unfortunately, during infections most pathogenic fungi<br />

show only the vegetative phase (absence of sporulation), in host tissue only hyphal<br />

elements or other nonspecific structures are observed. Although the pigmentation and<br />

shape of these hyphae and the presence or absence of septa can give us an idea of their<br />

identity, fungal culture is required for accurate identification. For the identification and<br />

classification of these fungi, the type of conidia and conidiogenesis are considered the<br />

most important sets of characteristics to be observed (Cole and Samson, 1979; Guarro et<br />

al., 1999). In the rare instances that opportunistic fungi develop the teleomorph in vitro<br />

(this happens in numerous species of Ascomycota and in a few species of Zygomycota<br />

and Basidiomycota), there are many morphological details associated with sexual<br />

sporulation which can be extremely useful in their classification. The type of fruiting<br />

body and type of ascus (structure that contain ascospores produced after meiosis) are<br />

vital for classification. Shape, color, and the presence of an apical opening (ostiole) in<br />

the fruiting bodies are important features in the recognition of higher taxa (Guarro et al.,<br />

1999). In recent years, morphological techniques have been influenced by modern<br />

procedures, which allow more reliable phenotypic studies to be performed. Numerical<br />

taxonomy, effective statistical packages, and the application of computer facilities to the<br />

development of identification keys offer some solutions and the possibility of a<br />

renaissance of morphological studies.<br />

Besides the use of morphological identification, other approaches have been<br />

proposed to identify fungi, which are: a) Biotyping, which is based on the use of<br />

physiological and biochemical test such as the commercially avaliable kits, API and<br />

VITEK systems, and selective media as CHROMOAGAR have been used to identify<br />

unicellular fungi (San-Millan et al., 1996; Bernal et al., 1998; Graf et al., 2000; Ballesté<br />

et al., 2005), b) Serotyping, a method based on aglutination reaction between antigens<br />

and policlonal antibodies, used for pathogenic yeast typing as Candida spp,<br />

Cryptococcus neoformans at the strain level (Evans et al., 1950; Haesenclever and<br />

Mitchell, 1961), c) Production of secondary metabolites, which consists in the


INTRODUCTION<br />

identication of fungi by the metabolites that they produce (Carlile and Watkinson, 1994;<br />

Frisvad, 1994; Whaley and Edwards, 1995), d) Cell wall composition, based on studies<br />

of structure and composition of the cell wall of fungi, delimiting high taxa (Bartnicki-<br />

Garcia, 1987; Carlile and Watkinson, 1994), and e) Isoenzymatic analysis, that enable<br />

the differentiation between species by the electrophoretic pattern of specific enzyme<br />

activity (Busch and Nitscko, 1999; Soll, 2000). All of these techniques have presented<br />

limitations in the identification of fungal species, being some of them restricted to some<br />

species or fungal group. Hence, the use of techniques that included studies at the<br />

genomic level may resolve these disadvantages.<br />

1.6.2. Genotypic methods<br />

Methods based on studies directly in DNA are within the genotypic methods. The<br />

electrophoretic karyotyping is a method, which use pulsed field gel electrophoresis to<br />

assess variations of chromosome sizes and the precise determination of karyotypes,<br />

allowing the discrimination at the strain and species level. The electrophoretic<br />

karyotyping consist in the migration of chromosomes under the influence of electric<br />

fields of alternating orientation to move intact chromosomes according to size through<br />

the agarose gel matrix, which can be visualized by ethidium bromide staining (Guarro et<br />

al., 1999; Taylor et al., 1999). Electrophoretic karyotyping has been used to fingerprint<br />

different fungal species as Candida spp, Micromucor spp, Paracoccidioides<br />

brasiliensis, Pyrenophora spp (Dib et al., 1996; Nogueira et al., 1998; Aragona et al.,<br />

2000; Klempp-Selb et al., 2000; Shin et al., 2001, Nagy et al., 2004).<br />

One of the first DNA fingerprinting methods used to assess strain relatedness and<br />

taxonomy in fungi was restriction fragment analysis, or restriction fragment length<br />

polymorphism (RFLP) comparisons, without probe hybridization. DNA is extracted,<br />

digested with one or more restriction en<strong>do</strong>nucleases to sample short pieces of DNA<br />

sequence. Then, the fragments are separated by electrophoresis in an agarose gel. The<br />

banding pattern of digested DNA is then visualized, usually by staining with ethidium<br />

bromide. The obtained band pattern is based on different fragment lengths determined<br />

by the restriction sites identified by the particular en<strong>do</strong>nuclease employed (Taylor et al.,<br />

1999; Soll, 2000). Any region of DNA can be used for RFLP analysis if variation<br />

(alleles) can be visualized, directly due to multiple copies (mtDNA, rDNA) or indirectly<br />

either by hybridization with a probe or by PCR amplification. RFLPs were the first<br />

21


22<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

DNA markers used for fungal evolutionary biology. The band patterns can be tabulated<br />

and compared, and phenetic trees constructed (Edel et al., 1996). Critics of RFLP<br />

analysis note that while restriction sites in different individuals are most likely to be<br />

identical by descent that is not the case for missing sites, because it is easier to lose a<br />

site than to gain one. There are many ways in which a restriction en<strong>do</strong>nuclease site can<br />

be lost: any of the several nucleotides in the recognition sequence can be substituted, or<br />

the site can suffer length mutations. Missing sites that have arisen, unknowingly, by<br />

different routes confound evolutionary analysis (Taylor et al., 1999).<br />

Ran<strong>do</strong>mly amplified polymorphic DNA (RAPD) analysis or arbitrarily primed PCR<br />

(AP-PCR) analysis is similar to RFLP analysis in that it assays DNA sequence variation<br />

in short regions, but instead of analyzing restriction en<strong>do</strong>nuclease recognition<br />

sequences, it focuses on PCR priming regions (Williams et al., 1990). Using ran<strong>do</strong>m<br />

primers of approximately 10 bases and a low annealing temperature, amplicons<br />

throughout the genome are targeted and amplified. Amplified products are separated on<br />

an agarose gel and stained with ethidium bromide. When a single ran<strong>do</strong>m<br />

oligonucleotide primer is used in a reaction, it hybridizes to homologous sequences in<br />

the genome and if the primer hybridizes to sequences on alternative DNA strands within<br />

roughly 3 kb, the DNA region between the two hybridization sites will be amplified.<br />

Criticisms of RAPDs initially focused on reproducibility. RAPD analysis succeeds<br />

because just one nucleotide substitution can allow or prevent priming, and so it is not<br />

surprising that small differences in any aspect will have the same effect of PCR profile.<br />

Even if there were no problems with repeatability, there would still be the concern that<br />

bands of equal electrophoretic mobility may not be homologous and the related concern<br />

that missing bands may not be homologous because they can be lost by several possible<br />

nucleotide substitutions in either PCR priming site as well as by length mutations. There<br />

is also the problem of <strong>do</strong>minant and null alleles; in haploid organisms both the<br />

<strong>do</strong>minant (presence) and null (absence) alleles can be scored, but in diploids it is not<br />

possible to distinguish genotypes that are homozygous for the <strong>do</strong>minant allele from<br />

those that are heterozygous. It is also tempting to score more than one variable band per<br />

RAPD reaction, although they may not be independent (Taylor et al., 1999). Thus,<br />

RAPD has not been found useful for evolutionary studies. Because of the repetitive<br />

character of the target sequences, genetic distances calculated from RAPD could be


INTRODUCTION<br />

affected by parology, namely, recombination and duplication events not parallel with<br />

speciation events (Melo et al., 1998).<br />

Amplified fragment length polymorphism (AFLP) analysis is a relatively new<br />

technique which has a discriminatory power that makes it suitable for identification as<br />

well as for strain typing. In short, in AFLP analysis genomic DNA is digested with two<br />

restriction enzymes (EcoRI and MseI) and <strong>do</strong>uble-stranded oligonucleotide adapters are<br />

ligated to the fragments. These adapters serve as targets for the primers during PCR<br />

amplification. To increase the specificity, it is possible to elongate the primers at their<br />

3`ends with one to three selective nucleotides. One of the primers is labeled with a<br />

fluorescent dye (Vos et al 1995). The polymorphism revealed by amplified fragment<br />

length polymorphism (AFLP) analysis depends on restriction en<strong>do</strong>nuclease site<br />

differences, just like RFLP. The variable fragments (loci) have two alleles (positive and<br />

null), and so the same concerns raised about null alleles with RFLP and RAPD analysis<br />

apply (Taylor et al., 1999). The advantage of AFLP analysis is that only a limited<br />

amount of DNA is needed since the fragments are PCR amplified producing more than<br />

with RFLP. Furthermore, since stringent annealing temperatures are used during<br />

amplification, the technique is more reproducible and robust than other methods such as<br />

RAPD analysis (Savelkoul et al., 1999). The value and potential of the AFLP technique<br />

in differentiation and identification from strains and/or species and the similarity of<br />

genetic analysis has been demonstrated in yeast ecology and evolutionary studies<br />

(Barros Lopes et al., 1999).<br />

Sequencing is the best method for measuring phylogenetic relationships and<br />

discriminating between species and strains through comparison of the DNA sequences<br />

of a variety of noncoding and coding regions. The underlying assumption is that the<br />

rates at which mutations accumulate in gene regions reflect evolutionary clocks.<br />

Because particular noncoding and coding regions may be highly resistant to change and<br />

therefore may have very slow evolutionary clocks while other coding regions, such as<br />

those of genes involved in pathogenesis, may be far less resistant to change and<br />

therefore may have very fast evolutionary clocks. The studies of rRNA genes were<br />

based on the assumption that they were less prone to biases resulting from selective<br />

pressure (Karl and Avise, 1993). Until recently, cloning and sequencing genes were<br />

slow, technically demanding and expensive, that seemed beyond the technical capacity<br />

23


24<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

of most medical mycology laboratories. However, the emergence of PCR, automated<br />

DNA-sequencing technologies, and gene data banks have made these studies easier<br />

(Soll, 2000). The sequencing of several genes both, nuclear and mithocondrial, is being<br />

used to infer phylogeny and discriminate fungal species and strains.<br />

In the first sequencing studies, noncoding regions have been selected to infer<br />

phylogeny, using the small subunit ribosomal DNA sequences from eukaryotes and<br />

prokaryotes (Woese and Fox, 1977), allowing the reconstruction of the first tree of life.<br />

Intergenic Transcribed Spacer (ITS) regions are mainly used in fungal taxonomy and<br />

phylogeny, but to resolve the phylogenetic placement of some fungal species that<br />

present an uncertain position new genes are being used as nuclear large and small<br />

subunit ribosomal DNA (nucLSU and nucSSU), mithocondrial large and small<br />

ribosomal DNA (mitLSU and mitSSU), second large subunit of RNA polymerase<br />

(RPB2), α and β tubulin and elongation factor 1 (EF-1), clarifying the divergence of<br />

species (Lutzoni et al., 2004; Bridge et al., 2005; James et al., 2006). Currently, the<br />

sequencing is being performed at the genome level what has enabled to explain better<br />

the phylogenetic relationships in fungi (Fitzpatrick et al., 2006).<br />

1.7. Methods for phylogenetic inference<br />

Previously, the attempt of inferring the phylogeny was based in the identification of<br />

common morphological characters amongst species. Though these data enabled us to<br />

group close species, it did not always resolve deep phylogenetic relationships especially<br />

in organisms with particular characters like fungi that were for long time related with<br />

plants. Currently, the methods for inferring phylogenetic relationships between species<br />

are using amino acid or nucleotide sequences, being these data more reliable because<br />

we can obtain information that is not possible to visualize by morphological analysis,<br />

such as presence and difference among homologue genes, mutations, and duplicated<br />

genes, for instance.<br />

In phylogenetic studies the first step is the single-gene or multigenic homologue<br />

sequence alignment. This procediment will allow the identification of regions with high<br />

homologies and how close the sequences are by using both nucleotides and amino acid<br />

sequences. Several programs have been proposed for alignment of sequences, being the<br />

most used the CLUSTAL series of programs. The obtained alignment is imported into a


INTRODUCTION<br />

program of phylogenetic inference, which can use two primary approaches to<br />

phylogenetic tree construction: an algorithmic and a tree-searching. The first uses an<br />

algorithm to construct a tree from the data. The second constructs many trees, and then<br />

the best tree or best set of trees is selected. These two methods are also known as<br />

distance and character-based methods, respectively. Below, the two methods of<br />

alignment as well as methods of phylogenetic construction will be described.<br />

1.7.1. Multiple sequence alignment<br />

1.7.1.1. CLUSTAL series of programs<br />

The Clustal series of programs are widely used in molecular biology for the multiple<br />

alignments of both nucleic acid and protein sequences and for preparing phylogenetic<br />

trees. The first Clustal program was designed specifically to work efficiently on<br />

personal computers, which at that time, had feeble computing power by today‘s<br />

standards. It combined a memory-efficient dynamic programming algorithm with the<br />

progressive alignment strategy. The multiple alignments are built up progressively by a<br />

series of pairwise alignments, following the branching order in a guide tree. The initial<br />

pre-comparison used a rapid word-based alignment algorithm and the guide tree was<br />

constructed by using the Unweighted Pair-Group Method with Arithmetic Mean<br />

(UPGMA) method (Higgins and Sharp, 1988). In 1992, a new release was made, called<br />

ClustalV, which incorporated profile alignments (alignments of existing alignments)<br />

and the facility to generate trees from the multiple alignment by using the Neighbour<br />

Joining (NJ) method, and the user could also test the tree for robustness by using a<br />

simple bootstrap test of tree topology (Higgins et al., 1992). In 1994, the third<br />

generation of the series Clustal W was released. Clustal W incorporated position-<br />

specific gap penalties so that gap penalties can be lowered at hydrophilic residues and<br />

wherever gaps are already introduced into the alignment. The sequence pre-comparison<br />

in Clustal W uses more sensitive dynamic programming, which yields a much better<br />

dendrogram by means of NJ, which improves tree topology and provides a method for<br />

weighting sequences on the basis of their divergences (Thompson et al., 1994). After,<br />

Clustal X was developed, displaying many improvements. Within alignments,<br />

conserved columns are highlighted by using a colour scheme that the user can<br />

customize. Beneath the sequence alignment, Clustal X provides a plot of residue<br />

conservation. Quality-analysis tools that highlight misaligned regions are also available<br />

(Thompson et al., 1997). The popularity of the programs depends on a number of<br />

25


26<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

factors, including not only the accuracy of the results, but also the robustness,<br />

portability and user-friendliness of the programs.<br />

1.7.1.2. MUSCLE<br />

Multiple sequence comparison by log-expectation (MUSCLE) is a new<br />

computer program for creating multiple alignments of protein sequences or alternatively<br />

nucleotides. This program uses two distance measures for a pair of sequences: a kmer<br />

distance (for an unaligned pair) and the Kimura distance (for an aligned pair). A kmer is<br />

a contiguous subsequence of length k, also known as a word or k-tuple. Related<br />

sequences tend to have more kmers in common than expected by chance. The kmer<br />

distance is derived from the fraction of kmers in common in a compressed amino acid<br />

alphabet. This measure <strong>do</strong>es not require an alignment, giving a significant speed<br />

advantage. Given an aligned pair of sequences, it computes the pairwise identity and<br />

converts it to an additive distance estimate, applying the Kimura correction for multiple<br />

substitutions at a single site. Distance matrices are clustered using UPGMA, which give<br />

slightly improved results over neighbor-joining, despite the expectation that neighbor-<br />

joining will give a more reliable estimate of the evolutionary tree. This can be explained<br />

by assuming that in progressive alignment, the best accuracy is obtained at each node by<br />

aligning the two profiles that have fewest differences, even if they are not evolutionary<br />

neighbors. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of<br />

these sets. Without refinement, MUSCLE achieves average accuracy statistically<br />

indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods<br />

for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min<br />

on a current desktop computer (Edgar, 2004).<br />

1.7.2. Major methods for estimating phylogenetic trees and infer<br />

phylogeny<br />

1.7.2.1. Distance methods<br />

These methods convert aligned sequences into a distance matrix of pairwise<br />

differences (distances) between the sequences. The matrix is used as the data from<br />

which branching order and branch lengths are computed. Within the most used methods<br />

we have the NJ and the UPGMA.<br />

UPGMA is a straightforward example of a clustering method. It starts with grouping<br />

two taxa with the smallest distance between them according to the distance matrix.


INTRODUCTION<br />

Then, a new node is added in the midpoint of the two, and the two original taxa are put<br />

on the tree. The distance from the new node to other nodes will be the arithmetic<br />

average. Then, a reduced distance matrix is obtained by replacing two taxa with one<br />

new node. That process is repeated on the new matrix and reiterated until the matrix<br />

consists of a single entry. That set of matrices is then used to build up the tree by<br />

starting at the root and moving out to the first two nodes represented by the last two<br />

clusters. In fact, the outcome phylogenetic tree from UPGMA is not an arbitrary tree.<br />

Instead, it is a special type of tree, which satisfies the molecular clock property. That is,<br />

the total time travelling <strong>do</strong>wn a path to the leaves from any node is the same, regardless<br />

the choice of path. Thus, if the original phylogenetic tree from observation <strong>do</strong>es not<br />

have this property, it will not reconcile well with the tree from UPGMA. One necessary<br />

condition for producing a better phylogenetic tree is the ultrametric condition, which<br />

requires that any triangle must be at least equilateral with regard to the weights between<br />

them, what is very unlikely. For that and other reasons, UPGMA is considered a bad<br />

algorithm for construction of phylogenetic trees, since it relies on the rates of evolution<br />

among different lineages to be approximately equal (Sneath and Sokal, 1973; Hall,<br />

2001).<br />

Neighbor Joining is similar to UPGMA in that it manipulates a distance matrix,<br />

reducing it in size at each step, and reconstructing the tree from that series of matrices.<br />

It differs from UPGMA in that it <strong>do</strong>es not construct clusters but directly calculates<br />

distances to internal nodes. In this method, the total sum of the individual distances is<br />

not computed for all or many different topologies, but the examination of different<br />

topologies is imbedded in the algorithm, so that only one final tree is produced. From<br />

the original matrix, NJ starts first with a star phylogeny. The next step is to calculate for<br />

each taxon its net divergence from all other taxa as the sum of the individual distances<br />

from the taxon. It then uses that net divergence to calculate a corrected distance matrix.<br />

NJ then finds the pair of taxa with the lowest corrected distance and calculates the<br />

distance from each of those taxa to the node that joins them, so that we can identify a<br />

pair of neighbors. Once this pair is identified, they are combined as a single unit and<br />

treated as a single sequence in the next step. This process is continued until all<br />

multifurcating nodes are resolved into bifurcating ones. NJ <strong>do</strong>es not assume that all taxa<br />

are equidistant from a root. Unlike UPGMA, NJ method produces unrooted trees. One<br />

of the disadvantages of the NJ method is that it generates only one tree and <strong>do</strong>es not test<br />

27


28<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

other possible tree topologies. This can cause problems because in many cases in the<br />

initial set up of NJ, there might be multiple equally close pair of neighbors to join,<br />

leading to multiple trees. Since there is no way the algorithm would know which one is<br />

the most optimal tree, choosing a wrong option may produce a suboptimal tree. To<br />

overcome this problem, a generalized NJ method has been developed in which multiple<br />

trees with different initial taxon groupings are generated (Saitou and Nei, 1987; Nei,<br />

1996; Hall, 2001).<br />

1.7.2.2. Character-based methods<br />

These methods are characterized by using multiple alignments directly for<br />

comparison of characters within each column (each site) in the alignment. Three main<br />

methods are applied in phylogeny: maximum parsimony (MP), maximum likelihood<br />

(ML) and Bayesian inference (BI).<br />

The MP method, a given set of nucleotide (or amino acid) sequences are<br />

considered, and the nucleotides (or amino acids) of ancestral sequences for a<br />

hypothetical topology are inferred under the assumption that mutational changes occur<br />

in all directions among the four different nucleotides or the 20 amino acids. The<br />

smallest number of nucleotide substitutions that explain the entire evolutionary process<br />

for the given topology is then computed. This computation is <strong>do</strong>ne for all other<br />

topologies, and the topology that requires the smallest number of substitutions is chosen<br />

as the best tree. If there are no multiple substitutions at each site, MP is expected to<br />

generate the correct topology as long as enough parsimony-informative sites are<br />

examined. The disadvantage is that in practice the nucleotide sequences are often<br />

subject to backward and parallel substitutions and this introduces uncertainties in<br />

phylogenetic inference. When the true tree has a special type of topology and branch<br />

lengths MP may generate an incorrect topology even if an infinite number of<br />

nucleotides are examined and this can happen even if the rate of nucleotide substitution<br />

is constant for all evolutionary lineages (Felsenstein, 1978; Takezaki and Nei, 1994).<br />

Furthermore, in parsimony analysis it is difficult to treat the phylogenetic inference in a<br />

statistical framework because there is no natural way to compute the means and<br />

variances of minimum numbers of substitutions obtained by the parsimony procedure.<br />

However, under certain circumstances, MP is quite efficient in obtaining the correct<br />

topology. It should also be noted that MP is the only method that can easily take care of


INTRODUCTION<br />

insertions and deletions of nucleotides, which sometimes give important phylogenetic<br />

information (Nei, 1996).<br />

Cavalli-Sforza and Edwards (1967) were the first to propose the idea of using a<br />

ML method for phylogenetic inference from gene frequency data, but they encountered<br />

a number of problems when implementing this method. Later, considering nucleotide<br />

sequence data, an algorithm was developed for constructing a phylogenetic tree by the<br />

ML method (Felsenstein, 1981). In the case of amino acid sequence data, the likelihood<br />

method was proposed by using an empirical transition matrix for 20 different amino<br />

acids (Kishino et al., 1990). Later, this method was extended by using various transition<br />

matrices for nuclear and mitochondrial proteins (Adachi and Hasegawa, 1995; 1996).<br />

The first step is the selection of the substitution model of interest that gives the<br />

instantaneous rates at which each of the four possible nucleotides (or 20 amino acids)<br />

changes to each of the other three possible nucleotides (or 19 amino acids). Alignments<br />

of homologous DNA and amino acid sequences can be examined under a wide range of<br />

evolutive models (JC69, K80, F81, F84, HKY85, TN93 and GTR for nucleotides, and<br />

Dayhoff, JTT, mtREV, WAG and DCMut for amino acids) (Guin<strong>do</strong>n et al., 2005). ML<br />

looks for the tree that, under some model of evolution, maximizes the likelihood of<br />

observing the data that is expressed as log-likelihood. ML program seeks the tree with<br />

the largest log likelihood. The advantages of the ML method are that it allows users to<br />

specify the evolutionary model they want to use, and that the likelihood of the resulting<br />

tree is known. A disadvantage is that ML method is considerably slower than either<br />

Parsimony or NJ, and it is not difficult to exceed the capacity of even the most up-to-<br />

date desktop computer (Hall, 2001).<br />

In the BI, inferences of phylogenies are based upon the posterior probability<br />

(PP) of phylogenetic trees. Bayesian analysis of phylogenies is similar to ML in that the<br />

user postulates a model of evolution and the program searches for the best trees that are<br />

consistent with both the model and the data (the alignment). It differs somewhat from<br />

ML in that while ML seeks the tree that maximizes the probability of observing the data<br />

given by that tree, bayesian analysis seeks the tree that maximizes the probability of the<br />

tree given the data and model for evolution. Unlike ML, that seeks the single most<br />

likely tree, Bayesian analysis searches for the best set of trees. MrBayes program uses<br />

the Metropolis-coupled Markov Chain Monte Carlo (MCMCMC) method, which works<br />

29


30<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

proposing a new state (new tree) by using a substitution model, and then the acceptance<br />

probability for this state is calculated (Hasting, 1970; Green, 1995). A ran<strong>do</strong>m number<br />

between 0 and 1 is drawn, and if that number is less than the calculated probability, the<br />

new state is accepted; otherwise the state remains the same. The proposed new state<br />

involves moving a branch and/or changing length of a branch to create a modified tree.<br />

If that new tree is more likely than the existing tree, given the model and the data, it is<br />

more likely to be accepted. This process of proposing and accepting/rejecting new state<br />

is repeated many thousands or millions of times (Hall, 2001; Huelsenbeck and<br />

Ronquist, 2001). Although BI present computational efficiency with respect to the other<br />

character-based methods some questions are unsolved as convergence of Markov chain,<br />

discrepancy between Bayesian probabilities and nonparametric bootstrap test values<br />

(Huelsenbeck et al., 2002).<br />

1.8. Adaptive evolution<br />

Most amino acid substitutions are deleterious and are eventually eliminated from the<br />

population by purifying selection. Some amino acid substitutions are neutral and pure<br />

chance determines whether they will be fixed into the population. In rare instances, an<br />

amino acid substitution is beneficial and is fixed into the population by positive<br />

selection. At the nucleotide level, some base substitutions cause amino acid<br />

replacements and others are silent. The nucleotide position within a co<strong>do</strong>n affects<br />

whether an amino acid replacement takes place when the nucleotide changes. Most<br />

mutations at the first position in a co<strong>do</strong>n result in amino acid replacements; all<br />

mutations at second positions are replacement mutations, but because of the redundancy<br />

of the genetic code, many third position mutations are silent. The number of non-<br />

synonymous changes per non-synonymous site is called dN, and the number of<br />

synonymous changes per synonymous is called dS. When the sequences are compared a<br />

dN/dS ratio1 is taken as evidence of<br />

positive selection. To identify evidence of positive or purifying selection, studies of<br />

adaptive evolution normalize the number of non-synonymous and synonymous<br />

mutations to the number of non-synonymous and synonymous sites in the gene.<br />

Several methods have been proposed to estimate nonsynonymous and synonymous<br />

rate of substitution by comparison of nucleotides and amino acid sequences. The first


INTRODUCTION<br />

methods were based on estimating the numbers of synonymous and nonsynonymous<br />

substitutions from comparison of two coding DNA sequences (Miyata and Yasunaga,<br />

1980; Li et al., 1985; Nei and Gojobori, Li, 1993; Pamilo and Bianchi, 1993). The<br />

method of Nei and Gojobori (1986) is the simplest and performs a co<strong>do</strong>n-by-co<strong>do</strong>n<br />

comparison. The model of Jukes and Cantor (1969) is applied to correct multiple<br />

substitutions that have occurred at one site, and it normally produces results very similar<br />

to those obtained by Nei-Gojobori method.<br />

Methods based on a likelihood approach were proposed for determining positive<br />

selection by comparison of multiple sequences, whose models of co<strong>do</strong>n substitution<br />

assume a single ω=dN/dS for all lineages and sites and has been extended to account for<br />

variation of ω, either among lineages or among sites (Goldman and Yang, 1994; Muse<br />

and Gaut, 1994). The lineage-specific models allow for variable ωs among lineages and<br />

are thus suitable for detecting positive selection along lineages. They assume no<br />

variation in ω among sites. As a result, they detect positive selection for a lineage only<br />

if the average dN over all sites is higher than average dS (Muse and Gaut, 1994; Yang,<br />

1998; Yang and Nielsen, 1998). The specific-site models allow the ω ratio to vary<br />

among sites but not among lineages. Positive selection is detected at individual sites<br />

only if the average dN over all lineages is higher than average dS (Nielsen and Yang,<br />

1998; Yang et al., 2000). The model branch-site allows the ω ratio to vary both among<br />

sites and among lineages. This model assumes that positive selection occurs at a few<br />

time points and affects a few amino acids (Yang and Nielsen, 2002).<br />

The co<strong>do</strong>n-based models, based on likelihood approach, include parameters for<br />

transition-transversion bias and for co<strong>do</strong>n frequencies. In the case of the Goldman and<br />

Yang (1994) model it also includes the different replacement probabilities between<br />

amino acids based on the Grantham (1974) physicochemical distance matrix, and<br />

Bayesian models that assume a prior distribution dN/dS ratio (Nielsen and Yang, 1998;<br />

Yang et al., 2000). All methods are based on theoretical assumptions and ignore the<br />

empirical observations that distinct amino acids differ in their replacement rates.<br />

Recently, a co<strong>do</strong>n-based model, named mechanistic-empirical model (MEC), was<br />

published what takes into account all the parameters mentioned above and the<br />

assimilation of empirical amino acid replacement probabilities into the co<strong>do</strong>n-<br />

substitution matrix (Doron-Faigenboim and Pupko, 2007).<br />

31


THESIS BACKGROUND AND OBJECTIVES


2. THESIS BACKGROUND AND OBJECTIVES<br />

THESIS BACKGROUND AND OBJECTIVES<br />

Throughout the last decade, yeasts belonging to the genus Candida have emerged as<br />

major opportunistic pathogens broadly distributed in the nature, being frequently<br />

present as innocuous commensal of the skin, mouth, vagina, and pharynx as well as of<br />

the urinary and gastrointestinal tracts in humans and other warm-blooded animals<br />

(Coleman et al., 1998; Pedrós, 2003). These yeasts present small cells about 4 a 6 um<br />

diameter, hyphaes and pseu<strong>do</strong>hyphae and reproduce majorly by budding. As a relatively<br />

harmless commensal organism C. albicans exists in about 50% of the human population<br />

(Odds, 1988; Calderone, 2002).<br />

Candida species produce infections denomined candidiasis. These infections can<br />

be superficial form particularly in the mucosas (epitelium and en<strong>do</strong>telium), or<br />

generalized causing systemic candidiasis also designated as candidemia (Gudlaugsson<br />

et al., 2003). These infections have increased over the last two decades owing to the<br />

increase of immunocompromised patients due to hematological disorders, transplants,<br />

chemotherapy, AIDS, among others. They are also frequent in the extremes of age such<br />

as in the elderly and in neonates, particularly the underweight, and in women of<br />

childbearing age (López Martínez et al., 1984; Musial et al., 1988; Horowitz et al.,<br />

1992; Ball et al., 2004; Estrella et al., 2000; Clark et al., 2004; Chen et al., 2005;<br />

Tavanti et al., 2005a). As opportunistic they can present pathogenic character under<br />

determined conditions, becoming an important cause of infection (Coleman et al., 1998;<br />

Pedrós, 2003). The infections caused by yeasts from the genera Candida became the<br />

fourth cause of invasive infections in United States and the eighth in Europe (Jarvis,<br />

1995; Fluit et al., 2000).<br />

Approximately 200 Candida species are known, of which nearly 20 species of<br />

Candida have been identified as etiologic agents of infections, but the list of medically<br />

important yeasts continues to grow (Calderone, 2002). Although Candida albicans has<br />

been, in the past, the most common causative organism of fungemia and disseminated<br />

candidiasis, other Candida species are becoming increasingly more prevalent. A<br />

worldwide study of bloodstream infections from 1997 to 1999 showed that at least 45%<br />

of yeast infections were caused by other Candida species than C. albicans (Pfaller et al.,<br />

2001). The most commonly isolated species, apart from C. albicans, are C. parapsilosis<br />

(20 to 40% of all reported episodes of candidemia), C. tropicalis (10 to 30%), C. krusei<br />

35


36<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

(10 to 35%), C. glabrata (5 to 40%), C. guilliermondii (2 to 10%) and C. lusitaniae (up<br />

to 8%) (Sandven 2000, Krcmery and Barnes, 2002). The emergence of these non-<br />

albicans species is associated with inherent or acquired resistance to fluconazole in up<br />

to 75% of all C. krusei infections and 35% of all C. glabrata infections; conversely, at<br />

present C. lusitaniae remains highly susceptible to azole antifungal agents but it is<br />

resistant to Amphotericine B (Wingard et al., 1991; Krcmery and Barnes, 2002; Pfaller<br />

et al., 2003). C. dubliniensis has emerged as an opportunistic pathogen closely related to<br />

C. albicans. It shares many diagnostic characteristics with C. albicans but differs with<br />

respect to its epidemiology, virulence, and the ability to develop fluconazole resistance<br />

(Sullivan and Coleman, 1997; Gutierrez et al., 2002). The incidence of C. parapsilosis<br />

is associated with handling central venous catheters and its ability ofgrowing as biofilm<br />

on implanted medical devices (Baillie and Douglas, 1998; Kuhn et al., 2004). C.<br />

tropicalis is related to patients with cancer, particularly due to its increased invasiveness<br />

when the gastrointestinal mucosa is damaged and the intestinal flora is suppressed by<br />

pharmacological treatments (Kullberg and Lashof, 2002).<br />

Other new species have been appearing as C. rugosa, C. famata, C. africana, C.<br />

bracarensis, C. fabianii, C. ciferri, C. membranaefaciens, C. metapsilosis, C.<br />

orthopsilosis, C. pseu<strong>do</strong>haemulonii and C. pseu<strong>do</strong>rugosa, affecting mainly<br />

immunocompromised patients, taking advantage of host weakness to pass from<br />

commensal to infectious. (Hazen, 1995; Tietz et al., 2001; García-Martos et al., 2004;<br />

Fanci and Pecile, 2005; Tavanti et al., 2005b; Correia et al., 2006; Li et al., 2006;<br />

Sugita et al., 2006; Valenza et al., 2006).<br />

All Candida species belong to the subphylum Saccharomycotina, which is<br />

included in the phylum Ascomycota. Both mithocondrial and nuclear genes have been<br />

used to construct the phylogeny of the subphylum Saccharomycotina, making use of<br />

one gene or using multigenic analysis (Barns et al., 1991; Bowman et al., 1992;<br />

Diezmann et al., 2004; Lutzoni et al., 2004, Lopandic et al., 2005; James et al., 2006).<br />

Others have carried out phylogenomic analysis, with very good results, but some<br />

species still need to resolve their correct place in the phylogeny due to the lack of other<br />

supporting sequenced genomes (Robbertse et al., 2006; Fitzpatrick et al., 2006).


THESIS BACKGROUND AND OBJECTIVES<br />

The search of new genes and the evaluation of their usefulness in phylogenetic<br />

studies are very important to understand the phylogenetic relationships and their<br />

potential uses in fungal molecular systematics. Genes that are considered good<br />

candidates for use in phylogeny present specific features as vertical transmission,<br />

ubiquity and slowly evolving sites. Within these new genes we can consider MADS-box<br />

transcription factors and GPI-anchor proteins as potential candidates since they present<br />

all the above characteristics. The MADS-box genes encode a eukaryotic family of<br />

transcriptional regulators involved in diverse and important biological functions and the<br />

GPI-anchor proteins participate in the interaction with the environment.<br />

The MADS-box proteins have been identified in yeasts, plants, insects,<br />

nematodes, lower vertebrates and mammals. These proteins are characterized by<br />

containing a conserved DNA binding and dimerization <strong>do</strong>main named the MADS-box<br />

(Schwarz-Sommer et al., 1990) after the five founding members of the family: Mcm1<br />

(yeast) (Passmore et al., 1989), Arg80 (yeast) (Dubois et al., 1987) or Agamous (plant)<br />

(Yanofsky et al., 1990), Deficiens (plant) (Sommer et al., 1990) and SRF (human)<br />

(Norman et al., 1988). In animal and fungi, two distinct types of MADS-box genes have<br />

been identified, the SRF-like (type I) and MEF2-like (type II) classes (Shore and<br />

Sharrocks, 1995). These MEF2-like amino acid sequences are more closely related to<br />

most plant MADS-<strong>do</strong>main sequences, suggesting that at least one gene-duplication<br />

event occurred before the divergence of plants and animals (Alvarez-Buylla et al.,<br />

2000). Thus, both type I and type II MADS-box proteins can be found in each king<strong>do</strong>m.<br />

The organization of type I and type II MADS-box proteins is represented in Figure 5.<br />

Type I and type II proteins can be further classified in subfamilies on the basis of shared<br />

sequence similarity between their C-terminal extensions and the highly conserved<br />

MADS <strong>do</strong>main. These <strong>do</strong>mains are referred to as SAM in fungi and animal SRF-like<br />

proteins, as MEF2 in fungi and animal MEF2-like proteins and as I, K and C in plant<br />

type II proteins (Mueller and Nordheim, 1991; Sharrocks et al., 1993; Riechmann et al.,<br />

1996; Riechmann and Meyerowitz, 1997). MADS-box family members generally<br />

recognize AT rich consensus sequences, with a highly conserved core of 10 bp:<br />

CC(A/T)6GG is the binding site of SRF-like proteins known as CArG box (Treisman,<br />

1990), and CTA(A/T)4TAG is the binding site of MEF2-like proteins (Figure 6)<br />

(Pollock and Treisman, 1991).<br />

37


38<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Figure 5. Schematic representation of Type I (SRF-like) and Type II (MEF2-like) protein <strong>do</strong>mains in<br />

plant, animal, and fungi. The scale indicates the number of amino acids along the protein. Plant<br />

Type II-like proteins have carboxyl-terminal <strong>do</strong>mains that go beyond 200 amino acids. In plant<br />

Type I-like proteins the ―?‖ indicates carboxyl-terminal <strong>do</strong>mains not well defined yet and of<br />

variable lengths (Based on Alvarez-Buylla et al., 2000).<br />

In the yeast Saccharomyces cerevisiae four MADS-box proteins have been<br />

found: Mcm1 and Arg80 which are related to the human SRF, and Rlm1 and Smp1<br />

which belong to the MEF2-like family (Alvarez-Buylla et al., 2000). Rlm1 controls the<br />

expression of genes required for cell wall integrity, while Smp1 is involved in osmotic<br />

stress response mediated by the Hog1 signal transduction pathway (Watanabe et al.,<br />

1995; Do<strong>do</strong>u and Treisman, 1997; Nadal Ed et al., 2003). Arg80 is only required for<br />

repression of arginine anabolic genes and induction of arginine catabolic genes<br />

(Messenguy and Dubois, 2000), whereas Mcm1, also involved in the control of arginine<br />

metabolism, playing a pleiotropic role in the cell (Messenguy and Dubois, 1993). Mcm1<br />

is essential for cell viability and controls M/G1 and G2/M cell-cycle-dependent<br />

transcription (Althoefer et al., 1995; McInerny et al., 1997), mating (Jarvis et al., 1989),<br />

minichromosome maintenance (Passmore et al., 1988), recombination (Elble and Tye,<br />

1992), transcription of TY elements (Errede, 1993; Yu and Fassler, 1993) and<br />

osmotolerance (Kuo et al., 1997).


THESIS BACKGROUND AND OBJECTIVES<br />

Figure 6. Structure of Type I (SRF) and Type II (MEF2A) MADS <strong>do</strong>mains in complex with their DNA<br />

target (Homo sapiens). (a) Aligned sequences of SRF and MEF2A MADS <strong>do</strong>mains with α and<br />

β secondary structure assignments. (b) The SRF– core DNA complex with one monomer in red<br />

and the other monomer in green (taken from Pellegrini et al., 1995). DNA is in grey, with<br />

upper and lower DNA strands labelled W and C. (c) SRF DNA sequence (18 bp) with the<br />

palindromic CArG recognition element boxed. (d) MEF2A–DNA complex with one monomer<br />

in red and the other monomer in green (adapted from Santelli and Richmond, 2000). DNA is in<br />

grey. (e) MEF DNA sequence (17 bp) in the crystal structure with MEF2 consensus site boxed<br />

(taken from Santelli and Richmond, 2000).<br />

Other MADS-box proteins type I have also been studied, for instance in<br />

Schizosaccharomyces pombe, MAP1 gene encodes a protein required for cell-type-<br />

specific gene expression, in Ustilago maydis, UMC1 encodes a protein that regulates the<br />

expression of pheromone-inducible genes (Yabana and Yamamoto, 1996; Kruger et al.,<br />

1997) and in C. albicans, CaMCM1 is crucial for morphogenesis (Rottmann et al.,<br />

2003). In the case of MADS-box type II, putative RLM1 orthologues were identified in<br />

C. albicans, Paracoccidioides brasiliensis and Aspergillus niger (Damveld et al., 2005;<br />

Fernandes et al., 2005; Bruno et al., 2006).<br />

GPI-anchor proteins (GPiPs) contain a C-terminal signal sequence that allows<br />

the linkage to a glycosylphosphatidylinositol (GPI) anchor. Proteins destined to be GPI<br />

anchored share conserved features: an N-terminal signal sequence for localization to the<br />

39


40<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

en<strong>do</strong>plasmic reticulum (ER), a C-terminal hydrophobic <strong>do</strong>main (9 to 24 residues) for<br />

transient attachment to the ER membrane, and the so-called ω site, where the protein is<br />

cleaved to be ligated to the GPI anchor (Thomas et al., 1990). The ω site is localized 9<br />

to 10 amino acids before the C-terminal hydrophobic <strong>do</strong>main, and its amino acid<br />

environment determines whether the protein has a high probability to be GPI anchored.<br />

Amino acids at positions ω-1 to ω-11 form a linker region, usually with no charge and<br />

no secondary structure (α-helix or β-sheet). The ω site region is composed of small<br />

amino acids to fit the transamidase protease catalytic pocket, and finally, the spacer<br />

region (ω + 3 to ω + 9) is comparable to the linker, having no charge and being flexible<br />

(Eisenhaber et al., 2004). Shortly after protein synthesis in the ER, the preformed GPI<br />

anchor replaces the C-terminal transmembrane region (Figure 7).<br />

Figure 7. GPI anchor structure. The structure of core GPI anchor which consists of a lipid group (serving<br />

as the membrane anchor), a myoinositol group, an N-acetylglucosamine group, three<br />

mannose groups, and a phosphoethanol phosphoethanolamine group, which ultimately<br />

connects the GPI anchor to the protein via an amide linkage represented in black (Tiede et<br />

al., 1999). The addition of mannose groups and the positioning of side chains like<br />

phosphoethanolamine on the GPI anchor contributes to the variety of GPI anchors identified<br />

so far. The additional gray groups illustrate the side chains added to the GPI anchor in S.<br />

cerevisiae (side chains are specific to each organism).<br />

GPI anchoring is encountered in every eukaryotic cell from unicellular yeast<br />

cells to the highly specialized mammalian cells (Bowman et al., 2006; Ferguson, 1999;<br />

McConville and Menon, 2000). In S. cerevisiae, most are involved in cell wall<br />

compound biosynthesis, flocculation, protease activity, sporulation, and mating (Caro et<br />

al., 1997). These proteins have also been found in A. nidulans, C. albicans, C. glabrata,


THESIS BACKGROUND AND OBJECTIVES<br />

Neurospora crassa and Schizosacch. pombe (Caro et al., 1997; de Groot et al., 2003;<br />

Eisenhaber et al., 2003; Eisenhaber et al., 2004; Sundstrom, 2002; Weig et al., 2004). It<br />

is notable that S. cerevisiae and Schizosacch. pombe seem to possess a smaller number<br />

of GpiPs than C. albicans.<br />

IFF8 gene encodes a putative GPI-anchored protein of unknown function that<br />

forms part of the group of genes, also known as the IFF genes (for individual protein<br />

file family F). Analysis of the coding sequences in this family revealed gene products<br />

with a large N-terminal <strong>do</strong>main of 340 amino acids shared by each member of the<br />

family, with high homology. The proteins diverge strongly after this <strong>do</strong>main; both in<br />

size and sequence, including its C-terminal hydrophobic <strong>do</strong>main that allows for linkage<br />

to a glycosylphosphatidylinositol (GPI) anchor (Richard and Plaine, 2007).<br />

Objectives<br />

Currently, protein-coding genes are used to resolve deep level phylogenetic<br />

relationships through multigenic and phylogenomic analysis, but the majority of these<br />

analyses uses regions with similarity only among orthologue gene sequences and <strong>do</strong>es<br />

not take into account the other regions which could give us useful information to infer a<br />

more robust phylogeny. Hence, the main purpose of this thesis was to evaluate the use<br />

of the MADS-box transcription factors in fungi phylogeny and the use of IFF8 gene to<br />

help to resolve the phylogeny in the particular CUG group.<br />

Therefore, the specific aims of the present work were:<br />

* Search for MADS-box and IFF8 ortohologue genes in the king<strong>do</strong>m Fungi.<br />

* Analyze the potential use of these genes to infer phylogenetic relationships<br />

according to current proposal in the king<strong>do</strong>m Fungi.<br />

* Infer the phylogeny of Candida species based on concatenated alignment of the<br />

previously predicted genes.<br />

* Analyze the putative adaptive evolution of RLM1 gene in the subphylum<br />

Saccharomycotina, particularly in the Whole Genome Duplicate (WGD) group.<br />

* Identify the molecular mechanism that putatively changed RLM1 gene in the<br />

subphylum Saccharomycotina.<br />

41


MATERIAL AND METHODS


3. MATERIAL AND METHODS<br />

3.1. Phylogenetic analysis<br />

3.1.1. Data retrieval<br />

MATERIAL AND METHODS<br />

Sequences used in this study were obtained from several different databases (DB)<br />

and Rlm1 and Mcm1 protein and DNA sequences from Saccharomyces cerevisiae as<br />

well as the Iff8 protein and DNA sequences of Candida albicans were used as<br />

references. The S. cerevisiae Rlm1 and Mcm1 protein/gene sequences were obtained<br />

from the Genbank under accession numbers BAA09658/D63340 and<br />

CAA88409/Z48502, respectively. The Iff8 protein/gene sequences of C. albicans were<br />

obtained from Candida Database orf19.8201.<br />

To obtain both nucleotide and amino acid putative orthologue sequences for the 76<br />

fungal species, the protein sequences referred above were used as reference to carry out<br />

TBLASTN searches (Matrix BLOSUM62, cut-off value E > 10 -5 ) in the following<br />

databases: Candida DB (http://www.candidagenome.org), Saccharomyces DB<br />

(http://www.yeastgenome.org), Broad Institute (http://www.broad.mit.edu), Joint<br />

Genome Institute (JGI, http://www.jgi.<strong>do</strong>e.gov), Genolevures (http://cbi.labri.fr), The<br />

Institute for Genomic Research (TIGR, http://www.tigr.org), Washington University St.<br />

Louis (WUSTL, http://www.wustl.edu), National Center for Biotechnology Information<br />

(NCBI, http://www.ncbi.nlm.nih.gov), Sanger Institute (http://www.sanger.ac.uk),<br />

Baylor College of Medicine (http://www.hgsc.bcm.edu), Oklahoma University<br />

(http://www.genome.ou.edu) and Murcia University (http://mucorgen.um.es). To avoid<br />

errors in the following alignments, if partial orthologue sequences were obtained in this<br />

search they were completed with the information from the closest species. Animals and<br />

plants MADS box sequences were used as outgroup.<br />

Both outgroup protein/gene sequences were retrieved from Genbank database under<br />

the following accession numbers, Type II: Homo sapiens MEF2D<br />

NP_005911/NM_005920, Xenopus laevis MEF2 AAI12917/BC112916, Danio rerio<br />

MEF2D AAH98522/BC098522, Drosophila melanogaster DMEF2<br />

NP_477021/NM_057673, Caenorhabditis elegans CEMEF2 NP_492441/NM_060040,<br />

Arabi<strong>do</strong>psis thaliana APETALA1 NP_177074/NM_105581 and Oryza sativa<br />

AGAMOUS ABG21913/DP000011; Type I: Homo sapiens SRF<br />

NP_003122/NM_003131, Xenopus laevis SRF NP_001095218/NM_001101748, Danio<br />

45


46<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

rerio MEF2D NP_001103996/NM_001110526, Drosophila melanogaster DSRF<br />

NP_726438/NM_166667, Caenorhabditis elegans NP_492226/NM_059895,<br />

Arabi<strong>do</strong>psis thaliana AGL36 NP_850880/NM_180549 and Oryza sativa AGL35<br />

BAD81343/AP002070. No outgroup was used in Iff8 protein and gene analyses.<br />

3.1.2 Sequence alignments<br />

The sequences obtained were aligned by using MUSCLE program (Edgar, 2004),<br />

and then imported into MEGA 4.0 software (Tamura et al., 2007) to be manually edited.<br />

The alignments were based on amino acids, therefore nucleotides were aligned in<br />

co<strong>do</strong>ns. As some parts of the C-terminal regions were too divergent to be confidently<br />

aligned, a preliminary NJ tree in Clustal W (Thompson et al., 1994) and UPGMA tree<br />

in MUSCLE was produced for each phylum, aligning the conserved C-terminal<br />

<strong>do</strong>mains. After this previous analysis, the alignment of the less-conserved C-terminal<br />

regions from different phyla became much easier and a new and final total alignment<br />

was produced.<br />

3.1.3. Selection of the best evolutionary model to infer phylogeny<br />

The model selection approach, Akaike Information Criterion (AIC), was used to<br />

estimate the best-fit protein evolutionary model for ML. The ProtTest 1.4 program was<br />

used and the best model was selected as the one that presented the greatest likelihood<br />

value, from the 48 models produced by the program (Abascal et al., 2005). The same<br />

approach was used to estimate the best-fit DNA evolutionary model for ML but by<br />

using the jModeltest 1.0 program (Posada, 2008) for likelihood calculations, the best<br />

model was also the one that presented the greatest likelihood value from the 88 models<br />

of evolution produced. For Bayesian inference a model selection approach was also<br />

performed, using the 4 hierarchies for likelihood ratio test to determine the simplest and<br />

most appropriate DNA evolutionary model by using MrModeltest 2.3 (Nylander, 2004)<br />

and PAUP 4.0b10 (Swofford, 2002) programs as well as the required file<br />

Mrmodelblock (anexo I).<br />

3.1.4. Phylogenetic reconstruction<br />

Phylogenetic analysis was performed by ML in PHYML 3.0 program (Guin<strong>do</strong>n and<br />

Gascuel, 2003), for each protein and/or gene and protein and/or gene concatenated<br />

alignments previously obtained. To obtain the phylogenetic tree, the best evolutionary


MATERIAL AND METHODS<br />

model determined previously was considered and the branch/nodal supports were<br />

estimated with the approximate likelihood ratio test by using the Shimodaira-Hasegawa<br />

like support option (Anisimova and Gascuel, 2006).<br />

Bayesian inference (BI) was performed by using MrBayes 3.2 program<br />

(Huelsenbeck and Ronquist, 2001) for each alignment previously obtained. As for ML,<br />

to obtain the phylogenetic tree by bayesian inference the best evolutionary model<br />

determined by the previous analysis was also considered and the branch/nodal support,<br />

given as the posterior probability (PP) values in this analysis, was calculated by using<br />

the Metropolis-coupled Markov chain Monte Carlo method after calculating a large<br />

number of different trees. To infer fungal phylogeny using RLM1 gene four chains with<br />

4.000.000 analysis were run. For Saccharomycotina phylogeny inferred with RLM1<br />

gene, four chains with 800.000 analyses were run and for phylogeny inferred with<br />

MCM1 gene four chains with 1.250.000 analyses. Saccharomycotina phylogeny with<br />

RLM1+MCM1 genes was inferred by running four chains with 5.500.000 analyses. To<br />

infer CUG group phylogeny using IFF8 gene four chains with 1.000.000 analyses were<br />

run and by using RLM1+MCM1+IFF8 genes 2.000.000 generations. Trees were<br />

sampled every 100 generations and trees obtained before the convergence of the<br />

Markov chain were not included in the consensus tree. The remaining samples were<br />

used to construct the consensus tree and the branches/nodes probability posterior values<br />

were converted from frequency to percentage, importing into PAUP 4.0b10. In the BI,<br />

these percentages for the consensus tree are the rough equivalent of a ML search with<br />

bootstrapping analysis (Huelsenbeck and Ronquist, 2001).<br />

To visualize the degree of phylogenetic conflict within concatenated alignments a<br />

phylogenetic network was generated by using the NeighborNet method in the<br />

SPLITSTREE 4.1 program (Huson and Bryant, 2006), with 1000 bootstrap analysis.<br />

3.1.5. Interpretation of values considered for nodal support<br />

Previous studies have demonstrated that to infer about the nodal support in<br />

phylogenetic trees the information from posterior probabilities and bootstrap analysis<br />

should be complemented (Alfaro et al., 2003; Douady et al., 2003; Reeb et al., 2004).<br />

Bayesian methods are more efficient in recovering accurate nodal support values, but<br />

same authors have indicated that they could be less conservative than boostrap (Suzuki<br />

47


48<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

et al., 2002; Alfaro et al., 2003), unlike Shimodaira-Hasegawa (S-H) approach used in<br />

the ML analyses which is more conservative (http://atgc.lirmm.fr/alrt). Therefore, a<br />

combination of both, posterior probabilities and Shimodaira-Hasegawa like support,<br />

were used to assess the level of confidence of a specific node in the phylogenetic DNA<br />

trees. Throughout this work, the following scale was used:<br />

- High (strong) support - PP ≥ 95% and S-H ≥ 0.70;<br />

- Medium (moderate) support - PP ≥ 95% and 0.70 > S-H ≥ 0.50, or PP < 95%<br />

and S-H ≥ 0.70;<br />

- Low (poor or weak) support - PP ≥ 95% and S-H < 0.50, or PP < 95% and 0.70<br />

> S-H ≥ 0.50;<br />

- No support - PP < 95% and S-H < 0.50.<br />

In protein phylogenetic trees Shimodaira-Hasegawa like support (S-H) analyses was<br />

the only considered, using the same scale values.<br />

3.2. Analysis of adaptive evolution<br />

We applied the approach of Yang and coworkers (Yang et al., 2000; Swanson et al.,<br />

2001) to test for amino acids under positive selection in Rlm1 protein. To determine<br />

which model best fitted the data, likelihood ratio tests (LRTs) were performed by<br />

comparing the differences in log likelihood values between two models by using the χ 2<br />

distribution. The co<strong>do</strong>n-substitution models M0, M1a, M2a, M3, M7, and M8 that use a<br />

statistical distribution to describe the ran<strong>do</strong>m variation among co<strong>do</strong>n sites (ω=dN/dS)<br />

were used (Nielsen and Yang, 1998; Wong et al., 2004; Yang et al., 2005). We used<br />

LRTs to make 3 comparisons to find out whether positive selection has played a role in<br />

the molecular evolution of this gene; (i) the one ratio model (M0) was compared with<br />

the discrete model (M3), (ii) the neutral model (M1a) was compared with the selection<br />

model (M2a), and (iii) the beta model (M7) was compared with the beta and ω model<br />

(M8).<br />

To identify particular sites in the gene that were likely to have evolved under<br />

positive selection the Bayesian Empirical Bayes (BEB) analysis was used (Yang and<br />

Bielawski, 2000; Yang et al., 2005). This estimates the posterior probability (PP) values<br />

at each site. If, a particular co<strong>do</strong>n site presents PP value higher than 95% and ω higher


MATERIAL AND METHODS<br />

than 1 that site is inferred to be under positive selection (Nielsen and Yang, 1998; Yang<br />

and Bielawski, 2000; Yang et al., 2005). These calculations were performed by using<br />

the CODEML program inside PAML 4.1 (Yang, 1997) and HYPHY 1.0 program<br />

(Kosakovsky et al. 2005) to analyze positive selection in the MADS-box orthologues<br />

from RLM1 gene in plants, animals and fungi, and in the entire RLM1 gene within the<br />

Saccharomycotina group.<br />

A seventh model, named mechanistic empirical combined (MEC), was further<br />

used to test positive selection in the subphylum Saccharomycotina (Doron-Faigenboim<br />

and Pupko, 2006). This model was statistically tested by comparing the second order<br />

Akaike Information Criterion corrected (AICc) likelihood scores with Model 8a (Wong<br />

et al., 2004). The calculations were performed by using SELECTON 2.4 program (Stern<br />

et al., 2007).<br />

3.3. Analysis of the repetitive region in the subphylum Saccharomycotina<br />

To identify the putative molecular mechanism that contributed to the divergence of<br />

the MADS-box RLM1 gene in the subphylum Saccharomycotina, the C-terminal was<br />

analysed. Three reading frames from the sequences that presented a repetitive region in<br />

the C-terminal were analyzed by using VIRTUAL RIBOSOME 1.1 program<br />

(Wernersson, 2006) to determine the possible amino acids encoded by that region. The<br />

corresponding ancestral RLM1 gene, from which the most recent Saccharomycotina<br />

species reported in this study diverged, was constructed by using MrBayes 3.2 program.<br />

The Saccharomycotina Rlm1 protein sequences were aligned with this ancestral<br />

sequence to identify the presence of an ancestral repetitive region in the C-terminal.<br />

49


RESULTS AND DISCUSSION


4. RESULTS AND DISCUSSION<br />

4.1. Compilation of data<br />

RESULTS AND DISCUSSION<br />

The fungal Rlm1 and Mcm1 orthologue protein/gene sequences identified in this<br />

analysis consisted of 76 putative orthologues. From these, 47 putative Rlm1 orthologue<br />

sequences were directly identified in the databases, 22 were predicted by homology and<br />

7 presented partial sequences (Annex II). Regarding Mcm1, 60 putative orthologue<br />

sequences were found in the database, 14 were predicted by homology and 2 presented<br />

partial sequences (Annex III). In the case of Iff8 protein/gene only 9 putative orthologue<br />

sequences were identified (Annex IV). Other members of MADS-box protein, Smp1<br />

and Arg80 protein orthologue sequences were also identified in Saccharomyces sensu<br />

stricto and Saccharomyces sensu lato groups, but not being used in this study (Annexe<br />

V and VI).<br />

The presence of introns was detected only in RLM1 and MCM1 genes (Annexes II<br />

and III) IFF8 sequences did not contain introns (Annex IV). The number of introns<br />

varied in RLM1 and MCM1 genes, from the lack of them to 8 introns. The higher<br />

variability in the number of introns was observed within Mucoromycotina and<br />

Basidiomycota while the Ascomycota group presented a more constant pattern, from 0<br />

to 2 introns. In view of these facts, a bias towards an intron loss in MADS-box<br />

transcription factors in the king<strong>do</strong>m Fungi seems to be occurring, not a gain, which was<br />

evidenced by the absence or presence of only 2 introns in the most recently diverged<br />

phylum Ascomycota, unlike the older phyla that presented a varied number that could<br />

reach eight introns.<br />

In the present work, the results suggested that during the divergence of fungal<br />

species the intron loss would have been toward RML1 3‘ end because in some species<br />

that diverged recently the intron in the 5‘ end inside the MADS-box of this gene is the<br />

only maintained. In MCM1 a bias to lose introns both in 5‘ and in 3‘ ends was<br />

identified, but like in RLM1 gene one intron was maintained inside the MADS-box.<br />

Curiously, the presence of the first intron towards the 5‘ end was inside the MADS-box<br />

a feature present in all the genes with introns. In the case of MCM1 this was also true,<br />

the first intron is also present inside the MADS-box, but not in the 5‘ end since the<br />

MADS-box is in the middle of the gene.<br />

53


54<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

It is known that in intron-rich organisms, introns are evenly distributed within the<br />

coding sequences, but are biased toward the 5‘ end of genes in intron–poor organisms<br />

(Nielsen et al., 2004). Current models attribute the bias to 3‘ end intron loss due to a<br />

poly-adenosine-primed reverse transcription mechanism which was demonstrated in<br />

experiments with intron-containing Ty elements in yeast (Fink, 1987; Boecke et al.,<br />

1985). However, in a study carried out in different fungal species no increased<br />

frequency in intron loss toward the 3‘ end of genes was found, the loss occurred in the<br />

middle of the genes, suggesting either other mutational mechanisms (e.g., reverse<br />

transcription primed internally) or the presence of selective pressure to preferentially<br />

conserve introns near the 5‘ and 3‘ ends of genes (Nielsen et al., 2004). To date, the<br />

mechanism by which introns are inserted or delected from gene loci is not well<br />

understood.<br />

4.2. Phylogeny based on MADS-box transcription factors<br />

4.2.1. Rlm1 transcription factor<br />

Fungal phylogeny based on Rlm1 transcription factor was inferred both by using<br />

amino acids and DNA sequences. The phylogeny based on amino acids sequences was<br />

inferred only by maximum likelihood (ML) by using the PHYML 3.0 program. From<br />

the various models of evolution tested the Jones, Taylor and Thornton (JTT) model was<br />

the one selected, including the options to calculate the proportion of invariant sites, the<br />

gamma distribution and the frequency of amino acids. In the phylogeny based on<br />

nucleotides the General Time Reversible (GTR) model, including the proportion of<br />

invariant sites and the gamma distribution options, were the selected, both for ML<br />

analysis and bayesian inference.<br />

The phylogeny based on protein sequences resolved the Fungi as a clade strongly<br />

supported by a Shimodaira-Hasegawa (S-H) like value of 0.96 in the ML analysis.<br />

However the resolution of Fungi into a clade based on DNA sequences revealed the<br />

existence of a conflict. The S-H value obtained was of 0.24 for ML and of 100% for<br />

bayesian posterior probability (PP), which by the scale value means a low nodal support<br />

(Figure 8 and 9). This result could be due to the low number of DNA RLM1 sequences<br />

from species other than fungi that were used in this study. But, although the low nodal<br />

support in the phylogeny based on DNA sequences, the obtained topology is strongly<br />

supported by previous studies (Wainright et al., 1993; Baldauf and Palmer, 1993).


RESULTS AND DISCUSSION<br />

The Mucoromycotina ‗Zygomycota‘ are primarily coenocytic (sometimes producing<br />

cell septa) and undergo sexual reproduction by formation of a thick-walled resting spore<br />

called a zygospore, which is clearly different from the Basidiomycota and Ascomycota<br />

(Dikaryomycota) and other phyla. Morphologicaly the Ascomycota and Basidiomycota<br />

both possess regularly septate hyphae and a dikaryotic life stage but differ in the<br />

structures involved in meiosis and sporulation.<br />

The fungal phylogeny inferred by these analyses clearly placed the 76 fungal<br />

species into one subphyla incertae sedis, the Mucoromycotina ‗Zygomycota‘, and two<br />

well supported phyla, the Basidiomycota and the Ascomycota (Figures 8 and 9). The<br />

Mucoromycotina ‗Zygomycota‘ formed a clade moderately supported by protein (S-H<br />

value of 0.64) as well as by nucleotide analysis (S-H 0.69 ML and 99% PP). Both<br />

Ascomycota and Basidiomycota formed clades highly supported by S-H values of 0.84<br />

and 0.91, respectively with the protein analysis, but these clades were medium or low<br />

supported in the nucleotide ML analysis by S-H of 0.67 and 0.57 and PP values of 98%<br />

and 61%, respectively. Based on the literature a sister relationship between the<br />

Basidiomycota and Ascomycota (the ‗‗Dikaryomycota‘‘) has been proposed, being<br />

recently recognized as the subking<strong>do</strong>m Dikarya (James et al., 2006, Hibbett et al.,<br />

2007). The results obtained in these analyses highly support this sister relationship (S-H<br />

= 0.98 ML in protein, S-H= 0.97 ML and PP = 100% in nucleotides).<br />

In the present analysis three species belonging to the subphylum Mucoromycotina<br />

‗Zygomycota‘ were used two of which the Mucor circinelloides and the Rhyzopus<br />

oryzae are sisters by a well-supported node in the phylogenetic analysis using<br />

nucleotide (SH=0.87, PP=99%) and amino acids (SH=0.92) sequences (Figures 8 and<br />

9). This result agrees with the current taxonomic classification in which these two<br />

species form part of the family Mucoraceae (Voigt and Wöstemeyer, 2001). The other<br />

species included in this subphylum, Phycomyces blakeleeanus, member of the family<br />

Phycomycetaceae. The topology obtained is in agreement with previous results since<br />

both the Mucoraceae and Phycomycetaceae families are part of the order Mucorales<br />

within the subphylum Mucoromycotina (Hibbet et al., 2007).<br />

55


56<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Figure 8. Fungal phylogeny based on Rlm1 protein sequences. This phylogeny was obtained by<br />

maximum likelihood. The nodes are supported by Shimodaira-Hasegawa like support. Species<br />

with partial sequences are: Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus, Ascosphaera<br />

apis, Botrytis cinerea, Chaetomium globosum, Ephicloa festucae, Coccidioides immitis.


RESULTS AND DISCUSSION<br />

The phylum Basidiomycota is strongly supported as a monophyletic clade as well as<br />

classes Agaricomycetes (S-H=0.99 ML in protein, S-H=0.97 ML and PP=100% in<br />

nucleotides), Tremellomycetes (S-H=1.0 ML in protein, S-H=1.0 ML and PP=100% in<br />

nucleotides), and Ustilaginomycetes (S-H=0.97 ML in protein, S-H=0.97 ML and<br />

PP=91% in nucleotides), within this phylum by both protein and nucleotide analyses.<br />

The classification of Ustilaginomycetes, Microbotryomycetes, Tremellomycetes and<br />

Agaricomycetes a<strong>do</strong>pted here followed the current taxonomic proposal (Hibbet et al.,<br />

2007). The Ustilaginomycetes, which is a class of the subphylum Ustilaginomycotina,<br />

are represented by one species of the order Ustilaginales, Ustilago maydis, and by<br />

another of the order Malasseziales, Malassezia globosa, class incertae sedis. These two<br />

species received high support as sister groups in protein and nucleotide analyses.<br />

However, in the phylogeny inferred by the bayesian method a topology conflict was<br />

observed with Ustilaginomycetes and Schizosaccharomycetes grouping them as sister<br />

clades with a low support (PP=62%), being the latter a member of the phylum<br />

Ascomycota. The class Microbotryomycetes, which is a member of subphylum<br />

Pucciniomycotina, is only represented by one species, Sporobolomyces roseus, and is<br />

grouped as a sister group of the subphylum Agaricomycotina with high nodal support<br />

(S-H=0.96 ML in protein, S-H=0.86 ML and PP=95% in nucleotides). The classes<br />

Tremellomycetes and Agaricomycetes are members of the subphylum<br />

Agaricomycotina. These two classes form well supported sister clades in protein (S-<br />

H=0.99), but obtained a low support with the nucleotide analysis ((S-H=0.42 ML and<br />

PP=86%). The class Tremellomycetes is represented by two species of the order<br />

Tremellales, Cryptococcus neoformans and Cryptococcus gatti. These two species are<br />

strongly supported as sister taxa by protein (S-H= 1.0) and nucleotide (S-H= 1.0,<br />

PP=100%) analyses. The class Agaricomycetes is represented by four species included<br />

in the orders Agaricales (Coprinus cinereus, Laccaria bicolor), Polyporales (Postia<br />

placenta), and Corticiales (Phanerochaete chrysosporium), being these latter two<br />

considered as subclass incertae sedis. The order Polyporales and Corticiales are<br />

resolved as well-supported sister groups in protein (S-H=0.95) but with a low support in<br />

nucleotide (S-H= 0.0, PP=80%) analyses. The order Agaricales is resolved as the sister<br />

group of the Polysporales/Corticiales group with high nodal support in protein (S-<br />

H=0.99) and nucleotide (S-H=0.97, PP=100%) analyses.<br />

57


58<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Figure 9. Fungal phylogeny based on RLM1 DNA sequences. Nodes values correspond to posterior<br />

probabilities (left) and Shimodaira-Hasegawa (right). In the maximum likelihood analysis a S-<br />

H value of 0.67 (a) for Schizosaccharomycetes clade within phylum Ascomycota, value of 1.0<br />

(b) for V. polyspora as basal taxon from ‗Saccharomyces complex‘, value of 0.32 (c) for<br />

Ephicloa festucae as basal taxon from Trichoderma group and a value of 0.04 (d) for A. terreus<br />

as basal taxon of A. nidulans- A. niger group was observed. Species with partial sequence are:<br />

Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus, Ascosphaera apis, Botrytis cinerea,<br />

Chaetomium globosum, E. festucae, and Coccidioides immitis.


RESULTS AND DISCUSSION<br />

In this study, of the three subphyla recognized within the Ascomycota (Eriksson<br />

et al., 2004, Hibbet et al., 2007), only the Taphrinomycotina presented a conflict<br />

topology, as mentioned before. The species in this subphylum represented a<br />

monophyletic clade with a high nodal support in protein (S-H= 0.99) and nucleotide (S-<br />

H=1.0, PP=100%) analyses. The subphylum Saccharomycotina also presented<br />

monophyly with well-supported nodal values (S-H=0.89 in protein and S-H=0.88,<br />

PP=99% in nucleotide analyses). The subphylum Pezizomycotina was also represented<br />

as a monophyletic clade with high nodal support values (S-H=0.98 in protein and S-<br />

H=0.98, PP= 100% in nucleotide analyses), as well as its classes Leotiomycetes,<br />

Sordariomycetes, Dothideomycetes and Eurotiomycetes (Figures 8 and 9). The class<br />

Leotiomycetes was only represented by the order Helotiales with a high nodal support<br />

(S-H=1.0 in protein and S-H=1.0, PP=100% in nucleotide analyses), being the species<br />

Botrytis cinerea and Sclerotinia sclerotiorum members of this order.<br />

The class Sordariomycetes is represented by the orders Sordariales, Hypocreales,<br />

Phyllacorales – subclass incertae sedis, and the family Magnaporthaceae – order<br />

incertae sedis. The orders Sordariales, Phyllacorales and the family Magnaportheceae<br />

form a group highly supported (S-H=0.96 in protein and S-H= 0.98, PP=100% in<br />

nucleotide analyses) with only a nodal conflict in the placement of Neurospora crassa<br />

with the protein analysis, since this species should form a clade with Chaetomium<br />

globosum and Po<strong>do</strong>spora anserina. The order Hypocreales is a sister clade of the group<br />

Sordariales/Phyllacorales/Magnaporthaceae with a high nodal support (S-H=1.0 in<br />

protein and S-H=1.0, PP=100% in nucleotide analyses) but a conflict was also observed<br />

regarding the placement of Ephicloe festucae (family Clavicipitaceae). This species was<br />

placed as a sister clade with the family Hypocraceae (Trichoderma atroviride, T. reesei,<br />

T. virens) in ML both in protein and in nucleotide analysis, but as a sister clade of the<br />

family Nectriaceae (Nectria haematoccoca, Fusarium graminearum, F. oxysporum, F.<br />

verticillioides) in the BI, although with low nodal support in all analyses (S-H=0.12 ML<br />

in protein, S-H=0.32 ML and PP=63% in nucleotide).<br />

The class Dothideomycetes is represented by the orders Capnodiales and<br />

Pleosporales, which grouped as monophyletic sister clades with high nodal support<br />

(SH=0.99 in protein, S-H=0.97 PP=90% in nucleotide analyses) (Figures 8 and 9). The<br />

class Eurotiomycetes is represented by the orders Onygenales and Eurotiales and<br />

59


60<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

presented monophyly. In the order Onygenales, the families Ascosphaeraceae<br />

(Ascosphaera apis) and Arthrodermateceae (Microsporum gypseum) are sister clades<br />

with high nodal support in protein (S-H=0.92) and nucleotide (S-H=0.92, PP=99%)<br />

analyses as well as the families Onygenaceae (Uncinocarpus reesii) and Gymnoascaceae<br />

(Coccidioides immitis, Coccidioides posadasii). The family Ajellomycetaceae is<br />

represented by Paracoccidioides brasiliensis, Histoplasma capsulatum and Blastomyces<br />

dermatitidis, forming a highly supported clade by protein (S-H=1.0) and nucleotide (S-<br />

H=0.99, PP=100%) analyses. A nodal conflict was observed regarding the placement of<br />

the group Ascosphaeraceae/Arthrodermateceae, since the protein analysis showed it as a<br />

sister group of the group Onygenaceae/Gymnoascaceae and the nucleotide analyses as a<br />

sister group of the clade Ajellomycetaceae. However, both analyses showed low nodal<br />

suppor by protein (S-H=0.58) and nucleotide (S-H=0.04, PP=67%) analyses. The order<br />

Eurotiales is represented only by the family Trichocomaceae with a high nodal support<br />

(S-H=0.99 in protein, S-H=1.0, PP=100% in nucleotide analyses). Talaromyces<br />

stipitatus and Penicillium marneffei are sister taxa (Figures 8 and 9). The species<br />

Aspergillus clavatus, A. fumigatus and Neosartorya fischeri formed a group with a well<br />

nodal support that is a sister group of the one formed by A. oryzae, A. flavus, A. terreus,<br />

A. nidulans and A. niger, also well supported (Figures 8 and 9). Only regarding A.<br />

terreus placement by bayesian analysis a conflict was observed. This species grouped as<br />

sister taxa of A. oryzae and A. flavus, while in ML analysis it was placed together with<br />

A. nidulans and A. niger. However, both options had low supports. The orders<br />

Leotiomycetes and Sordariomycetes formed sister clades with a well nodal support<br />

while the orders Dothideomycetes and Eurotiomycetes also formed sister clades but<br />

with a moderate support (S-H=0.85 in protein, S-H=0.89, PP=86% in nucleotide<br />

analyses).<br />

4.2.2. Mcm1 transcription factor<br />

In fungal phylogeny based on the analysis of Mcm1 protein/gene only ML of<br />

protein sequences was considered, since the nucleotide analysis showed a contradictory<br />

topology to the one obtained with protein due to the high difference of MCM1 sequence<br />

sizes (Figure 10). BI was not performed.<br />

The phylogeny inferred from Mcm1 protein analysis coincides in the main<br />

points with the one inferred from Rlm1 protein/gene analyses. The Fungi were resolved


RESULTS AND DISCUSSION<br />

as a single clade with S-H= 0.82, strongly supported. The major differences in the<br />

overall topology fall on some clades:<br />

-The subphylum Mucoromycotina ‗Zygomycota‘ presented well supported<br />

polyphyly between the families Phycomycetaceae and Mucoraceae, while with Rlm1<br />

presented monophyly.<br />

-Within the phylum Basidiomycota, the species Malassezia globosa formed a<br />

sister group with the family Schizosaccharomycetes from the phylum Ascomycota with<br />

a good support value, contrary to the obtained with Rlm1 protein analysis, that correctly<br />

placed the Schizosaccharomycetes within the Ascomycota.<br />

-In the phylum Ascomycota, the class Saccharomycetes is present as a<br />

monophyletic group, but with a lower nodal support, while with Rlm1 analyses is well<br />

supported.<br />

-Within the subphylum Pezizomycotina, in the class Sordariomycetes (formed<br />

by the orders Sordariales, Phyllacorales) and the family Magnaportheceae presented<br />

Magnaporthe grisea as the basal clade, with a moderate nodal support, contrary to<br />

results found in the phylogeny with Rlm1 protein/gene that presented Verticillium<br />

dahliae as the basal taxon.<br />

-The basal taxon in the family Hypocraceae is Trichoderma atroviride, which<br />

disagrees with the obtained topology with Rlm1 protein analyses that places T. reesei as<br />

the basal taxa.<br />

-In the class Dothideomycetes, the latest taxa of the order Pleosporales is<br />

different regarding the result found in the topology with Rlm1 protein/gene analyses.<br />

With Mcm1 the latest taxa is Alternaria brassicicola and with Rlm1 is Pyrenophora<br />

tritici-repentis, but with Mcm1 analysis the nodal support is very low.<br />

-Within the class Eurotiomycetes Ascosphaera apis was found as basal of all the<br />

orders that constitute this clade, while the Rlm1 analyses placed as a sister taxon to<br />

Microsporum gypseum.<br />

-The clade Arthrodermateceae (Microsporum gypseum) is a sister clade of the<br />

group formed by Onygenaceae/Gymnoascaceae but presented a low nodal support<br />

contrary to results obtained with the nucleotide RLM1 analysis that joined it to the clade<br />

Ajellomycetaceae, also with low support.<br />

61


62<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Figure 10. Fungal phylogeny based on Mcm1 protein sequences. This phylogeny was obtained by<br />

maximum likelihood. The nodes are supported by Shimodaira-Hasegawa like support.<br />

Species with partial sequence: Kluyveromyces waltii, Blastomyces dermatitidis, Botrytis<br />

cinerea, Cryptococcus gatti.


RESULTS AND DISCUSSION<br />

-The order Eurotiales is weakly supported, while the group formed by the taxa<br />

A. niger, A. fumigatus and Neosarterya fischeri is well supported, contrary to results<br />

obtained in Rlm1 phylogeny, where A. clavatus was joined to the two latter taxa<br />

mentioned before.<br />

-The placement of the taxa A. terreus is also different in this analysis, it was<br />

placed as a sister taxon to A. oryzae and A.flavus, but only in the Rlm1 gene/ protein<br />

analyses its placement is not well supported. The same for A. clavatus that with Rlm1<br />

protein/genes analyses formed a highly supported group with Neosartorya fisheri and A.<br />

fumigatus, but with Mcm1 presented an uncertain placement.<br />

4.2.3. Placement of fungal phylogeny based on MADS-box transcription<br />

factor within the current literature<br />

Several phylogenetric studies have demonstrated the close relationship of the<br />

king<strong>do</strong>ms Animalia and Fungi (Baldauf and Palmer, 1993; Baldauf et al., 2000; Lang et<br />

al., 2002). These two king<strong>do</strong>ms are now known to be a part of a larger group that<br />

includes the phylum Choanozoa, termed the Opisthokonts (Cavalier-Smith and Chao,<br />

1995; Ragan et al., 1996). Together, the Opisthokonts and the phylum Amoebozoa form<br />

the group Unikonts (Cavalier-Smith, 2002b). In the present work the phylum<br />

Choanozoa has not been included but the close relationship between the two king<strong>do</strong>ms,<br />

Animalia and Fungi, was observed using species from the king<strong>do</strong>m Plantaea as<br />

outgroup. The king<strong>do</strong>m Plantae is part of the group Bikonts (Cavalier-Smith, 2002b).<br />

The phylogenetic topology obtained in this study is also in agreement with the topology<br />

obtained in a study that explain the origin of the MADS-box proteins in Animalia,<br />

Fungi, and Plantae (Alvarez-Buylla et al., 2000).<br />

Regarding the king<strong>do</strong>m Fungi, previous studies based on multigenic and rDNA<br />

analysis indicated that the phyla Microsporidia, Chytridiomycota,<br />

Neocallimastigomycota, Blastocladiomycota and the subphyla incertae sedis<br />

Mucoromycotina, Zoopagomycotina, Entomophthoromycotina and Kickxellomycotina<br />

are part of the earliest known divergence that took place during fungal evolution (Bruns<br />

et al., 1992; Berbee and Taylor, 1993; Tanabe et al., 2000, Lutzoni et al., 2004;<br />

Blackwell et al., 2006; Fitzpatrick et al., 2006; James et al., 2006). However, due to the<br />

low number of avalaible sequences within all the phyla that represents the king<strong>do</strong>m<br />

63


64<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Fungi, only sequences from species within Basidiomycota and Ascomycota phyla and<br />

the subphylum Mucoromycotina were used in this study.<br />

The subphylum Mucoromycotina is represented by species that belonged to the<br />

ex-phylum Zygomycota (Hibbet et al., 2007). The results obtained in our work, showed<br />

that the subphylum Mucoromycotina is an early diverging clade (Figures 8 and 9) which<br />

agrees with previous studies. The subphylum Mucoromycotina presented monophyly<br />

with moderate support with Rlm1 sequence analyses, in agreement with phylogenetic<br />

analysis inferred with ACT1, RPB1, RPB2, EF-1α, small and large subunit rDNA genes<br />

(Voigt and Wöstemeyer, 2001; James et al., 2006), and polyphyly in the analysis with<br />

MCM1 gene (Figure 10), like what was found in other studies (Lutzoni et al., 2004). To<br />

resolve this monophyly/polyphyly conflict the inclusion of sequences from other<br />

species within the subphyla Mucoromycotina, as well as from the other subphyla<br />

incertae sedis would be important. The use of additional protein-coding genes would<br />

also help to resolve this conflict.<br />

Basidiomycota includes about 30,000 species of rusts, smuts, yeasts, and<br />

mushroom fungi (Kirk et al., 2001). Most of them are characterized by their meiospores<br />

(basidiospores) on the exterior of typically club-shaped meiosporangia (basidia).<br />

Phylogenetic relationships among the three subphyla of Basidiomycota are still<br />

uncertain. The subphylum Pucciniomycotina is primarily distinguished by containing<br />

the rust fungi (7000 species), which are primarily pathogens of land plants. In our study,<br />

the species representing the subphylum Pucciniomycotina grouped together with the<br />

species of subphylum Agaricomycotina (Figures 8, 9 and 10). This result is not<br />

consistent with previous studies that suggested a sister group relationship with the group<br />

formed by the subphyla Ustilaginomycotina and Agaricomycotina (Lutzoni et al., 2004,<br />

James et al. 2006). The result from our study could be due to the use of only one<br />

representant of the subpylum Puccioniomycotina, the Sporobolomyces roseus, and also<br />

due to long-branch attraction artifact since the species used presented large MADS-box<br />

genes, like the ones observed within the representants of the class Tremellomycetes<br />

from the subphylum Agaricomycotina.<br />

The Ustilaginomycotina includes 1500 species of true smut fungi and yeasts,<br />

most of which cause systemic infections of angiosperm plant hosts. The


RESULTS AND DISCUSSION<br />

Agaricomycotina includes almost two-thirds of the known basidiomycetes, including<br />

the vast majority of mushroom forming fungi. Much of the morphological diversity<br />

exemplified in mushroom fruiting bodies is the result of radiations of certain lineages<br />

within the Agaricomycotina, and recovering their relationships with confidence has<br />

proven very difficult (Binder et al., 2005; Moncalvo et al., 2002). In our work, early-<br />

diverging lineages in the Agaricomycotina are strongly supported, which include<br />

parasitic and/or saprotrophic fungi capable of dimorphism or yeast-like phases, in<br />

agreement with previous studies that found the same relationships among species of this<br />

subphylum (Lutzoni et al., 2004; Fitzpatrick et al., 2006, James et al., 2006).<br />

Ascomycota is the largest phylum within the Fungi characterized by the<br />

production of meiospores (ascospores) in specialized sac-shaped meiosporangia (asci),<br />

which may or may not be produced within a sporocarp (ascoma). The majority of the<br />

species studied in our analysis belonged to this phylum. Within the Ascomycota three<br />

main subphyla are recognized, the Taphrinomycotina, the Pezizomycotina and the<br />

Saccharomycotina. Results from this study showed that all subphyla of Ascomycota<br />

were well supported as monophyletic groups in the phylogenetic tree presented in<br />

Figure 9.<br />

In the results obtained with the Mcm1 protein and RLM1 gene (Bayesian<br />

inference), the class Schizosaccharomycetes from the subphylum Taphrinomycotina sits<br />

outside the other subphyla from Ascomycota. Taphrinomycotina has been resolved as<br />

the earliest diverging monophyletic clade (Liu et al., 1999; Fitzpatrick et al., 2006;<br />

James et al., 2006; Liu et al., 2006; Liu et al., 2009). It includes a diverse group of<br />

species that exhibits yeast-like (for example, Pneumocystis carinii) and dimorphic (for<br />

example, Taphrina deformans) growth forms. The incongruence observed in the<br />

placement of the Schizosaccharomycetes could be due to the fact that in the MADS-box<br />

proteins identified for Ustilaginomycotina and Schizosaccharomycotina a high<br />

variability in the C‘-terminal region was observed. This variability caused a conflict in<br />

the topology determined by bayesian inference and maximum likelihood for RLM1 gene<br />

and Mcm1 protein, respectively (Figure 9 and 10), producing the paraphyly observed,<br />

and grouping these species. Another hypothesis to explain the incongruence could be<br />

the wrong and partial assembly of genomes, which would affect the correct sequence of<br />

genes in the databases; for instance the sequence from MCM1 gene of Blastomyces<br />

65


66<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

dermatitidis was identified, but part of the N‘-terminal was not matching with the query<br />

sequence. Thus, this sequence had to be corrected and completed with the N‘-terminal<br />

from closer species. Other reason for this incongruence could be the robustness of the<br />

method for phylogenetic inference, since the ALRT approach in maximum likelihood<br />

estimates the probability of a branch being correct better than bootstrapping, and is<br />

more conservative than Bayesian posterior probabilities (Anisimova and Gascuel, 2006;<br />

Hall and Salipante, 2007). That could explain why, the class Schizosaccharomycetes<br />

was correctly placed in the analysis with Rlm1 protein and DNA analyses by maximum<br />

likelihood and aLRT test.<br />

The subphylum Saccharomycotina consists of the ‗true yeasts‘, including<br />

bakers‘ yeast (Saccharomyces cerevisiae) and Candida albicans, the most frequently<br />

encountered fungal pathogen of humans, and its monophyly has been demonstrated in<br />

several studies (Lutzoni et al., 2004; Fitzpatrick et al., 2006; James et al. 2006; Suh et<br />

al., 2006). Results obtained in this work are also in agreement with the previous studies<br />

in which three clades, the CUG, the Saccharomyces sensu stricto and the<br />

Saccharomyces sensu lato are clearly shown (Figure 8, 9 and 10).<br />

Pezizomycotina is the largest subphylum of Ascomycota and includes the vast<br />

majority of filamentous and fruit-body-producing species. Within the Pezizomycotina a<br />

number of well-defined classes are observed, namely the Sordariomycetes, the<br />

Leotiomycetes, the Eurotiomycetes and Dothideomycetes whose relationships has been<br />

the subject of many debates in previous studies (Berbee, 1996; Liu et al., 1996). The<br />

topology obtained with Rlm1 and Mcm1 protein/gene in this study indicated that<br />

Leotiomycetes and Sordariomycetes are well supported sister clades (Figure 8, 9 and<br />

10). This result is in agreement with several previous analyses, like the rDNA-based<br />

analysis (Lumbsch et al., 2005), multigenic analyses (James et al., 2006; Spatafora et<br />

al., 2006) and analyses based on complete fungal genomes (Fitzpatrick et al., 2006;<br />

Robbertse et al., 2006).), but is in disagreement with one analysis of four gene<br />

combined dataset that placed the Dothideomycetes as a sister group to the<br />

Sordariomycetes (Lutzoni et al., 2004).<br />

In this study, within the Sordariomycetes, the inferred phylogenetic relationships<br />

amongst the Sordariomycetidae organisms concur with previous phylogenetic studies


RESULTS AND DISCUSSION<br />

(Berbee, 2001). However, in the order Sordariales the placement of Magnaporthe grisea<br />

is not well-defined as the analysis with Mcm1 protein considers it as basal taxon, unlike<br />

the analysis with Rlm1 protein/gene considers Verticillium dahliae as basal taxon.<br />

These results disagree with the phylogeny of Sordariomycetes obtained by other studies,<br />

which placed V. dahliae as the sister taxon of the Hypocreales (Zhang et al., 2006).<br />

Likewise, in this study the inferred phylogeny based on Mcm1 protein and RLM1 gene<br />

(Bayesian inference) propose that Ephicloe festucae is a sister taxon of the group<br />

formed by Fusarium species (Figure 9 and 10), and the analysis of Rlm1 protein/gene<br />

(maximum likelihood) placed it as sister taxon of group formed by Trichoderma<br />

species. However, in our study only one species from family Clavicipitaceae was used.<br />

The Dothideomycetes class form a well-supported sister group with the<br />

Eurotiomycetes in the analysis of Rlm1 protein/gene, but not in the analysis of Mcm1<br />

protein (S-H=0.0). The former result agrees with the previous analysis carried out with<br />

four combined genes which also grouped together the Dothideomycetes and the<br />

Eurotiomycetes (Lutzoni et al; 2004). Conversely, a concatenated alignment of 42<br />

fungal genomes inferred that Stagonospor no<strong>do</strong>rum (a member of the Dothideomycetes)<br />

is more closely related to the Sordariomycetes and Leotiomycetes lineages (Fitzpatrick<br />

et al., 2006). Another study also based on 17 Ascomycota genomes, which used<br />

concatenated alignment, reported conflicting inferences regarding the phylogentic<br />

position of S. no<strong>do</strong>rum (Robbertse et al., 2006). The phylogenies reconstructed based<br />

on 17 genomes, using NJ and ML methods, inferred a sister group relationship between<br />

S. no<strong>do</strong>rum and Eurotiomycetes (Robbertse et al., 2006). This work is in accordance<br />

with a supertree inferred based on 42 complete genomes (Fitzpatrick et al., 2006) and<br />

the tree obtained with Rlm1 protein/gene and Mcm1 protein analyses. However, the<br />

same phylogenomic study inferred, by using maximum parsimony, the placement of S.<br />

no<strong>do</strong>rum at the base of the Pezizomycotina (Robbertse et al., 2006). The topology<br />

inferred in this work clearly showed two sister clades (orders Capnodiales and<br />

Pleosporales), which concurs with previous phylogenetic studies (Schoch et al., 2006).<br />

In this study, within the Eurotiomycetes class there is only a clade corresponding<br />

to the order Onygenales (Histoplasma capsulatum; Blastomyces dermatitidis<br />

Paracoccidioides brasiliensis, Coccidioides immitis, Coccidioides posadassii,<br />

Uncinocarpus reesii, Microsporum gypseum and Ascosphaera apis). The Onygenales<br />

67


68<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

clade is particularly interesting as it contains mainly animal pathogenic species. Some<br />

of them namely Coccidioides immitis have initially been classified as a protist, but<br />

further research showed it were a fungus and separate studies placed it in three different<br />

divisions of the former group named Eumycota (Rixford and Gilchrist, 1896; Ophuls,<br />

1905; Ciferri and Redaelli, 1936; Baker et al., 1943). Subsequent ribosomal phylogeny<br />

studies (Pan et al., 1994; Bowman et al., 1996) suggested a close phylogenetic<br />

relationship between C. immitis and U. reesii, excluding H. capsulatum. The<br />

phylogenies based on Rlm1 and Mcm1 protein/gene agree with the placement of C.<br />

immitis, C. posadassii and U. reesii as sister taxa, representing two families,<br />

Gymnoascaceae and Onygenaceae. However, a conflict was observed in the placement<br />

of Ascosphaera apis, which formed a clade with Microsporum gypseum with the Rlm1<br />

protein/gene analysis and appears as a basal taxon in Eurotiomycetes, with the Mcm1<br />

protein analysis. The obtained results disagree with the Eurotiomycetes phylogeny<br />

study, in which Ascosphaera apis (Ascosphaeraceae) formed a sister clade in the<br />

Ajellomycetaceae (H. capsulatum, P. brasiliensis and B. dermatitidis) and Microsporum<br />

gypseum, a sister clade of Gymnoascaceae (Geiser et al., 2006). The Eurotiomycetes<br />

branch containing the Eurotiales clade inferred the close relationship among<br />

Talaromyces stipitatus and Penicillium marneffei (Figures 8, 9 and 10). A minor<br />

difference in the Aspergillus clade was observed between the phylogenetic analyses<br />

with Rlm1 and Mcm1 regarding the position of A. nidulans, A. terreus, A. niger and A.<br />

fumigatus (Figure 8, 9 and 10). In this study, Neosatoria fischeri and A. fumigatus<br />

formed well-supported sister taxa, as well as A. oryzae and A. flavus, in accordance to<br />

Fitzpatrick et al.(2006) and James et al. (2006) studies.<br />

Due to the fact that the majority of fungi are still undiscovered, a robust phylogeny<br />

of known taxonomic groups will be essential for placement of unknown species as these<br />

are discovered. As demonstrated for Bacteria and Archaea (Pace, 1997), the Fungi are<br />

likely to harbor many lineages whose discovery is dependent on phylogenetic analyses.<br />

Therefore, it will be necessary to carry out phylogenetic studies including a wider set of<br />

fungal species from other phyla, as the Glomeromycota and the Chytridiomycota, what<br />

will enable to resolve the phylogenetic relationships between misclassified and<br />

misplaced fungal species.


4.3. Relationships within the subphylum Saccharomycotina<br />

RESULTS AND DISCUSSION<br />

Due to the relationship that exists between Candida species and ‗Saccharomyces<br />

complex‘, these groups were further analysed by using both single gene and<br />

concatenated gene phylogenies.<br />

4.3.1. Phylogeny of the ‘Saccharomyces complex’<br />

In figures 8, 9 and 10, the subphylum Saccharomycotina is clearly shown as a<br />

monophyletic clade grouping Yarrowia lipolytica, Candida species and ‗Saccharomyces<br />

complex‘ species, in the phylogenetic tree inferred for the king<strong>do</strong>m Fungi. The<br />

phylogenies deduced based on single gene or concatenated genes, including only<br />

species belonging to the subphylum Saccharomycotina are present in Figures 11 to 14.<br />

Results from this analysis showed a topology very similar to the previously inferred<br />

topologies that included all other fungal species, by using Rlm1 and Mcm1 genes.<br />

The Non Whole Genome Duplication (Non-WGD) group, formed by<br />

Kluyveromyces lactis, Ashbya gossypii, Saccharomyces kluyveri and Kluyveromyces<br />

waltii, presented monophyly in all analyses, except in the topologies based on MCM1<br />

gene by bayesian inference (Figure 12B). The relationships amongst the Non-WGD<br />

species presented a variable placement of taxa, with their grouping poorly defined<br />

(Figures 11, 12 and 13). These results were not conclusive due to the low number of<br />

taxa within this complex that have available sequences for this study. In other studies,<br />

based on multigenic and phylogenomic analyses, the relationship amongst the Non-<br />

WGD species has been demonstrated showing K. waltii – S. kluyveri and K. lactis – A.<br />

gossypii grouping together (Fitzpatrick et al., 2006; James et al., 2006; Jeffroy et al.,<br />

2006; Liu et al., 2006).<br />

Another study of the ‗Saccharomyces complex‘ based on multigenic analysis,<br />

that used 75 species showed A. gossypii and K. lactis forming basal separated clades<br />

from S. kluyveri and K. waltii, these latter clade with a medium support (Kurtzman and<br />

Robnett, 2003). Another phylogenetic study based on SSU and LSU rDNA sequences<br />

from 64 species of the family Saccharomycetales placed K. lactis as an early clade and<br />

A. gossypi as a basal clade (Suh et al., 2006). Thus, the correct relationship amongst the<br />

Non-WGD species is still to be resolved.<br />

69


70<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

In our study, a noticeable difference occured in the placement of Saccharomyces<br />

castellii in the WGD clade with respect to previous studies. In the analyses based on<br />

RLM1 gene and protein, S. castellii was a well-supported early taxon within the WGD<br />

group (Figure 11).<br />

A B<br />

Figure 11. Phylogeny of the subphylum Saccharomycotina inferred by using RLM1. (A) Protein<br />

phylogeny obtained by maximum likelihood using the evolutive model JTT + I + G with<br />

nodal support by Shimodaira-Hasegawa; (B) Gene phylogeny obtained by using the evolutive<br />

model GTR + I + G. Nodes values correspond to posterior probabilities (left) and maximum<br />

likelihood (right), Shimodaira-Hasegawa like support). Species with partial sequence are<br />

Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus.<br />

However, the analysis based on Mcm1 protein showed a low support of S.<br />

castellii as an early taxon from Saccharomyces sensu stricto group (S. bayanus, S.<br />

kudriavzevii, S. mikatae, S. cerevisiae and S. para<strong>do</strong>xus). The analysis with MCM1 gene<br />

indicated a weak monophyly by ML with Saccharomyces sensu lato group (S. castellii,<br />

C. glabrata, Vanderwaltozyma polyspora) and BI presented a medium support as early<br />

taxon from Saccharomyces sensu stricto group (Figure 12).<br />

The Rlm1-Mcm1 concatenated phylogeny placed S. castellii as an early taxon from<br />

Saccharomyces sensu stricto group, both by maximum likelihood and bayesian methods<br />

(Figure 13). Additionally, in the global fungal phylogeny analysis based on Rlm1 and<br />

Mcm1 sequences, S. castellii also presented an inaccurate placement (Figures 11 - 13).


RESULTS AND DISCUSSION<br />

Inacurrate placement of S. castellii was also obtained by past multigenic studies, in<br />

which it either competed with C. glabrata to the basal placement in the Saccharomyces<br />

sensu stricto group, or formed a clade together with this group (Diezmann et al., 2004;<br />

Fitzpatrick et al., 2006; James et al., 2006; Jeffroy et al., 2006; Liu et al., 2006).<br />

A B<br />

Figure 12. Phylogeny of the subphylum Saccharomycotina inferred by using MCM1. (A) Protein<br />

phylogeny maximum likelihood using the evolutive model JTT + I + G with nodal support by<br />

Shimodaira-Hasegawa; (B) Gene phylogeny using the evolutive model GTR + I + G. Nodes<br />

values correspond to posterior probabilities (left) and maximum likelihood (right,<br />

Shimodaira-Hasegawa like support). b 0.16 is the support from group formed by S. castelli,<br />

C. glabrata and V.polypsora by maximum likelihood. Species with partial sequence:<br />

Kluyveromyces waltii<br />

However, a study including a higher number of Saccharomycetales species and<br />

based on SSU and LSU rDNA showed that V. polyspora was closer to S. cerevisiae than<br />

S. castellii and C. glabrata (Suh et al., 2006). Another study based on mitochondrial<br />

and nuclear genes as COX1, mit SSU rDNA, EF-1α, rDNA (ITS1, ITS2), nuc LSU<br />

rDNA (D1/D2), placed S. castellii forming a closer clade to Saccharomyces sensu<br />

stricto group, and C. glabrata and V. polyspora forming earlier clades than S. castellii<br />

with respect to Saccharomyces sensu stricto group (Kurtzman and Robnett, 2003).<br />

The sister group relationships amongst the Saccharomyces sensu stricto species<br />

also differed between Rlm1, Mcm1 and concatenated phylogenies (Figures 11 - 13). For<br />

example, the Saccharomycetales Rlm1 protein/gene phylogeny placed S. bayanus and S.<br />

71


72<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

kudriavzevii as sister taxa at the base of the Saccharomyces sensu stricto node and S.<br />

mikatae, S. cerevisiae and S. para<strong>do</strong>xus forming a group within it, by ML and BI. But<br />

Mcm1 protein phylogeny placed S. mikatae as basal taxon while MCM1 gene<br />

phylogeny did not define with accuracy the basal taxon between S. bayanus or S.<br />

kudriavzevii.<br />

A B<br />

Figure 13. Phylogeny of the subphylum Saccharomycotina inferred by using concatenated alignment of<br />

RLM1 - MCM1. (A) Protein phylogeny obtained by maximum likelihood with the evolutive<br />

model JTT + I + G and nodal support by Shimodaira-Hasegawa; and (B) Gene phylogeny<br />

obtained using the evolutive model GTR + I + G. Nodes values correspond to posterior<br />

probabilities (left) and maximum likelihood (right, Shimodaira-Hasegawa like support).<br />

Species with partial sequence: Kluyveromyces waltii.<br />

The concatenated alignment of Rlm1 and Mcm1 proteins (Figure 13) inferred a<br />

result similar to Rlm1 gene/protein phylogenies (Figure 12), and the analysis based on<br />

concatenated alignment of genes (RLM1 and MCM1) placed S. bayanus at the base of<br />

the Saccharomyces sensu stricto node and inferred a ladderised topology amongst the<br />

Saccharomyces sensu stricto species (Figure 13). These phylogenetic trees, based on<br />

RLM1 and concatenated genes, agree with supertrees that indicated both ladderised<br />

topology and S. bayanus and S. kudriavzevii as sister taxa (Fitzpatrick et al., 2006,<br />

Jeffroy et al., 2006). The study of the ‗Saccharomyces complex‘ based on multigenic<br />

analysis also showed a ladderised topology in Saccharomyces sensu stricto species,<br />

having S. bayanus as basal taxon (Kurtzman and Robnett, 2003).


RESULTS AND DISCUSSION<br />

To analyze the degree of conflicting phylogenetic signal within the concatenated<br />

alignment, a phylogenetic network was constructed and a bootstrap analysis was<br />

performed on this phylogenetic network (Figure 14). It is interesting to note that<br />

numerous alternative splits were obtained (60 in total), however no splits were observed<br />

excluding C. glabrata, V. polyspora and S. castellii from the remaining WGD<br />

organisms, despite these species form clades within the WGD group. These results are<br />

in conflict with the concatenated phylogeny (Figure 13), which inferred that C. glabrata<br />

and V. polyspora formed a well-defined clade, and S. castellii was the basal species<br />

from the remaining WGD organisms. The syntenic information clearly shows that S.<br />

castellii diverged from the Saccharomyces sensu stricto lineage before C. glabrata<br />

(Scanell et al., 2006). Therefore topologies that place C. glabrata as outgroup to the<br />

Saccharomyces sensu stricto lineage and S. castellii or vice-versa are unreliable<br />

(Scanell et al., 2006; Fitzpatrick et al., 2006; Suh et al., 2006), like the ones observed in<br />

our study with the simple and concatenated genes. It is known that systematic bias can<br />

influence the resulting tree (Phillips et al., 2004). Hence, a closer scrutiny would be<br />

necessary.The incongruences found in these analyses suggest that the use of additional<br />

Saccharomycotina taxa would be important to help solve the placement of species<br />

belonging to these groups. In addition, it is likely that stochastic errors and erroneous<br />

inferences could be eradicated with the addition of extra genome data, when it becomes<br />

available.<br />

Figure 14. Phylogenetic network reconstructed using a concatenated alignment of RLM1 – MCM1 genes<br />

in the subphylum Saccharomycotina. The NeighborNet method was used to infer splits within<br />

the alignment (1000 bootstrap nodal support).<br />

73


74<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

4.3.2. Phylogenetic relationship amongst Candida species<br />

The group of organisms that translate mainly CUG as serine instead of leucine has<br />

been designated as the CUG group (Kawaguchi et al., 1989; Santos and Tuite, 1995).<br />

This co<strong>do</strong>n reassignment has been proposed to have occurred about 170 million years<br />

ago (Massey et al., 2003). Rlm1, Mcm1 and concatenated topologies inferred a robust<br />

monophyletic clade containing organisms within the CUG group (Figures 11 - 13). The<br />

topology analysis showed that there are four putative CUG sub-clades, the first<br />

containing C. lusitaniae, the second C. guilliermondii and Debaromyces hansenii, the<br />

third Pichia stipitis and the fourth containing C. tropicalis, C. albicans, C. dubliniensis,<br />

C. parapsilosis, and Lodderomyces longisporus. These topologies have also been<br />

recognized in other phylogenetic studies inferred by rRNA genes (Suh et al., 2006).<br />

Some phenotypic and morphologic characteristics could correlate with the topology<br />

observed since C. lusitaniae and C. guilliermondii are haploid yeasts, and their<br />

teleomorphic states present sexual reproduction (Wickerham and Burton, 1954;<br />

Rodriguez de Miranda, 1979; Young et al., 2000). D. hansenii and Pichia stipitis are<br />

homothallic, with a fused mating locus (Dujon et al., 2004; Fabre et al., 2005). In<br />

contrast, members of the fourth sub-clade have at best a cryptic sexual cycle and have<br />

never been observed to undergo meiosis (Hull and Johnson, 1999; Hull et al., 2000;<br />

Magee and Magee, 2000; Pujol et al., 2004; Logue et al., 2005).<br />

The phylogeny of Candida species was also investigated in this study by using a<br />

further analysis with Iff8 protein and gene (Figure 15). The resultant topologies placed<br />

L. elongisporus within the asexual clade with high support, which is in agreement with<br />

other phylogenetic studies based on multigenic and phylogenomic analyses (James et<br />

al., 1994; Fitzpatrick et al., 2006; Diezmann et al., 2004).<br />

The CUG specific topology based on three concatenated proteins/genes, RLM1,<br />

MCM1 and IFF8, suggested that D. hansenii and C. guilliermondii are sister taxa, as<br />

they are grouped together with high support by maximum likelihood and bayesian<br />

inference, excluding C. lusitaniae and P. stipitis (Figure 16). Topologies obtained with<br />

Rlm1 protein/gene and RLM1-MCM1 concatenated genes presented conflicts in the<br />

placement of P. stipitis, being included within C. guilliermondii – Debaryomyces<br />

hansenii clade (Figure 11 and 13A).


A B<br />

RESULTS AND DISCUSSION<br />

Figure 15. Phylogeny of CUG Group based on IFF8. (A) Protein phylogeny obtained by maximum<br />

likelihood, using the JTT + I + G evolutive model with nodal support by Shimodaira-<br />

Hasegawa; (B) Gene phylogeny obtained by using the evolutive model used was GTR + I +<br />

G. Nodes values correspond to posterior probabilities (left) and maximum likelihood (right,<br />

Shimodaira-Hasegawa like support). *In the maximum likelihood analysis, C. tropicalis,<br />

C.parapsilosis and L.elongisporus form a clade with 0.41 S-H and C.parapsilosis-<br />

L.elongisporus are sister taxa with 0.98 S-H. The species mentioned before with C.albicans<br />

and C.dubliniensis form a clade well supported 0.97S-H.<br />

A B<br />

Figure 16. Subphylum Saccharomycotina phylogeny reconstructed using concatenated alignment of<br />

RLM1 - MCM1 – IFF8. (A) Protein phylogeny obtained by maximum likelihood with the<br />

evolutive model JTT + I + G and Shimodaira-Hasegawa like nodal support; (B) Genes<br />

phylogeny obtained by using the evolutive model used was GTR + I + G. Nodes values<br />

correspond to posterior probabilities (left) and maximum likelihood (right, Shimodaira-<br />

Hasegawa like support).<br />

75


76<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Other studies have placed C. lusitaniae in a clade with C. guilliermondii, and<br />

inferred a closer relationship between the two with Debaryomyces species (Daniel et al.,<br />

2001; Diezmann et al., 2004). These results raise interesting questions regarding the<br />

sexual status of the Candida species. It is possible that the "asexual" species are in fact<br />

fully sexual. C. albicans, C. dubliniensis and C. tropicalis have been observed to mate<br />

(Soll et al., 1988; Pujol et al., 2004), and in addition C. albicans genome contains most<br />

of the required genes for the development of meiosis (Tzung et al., 2001). In contrast<br />

the evidence that L. elongisporus reproduces sexually is based on the appearance of<br />

asci, with one (or sometimes two) spores (Van der Walt, 1966).<br />

The phylogenetic network constructed based on the three concatenated genes<br />

(Figure 17) corroborated the CUG-specific concatenated tree regarding the grouping of<br />

C. guilliermondii and D. hansenii as sister taxa, excluding C. lusitianiae and P.stipitis,<br />

but when the network was inferred by two concatenated genes C. guilliermondii, D.<br />

hansenii and P. stipitis formed sister taxa (Figure 14). This analysis showed a topology<br />

conflict with the phylogeny based on Iff8 protein and gene for L. elongisporus group<br />

with C. parapsilosis (Figure 15). As expected there was no conflict for the grouping of<br />

C. albicans and C. dubliniensis, illustrating their high genotypic similarity (Sullivan et<br />

al., 1995), and those species formed with C. tropicalis a highly supported clade, which<br />

agree with other recent published studies (Fitzpatrick et al., 2006; James et al., 2006;<br />

Suh et al., 2006).<br />

Figure 17. Subphylum Saccharomycotina phylogenetic network reconstructed using a concatenated<br />

alignment of RLM1 – MCM1 – IFF8 genes. The NeighborNet method was used to infer splits<br />

within the alignment (1000 bootstrap nodal support).


4.4. Analysis of adaptive evolution<br />

RESULTS AND DISCUSSION<br />

Genome duplication has occurred in the WGD group as previously described (Kellis<br />

et al., 2004). Understanding the fate of duplicates is fundamental to clarify mechanisms<br />

of genetic redundancy and the link between gene family diversification and phenotypic<br />

evolution. Models to identify possible sites of positive selection can provide alternative<br />

explanations for the persistence and diversification of gene duplicates.<br />

The detection of an excess in the ratio of the rate of nonsynonymous (dN) over the<br />

synonymous (dS) substitutions (dN/dS, also denoted ω) is a nonambiguous indicator of<br />

positive selection at the co<strong>do</strong>n level. Positive selection (ω>1) is considered as the<br />

selection of fixing advantageous mutations, and this term is used interchangeably with<br />

molecular adaptation and adaptive molecular evolution. Purifying selection (ω


78<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

avoid the presence of a high number of gaps and to analyze closer species. This new<br />

analysis was performed by using the HYPHY 1.0 package because it consider the<br />

presence of gaps calculating a dN/dS value for those positions (Annex VIII). To better<br />

analyze this sequence the conserved moieties were identified. In the alignment of<br />

Saccharomycotina Rlm1 proteins four conserved moities were recognized, the first<br />

motif is the MADS-box conserved region at the N-terminal, the second is an acidic<br />

region close to MADS-box, third and fourth motif were named A and B, respectively<br />

(Figure 18).<br />

MADS-BOX<br />

Acidic region A<br />

B<br />

Figure 18. Conserved regions in RLM1 gen of the subphylum Saccharomycotina: MADS-box, Acidic<br />

region, A-B Conserved regions present in this subphylum.


RESULTS AND DISCUSSION<br />

At the C-terminal of Rlm1 a repetitive region was identified in Candida albicans,<br />

Candida dubliniensis, Kluyveromyces lactis and Saccharomyces sensu stricto group<br />

(Figure 19).<br />

Results obtained by applying the models to identify sites of positive selection<br />

showed exactly the same evidence, no positive selection not only in the MADS-box<br />

<strong>do</strong>main but also in the all RLM1 sequence, confirming the previous findings. The<br />

significance of the values obtained with the complex models against their corresponding<br />

null models was tested by using a Likelihood Ratio Test (LRT), according to Anisimova<br />

et al. (2001) (Annex IX).<br />

The LRT test confirmed the results supporting no positive selection. Overall, the<br />

results from maximum-likelihood models of co<strong>do</strong>n evolution indicated that the<br />

evolution of MADS-box from fungi and Saccharomycotina RLM1 gene are mainly<br />

under purifying selection, suggesting that the expressed protein plays an important role<br />

in the fungi that must be maintained.<br />

Figure 19. Region of microsatellites present in some species from subphylum Saccharomycotina.<br />

Alignment was performed by using MEGA 4.0 with ClustalW.<br />

A new co<strong>do</strong>n-based model the Mechanistic Empirical Combined (MEC), based<br />

on the junction of all parameters used by the traditional models mentioned before and a<br />

new empirical amino acid replacement model was recently described (Doron-<br />

Faigenboim and Pupko, 2006). This MEC model was also tested on the<br />

Saccharomycotina sequences to detect sites in Rlm1 under positive selection. By means<br />

of this new model it was possible to identify sites under positive selection in the<br />

79


80<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Saccharomycotina Rlm1 protein. The protein sequence from S. cerevisiae was used as<br />

pattern to identify those sites (Figure 20).<br />

The MEC model produced a loglikelihood value of -38544.7 that is higher than -<br />

39363.5 obtained for M8a (used as the null model) which indicates the presence of sites<br />

under positive selection. To test the significancy of the MEC model result, an AICc<br />

(Akaike Information Criterion correction) was performed and the value of AICc for the<br />

MEC model (77099.41602) was lower than the one obtained for the M8a model<br />

(78735.01068), which means that the hypothesis of positive selection can be accepted.<br />

Several sites in the Rlm1 sequence were identified as putatively under positive<br />

selection. It is curious to observe that 57% (20 out of 35) of the places under positive<br />

selection were observed in the first part of the protein, right after the MADS-Box and<br />

before the region A. Between the MADS-box and the acidic region 12% of the amino<br />

acid residues were identified as being under positive selection, between the acidic<br />

region and the region A 8%, between the A and B region 1.7%, and between the B and<br />

the repetitive region 10% (Figure 20).<br />

A particular fact was the observation of amino acids under positive selection<br />

near the microsatellite region. Taking into consideration this repetitive region, we can<br />

observe that in C. albicans, C. dubliniensis and K. lactis the amino acid under repetition<br />

is glutamine, coded by CAA, while in the group of Saccharomyces sensu stricto /S.<br />

cerevisiae is asparagine, encoded by AAC. This observation showed that amino acid<br />

substitutions that would have occurred during the divergence of species, due to events<br />

of mutations and natural selection changed this microsatellite region and probably<br />

diversified gene function in WGD species, avoiding the loss of protein function.<br />

Numerous lines of evidence have demonstrated that genomic distribution of<br />

repetitive DNA, particularly of microsatellites, is nonran<strong>do</strong>m presumably because of<br />

their effects on chromatin organization, regulation of gene activity, recombination,<br />

DNA replication, cell cycle, mismatch repair (MMR) system, etc. (Li et al., 2002).<br />

Microsatellites may provide an evolutionary advantage of fast adaptation to new<br />

environments as evolutionary tuning knobs (Kashi et al., 1997; Trifonov, 2003). Due to<br />

the presence of trinucleotide microsatellites within protein coding regions tracts of


RESULTS AND DISCUSSION<br />

repeated amino acid are present in some proteins and different amino acid repeats are<br />

concentrated in different classes of proteins. In yeast proteins, the most abundant amino<br />

acid repeats are Gln, Asn, Asp, Glu, and Ser (Richard and Dujon 1997; Alba et al.,<br />

1999). In this work within Saccharomycotina Rlm1 microsatellite region tracts of poly-<br />

Gln and poly-Asn have been found (Figure 19). The change in the amino acid tract at<br />

the microsatellite region, from Gln (CAA) to Asn (AAC) could be due to a possible<br />

frameshift mutation that occurred in the gene before the microsatellite region. Since,<br />

this kind of mutations has been already observed in MADS-Box genes, exactly at the C-<br />

terminal (Kramer at al. 2006). Thus, the microsatellite repetitive region of this gene was<br />

further analysed.<br />

Figure 20. Saccharomyces cerevisiae amino acid sequence showing sites under positive selection by the<br />

MEC model. Conserved regions: MADS-box (blue), Acidic region (orange), Region A<br />

(purple), Region B (green) and microsatellites (red).<br />

81


82<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

4.5. Event of frameshift mutation in Saccharomyces cerevisiae RLM1<br />

gene<br />

In the sequence alignment of Saccharomycotina Rlm1 proteins four conserved<br />

moities were recognized, and at the C-terminal a repetitive region was identified in C.<br />

albicans, C. dubliniensis, Kluyveromyces lactis and Saccharomyces sensu stricto group,<br />

which seems to be under positive selection. It is difficult to determine the molecular<br />

events that changed the amino acid residues in the sites identified as under positive<br />

selection but from the nucleotide sequence analysis, the molecular event that could have<br />

changed the tract at the repetitive region might have been a frameshift mutation.<br />

Additionally, this type of mutations has been already observed at the C- terminal of<br />

MADS-box genes, as previously mentioned, and thus this hypothesis was tested by<br />

analyzing the different RML1 reading frames.<br />

The three open reading frames from C. albicans, C. dubliniensis, K. lactis and S.<br />

cerevisiae were analyzed to find all the potential repetitive amino acids in the C-<br />

terminal region of this gene. The first reading frame showed a poli- Glutamine (Gln–Q)<br />

tract in the C-terminal of C. albicans, C. dubliniensis and K. lactis, and a poli-<br />

Asparagine (Asn–N) in the C - terminal of S. cerevisiae (Figure 21). These reading<br />

frames are the actual coding frames of the mentioned yeast species. The S. cerevisiae<br />

second reading frame showed a translation to Threonine (The–T), while the third<br />

reading frame encoded a poli-Glutamine tract, just like C. albicans, C. dubliniensis and<br />

K. lactis (Figure 21).<br />

The mutational mechanisms known to contribute to microsatellite polymorphism<br />

and that could affect the reading frame in the microsatellite region are the following: i)<br />

DNA polymerase slippage during DNA replication (Tachida and Izuka, 1992) and ii)<br />

recombination (unequal crossover or gene conversion) between DNA strands (Harding<br />

et al., 1992; Li et al., 2002). Microsatellites studies indicated that trinucleotide repeats<br />

in coding sequences of many organisms are selected against possible frameshift<br />

mutation since these mutations are generally considered detrimental by causing<br />

nonfunctional transcripts and /or protein, through possible insertion of premature stop<br />

co<strong>do</strong>ns (Ohno, 1970, Tóth et al., 2000; Wren et al., 2000; Cordeiro et al., 2001).<br />

Although trinucleotide repeats are under selective pressure to avoid frameshift<br />

mutations, exceptional situations have been described when a protein is temporarily


RESULTS AND DISCUSSION<br />

freed from selective pressure. If selective pressure is relieved, for example, through a<br />

second copy of a gene, this duplicate can compensate for the possible loss of function<br />

caused by the frameshift mutation and enable such mutations to lead to functional<br />

divergence (Raes and Van de Peer, 2005).<br />

Figure 21. The three RLM1 open reading frames of Candida albicans, Kluyveromyces lactis and<br />

Saccharomyces cerevisisae visualized with the Virtual Ribosome 1.0. The microsatellite<br />

region, as well as the most likely position to insert a nucleotide and change S. cerevisisae<br />

reading frame, avoiding the presence of stop co<strong>do</strong>ns, are indicated<br />

Genomic duplication has been proposed as an advantageous path to evolutionary<br />

innovation, because duplicated genes can supply genetic raw material for the emergence<br />

of new functions through the forces of mutation and natural selection (Ohno, 1970).<br />

Such duplication can involve individual genes, genomic segments or whole genomes<br />

and coordinated duplication of an entire genome may allow for large-scale adaptation to<br />

new environments.<br />

Nucleotide insertion<br />

C. albicans<br />

K. lactis<br />

S. cerevisiae<br />

In this work, the phylogeny analysis clearly inferred two groups that suffered<br />

genome duplication: Saccharomyces sensu strict and sensu lato and it was in the<br />

Saccharomyces sensu stricto group that the amino acid encoded by trinucleotide repeats<br />

suffered alteration. The genome duplication of S. cerevisisae was firstly proposed by the<br />

locations of paralogues that revealed the existence of ancestral duplication blocks<br />

(Wolfe and Shields, 1997; Langkjaer et al., 2003). The model WGD was then proposed<br />

3<br />

2<br />

1<br />

3<br />

2<br />

1<br />

3<br />

2<br />

1<br />

83


84<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

and corroboted by syntenic and genomic studies (Wolfe and Shields, 1997; Seoighe and<br />

Wolfe, 1999, Kellis et al., 2004). Those studies postulated that the WGD event occurred<br />

after the divergence of S. cerevisiae and K. lactis, being also identified a group of Non-<br />

WGD species represented by K. lactis, A. gossypii, S. kluyveri and K. waltii (Byrne and<br />

Wolfe, 2006). In accordance to the inferred phylogenies firstly the CUG group diverged<br />

from the Non WGD – WGD groups, and then the Non WGD – WGD groups separated.<br />

Therefore, the microsatellite within RLM1 must have been present in same ancestral<br />

taxon within Saccharomycotina, evolving during the speciation process, and changing<br />

in recent species, possibly due to a frameshift mutation, as this work is indicating.<br />

Additionally, frameshift mutations have been detected in MADS-box transcription<br />

factor families of plant, causing mutations exactly in the C–terminal and yielding<br />

functional diversification (Lachtman, 1998, Kramer et al., 2006).<br />

In the present study two members of the MADS- box transcription factor family<br />

were studied, the Rlm1 and the Mcm1. However, in S. cerevisiae two other members of<br />

this family were identified, and this species presents four MADS- box in total: the Smp1<br />

whose function is involved in regulating the response to osmotic stress (de Nadal et al.,<br />

2003) and with orthologues in the WGD group and none in the Non WGD and CUG<br />

groups; and the Arg80 whose function is involved in the regulation of arginine-<br />

responsive genes (Dubois et al., 1987), being identified in S. cerevisiae and in the WGD<br />

group (Annexes V and VI). Recently, the complete genomes of Zygosaccharomyces<br />

rouxii and Kluyveromyces thermotolerans have been sequenced, showing that Z. rouxii<br />

presents four MADS-box transcription factors, like the WGD group, while K.<br />

thermotolerans presents just two MADS-box transcription factors, like Non-WGD<br />

group. Therefore, we could infer that the genome duplication in Saccharomycotina<br />

would have occurred in taxa that presented the four MADS-box transcription factors.<br />

Previous studies suggested that at least one MADS-box gene was present in the<br />

common ancestor of plants, animals, and fungi, and that probably the duplication that<br />

gave rise to the animal MADS- box type II and type I genes occurred after animals<br />

diverged from plants, but before fungi diverged from animals (Theissen et al., 1996).<br />

However, Alvarez-Buylla et al. (2000) proposed that at least one ancestral MADS- box<br />

gene duplicated in the common ancestor of the major eukaryotic king<strong>do</strong>ms, more than a<br />

billion years ago, to give rise to the distinct type I (SRF-like) and type II (MEF2-like)


RESULTS AND DISCUSSION<br />

lineages found in plants, fungi, and animals today. Figure 22 presents a scheme of the<br />

MADS-box genes evolution model based on Alvarez-Buylla et al. (2000).<br />

B)<br />

A)<br />

Animal and Fungi<br />

Animals<br />

Lineage<br />

MEF2-like and<br />

SRF-like<br />

isoforms<br />

Lineages<br />

Fungi<br />

Lineage<br />

WGD – Saccharomycotina<br />

Lineage<br />

al., 2000)<br />

Plants<br />

Lineage<br />

Common<br />

Ancestral<br />

1 billion<br />

years ago<br />

(Based on<br />

Alvarez-Buylla et<br />

APETALA,<br />

PISTILLATA,<br />

DEFICIENS,<br />

CAULIFLOWER,<br />

FRUITFULL,<br />

AGAMOUS and<br />

SQUAMOSA<br />

Saccharomyces cervisiae Candida albicans<br />

Figure 22. A) Evolution of MADS-box genes based on the Alvarez-Buylla et al.(2000) model, proposing<br />

a common origin for Animals, Fungi and Plants and a further duplication event in the Fungi<br />

lineage in some species within the subphylum of Saccharomycotina (WGD group). B)<br />

Comparison of Rlm1 protein/gene <strong>do</strong>mains in S. cerevisiae and C. albicans, showing the<br />

frameshift mutations occurred after genome duplication in S. cerevisiae and WGD species<br />

that changed the coding amino acid from Q to N.<br />

85


86<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Animal and fungal MADS-box sequences type II are more closely related to most<br />

plant MADS-box sequences type II than to animal MADS-box sequences type I,<br />

suggesting the occurrence of at least one gene-duplication event before the divergence<br />

of plants and animals. This model also proposes that after the divergence of animal-<br />

fungi and plant lineages both types of MADS-box genes were transferred to each<br />

lineage. In the plant lineages different evolutive mechanisms such as gene duplication,<br />

frameshift mutation, and alternative splicing, occurred, producing a variety of MADS-<br />

box transcription factor genes, while in fungi and animals each diverged with both<br />

MADS-box types. In animals the alternative splicing has been reported as the main<br />

evolutive mechanism that changed these genes. Fungi lineages maintained both MADS-<br />

box types represented by RLM1 and MCM1 genes, but within the ‗Saccharomyces<br />

complex‘ in subphylum Saccharomycotina, with the whole genome duplication event<br />

occurring in Saccharomyces sensu stricto and sensu lato clades, the genes SMP1 and<br />

ARG80 appeared, that are paralogues to Rlm1 and Mcm1, respectively. After this<br />

genome duplication the coded amino acid in the repetitive region of RLM1 gene within<br />

the Saccharomyces sensu stricto group changed. This was due to a frameshift mutation<br />

to encode Asn instead of Gln as in the Non-WGD and CUG groups, that represent more<br />

basal clades in the subphylum Saccharomycotina.<br />

The putative ancestral RLM1 gene was predicted to try to confirm the microsatellite<br />

co<strong>do</strong>n present in the commom ancestor from which CUG, Non-WGD and WGD groups<br />

diverged (Figure 23). This ancestral gene presented the main features characteristic of<br />

the MADS-box transcription factor that has been previously identified namely, the<br />

MADS-box, the acidic region and the regions A and B (Figure 18), but the repetitive<br />

region was not observed. However, in the C-teminal of the protein this predicted<br />

putative ancestral gene/protein showed a low accuracy percentage (59.9% - 43%),<br />

presenting 194 unreliable sites of 1407. These unreliable sites are present in different<br />

regions other than conserved ones, indicating a large variability on the unreliable<br />

regions that were mainly replaced by amino acids with similar properties. On the other<br />

hand, the conserved regions will be a primordial in the function of the protein and thus,<br />

their presence in basal taxa in the subphylum Saccharomycotina.<br />

Several reports indicate that transcription factors and protein kinases are<br />

significantly associated with acidic and polar amino acid repeats, whereas Ser repeats


RESULTS AND DISCUSSION<br />

are significantly associated with membrane transporter proteins (Alba et al., 1999).<br />

Rlm1 is a transcription factor and the repetitive tracts observed are poly-Gln and poly-<br />

Asn, which agrees with the previous refered works.<br />

M G R R K I E I Q P I T D D R N R T V T F I K R K A G L F K<br />

5' ATGGGTAGAAGAAAGATTGAAATTCAGCCGATCACGGACGATCGAAACCGGACAGTGACGTTTATCAAGCGTAAGGCCGGTCTATTCAAA 90<br />

>>>.......................................................................................<br />

* * *<br />

K A H E L A V L C Q V D V A V I I F G N N N K L Y E F S S V<br />

5' AAGGCTCACGAGTTGGCTGTGCTTTGCCAGGTAGATGTTGCGGTGATCATCTTTGGCAACAACAATAAGCTGTACGAGTTCTCCTCTGTT 180<br />

............)))......................................................)))..................<br />

*<br />

D T N E L I K R Y Q K N M P H E M K A P E D Y G D Y K K K K<br />

5' GACACAAACGAGTTGATTAAAAGATACCAGAAAAACATGCCGCACGAAATGAAGGCGCCTGAGGACTACGGAGACTACAAGAAGAAAAAA 270<br />

............))).....................>>>.........>>>.......................................<br />

* * * * *<br />

H V G D D R P T A H F N V N N N N D D D D D D D D D E D D D<br />

5' CATGTGGGTGATGATAGACCAACGGCACATTTTAATGTTAATAACAATAATGACGATGATGATGACGATGATGATGATGAAGATGATGAT 360<br />

..........................................................................................<br />

* * * * * * * * * * * * *<br />

D T D N D T P D A N P K D K N E T E K K T Q N Q T P K E L N<br />

5' GATACTGATAACGATACTCCTGATGCCAATCCTAAAGATAAAAACGAGACGGAAAAAAAAACTCAAAATCAAACTCCTAAGGAGCTGAAC 450<br />

....................................................................................)))...<br />

* * * * * * * * * * * * * * * *<br />

H P Q Q P P P P Q H T S Q Q Q V Q K P E A P K D Q P A P P T<br />

5' CATCCTCAGCAACCCCCGCCGCCTCAACATACTTCTCAACAACAAGTACAAAAACCAGAAGCACCTAAAGACCAACCCGCGCCGCCGACA 540<br />

..........................................................................................<br />

* * * * * * * * * * * * *<br />

P P N H T H K N P D T M N Q R P E P P V Q I P T D V Q T N H<br />

5' CCTCCAAACCACACACACAAGAACCCGGATACTATGAACCAAAGACCTGAACCGCCAGTTCAAATTCCAACTGATGTCCAGACTAATCAT 630<br />

.................................>>>......................................................<br />

* * * * * * * * * * * * * * * * * *<br />

Q N D N A K T I T A I H T H K Q K N Q K N T K N K N N K N N<br />

5' CAGAACGATAATGCTAAAACCATTACGGCCATTCACACCCATAAACAAAAAAATCAAAAGAATACTAAGAATAAAAATAATAAAAATAAT 720<br />

..........................................................................................<br />

* * * * * * * * * * * * * * * * * * * * * * * *<br />

L S N Q N H I N S E T T P T T N Y S A S I K T P D S S N H T<br />

5' CTAAGTAATCAGAATCATATTAATAGTGAAACTACTCCAACTACCAATTATTCTGCATCGATCAAAACACCTGATTCATCAAACCATACT 810<br />

..........................................................................................<br />

* * * * * * * * * * * * * *<br />

L P I P T T T K E L P P T P V T A T A P G L P I S K N N P Y<br />

5' TTGCCAATACCAACAACAACTAAAGAACTACCTCCTACTCCAGTTACTGCAACTGCTCCTGGATTACCAATAAGTAAAAATAATCCATAT 900<br />

))).......................................................................................<br />

* * * * * * * * *<br />

F F G S E P P P Q V S P P Y P S I T P I L Q H E I P Q Q D P<br />

5' TTTTTTGGATCCGAACCTCCACCTCAAGTTTCTCCTCCATATCCTTCTATTACACCTATACTTCAACATGAAATTCCTCAACAAGATCCA 990<br />

..........................................................................................<br />

* * * * * * * * * *<br />

P P Q P V G Q H T P Q Y Q Q T T Q T D T N N K K K L P P P H<br />

5' CCGCCGCAACCGGTAGGACAACATACACCTCAATACCAACAAACTACTCAAACAGATACTAACAATAAGAAAAAATTACCACCACCACAT 1080<br />

..........................................................................................<br />

* * * * * * * * * * * * * *<br />

P N L N H P Q N N A A P I P P P E T I E H N P T G E E Q T P<br />

5' CCTAATCTAAATCATCCACAAAATAATGCGGCACCAATACCTCCACCTGAAACAATTGAACATAATCCTACTGGTGAAGAACAAACGCCA 1170<br />

..........................................................................................<br />

* * * * * * * * * * * * * * * * * *<br />

I S G L P S R Y V N D M F P S P S N F Y A P Q D W P T G M T<br />

5' ATATCAGGACTACCATCGCGATATGTTAACGACATGTTTCCATCTCCATCTAACTTCTACGCTCCTCAAGATTGGCCCACTGGTATGACA 1260<br />

.................................>>>................................................>>>...<br />

* * *<br />

P I N A N M P Q Y V A G T I P A G G P K K E Q T R T R K K T<br />

5' CCCATTAATGCTAATATGCCTCAATACGTTGCGGGTACGATTCCTGCAGGAGGTCCTAAGAAAGAACAAACTCGAACTCGCAAAAAAACA 1350<br />

...............>>>........................................................................<br />

* * * * * * * * * * * * * * * * * * *<br />

K G N D K T D N K E D N N K E K K R K<br />

5' AAGGGAAATGATAAAACGGATAATAAGGAAGACAATAATAAAGAAAAAAAGAGAAAA 1407<br />

.........................................................<br />

* * * * * * * * * * * * * *<br />

Figure 23. Predicted putative ancestral RLM1 gene sequence from which CUG, WGD and Non-WGD<br />

groups diverged. MADS.box (red), Acidic region (orange), conserved region A (yellow) and<br />

conserved region B (green). *unreliable sites.<br />

Earlier investigations speculated that eukaryotes incorporating more DNA<br />

repeats, might provide a molecular device for faster adaptation to environmental<br />

87


88<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

stresses (Kashi et al., 1997; Marcotte et al. 1999; Wren et al., 2000; Li et al. 2002;<br />

Trifonov, 2003). This speculation has been supported by an increasing number of<br />

experiments. For instance, in S. cerevisiae, microsatellites are overrepresented among<br />

ORFs encoding for regulatory proteins (e.g., transcription factors and protein kinases)<br />

rather than for structural ones, indicating the role of microsatellites as a factor<br />

contributing to fast evolution of adaptive phenotypes (Young et al., 2000). In<br />

prokaryotes, microsatellites are not as abundant as in eukaryotes. Most of the<br />

microsatellites in bacteria are located in virulence genes and/or regulatory regions, and<br />

they affect pathogenesis and bacterial adaptive behavior (Hood et al., 1996; Peak et al.,<br />

1996; van Belkum et al., 1998; Field and Wills, 1998). The contingency genes<br />

containing microsatellites show high mutation rates, allowing the bacterium to act<br />

swiftly on deleterious environmental conditions, showing the role of these repetitive<br />

DNA sequences in natural selection (Moxon et al. 1994). In C. albicans, it has been<br />

demonstrated that the microsatellite region in the 3‘end of RLM1 gene, denominated<br />

CAI, presents high polymorphism (Sampaio et al., 2003, 2009). Additionally, the<br />

analysis of CAI polymorphism in strains from recurrent infections revealed<br />

microevolutionary changes at CAI, suggesting a putative involvement of RLM1 in<br />

evolutive and/or adaptative response of C. albicans strains during recurrence (Sampaio<br />

et al. 2005).<br />

The C-terminal of proteins from the MADS box family is necessary for dimerization<br />

and required for transcripticional activation (Messenguy and Dubois, 2003). It is known<br />

that their regulatory specificity depends on accessory factors and in many cases the<br />

cofactor with which the protein interacts specifies which genes are regulated, when they<br />

are regulated and if these genes are transcriptionally activated or repressed (Shore and<br />

Sharrocks, 1995). Moreover, Sampaio et al., (2009) demonstrated that the CAI<br />

repetitive region confers a high genetic variability to RLM1 gene which is reflected in<br />

strain susceptibility to different stress conditions. Thus, and although not evident in the<br />

putative ancestral sequence, this regions seemed to be important for the protein<br />

function. After genome duplication the coded amino acid in the repetitive region of<br />

RLM1 gene within the Saccharomyces sensu stricto group changed, possibly due to a<br />

frameshift mutation to encode Asn instead of Gln. This change certainly created<br />

additional /new function for this protein.


FINAL REMARKS


5. FINAL REMARKS<br />

FINAL REMARKS<br />

Presently, the search of protein-encoding genes that present the desired<br />

characteristics to infer a robust phylogeny is a challenge. Several genes have been used<br />

to construct phylogenetic trees of different species among which the ribosomal genes<br />

were used to infer most of the phylogenetic relationships among the different king<strong>do</strong>ms<br />

of life. In recent years, the use of concatenated gene alignments (multigene analysis)<br />

and of complete genome analysis has become the best approach for resolving<br />

phylogenetic relationships among species, whose placements were not well determined<br />

by single-gene approach. However, the concatenated alignments of genes have only<br />

been based in conserved regions, which can lead to a loss of genetic information that<br />

might be important to resolve certain species placement that even with a large number<br />

of genes may not be possible to infer accurate position.<br />

In this work we used three nuclear protein-encoding genes, RLM1 and MCM1,<br />

belonging to the family of MADS-box transcription factors, and Iff8 a GPI-anchor<br />

protein, to infer phylogeny in the king<strong>do</strong>m Fungi, using the entire gene sequence in<br />

contrast with the previously mentioned studies. Orthologue sequences for the two genes,<br />

RLM1 and MCM1, were identified in 76 species from the king<strong>do</strong>m Fungi that have their<br />

genome sequenced, and 8 putative orthologue IFF8 genes in the group CUG.<br />

Results obtained from the phylogenetic analysis using the transcription factor RLM1<br />

indicated that it presents conditions to be considered within a multigene analysis, since<br />

the obtained phylogeny is closer to the ones established by multigene and phylogenomic<br />

analyses. The transcription factor Mcm1 presented limitations to be used in phylogeny<br />

at the king<strong>do</strong>m level, because of its variable size sequences, leading to possible errors in<br />

the alignment. This problem could be overcome with extra sequences of other species<br />

that would help to infer a more accurate phylogeny and still be used at the king<strong>do</strong>m<br />

level. Despite this limitation this gene can be used to resolve phylogeny at the phylum<br />

or lower clades level, because its use, independently and/or concatenated, to infer<br />

phylogeny within the subphylum Saccharomycotina was in agreement with published<br />

studies. On the other hand, the use of IFF8 gene to infer phylogeny is limited and<br />

restricted to the CUG group, since other orthologues were not found within the king<strong>do</strong>m<br />

Fungi and the results obtained in CUG group phylogeny presented conflicts, particularly<br />

91


92<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

in the position of C. tropicalis that is not in agreement with what has already been<br />

determined as its correct relationship in Candida phylogeny.<br />

It is known that, within subphylum Saccharomycotina, the process of genome<br />

duplication has occurred in the ‗Saccharomyces complex‘, in known groups as sensu<br />

stricto and sensu lato, resulting in duplication of some genes. The transcription factors<br />

studied in this work are within the duplicated genes that were maintained.<br />

Understanding the fate of duplicates is fundamental to clarify mechanisms of genetic<br />

redundancy and the link between gene family diversification and phenotypic evolution.<br />

Thus, in this study, the RLM1 gene was analyzed particularly within the subphylum<br />

Saccharomycotina to identify possible sites of positive selection that could provide<br />

alternative explanations for the persistence and diversification of this gene. The analysis<br />

of adaptive evolution within the subphylum Saccharomycotina, by using the<br />

conventional evolutive models, indicated no positive selection in the RLM1 gene<br />

(purifying selection), suggesting that this protein plays an important role in these<br />

organisms. However, a more complete model, the MEC, identified positive selection in<br />

several amino acid sites, indicating nonsynonymous substitutions in some amino acid<br />

positions. Although these substitutions were present in different positions, these<br />

positions were not inside conserved regions, suggesting that these changes occurred due<br />

to different molecular events during the evolution process of divergence of species. A<br />

particular situation that was further analyzed in this study was the observation of the<br />

presence of amino acids under positive selection at the beginning of the repetitive<br />

region, a trinucleotide microsatellite, and differences in the amino acid under repetition<br />

between the species that presented the duplicated the genome and the non duplicated.<br />

The S. cerevisiae Rlm1 protein presents a common microsatellite region in C.<br />

albicans, C. dubliniensis and K. lactis in the C-terminal of the protein, but there is one<br />

peculiarity about this repetitive region. The amino acid encoded by this trinucleotide<br />

microsatellite in S. cerevisiae is the Asn, and in the other species is Gln. The closer<br />

analyses of the different reading frame of RLM1 gene lead to the conclusion that the<br />

most likely molecular mechanism that changed the amino acids at the microsatellite<br />

region was a frameshift mutation by insertion of nucleotides and not an alteration of<br />

nucleotides in the co<strong>do</strong>ns, as it is generally the case in nonsynonymous substitutions.<br />

This mechanism of frameshift mutation prevented the insertion of stop co<strong>do</strong>ns in the C-


FINAL REMARKS<br />

terminal, avoiding the loss of this region in the protein. This change of amino acids in<br />

the microsatellites of S. cerevisiae RLM1 gene should have brought structural and<br />

functional alterations to the protein.<br />

Another peculiarity of this gene is the absence of these microsatellites in basal<br />

species, within the subphylum Saccharomycotina, indicating that their evolution has<br />

occurred during the divergence of species. This was further confirmed by the<br />

observation that the predicted ancestral RLM1 gene presented the main features<br />

characteristic of the MADS-box transcription factor that has been previously identified<br />

namely the MADS-box, the acidic region and the regions A and B, but the repetitive<br />

region was absent. This absence would be related with the role that the microsatellites<br />

play in the functionality of these proteins in some species and adaptation to different<br />

environmental conditions. Thus, the process of duplication, and genome reorganization<br />

influenced the restructuring of the S. cerevisiae RLM1 gene, decreasing the selective<br />

pressure on this gene and enabling a frameshift mutation, which shifted the reading of<br />

co<strong>do</strong>ns from Gln to Asn. Additionally, this process of gene duplication affected not only<br />

RLM1 gene but also MCM1 gene in the Saccharomyces sensu stricto and<br />

Saccharomyces sensu lato groups. This observation points to the potential use of<br />

MADS-box transcription factors in the studies of identification and process of genomes<br />

duplication within subphylum Saccharomycotina, due to the presence of two or four<br />

MADS-box genes in these yeast species.<br />

The major finding of this work were (i) the identification of the potential use of<br />

RLM1 gene for inferring phylogeny in the king<strong>do</strong>m Fungi with special emphasis to<br />

Candida species, and (ii) the observation that this gene evolved within the<br />

Saccharomyces sensu strict group after genome duplication, being the molecular<br />

mechanism responsible for the change observed in the C-terminal of this protein most<br />

probably a frameshift mutation.<br />

93


REFERENCES


6. REFERENCES<br />

REFERENCES<br />

Abascal, F., Zar<strong>do</strong>ya, R., and Posada, D. (2005). ProtTest: Selection of best-fit<br />

models of protein evolution. Bioinformatics. 21(9): 2104-2105.<br />

Adachi, J., and Hasegawa, M. (1995). MOLPHY: Programs for Molecular<br />

Phylogenetics. Tokyo: Inst. Statist. Math.<br />

Alba, M. M., Santibáñez-Koref, M. F., and Hancock, J. M. (1999). Amino acid<br />

reiterations in yeast are overrepresented in particular classes of proteins and<br />

show evidence of a slippage-like mutational process. J Mol Evol. 49: 789–797.<br />

Alexopoulos, C.J., Mims, C.W., and Blackwell, M. (1996). Introductory<br />

Micology. 4 th edition. John Wiley & Sons Inc., New York, EE.UU.<br />

Alfaro, M. E., Zoller, S., and Lutzoni, F. (2003). Bayes or bootstrap? A<br />

simulation study comparing the performance of Bayesian Markov chain Monte<br />

Carlo sampling and bootstrapping in assessing phylogenetic confidence.<br />

Molecular Biology and Evolution. 20: 255–266.<br />

Althoefer, H., Schleiffer, A., Wassmann, K., Nordheim, A., and Ammerer, G.<br />

(1995). Mcm1 is required to coordinate G2-specific transcription in<br />

Saccharomyces cerevisiae. Mol Cell Biol. 15: 5917–5928.<br />

Alvarez-Buylla, E. R., Pelaz, S., Liljegren, S. J., Gold, S. E., Burgeff, C., Ditta,<br />

G. S., Ribas de Pouplana, L., Martinez-Castilla, L., and Yanofsky, M. F. (2000).<br />

An ancestral MADS-box gene duplication occurred before the divergence of<br />

plants and animals. Proc Natl Acad Sci. 97: 5328–5333.<br />

Anisimova, M., Bielawski, J. P., and Yang, Z. (2001). Accuracy and power of<br />

the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol<br />

Evol, 18(8): 1585-1592.<br />

Anisimova, M., Bielawski, J. P., and Yang, Z. (2002). Accuracy and power of<br />

Bayes prediction of amino acid sites under positive selection. Mol Biol Evol. 19:<br />

950-958.<br />

Anisimova, M., and Gascuel, O. (2006). Approximate Likelihood-Ratio Test for<br />

branches: A fast, accurate and powerful alternative. Systematic Biology. 55(4):<br />

539-552.<br />

Aragona, M., Montigiani, M., and Porta-Puglia, A. (2000). Electrophoretic<br />

karyotypes of the phytopathogenic Pyrenophora graminea and P. teres.<br />

Mycological Research. 104: 853-857.<br />

97


98<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Baillie, G. S., and Douglas, L. J. (1998). Effect of growth rate on resistance of<br />

Candida albicans biofilms to antifungal agents. Antimicrob Agents Chemother.<br />

42: 1900–1905.<br />

Baldauf, S. L. (1999). A search for the origins of animals and fungi: comparing<br />

and combining molecular data. American Naturalist. 154: S178-S188.<br />

Baldauf, S. L., and Palmer, J. D. (1993). Animals and fungi are each other‘s<br />

closest relatives: congruent evidence from multiple proteins. Proc Natl Acad Sci.<br />

90: 11558–11562.<br />

Baldauf, S. L., Roger, A. J., Wenk-Sierfert, I., and Doolittle, W. F. (2000). A<br />

king<strong>do</strong>m-level phylogeny of eukaryotes based on combined protein data.<br />

Science. 290: 972–977.<br />

Ball, L. M., Bes, M. A., Theelen, B., Boekhout, T., Egeler, R., and Kuijper, E. J.<br />

(2004). Significance of amplified fragment length polymorphism in<br />

identification and epidemiological examination of Candida species colonization<br />

in children undergoing allogeneic stem cell transplantation. J Clin Microbiol.<br />

42(4): 1673-1679.<br />

Ballesté, R, Arteta, Z., Fernández, N., Mier, C., Mousqués, N., Xavier, B.,<br />

Cabrera, M. J., Acosta, G., Combol, A., and Gezuele, E. (2005). Evaluación del<br />

medio cromógeno CHROMagar CandidaTM para la identificación de levaduras<br />

de interés médico. Rev Med Uruguay. 21: 186-193.<br />

Baker, E. E., Mrak, M., and Smith, C. E. (1943). The morphology, taxonomy,<br />

and distribution of Coccidioides irnmitis Rixford and Gilchrist 1896. Farlowia.<br />

1: 199-244.<br />

Bapteste, E., Brinkmann, H., Lee, J. A., Moore, D. V., Sensen, C. W., Gor<strong>do</strong>n,<br />

P., Duruflé, L., Gaasterland, T., Lopez, P., Müller, M., and Philippe, H. (2001).<br />

The analysis of 100 genes supports the grouping of three highly divergent<br />

amoebae: Dictyostelium, Entamoeba, and Mastigamoeba. Proc Natl Acad Sci.<br />

99: 1414 – 1419.<br />

Barr, D. J. S. (1992). Evolution and king<strong>do</strong>ms of organisms from the perspective<br />

of a mycologist. Mycologia. 84: 1–11.<br />

Barros Lopes, M., Rainieri, S., Henschke, P.A., and Langridge, P. (1999). AFLP<br />

fingerprinting for analysis of yeast genetic variation. Int J Syst Bacteriol. 49:<br />

915-924.


REFERENCES<br />

Bartnicki-Garcia, S. (1987). The cell wall in fungal evolution., p. 389-403. In A.<br />

D. M. Rayner, C. M. Brasier, and D. Moore (ed.), Evolutionary biology of the<br />

fungi. Cambridge University Press, New York, N. Y.<br />

Bhattacharya, D., Lutzoni, F., Reeb, V., Simon, D., Nason, J., and Fernandez, F.<br />

(2000). Widespread occurrence of spliceosomal introns in the rDNA genes of<br />

Ascomycetes. Molecular Biology and Evolution. 17: 1971–1984.<br />

Bauer, R., Begerow, D., Sampaio, J.P., Weiß, M., and Oberwinkler, F. (2006).<br />

The simple-septate basidiomycetes: a synopsis. Mycological Progress. 5: 41–66.<br />

Beckmann, J. S., and Weber, J. L. (1992). Survey of human and rat<br />

microsatellites. Genomics. 12: 627–631.<br />

Berbee, M. L. (1996). Loculoascomycete origins and evolution of filamentous<br />

ascomycete morphology based on 18S rDNA gene sequence data. Molecular<br />

Biology and Evolution. 13: 462–470.<br />

Berbee, M. L. (2001). The phylogeny of plant and animal pathogens in the<br />

Ascomycota. Physiology and Molecular Plant Pathology. 59: 165–187.<br />

Berbee, M. L., and Taylor, J. W. (1993). Dating the evolutionary radiations of<br />

the true fungi. Canadian Journal of Botany. 71: 1114–1127.<br />

Bernal, S., Mazuelos, E. M., Chavez, M., Coronilla, J., and Valverde, A. (1998).<br />

Evaluation of the new API Candida system for identification of the most<br />

clinically important yeast species. Diagnostic Microbiology and Infectuos<br />

Diseases. 32(3): 217-221.<br />

Binder, M., and Hibbett, D. S. (2002). Higher-level phylogenetic relationships of<br />

Homobasidiomycetes (mushroom-forming fungi) inferred from four rDNA<br />

regions. Molecular Phylogenetics and Evolution. 22: 76–90.<br />

Binder, M., Hibbett, D. S, and Molitoris, H. P. (2001). Phylogenetic<br />

relationships of the marine gasteromycete Nia vibrissa. Mycologia. 93: 679–<br />

688.<br />

Binder, M., Hibbett, D. S., Larsson, K-H., Larsson, E., Langer, E., and Langer,<br />

G. (2005). The phylogenetic distribution of resupinate forms across the major<br />

clades of mushroom-forming fungi (Homobasidiomycetes). Systematics and<br />

Biodiversity. 3(2): 113-157.<br />

Blackwell, M. (2000). Evolution. Terrestrial life-fungal from the start?. Science.<br />

293: 1129-1133.<br />

99


100<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Blackwell, M., Hibbett, D. S., Taylor, J., and Spatafora, J. W. (2006). Research<br />

coordination networks: a phylogeny for king<strong>do</strong>m Fungi (Deep Hypha).<br />

Mycologia. 98(6): 829-837.<br />

Boeke, J. D., Garfinkel, D. J., Styles, C. A., and Fink, G. R. (1985). Ty elements<br />

transpose through an RNA intermediate. Cell. 40: 491–500.<br />

Botterel, F., Cesterke, C., Costa, C., and Bretagne, S. (2001). Analysis of<br />

microsatellite markers of Candida albicans used for rapid typing. J Clin<br />

Microbiol. 39: 4076-4081.<br />

Bowman, B. H., White, T. J., and Taylor, J. W. (1996). Human pathogeneic<br />

fungi and their close nonpathogenic relatives. Mol Phylogenet Evol. 6(1): 89-96.<br />

Bowman, S. M., Piwowar, A., Al Dabbous, M., Vierula, J., and Free, S. J.<br />

(2006). Mutational analysis of the glycosylphosphatidylinositol (GPI) anchor<br />

pathway demonstrates that GPI-anchored proteins are required for cell wall<br />

biogenesis and normal hyphal growth in Neurospora crassa. Eukaryot Cell 5:<br />

587–600.<br />

Bretagne, S., Costa, J. M., Besmond, C., Carsique, R., and Calderone, R. (1997).<br />

Microsatellite polymorphism in the promoter sequence of the elongation factor 3<br />

gene of Candida albicans as the basis for a typing system. J Clin Microbiol. 35:<br />

1777-1780.<br />

Bridge, P. D., Spooner, B. M., and Roberts, P. J. (2005). The impact of<br />

molecular data in fungal systematics. Adv Bot Res. 42: 33 – 67.<br />

Bruno, V. M., Kalachikov, S., Subaran, R., Nobile, C. J., Kyratsous, C., and<br />

Mitchell, A. P. (2006). Control of the C. albicans cell wall damage response by<br />

transcriptional regulatorCas5. Plos Pathogens. 2(3): e21.<br />

Bruns, T. D., Vilgalys, R., Barns, S. M., Gonzalez, D., Hibbett, D. S., Lane, D.<br />

J., Simon, L., Stickel, S., Szaro, T. M., Weisburg, W. G., and Sogin, M. L.<br />

(1992). Evolutionary relationships within the fungi: analyses of nuclear small<br />

subunit RNA sequences. Molecular Phylogenetics and Evolution. 1: 231–241.<br />

Busch, U., and Nitschko, H. (1999). Methods for the differentiation of<br />

microorganisms. Journal of Chromatography B. 722: 263 - 278.<br />

Byrne, K. P., and Wolfe, K. H. (2006). Visualizing syntenic relationships among<br />

the hemiascomycetes with the Yeast Gene Order Browser.Nucleic Acids Res.<br />

34: D452-D455.


REFERENCES<br />

Calderone, R. A. (2002). Candida and Candidiasis. ASM Press. Washington,<br />

DC, EE.UU.<br />

Carlile, M. J., and Watkinson, S. C. (1994). The Fungi. Academic Press,Ltd.,<br />

Lon<strong>do</strong>n, United King<strong>do</strong>m.<br />

Cavalier-Smith, T. (1981). Eukaryote king<strong>do</strong>ms: seven or nine?. Biosystems.<br />

14(3-4): 461-481.<br />

Cavalier-Smith, T. (1987a). The origin of eukaryotic and archaebacterial cells.<br />

Ann N Y Acad Sci. 503: 17-54.<br />

Cavalier-Smith, T. (1987b). The origin of Fungi and pseu<strong>do</strong>fungi. In: Rayner A.<br />

D. M., Brasier C. M. and Moore D. (eds): Evolutionary Biology of the Fungi,<br />

pp. 339–353. Cambridge University Press.<br />

Cavalier-Smith, T. (1998). A revised six-king<strong>do</strong>m system of life. Biol. Rev.<br />

Camb. Philos. Soc. 73: 203–266.<br />

Cavalier-Smith, T. (2000). Flagellate megaevolution: the basis for eukaryote<br />

diversification. In: Green J. R. and Leadbeater B. S. C. (eds): The Flagellates,<br />

pp. 361-390. Taylor and Francis. Lon<strong>do</strong>n.<br />

Cavalier-Smith, T. (2002a). The neomuran origin of archaebacteria, the<br />

negibacterial root of the universal tree and bacterial megaclassification. Int J<br />

Syst Evol Microbiol. 52(Pt 1): 7-76.<br />

Cavalier-Smith, T. (2002b). The phagotrophic origin of eukaryotes and<br />

phylogenetic classification of Protozoa. Int J Syst Evol Microbiol. 52(part2):<br />

297-354.<br />

Cavalier-Smith, T. (2002c). Chloroplast evolution: Secundary symbiogenesis<br />

and multiple losses. Curr Biol. 12: R62-R64.<br />

Cavalier-Smith, T. (2003). Protist phylogeny and the high-level classification of<br />

Protozoa. Europ J Protistol. 39: 338-348.<br />

Cavalier-Smith, T. (2004). Only six king<strong>do</strong>ms of life. Proc R Soc Lond B. 271:<br />

1251-1262.<br />

Cavalier-Smith, T. (2006a). Rooting of tree of life by transition analysis.<br />

Biology Direct. 1: 19.<br />

Cavalier-Smith, T. (2006b). Cell evolution and earth history: stasis and<br />

revolution. Phil Trans Roy Soc B. 361: 969-1006.<br />

101


102<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Cavalier-Smith, T. (2006c). Protozoa: the most abundant predators on earth.<br />

Microbiology today. 166-169.<br />

Cavalier-Smith, T., and Chao, E. E. (1995). The opalozoan Apusomonas is<br />

related to the common ancestor of animals, fungi, and choanoflagellates. Proc R<br />

Soc Lond B. 261: 1–9.<br />

Cavalier-Smith, T., and Chao, E. E. (2003). Phylogeny of Choanozoa,<br />

Apusozoa, and other Protozoa and early eukaryote megaevolution. J Mol Evol.<br />

56: 540–563.<br />

Cavalier-Smith T., Chao E. E., and Oates B. (2004): Molecular phylogeny of<br />

Amoebozoa and the evolutionary significance of the unikont Phalansterium. Eur<br />

J Protistol. 40: 21-48.<br />

Cavalli-Sforza, L. L., and Edwards, A. W. F. (1967). Phylogenetic analysis:<br />

models and estimation procedures. Am J Hum Genet. 19: 122–257.<br />

Chapela, I. H. (1991). Spore size revisited: analysis of spore populations using<br />

automated particle size. Sy<strong>do</strong>wia. 43: 1–14.<br />

Chen, T. C., Chen, Y. H., Tsai, J. J., Peng, C. F., Lu, P. L., Chang, K., Hsieh, H.<br />

C., and Chen, T .P. (2005). Epidemiologic analysis and antifungal susceptibility<br />

of Candida blood isolates in Southern Taiwan. Journal of Microbiology and<br />

Immunological Infectious. 38: 200-210.<br />

Chistiakov, D. A., Hellemans, B., and Volckaert, F. A. M. (2006).<br />

Microsatellites and their genomic distribution, evolution, function and<br />

applications: A review with special reference to fish genetics. Aquaculture. 255:<br />

1-29.<br />

Ciferri, R., and Redaelli, P. (1936). Morfologia, biologia e posizione sistemática<br />

di Coccidioides immitis stiles e delle sue varieta, con notizie sul granuloma<br />

coccidioide. R Accad Ital. 7: 399-474.<br />

Clark, T. A., Slavinski, S. A, Morgan, J., Lott, T., Arthington-Skaggs, B. A.,<br />

Brandt, M. E., Webb, R. M., Currier, M., Flowers, R. H., Fridkin, S. K., and<br />

Hajjeh, R. A. (2004). Epidemiologic and molecular characterization of an<br />

outbreak of Candida parapsilosis bloodstream infections in a community<br />

hospital. J Clin Microbiol. 42(10): 4468-4472.<br />

Cole, G. T., Samson, R. A. (1979). Patterns of development in conidial fungi.<br />

Pittman, Lon<strong>do</strong>n, United King<strong>do</strong>m.


REFERENCES<br />

Coleman, D. C., Rinaldi, M. G., Haynes, K. A., Rex, J. H., Summerbell, R. C.,<br />

Anaissie, E. J., Li, A., and Sullivan, D. J. (1998). Importance of Candida species<br />

other than Candida albicans as opportunistic pathogens. Medical Mycology. 36:<br />

156–165.<br />

Copeland, H. (1956). The Classification of Lower Organisms. Palo Alto: Pacific<br />

Books.<br />

Caro, L. H., Tettelin, H., Vossen, J. H., Ram, A. F., van den Ende, H., and Klis,<br />

F. M. (1997). In silico identification of glycosyl-phosphatidylinositol-anchored<br />

plasma-membrane and cell wall proteins of Saccharomyces cerevisiae. Yeast.<br />

13: 1477–1489.<br />

Cordeiro, G. M., Casu, R., McIntyre, C. L., Manners, J. M., and Henry, R. J.<br />

(2001). Microsatellite markers from sugarcane (Saccharum spp) ESTs across<br />

transferable to erianthus and sorghum. Plant Sci. 160: 1115–1123.<br />

Correia, A., Sampaio, P., James, S., and Pais, C. (2006). Candida bracarensis<br />

sp. nov., a novel anamorphic yeast species phenotypically similar to Candida<br />

glabrata. Int J Syst Evol Microbiol. 56: 313-317.<br />

Damveld, R. A., Arentshorst, M., Franken, A., vanKuyk, P. A., Klis, F. M., van<br />

den Hondel, C. A. M. J. J., and Ram, A. F. J. (2005). The Aspergillus niger<br />

MADS-box transcription factor RlmA is required for cell wall reinforcement in<br />

response to cell wall stress. Molecular Microbiology. 58(1): 305-319.<br />

Daniel, H. M., Sorrell, T. C., and Meyer, W. (2001). Partial sequence analysis of<br />

the actin gene and its potential for studying the phylogeny of Candida species<br />

and their teleomorphs. Int J Syst Evol Microbiol. 51(Pt 4): 1593-1606.<br />

De Groot, P. W., de Boer, A. D., Cunningham, J., Dekker, H. L., de Jong, L.,<br />

Hellingwerf, K. J., de Koster, C., and Klis, F. M. (2004). Proteomic analysis of<br />

Candida albicans cell walls reveals covalently bound carbohydrate-active<br />

enzymes and adhesins. Eukaryot Cell, 3: 955–965.<br />

De Nadal., E., Casa<strong>do</strong>mé, L., and Posas, F. (2003). Targeting the MEF2-like<br />

transcription factor Smp1 by the stress-activated Hog1 mitogen-activated protein<br />

kinase. Mol Cel Biol. 23(1): 229-237.<br />

Dib, J.C., Dube, M., Kelly, C., Rinaldi, M.G., and Patterson, J. E. (1996).<br />

Evaluation of pulsed-field gel electrophoresis as a typing system for Candida<br />

103


104<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

rugosa: comparison of karyotype and restriction fragment length<br />

polymorphisms. J Clin Microbiol. 34: 1494–1496.<br />

Diezmann, S., Cox, C. J., Schönian, G., Vilgalys, R., and Mitchell, T. (2004).<br />

Phylogeny and evolution of medical species of Candida and related taxa: A<br />

multigenic analysis. J Clin Microbiol. 42(12): 5624-5635.<br />

Do<strong>do</strong>u, E., and Treisman, R. (1997). The Saccharomyces cerevisiae MADS-box<br />

transcription factor Rlm1 is a target for the Mpk1 mitogen-activated protein<br />

kinase pathway. Mol Cell Biol. 17: 1848– 1859.<br />

Doron-Faigenboim, A., and Pupko, T. 2006. A combined empirical and<br />

mechanistic co<strong>do</strong>n model. Mol Biol Evol. 24(2): 388-397.<br />

Douady, C. J., Delsuc, F., Boucher, Y., Doolittle, W. F., and Douzery, E. J. P.<br />

(2003). Comparison of the Bayesian and maximum likelihood bootstrap<br />

measures of phylogenetic reliability. Molecular Biology and Evolution. 20: 248–<br />

254.<br />

Douzery, E. J., Snell, E. A., Bapteste, E., Delsuc, F., and Philippe, H. (2004) The<br />

timing of eukaryotic evolution: <strong>do</strong>es a relaxed molecular clock reconcile<br />

proteins and fossils? Proc Natl Acad Sci. 101(43): 15386-15391.<br />

Dubois, E., Bercy, J., and Messenguy, F. (1987). Characterization of two genes,<br />

ARGRI and ARGRIII required for specific regulation of arginine metabolism in<br />

yeast. Mol Gen Genet. 207: 142–148.<br />

Dujon, B., Sherman, D., Fischer, G., Durrens, P., Casaregola, S., Lafontaine, I.,<br />

De Montigny, J., Marck, C., Neuveglise, C., Talla, E., Goffard, N., Frangeul, L.,<br />

Aigle, M., Anthouard, V., Babour, A., Barbe, V., Barnay, S., Blanchin, S.,<br />

Beckerich, J. M., Beyne, E., Bleykasten, C., Boisrame, A., Boyer, J., Cattolico,<br />

L., Confanioleri, F., De Daruvar, A., Despons, L., Fabre, E., Fairhead, C., Ferry-<br />

Dumazet, H., Groppi, A., Hantraye, F., Hennequin, C., Jauniaux, N., Joyet, P.,<br />

Kachouri, R., Kerrest, A., Koszul, R., Lemaire, M., Lesur, I., Ma, L., Muller, H.,<br />

Nicaud, J. M., Nikolski, M., Oztas, S., Ozier-Kalogeropoulos, O., Pellenz, S.,<br />

Potier, S., Richard, G. F., Straub, M. L., Suleau, A., Swennen, D., Tekaia, F.,<br />

Wesolowski-Louvel, M., Westhof, E., Wirth, B., Zeniou-Meyer, M., Zivanovic,<br />

I., Bolotin-Fukuhara, M., Thierry, A., Bouchier, C., Caudron, B., Scarpelli, C.,<br />

Gaillardin, C., Weissenbach, J., Wincker, P., and Souciet, J. L. (2004). Genome<br />

evolution in yeasts. Nature. 430(6995): 35-44.


REFERENCES<br />

Edel, V., Steinberg, C., Gautheron, N., and Alabouvette, C. (1996). Evaluation<br />

of restriction analysis of polymerase chain reaction (PCR)-amplified ribosomal<br />

DNA for the identification of Fusarium species. Mycol Res. 101: 179–187.<br />

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy<br />

and high throughput. Nucleic Acids Research. 32(5): 1792-1797.<br />

Eisenhaber, B., Maurer-Stroh, S., Novatchkova, M., Schneider, G., and<br />

Eisenhaber, F. (2003). Enzymes and auxiliary factors for GPI lipid anchor<br />

biosynthesis and post-translational transfer to proteins. Bioessays. 25: 367–385.<br />

Eisenhaber, B., Schneider, G., Wildpaner, M., and Eisenhaber, F. (2004). A<br />

sensitive predictor for potential GPI lipid modification sites in fungal protein<br />

sequences and its application to genome-wide studies for Aspergillus nidulans,<br />

Candida albicans, Neurospora crassa, Saccharomyces cerevisiae and<br />

Schizosaccharomyces pombe. J Mol Biol. 337: 243–253.<br />

Elble, R., and Tye, B. K. (1992). Chromosome loss, hyperrecombination, and<br />

cell cycle arrest in a yeast mcm1 mutant. Mol Biol Cell. 3: 971 – 980.<br />

Ellegren, H. (2000). Microsatellite mutations in the germline: implications for<br />

evolutionary inference. Trends Genet. 16: 551–558.<br />

Eriksson, O. E., Baral, H. O., Currah, R. S., Hansen, K., Kurtzman, C. P.,<br />

Rambold, G., and Laessøe, T (eds). (2004). Outline of Ascomycota— 2004.<br />

Myconet, 10: 1–99.<br />

Errede, B. (1993). MCM1 binds to a transcriptional control element in Ty1. Mol<br />

Cell Biol. 13: 57 – 62.<br />

Estrella, M. C., Mella<strong>do</strong>, E., Guerra, T. M. D., Monzón, A., and Tudela, J. L. R.<br />

(2000). Susceptibility of fluconazole-resistant clinical isolates of Candida spp.<br />

to echinocandin LY303366, itraconazole and amphotericin B. Journal of<br />

Antimicrobial Chemotherapy. 46: 475-477.<br />

Evans, E. E. (1950). The antigenic composition of Cryptococcus neoformans. I.<br />

A serologic classification by means of the capsular and agglutination reactions. J<br />

Immunol, 64: 423 – 430.<br />

Fabre, E., Muller, H., Therizols, P., Lafontaine, I., Dujon, B., and Fairhead, C.<br />

(2005). Comparative genomics in hemiascomycete yeasts: evolution of sex,<br />

silencing, and subtelomeres. Mol Biol Evol. 22(4): 856-873.<br />

105


106<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Fanci, R., and Pecile, P. (2005). Central venous catheter-related infection due to<br />

Candida membranaefaciens, a new opportunistic azole-resistant yeast in a<br />

cancer patient: a case report and a review of literature. Mycoses, 48: 357-359.<br />

Felsenstein, J. (1978). Cases in which parsimony or compatibility methods will<br />

be positively misleading. Syst Zool. (27): 401– 410.<br />

Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum<br />

likelihood approach. J Mol Evol. 17: 368–376.<br />

Ferguson, M. A. (1999). The structure, biosynthesis and functions of<br />

glycosylphosphatidylinositol anchors, and the contributions of trypanosome<br />

research. J Cell Sci. 112: 2799–2809.<br />

Fernandes, L., Araújo, M. A. M., Amaral, A., Reis, V. C. B., Martins, N. F., and<br />

Felipe, M. S. (2005). Cell signalling pathways in Paracocci<strong>do</strong>ides brasiliensis –<br />

inferred from comparisons with other fungi. Genet Mol Res. 4(2): 216-231.<br />

Field, D., and Wills, C. (1996). Long, polymorphic microsatellites in simple<br />

organisms. Proc R Soc Lon<strong>do</strong>n Ser B. 263: 209–251.<br />

Fink, G. R. (1987). Pseu<strong>do</strong>genes in yeast? Cell. 49: 5–6.<br />

Fitzpatrick, D., Logue, M., Stajich, J., and Butler, G. (2006) A fungal phylogeny<br />

bosed on 42 complete genomes derived from supertree and combined gene<br />

analysis. BMC Evolutionary Biology. 6: 99.<br />

Fluit, A. C., Jones, M. E., Schmitz, F-J., Acar, J., Gupta, R., Verhoef, J.,<br />

SENTRY participants Groupl. (2000). Antimicrobial susceptibility and<br />

frequency of occurrence of clinical blood isolates in Europe from the SENTRY<br />

Antimicrobial Surveillance Program, 1997 and 1998. Clin Infect Dis. 30: 454–<br />

60.<br />

Frisvad, J. C. (1994). Classification of organisms by secondary metabolites,p.<br />

303-321. In D. L. Hawsworth (ed.), The identification and characterization of<br />

pest organisms. CAB International, Wallingford, United King<strong>do</strong>m.<br />

García-Martos, P., Ruiz-Aragón, J., Garcia-Agu<strong>do</strong>, L., Saldarreaga, A., Lozano,<br />

M., and Marín, P. (2004). Aislamiento de Candida ciferri en un paciente<br />

inmunodeficiente. Rev Iberoam Micol. 21: 85-86.<br />

Germot, A., Philipe, H., and Le Guyader, H. (1997). Evidence for loss of<br />

mitochondria in microsporidia from a mitochondrial HSP70 in Nosema locustae.<br />

Molecular and Biochemical Parasitology. 87: 159–168.


REFERENCES<br />

Geiser, D. M., Gueidan, C., Miadlikowska, J., Lutzoni, F., Kauff, F., Hofstetter,<br />

V., Fraker, E., Schoch, C. L., Tibell, L., Untereiner, W. A., and Aptroot, A.<br />

(2006). Eurotiomycetes: Eurotiomycetidae and Chaetothyriomycetidae.<br />

Mycologia, 98(6): 1053-1064.<br />

Gill, E. E., and Fast, N. M. (2006). Assessing the microsporidia–fungi<br />

relationship: combined phylogenetic analysis of eight genes. Gene. 375: 103–<br />

109.<br />

Graf, B., Adam, T., Zill, E., and Göbel, U. B. (2000). Evaluation of the VITEK 2<br />

system for rapid identification of yeasts and yeast-like organisms. Journal of<br />

Clinical Microbiology, 38(5): 1782-1785.<br />

Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation<br />

and Bayesian model determination. Biometrika. 82: 711-732.<br />

Gribal<strong>do</strong>, S., and Philippe, H. (2002). Ancient phylogenetic relationships. Theor<br />

Popul Biol. 61(4): 391-408.<br />

Guarro, J., Gené, J., and Stchigel, A. 1999. Developments in fungal taxonomy.<br />

Clin Microbiol Rev. 12(3): 454-500.<br />

Gudlaugsson, O., Gillespie, S., Lee, K., Van de Berg, J., Hu, J., Messer, S.,<br />

Herwaldt, L., Pfaller, M., and Diekema, D. (2003). Attributable mortality of<br />

nosocomial candidemia. Clin Infect Dis. 37: 1172-1177.<br />

Guin<strong>do</strong>n, S., and Gascuel, O. (2003). A simple, fast, and accurate algorithm to<br />

estimate large phylogenies by maximum likelihood. Systematic Biology. 52(5):<br />

696-704.<br />

Guin<strong>do</strong>n, S., Lethiec, F., Duroux, P., and Gascuel, O. (2005). PHYML Online –<br />

a web server for fast maximum likelihood-based phylogenetic inference. Nucleic<br />

Acids Research. 33: W557-W559.<br />

Gur-Arie, R., Cohen, C., Eitan, L., Shelef, E., Hallerman, E., and Kashi, Y.<br />

(2000). Abundance, non-ran<strong>do</strong>m genomic distribution, nucleotide composition,<br />

and polymorphism of simple sequence repeats in E. coli. Genome Res. 10: 61–<br />

70.<br />

Gutierrez, J., Morales, P., Gonzalez, M. A, and Quin<strong>do</strong>s, G. (2002). Candida<br />

dubliniensis, a new fungal pathogen. J. Basic Microbiol. 42: 207–227.<br />

Haeckel, E. (1866). Generelle Morphologie der Organismen. Reimer, Berlin.<br />

107


108<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Hall, B. G. (2008). Phylogenetic trees made easy: A how to manual for<br />

molecular biologists. Third Edition. Sinauer Associates, Inc. Sunderland,<br />

Massachusetts, EE UU.<br />

Harding, R. M., Boyce, A. J., and Clegg, J. B. (1992). The evolution of tandemly<br />

repetitive DNA: recombination rules. Genetics. 132: 847–859.<br />

Hasenclever, H. F., and Mitchell, W. O. (1961). Antigenic studies of Candida. I.<br />

Observations of two antigenic groups in Candida albicans. J Bacteriol. 82: 570<br />

– 573.<br />

Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains<br />

and their applications. Biometrika. 57: 97-109.<br />

Hawksworth, D. L. (1991). The fungal dimension of biodiversity-magnitude,<br />

significance, and conservation. Mycological Research. 95: 641.<br />

Hawksworth, D. L. (2001). The magnitude of fungal diversity: the 1-5 million<br />

species estimate revisited. Mycological Research. 105: 1422-1433.<br />

Hawksworth, D. L., Kirk, P. M., Sutton, B. C., and Pegler, D. N. (1995).<br />

Ainsworth & Bisby‘s Dictionary of the Fungi. 8th ed. CAB International,<br />

Wallingford, Oxon, United King<strong>do</strong>m.<br />

Hazen, K. C. (1995). New and emerging yeast pathogens. Clin Microbiol<br />

Reviews. 8: 462–478.<br />

Heath, I. B. (1986). Nuclear division: a marker for protist phylogeny? Progress<br />

in Protistology. 1: 115-162.<br />

Hibbett, D. S., and Binder, M. (2001). Evolution of marine mushrooms.<br />

Biological Bulletin. 201: 319–322.<br />

Hibbett, D. S., Gilbert, L. B., and Donoghue, M.J. (2000). Evolutionary<br />

instability of ectomycorrhizal symbiosis in basidiomycetes. Nature. 407: 506–<br />

508.<br />

Hibbett, D. S., Binder, M., Bischoff, J. F., Blackwell, M., Cannon, P. F.,<br />

Eriksson, O. E., Huhn<strong>do</strong>rf, S., James, T., Kirk, P. M., Lücking, R., Lumbsch, T.,<br />

Lutzoni, F., Matheny, P. B., Mclaughlin, D. J., Powell, M. J., Redhead, S.,<br />

Schoch, C. L., Spatafora, J. W., Stalpers, J. A., Vilgalys, R., Aime, M. C.,<br />

Aptroot, A., Bauer, R., Begerow, D., Benny, G. L., Castlebury, L. A., Crous, P.<br />

W., Dai, Y-C., Gams, W., Geiser, D. M., Griffith, G. W., Gueidan, C.,<br />

Hawksworth, D. L., Hestmark, G., Hosaka, K., Humber, R. A., Hyde, K.,


REFERENCES<br />

Ironside, J. E., Kõljalg, U., Kurtzman, C. P., Larsson, K-H., Lichtwardt, R.,<br />

Longcore, J., Midlikowska, J., Miller, A., Moncalvo, J-M., Mozley-Standridge,<br />

S., Oberwinkler, F., Parmasto, E., Reeb, V., Rogers, J. D., Roux, C., Ryvarden,<br />

L., Sampaio, J. P., Schüßler, A., Sugiyama, J., Thorn, R. G., Tibell, L.,<br />

Untereiner, W. A., Walker, C., Wang, Z., Weir, A., Weiß, M., White, M. M.,<br />

Winka, K., Yao, Y-J., and Zhang, N. (2007). A higher-level phylogenetic<br />

classification of the Fungi. Mycological Research III. 509-247.<br />

Hirt, R. P., Healy, B., Vossbrinck, C. R., Canning, E. U., and Embley, T. M.<br />

(1997). A mitochondrial Hsp70 orthologue in Vairimorpha necatrix: molecular<br />

evidence that Microsporidia once contained mitochondria. Current Biology. 7:<br />

995–998.<br />

Hood, D. W., Deadman, M. E., Jennings, M. P., Bisercic, M., Fleischmann, R.<br />

D., Venter, J. C., and Moxon, E. R. (1996). DNA repeats identify novel<br />

virulence genes in Haemophilus influenzae. Proc Natl Acad Sci. 93: 11121–<br />

11125.<br />

Horowitz, B. J., Giaquinta, D., and Ito, S. (1992). Evolving pathogens in<br />

vulvovaginal candidiasis: Implications for patient care. J Clin Pharmacol. 32(3):<br />

248-255.<br />

Huelsenbeck, J. P., and Ronquist, F. (2001). MrBayes: Bayesian inference of<br />

phylogenetic trees. Bioinformatics Applications Note. 17(8): 754-755.<br />

Helsenbeck, J. P., Larget, B., Miller, R. E., and Ronquist, F. (2002). Potential<br />

applications and pitfalls of bayesian inference of phylogeny. Syst Biol. 51(5):<br />

673 – 688.<br />

Hull, C. M., and Johnson, A. D. (1999). Identification of a mating type-like<br />

locus in the asexual pathogenic yeast Candida albicans. Science. 285(5431):<br />

1271-1275.<br />

Hull, C. M., Raisner, R. M., and Johnson, A. D. (2000). Evidence for mating of<br />

the "asexual" yeast Candida albicans in a mammalian host. Science. 289(5477):<br />

307-310.<br />

Huson, D., and Bryant, D. (2006). Application of Phylogenetic Networks in<br />

evolutionary studies. Mol Biol Evol. 23(2): 254-267.<br />

Inagaki, Y., and Doolittle, W. F. (2000). Evolution of the eukaryotic translation<br />

termination system: origins of release factors. Mol Biol Evol. 17: 882 – 889.<br />

109


110<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Iyer, L. M., Koonin, E. V., and Aravind L. (2004). Evolution of bacterial RNA<br />

polymerase: implications for large-scale bacterial phylogeny, <strong>do</strong>main accretion,<br />

and horizontal gene transfer. Gene. 335: 73-88.<br />

Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S., and Miyata, T. (1989).<br />

Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred<br />

from phylogenetic trees of duplicated genes. Proc Natl Acad Sci. 86(23): 9355-<br />

9359.<br />

James, T. Y., Porter, D., Leander, C. A., Vilgalys, R., and Longcore, J. E.<br />

(2000). Molecular phylogenetics of the Chytridiomycota supports the utility of<br />

ultrastructural data in chytrid systematics. Canadian Journal of Botany. 78: 336–<br />

350.<br />

James, T. Y., Kauff, F., Schoch, C. L., Matheny, P. B., Hofstetter, V., Cox, C.,<br />

Celio, G., Gueidan, C., Fraker, E., Miädlikowska, J., Lumbsch, H. T., Rauhut,<br />

A., Reeb, V., Arnold, E. A., Amtoft, A., Stajich, J. E., Hosaka, K., Sung, G-H.,<br />

Johnson, D., O‘Rourke, B., Crockett, M., Binder, M., Curtis, J. M., Slot, J. C.,<br />

Wang, Z., Wilson, A. W., Schüßler, A., Longcore, J. E., O‘Donnell, K., Mozley-<br />

Standridge, S., Porter, D., Letcher, P. M., Powell, M. J., Taylor, J. W., White,<br />

M. M., Griffith, G. W., Davies, D. R., Humber, R. A., Morton, J., Sugiyama, J.,<br />

Rossman, A. Y., Rogers, J. D., Pfister, D. H., Hewitt, D., Hansen, K.,<br />

Hambleton, S., Shoemaker, R. A., Kohlmeyer, J., Volkmann-Kohlmeyer, B.,<br />

Spotts, R. A., Serdani, M., Crous, P. W., Hughes, K. W., Matsuura, K., Langer,<br />

E., Langer, G., Untereiner, W. A., Lücking, R., Büdel, B., Geiser, D. M.,<br />

Aptroot, A., Diederich, P., Schmitt, I., Schultz, M., Yahr, R., Hibbett, D. S.,<br />

Lutzoni, F., McLaughlin, D., Spatafora, J., and Vilgalys, R. (2006).<br />

Reconstructing the early evolution of the fungi using a six gene phylogeny.<br />

Nature. 443: 818–822.<br />

James, T. Y., Letcher, P. M., Longcore, J. E., Mozley-Standridge, S. E., Porter,<br />

D., Powell, M. J., Griffith, G. W., and Vilgalys, R. (2007). A molecular<br />

phylogeny of the flagellated fungi (Chytridiomycota) and a proposal for a new<br />

phylum (Blastocladiomycota). Mycologia. 98: 860–871.<br />

Jarvis, W. R. (1995). Epidemiology of nosocomial fungal infections, with<br />

emphasis on Candida species. Clin Infect Dis. 20: 1526–1530.


REFERENCES<br />

Jarvis, E. E., Clark, K. L., and Sprague Jr., G. F. (1989). The yeast transcription<br />

activator PRTF, a homolog of the mammalian serum response factor, is encoded<br />

by the MCM1 gene. Genes Dev. 3: 936–945.<br />

Jeffroy, O., Brinkmann, H., Delsuc, F., and Philippe, H. (2006). Phylogenomics:<br />

the beginning of incongruence? Trends Genet. 22(4): 225-231.<br />

Karl, S. A., and Avise, J. C. (1993). PCR-based assays of mendelian<br />

polymorphisms from anonymous single-copy nuclear DNA-techniques and<br />

applications for population genetics. Mol Biol Evol. 10: 342–361.<br />

Kashi, Y., King, D., and Soller, M. (1997). Simple sequence repeats as a source<br />

of quantitative genetic variation. Trends Genet. 13: 74–78.<br />

Kawaguchi, Y., Honda, H., Taniguchi-Morimura, J., and Iwasaki, S. (1989). The<br />

co<strong>do</strong>n CUG is read as serine in an asporogenic yeast Candida cylindracea.<br />

Nature. 341(6238): 164-166.<br />

Keeling, P. J. (2003). Congruent evidence from alpha-tubulin and beta-tubulin<br />

gene phylogenies for a zygomycete origin of microsporidia. Fungal Genetics and<br />

Biology. 38: 298–309.<br />

Keeling, P. J., Luker, M. A., and Palmer, J. D. (2000). Evidence from beta-<br />

tubulin phylogeny that Microsporidia evolved from within the fungi. Molecular<br />

Biology and Evolution. 17: 23–31.<br />

King, N., and Carroll, S. B. (2001): A receptor tyrosine kinase from<br />

choanoflagellates: molecular insights into early animal evolution. Proc. Natl<br />

Acad. Sci. 98: 15032–15037.<br />

Kirk, P. M., Cannon, P. F., David, J. C., and Stalpers, J. A. (eds). (2001).<br />

Ainsworth & Bisby‘s; Dictionary of the Fungi 60 (CAB International,<br />

Wallingford, UK).<br />

Klempp-Selb, B., Rimek, D., and Kappe, R. (2000). Karyotyping of Candida<br />

albicans and Candida glabrata from patients with Candida sepsis. Mycoses. 43:<br />

159-163.<br />

Kohn, L. M. 1992. Developing new characters for fungal systematics: an<br />

experimental approach for determining the rank of resolution. Mycologia. 84:<br />

139–153.<br />

Kosakovsky, S. L., Frost, S. D. W., and Muse, S. V. (2005). HyPhy: Hypothesis<br />

testing using phylogenies. Bioinformatics. 21(5): 676–9.<br />

111


112<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Kramer, E. M., Su, H-J., Wu, C-C., and Hu, J-M. (2006). A simplified<br />

explanation for the frameshift mutation that created novel C-terminal motif in<br />

the APETALA3 gene lineage. BMC Evolutionary Biology. 6: 30.<br />

Krcmery, V., and Barnes, A. J. (2002). Non-albicans Candida spp. causing<br />

fungaemia: pathogenicity and antifungal resistance. J Hosp Infect. 50: 243–260.<br />

Kruger, J., Aichinger, C., Kahmann, R., and Bolker, M. (1997). A MADS-box<br />

homologue in Ustilago maydis regulates the expression of pheromoneinducible<br />

genes but is nonessential. Genetics. 147: 1643– 1652.<br />

Kuhn, D. M., Mukherjee, P. K., Clark, T. A., Pujol, C., Chandra, J., Hajjeh, R.<br />

A., Warnock, D. W., Soll, D. R., and Ghannoum, M. A. (2004). Candida<br />

parapsilosis characterization in an outbreak setting. Emerg Infect Dis. 10: 1074–<br />

1081.<br />

Kullberg, B. J., and Lashof, A. M. L. O. (2002). Epidemiology of opportunistic<br />

invasive mycoses. European Journal of Medical Research. 7: 183-191.<br />

Kuo, M. H., Nadeau, E. T., and Grayhack, E. J. (1997). Multiple phosphorylated<br />

forms of the Saccharomyces cerevisiae Mcm1 protein include an isoform<br />

induced in response to high salt concentrations. Mol Cell Biol. 17: 819–832.<br />

Kurtzman, C. P., and Robnett, C. J. (1998). Identification and phylogeny of<br />

ascomycetous yeasts from analysis of nuclear large subunit (26S) ribosomal<br />

DNA partial sequences. Antonie van Leeuwenhoek, 73: 331–371.<br />

Kurtzman, C. P., and Robnett, C. J. (2003). Phylogenetic relationships among<br />

yeasts of the ‗Saccharomyces complex‘ determined from multigene sequence<br />

analyses. FEMS Yeast Res. 3: 417–432.<br />

Latchman, D. S. (1998). Eukaryotic transcription factors, Academic Press.<br />

Lang, B. F., O‘Kelly, C., Nerad, T., Gray, M. W., and Burger, G. (2002). The<br />

closest unicellular relatives of animals. Curr Biol. 12: 1773–1778.<br />

Langkjaer, R. B., Cliften, P. F., Johnston, M., and Piskur, J. (2003). Yeast<br />

genome duplication was followed by asynchronous differentiation of duplicated<br />

genes. Nature. 421: 848–852.<br />

Linnaei, C. (1735). Systema naturae sive Regna Tria Naturae Systematice<br />

proposita per Classes, Ordines, Genera et Species. Lugduni Batavorum: apud T.<br />

Hakk.


REFERENCES<br />

Le´John, H. B. (1974). Biochemical parameters of fungal phylogenetics. Evol<br />

Biol. 7: 79–125.<br />

Li, J., Xu, J., and Bai, F. (2006). Candida pseu<strong>do</strong>rugosa sp. nov., a novel yeast<br />

species from sputum. J Clin Microbiol. 44(12): 4486-4490.<br />

Li, Y-C., Korol, A. B., Fahima, T., Beiles, A., and Nevo, E. (2002).<br />

Microsatellites: genomic distribution, putative functions and mutational<br />

mechanisms. Molecular Ecology. 11: 2453-2465.<br />

Li, Y-C., Korol, A.B., Fahima, T., and Nevo, E. (2004). Microsatellites within<br />

genes: structure, function, and evolution. Mol Biol Evol. 21: 991–1007.<br />

Liu, Y. J., Whelen, S., and Hall, B. D. (1999). Phylogenetic relationships among<br />

ascomycetes: evidence from an RNA polymerase II subunit. Molecular Biology<br />

and Evolution. 16: 1799–1808.<br />

Liu, Y. J., Hodson, M. C., and Hall, B. D. (2006). Loss of the flagellum<br />

happened only once in the fungal lineage: phylogenetic structure of king<strong>do</strong>m<br />

Fungi inferred from RNA polymerase II subunit genes. BMC Evolutionary<br />

Biology. 6: 74.<br />

Liu, Y. J., Leigh, J. W., Brinkmann, H., Cushion, M. T., Rodriguez-Ezpeleta, N.,<br />

Philippe, H., and Lang, B. F. (2009). Phylogenomic analysis support the<br />

monophyly of Taphrinomycotina, including Schizosaccharomyces fission<br />

yeasts. Mol Biol Evol. 26: 27-34.<br />

Logue, M. E., Wong, S., Wolfe, K. H., and Butler, G. (2005). A genome<br />

sequence survey shows that the pathogenic yeast Candida parapsilosis has a<br />

defective MTLa1 allele at its mating type locus. Eukaryot Cell. 4(6): 1009-1017.<br />

López Martínez, R., Ruiz Sánchez, J., and Vértiz <strong>Chávez</strong>, E. (1984). Vaginal<br />

candidiasis. Mycopathologia. 85: 167-170.<br />

Lowy B. 1971. New records of mushroom stones from Guatemala. Mycologia.<br />

63: 983-993.<br />

Lumbsch, H. T., Schmitt, I., Lindemuth, R., Miller, A., Mangold, A., Fernandez,<br />

F., and Huhn<strong>do</strong>rf, S. (2005). Performance of four ribosomal DNA regions to<br />

infer higher-level phylogenetic relationships of inoperculate euascomycetes<br />

(Leotiomyceta). Mol Phylogenet Evol. 34(3):512-524.<br />

Lutzoni, F., Pagel, M., and Reeb, V. (2001). Major fungal lineages are derived<br />

from lichen symbiotic ancestors. Nature. 411: 937–940.<br />

113


114<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Lutzoni, F., Kauff, F., Cox, C. J., McLaughlin, D., Celio, G., Dentinger, B.,<br />

Padamsee, M., Hibbett, D. S., James, T. Y., Baloch, E., Grube, M., Reeb, V.,<br />

Hofstetter, V., Schoch, C., Arnold, A. E., Miädlikowska, J., Spatafora, J.,<br />

Johnson, D., Hambleton, S., Crockett, M., Shoemaker, R., Sung, G-H., Lücking,<br />

R., Lumbsch, T., O‘Donnell, K., Binder, M., Diederich, P., Ertz, D., Gueidan,<br />

C., Hansen, K., Harris, R. C., Hosaka, K., Lim, Y-W., Matheny, B., Nishida, H.,<br />

Pfister, D., Rogers, J., Rossman, A., Schmitt, I., Sipman, H., Stone, J.,<br />

Sugiyama, J., Yahr, R., and Vilgalys, R. (2004). Assembling the fungal tree of<br />

life: progress, classification, and evolution of subcellular traits. American<br />

Journal of Botany. 91: 1446–1480.<br />

Magee, B. B., and Magee, P. T. (1987). Electrophoretic karyotypes and<br />

chromosome numbers in Candida species. J Gen Microbiol. 133: 425–430.<br />

Magee, B. B., and Magee, P. T. (2000). Induction of mating in Candida albicans<br />

by construction of MTLa and MTLalpha strains. Science. 289(5477): 310-313.<br />

Marcotte, E. M., Pellegrini, M., Yeates, T. O., and Eisenberg, D. (1999). A<br />

census of protein repeats. J Mol Biol. 293: 151–160.<br />

Massey, S. E., Moura, G., Beltrao, P., Almeida, R., Garey, J. R., Tuite, M. F.,<br />

and Santos, M. A. (2003). Comparative evolutionary genomics unveils the<br />

molecular mechanism of reassignment of the CTG co<strong>do</strong>n in Candida spp.<br />

Genome Res. 13(4): 544-557.<br />

Matheny, P. B., Gossman, J. A., Zalar, P., Arun Kumar, T. K., and Hibbett, D. S.<br />

(2007a). Resolving the phylogenetic position of the Wallemiomycetes: an<br />

enigmatic major lineage of Basidiomycota. Canadian Journal of Botany. 84:<br />

1794–1805.<br />

Matheny, P. B., Wang, Z., Binder, M., Curtis, J. M., Lim, Y. W., Nilsson, R. H.,<br />

Hughes, K. W., Hofstetter, V., Ammirati, J. F., Schoch, C. L., Langer, G. E.,<br />

McLaughlin, D. J., Wilson, A. W., Frøslev, T., Ge, Z. W., Kerrigan, R. W., Slot,<br />

J. C., Vellinga, E. C., Liang, Z. L., Baroni, T. J., Fischer, M., Hosaka, K.,<br />

Matsuura, K., Seidl, M. T., Vaura, J., and Hibbett, D. S. (2007b). Contributions<br />

of Rpb2 and Tef1 to the phylogeny of mushrooms and allies (Basidiomycota,<br />

Fungi). Molecular Phylogenetics and Evolution. 43: 430-451.<br />

Mayer, V. W., and Aguilera, A. (1990). High levels of chromosome instability in<br />

polyploids of Saccharomyces cerevisiae. Mutat Res. 231: 177–186 (1990).


REFERENCES<br />

McConville, M. J., and Menon, A. K. (2000). Recent developments in the cell<br />

biology and biochemistry of glycosylphosphatidylinositol lipids (review). Mol<br />

Membr Biol. 17: 1–16.<br />

McInerny, C. J., Partridge, J. F., Mikesell, G. E., Creemer, D. P., and Breeden,<br />

L. L. (1997). A novel Mcm1-dependent element in the SWI4, CLN3, CDC6, and<br />

CDC47 promoters activates M/G1-specific transcription. Genes Dev. 11: 1277–<br />

1288.<br />

Melo, A., de Almeida, L., Colombo, A., and Briones, M. (1998). Evolutionary<br />

distances and identification of Candida species in clinical isolates by ran<strong>do</strong>mly<br />

amplified polymorphic DNA (RAPD). Mycopathologia. 142: 57-66.<br />

Messenguy, F., and Dubois, E. (1993). Genetic evidence for a role for MCM1 in<br />

the regulation of arginine metabolism in Saccharomyces cerevisiae. Mol Cell<br />

Biol. 13: 2586– 2592.<br />

Messenguy, F., and Dubois, E. (2000). Regulation of arginine metabolism in<br />

Saccharomyces cerevisiae: a network of specific and pleiotropic proteins in<br />

response to multiple environmental signals. Food Technol Biotechnol. 38: 277–<br />

285.<br />

Metzgar, D., van Belkum, A., Field, D., Haubrich, R., and Wills, C. (1998).<br />

Ran<strong>do</strong>m amplification of polymorphic DNA and microsatellite genotyping of<br />

pre- and posttreatment isolates of Candida spp. from human immunodeficiency<br />

virus-infected patients on different fluconazole regimens. J Clin Microbiol. 36:<br />

2308-2313.<br />

Metzgar, D., Bytof, J., and Wills, C. (2000). Selection against frameshift<br />

mutations limits microsatellite expansion in coding DNA. Genome Res. 10: 72–<br />

80.<br />

Miadlikowska, J., and Lutzoni, F. (2004). Phylogenetic classification of<br />

peltigeralean fungi (Peltigerales, Ascomycota) based on ribosomal RNA small<br />

and large subunits. American Journal of Botany. 91: 449–464.<br />

Micales, J. A., Bonde, M. R., and Peterson, G. L. (1986). The use of isozyme<br />

analysis in fungal taxonomy and genetics. Mycotaxon. 27: 405-409.<br />

Moestrup, Ø. (2000): The flagellate cytoskeleton: introduction of a general<br />

terminology for microtubular roots in protists. In: Leadbeater B. S. and Green J.<br />

115


116<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

C. (eds): The flagellates: unity, diversity and evolution, pp. 69–94. Taylor and<br />

Francis, Lon<strong>do</strong>n.<br />

Moncalvo, J. M., Vilgalys, R., Redhead, S. A., Johnson, J. E., James, T. Y.,<br />

Aime, M. C., Hofstetter, V., Verduin; S. J. W., Larsson, E., Baroni, T. J., Thorn,<br />

R.G., Jacobsson, S., Clémençon, H., and Miller, jr, O. K. (2002). One hundred<br />

and seventeen clades of euagarics. Molecular Phylogenetics and Evolution. 23:<br />

357–400.<br />

Moss, M. O. (1987). Fungal biotechnology roundup. Mycologist, 21: 55-58.<br />

Moxon, E. R., Rainey, P. B., Nowak, M. A., and Lenski, R. E. (1994). Adaptive<br />

evolution of highly mutable loci in pathogenic bacteria. Curr Biol. 4: 24–33.<br />

Mueller, C. G., and Nordheim, A. (1991). A protein <strong>do</strong>main conserved between<br />

yeast MCM1 and human SRF directs ternary complex formation. EMBO J. 10:<br />

4219– 4229.<br />

Musial, C. E., Cockerill, F. R., and Roberts, G. D. (1988). Fungal infections of<br />

the immunocompromised host: Clinical and laboratory aspects. Clin Microbiol<br />

Rev. 1: 349–364.<br />

Nagahama, T., Sato, H., Shimazu, M., and Sugiyama, J. (1995). Phylogenetic<br />

divergence of the entomophthoralean fungi: evidence from nuclear 18S<br />

ribosomal RNA gene sequences. Mycologia. 87: 203–209.<br />

Nagy, Á., Pesti, M., Galgóczy, M., and Vágvölyi, C. (2004). Electrophoretic<br />

karyotype of two Micromucor species. Journal of Basic Microbiology. 44(1):<br />

36-41.<br />

Nadal Ed, E., Casa<strong>do</strong>me, L., and Posas, F. (2003). Targeting the MEF2-like<br />

transcription factor Smp1 by the stress-activated Hog1 mitogen-activated protein<br />

kinase. Mol Cell Biol. 23: 229– 237.<br />

Nei, M. (1996). Phylogenetic analysis in molecular evolutionary genetics. Annu<br />

Rev Genet. 30: 371-403.<br />

Nielsen, R., and Yang, Z. (1998). Likelihood models for detecting positively<br />

selected amino acid sites and applications to the HIV-1 envelope gene. Genetics.<br />

148: 929–936.<br />

Nielsen, C. B., Friedman, B., Birren, B., Burgue, C. B., and Galagan, J. E.<br />

(2004). Patterns of intron gain and loss in Fungi. Plos Biology. 2(12): e422.


REFERENCES<br />

Nogueira, M. I., Silva, P., Galin<strong>do</strong>, I., Ramirez, J. L., Arruda, R., and Franco, J.<br />

(1998). Electrophoretic karyotypes and genome sizing of the pathogenic fungus<br />

Paracoccidioides brasiliensis. J Clin Microbiol. 36(3): 742-747.<br />

Norman, C., Runswick, M., Pollock, R., and Treisman, R. (1988). Isolation and<br />

properties of cDNA clones encoding SRF, a transcription factor that binds to the<br />

c-fos serum response element. Cell. 55: 989– 1003.<br />

Nylander, J. A. A. (2004). MrModeltest v2. Program distributed by the author.<br />

Evolutionary Biology Centre, Uppsala University.<br />

Odds, F. C. (1988). Candida and Candi<strong>do</strong>sis: A Review and Bibliography. 2 ed.<br />

Baillière Tindall, Lon<strong>do</strong>n.<br />

Ohno, S. (1970). Evolution by Gene Duplication (Allen and Unwin, Lon<strong>do</strong>n).<br />

Ophuls, M. D. (1905). Further observations on a pathogenic mould formerly<br />

described as a protozoon (Coccidioides immitis, Coccidioides pyrogenes). J Exp<br />

Med. 6: 443-485.<br />

Pace, N. R. (1997). A molecular view of microbial diversity and the biosphere.<br />

Science. 276: 734–740.<br />

Pan, S., Sigler, L., and Cole, G. T. (1994). Evidence for a phylogenetic<br />

connection between Coccidioides immitis and Uncinocarpus reesii<br />

(Onygenaceae). Microbiology. 140 (Pt 6): 1481-1494.<br />

Passmore, S., Maine, G. T., Elble, R., Christ, C., and Tye, B. K. (1988).<br />

Saccharomyces cerevisiae protein involved in plasmid maintenance is necessary<br />

for mating of MAT alpha cells. J Mol Biol. 204: 593– 606.<br />

Passmore, S., Elble, R., and Tye, B. K.. (1989). A protein involved in<br />

minichromosome maintenance in yeast binds a transcriptional enhancer<br />

conserved in eukaryotes. Genes Dev. 3: 921– 935.<br />

Peak, I. R. A., Jennings, M. P., Hood, D. W., Bisercic, M., and Moxon, E. R.<br />

(1996). Tetrameric repeat units associated with virulence factor phase variation<br />

in Haemophilus also occur in Neiserria spp. and Moraxella catarrhalis. FEMS<br />

Microbiol Lett. 137: 109–114.<br />

Pedrós, M. (2003). Clonación y caracterización de una hidrofobina de clase II<br />

(CaHPB) de Candida albicans. Tesis Doctoral. Facultad de Farmacia.<br />

Universitat de Valencia. Valencia, España. 215 pag.<br />

117


118<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Pellegrini, L., Tan, S., and Richmond, T. J. (1995). Structure of serum response<br />

factor core bound to DNA. Nature. 376: 490– 498.<br />

Peretó, J., López-García, P., and Moreira, D. (2004). Ancestral lipid biosynthesis<br />

and early membrane evolution. Trends Biochem Sci. 29: 469-477.<br />

Persoh, D., and Rambold, G. (2003). Fungal diversity in the host lichen Letharia<br />

vulpine. In J. Stadler, I. Hensen, S. Klotz, and H. Feldmann (eds.),<br />

Biodiversity—from patterns to processes. Kurzfassungen der Beiträge zur 33.<br />

Jahrestagung der Gesellschaft für Ökologie, 161. Verhandlungen der<br />

Gesellschaft für Ökologie, Halle/Saale, Germany.<br />

Peterson, S. W. (2000). Phylogenetic relationships in Aspergillus based on<br />

rDNA sequences analysis. In R. A. Samson and J. I. Pitt (eds.), Integration of<br />

modern taxonomic methods for Penicillium and Aspergillus classification.<br />

Harwood Academic Publishers, Amsterdam, Netherlands.<br />

Peyretaillade, E., Broussolle, V., Peyret, P., Méténier, G., Gouy, M., Vivarès, C.<br />

P. (1998). Microsporidia, amitochondrial protists, possess a 70-kDa heat shock<br />

protein gene of mitochondrial evolutionary origin. Molecular Biology and<br />

Evolution. 15: 683–689.<br />

Pfaller, M. A., Diekema, D. J., Jones, R. N., Sader, H. S., Fluit, A. C., Hollis, R.<br />

J., and Messer, S. A. (2001). International surveillance of bloodstream infections<br />

due to Candida species: frequency of occurrence and in vitro susceptibilities to<br />

fluconazole, ravuconazole, and voriconazole of isolates collected from 1997<br />

through 1999 in the SENTRY antimicrobial surveillance program. J Clin<br />

Microbiol. 39: 3254–3259.<br />

Pfaller, M. A., Diekema, D. J., Messer, S. A., Boyken, L., Hollis, R. J., and<br />

Jones, R. N. (2003). In vitro activities of voriconazole, posaonazole, and four<br />

licensed systemic antifungal agents against Candida species infrequently<br />

isolated from blood. J Clin Microbiol. 41: 78–83.<br />

Phillips, M. J., Delsuc, F., and Penny, D. (2004). Genome-scale phylogeny and<br />

the detection of systematic biases. Mol Biol Evol. 21(7): 1455-1458.<br />

Pirozynski, K. A., and Malloch, D. (1975). The origin of land plants: a matter of<br />

mycotropism. BioSystems. 6: 153–164.


REFERENCES<br />

Pollock, R., and Treisman, R. (1991). Human SRF-related proteins:<br />

DNAbinding properties and potential regulatory targets. Genes Dev. 5: 2327–<br />

2341.<br />

Polonelli, L., Conti, S., Magliani, W., and Horace, G. (1989). Biotyping of<br />

pathogenic fungi by the killer system and with monoclonal antibodies.<br />

Mycopathologia. 107(1): 17 – 23.<br />

Posada, D. (2008). jModelTest: Phylogeny model averaging. Mol Biol Evol.<br />

25(7): 1253-1256.<br />

Pujol, C., Daniels, K. J., Lockhart, S. R., Srikantha, T., Radke, J. B., Geiger, J.,<br />

and Soll, D. R. (2004). The closely related species Candida albicans and<br />

Candida dubliniensis can mate. Eukaryot Cell. 3(4): 1015-1027.<br />

Raes J., and van de Peer, Y. (2005). Functional divergenceof proteins through<br />

frameshift mutations. Trends in Genetics. 21(8): 428-431.<br />

Ragan, M. A., Goggins, C. L., Cawthorn, R. J., Cerenius, L., Jamieson, A. V. C.,<br />

Plourde, S. M., Rand, T. G., Söderhäll, K., and Gutell, R. R. (1996). A novel<br />

clade of protistan parasites near the animal-fungal divergence. Proc Natl Acad<br />

Sci. 93: 11907–11912.<br />

Reeb, V., Lutzoni, F., and Roux, C. (2004). Contribution of RPB2 to multilocus<br />

phylogenetic studies of the euascomycetes (Pezizomycotina, Fungi) with special<br />

emphasis on the lichen-forming Acarosporaceae and evolution of polyspory.<br />

Molecular Phylogenetics and Evolution. 32: 1036-1060.<br />

Ribeiro, E., Pereira, C., Takaki, R., and Höfling, J.F. (2000). Grouping oral<br />

Candida species by multilocus enzyme electrophoresis. Int J Syst Evol<br />

Microbiol. 50: 1343-1349.<br />

Richard, G.–F., and Dujon, B. (1997). Trinucleotide repeats in yeast. Res<br />

Microbiol, 148: 731–744.<br />

Riechmann, J. L., and Meyerowitz, E. M. (1997). Determination of floral organ<br />

identity by Arabi<strong>do</strong>psis MADS <strong>do</strong>main homeotic proteins AP1, AP3, PI, and<br />

AG is independent of their DNA-binding specificity. Mol Biol Cell. 8: 1243–<br />

1259.<br />

Riechmann, J. L., Krizek, B. A., and Meyerowitz, E. M. (1996). Dimerization<br />

specificity of Arabi<strong>do</strong>psis MADS <strong>do</strong>main homeotic proteins APETALA1,<br />

119


120<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci. 93: 4793 –<br />

4798.<br />

Rixford, E., and Gilchrist, C. (1896). Two cases of protozoan (coccidioidal)<br />

infection of the skin and other organs. Johns Hopkins Hospital Report. 1: 209-<br />

268.<br />

Rensberger, B. (1992). The Iceman: Now the research is on ice. J NIIT Res. 4:<br />

25–27.<br />

Rodrigues de Miranda, L. (1979). Clavispora, a new yeast genus of the<br />

Saccharomycetales. Antonie Van Leeuwenhoek. 45(3): 479-483.<br />

Roger, A. J, and Hug, L. A. (2006). The origin and diversification of eukaryotes:<br />

problems with molecular phylogenetics and molecular clock estimation. Philos<br />

Trans R Soc Lond B Biol Sci. 361: 1039-1054.<br />

Rokas, A., King, N., Finnerty, J., and Carroll, S. B. (2003). Conflicting<br />

phylogenetic signals at the base of the metazoan tree. Evol. Dev. 5: 346–359.<br />

Romero, A. J., and Minter, D. W. (1988). Fluorescence microscopy: an aid to the<br />

elucidation of ascomycete structures. Trans Br Mycol Soc. 90: 457–470.<br />

Rottmann, M., Dieter, S., Brunner, H., and Rupp, S. (2003). A screen in<br />

Saccharomyces cerevisiae identified CaMCM1, an essential gene in Candida<br />

albicans crucial for morphogenesis. Mol Microbiol. 47: 943–959.<br />

Saitou, N., and Nei, M. (1987). The neighbor-joining method: a new method for<br />

reconstructing phylogenetic trees. Mol Biol Evol. 4: 406–425.<br />

Sampaio, P., Gusmão, L., Alves, C., Pina-Vaz, C., Amorim, A., and Pais, C.<br />

(2003). Highly polymorphic microsatellite for identification of Candida albicans<br />

strains. J Clin Microbiol. 41(2): 552-557.<br />

Sampaio, P., Gusmão, L., Correia, A., Alves, C., Rodrigues, A., Pina-Vaz, C.,<br />

Amorim, A., and Pais, C. (2005). New microsatellite multiplex PCR for Candida<br />

albicans strain typing reveals microevolutionary changes. J Clin Microbiol.<br />

43(8): 3869-3876.<br />

Sampaio, P., Nogueira, E., Loureiro, A. S., Delga<strong>do</strong>-Silva, Y, Correia, A., Pais,<br />

C. (2009). Increased number of glutamine repeats in the C-terminal of Candida<br />

albicans Rlm1p enhances the resistant to stress agents. Antonie van<br />

Leeuwehoek. In press.


REFERENCES<br />

Sandven, P. (2000). Epidemiology of candidemia. Rev Iberoam Micol. 17: 73–<br />

81.<br />

San-Millan, R., Ribacoba, L., Ponton, J., and Quin<strong>do</strong>s, G. (1996). Evaluation of<br />

a commercial medium for identification of Candida species. J Clin Microbiol<br />

Infect Dis. 15: 153 – 158.<br />

Santos, M. A., and Tuite, M. F. (1995). The CUG co<strong>do</strong>n is decoded in vivo as<br />

serine and not leucine in Candida albicans. Nucleic Acids Res. 23(9): 1481-<br />

1486.<br />

Santelli, E., and Richmond, T.J. (2000). Crystal structure of MEF2A core bound<br />

to DNA at 1.5 A resolution. J Mol Biol. 297: 437– 449.<br />

Savelkoul, P. H. M., Aarts, H. J. M., de Haas, J., Dijkshoorn, L., Duim, B.,<br />

Otsen, M., Rademaker, J. L. W., Schouls, L., and Lenstra, J. A. 1999. Amplified<br />

fragment length polymorphism analysis: the state of an art. J Clin Microbiol. 37:<br />

3083–3091.<br />

Scannell, D. R., Byrne, K. P., Gor<strong>do</strong>n, J. L., Wong, S., and Wolfe, K. H. (2006).<br />

Multiple rounds of speciation associated with reciprocal gene loss in polyploid<br />

yeasts. Nature. 440(7082): 341-345.<br />

Shalchian-Tabrizi, K., Minge, M. A., Espelund, M., Orr, R., Ruden, T.,<br />

Jakobsen, K. S., and Cavalier-Smith, T. (2008). Multigene phylogeny of<br />

Choanozoaand the Origin of Animals. Plos ONE. 3(5): e2098.<br />

Schoch, C. L., Shoemaker, R. A., Seifert, K. A. Hambleton, S., Spatafora, J. W.,<br />

and Crous, P. W. (2006). A multigene phylogeny of the Dothideomycetes using<br />

four nuclear loci. Mycologia. 98(6): 1041-1052.<br />

Schüßler, A., Schwarzott, D., and Walker, C. (2001). A new fungal phylum, the<br />

Glomeromycota: phylogeny and evolution. Mycologycal Research. 105: 1413-<br />

1421.<br />

Schwarz-Sommer, Z., Huijser, P., Nacken, W., Saedler, H., and Sommer, H.<br />

(1990). Genetic control of flower development by homeotic genes in<br />

Antirrhinum majus. Science. 250: 931– 936.<br />

Seifert, K. A., Wingfield, B. D., and Wingfield, M. J. 1995. A critique of DNA<br />

sequence analysis in the taxonomy of filamentous Ascomycetes and<br />

ascomycetous anamorphs. Can J Bot. 73 (Suppl. 1): S760–S767.<br />

121


122<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Seoighe, C., and Wolfe, K. H. (1999). Updated map of duplicated regions in the<br />

yeast genome. Gene. 238: 253–261.<br />

Sharrocks, A. D., Gille, H., and Shaw, P. E. (1993). Identification of amino acids<br />

essential for DNA binding and dimerization in p67SRF: implications for a novel<br />

DNA-binding motif. Mol Cell Biol. 13: 123– 132.<br />

Shemer, R., Weissman, Z., Hashman, N., and Kornitzer, D. 2001. A highly<br />

polymorphic degenerate microsatellite for molecular strain typing of Candida<br />

krusei. Microbiology. 147: 2021-2028.<br />

Shin, J. H., Shin, D. H., Song, J. W., Kee, S. J., Suh, S. P., and Ryang, D. W.<br />

(2001). Electrophoretic karyotype analysis of sequential Candida parapsilosis<br />

isolates from patients with persistent or recurrent fungemia. J Clin Microbiol.<br />

39(4): 1258-1263.<br />

Sneath, P. H. A., and Sokal, R. R. (1973). Numerical Taxonomy. W.H. Freeman<br />

and Company, San Francisco, pp 230-234.<br />

Soll, D. R., Staebell, M., Langtimm, C.J., Pfaller, M., Hicks, J., and Rao, T.V.G.<br />

(1988). Multiple Candida strains in the course of a single systemic infection. J.<br />

Clin. Microbiol. 26: 1448-1459.<br />

Soll, D. (2000). The ins and out of DNA fingerprinting the infectious fungi. Clin<br />

Microbiol Rev. 13(2): 332-370.<br />

Sommer, H., Beltran, J. P., Huijser, P., Pape, H., Lonnig, W. E., Saedler, H., and<br />

Schwarz-Sommer, Z. (1990). Deficiens, a homeotic gene involved in the control<br />

of flower morphogenesis in Antirrhinum majus: the protein shows homology to<br />

transcription factors. EMBO J. 9: 605– 613.<br />

Spatafora, J. W., Sung, G-H., Johnson, D., Hesse, C., O‘Rourke, B., Serdani, M.,<br />

Spotts, R., Lutzoni, F., Hofstetter, V., Miadlikowska, J., Reeb, V., Gueidan, C.,<br />

Fraker, E., Lumbsch, T., Lücking, R., Schmitt, I., Hosaka, K., Aptroot, A.,<br />

Roux, C., Miller, A. N., Geiser, D. M., Hafellner, J., Hestmark, G., Arnold, A.<br />

E., Büdel, B., Rauhut, A., Hewitt, D., Untereiner, W. A., Cole, M. S.,<br />

Scheidegger, C., Schultz, M., Sipman, H., and Schoch, C. L. (2006). A five-gene<br />

phylogeny of Pezizomycotina. Mycologia. 98(6): 1018-1028.<br />

Stechmann, A., and Cavalier-Smith, T. (2002). Rooting the eukaryote tree by<br />

using a derived gene fusion. Science. 297: 89–91.


REFERENCES<br />

Stechmann, A., and Cavalier-Smith, T. (2003). The root of the eukaryote tree<br />

pinpointed. Curr Biol. 13: R665–R666.<br />

Steenkamp, E. T., Wright, J., and Baldauf, S. (2006). The protistan origins of<br />

Animals and Fungi. Mol Biol Evol. 23(1): 93 -106.<br />

Stenroos, S., Hyvönen, J., Myllys, L., Thell, A., and Ahti, T. (2002). Phylogeny<br />

of the genus Cla<strong>do</strong>nia s. lat. (Cla<strong>do</strong>niaceae, Ascomycetes) inferred from<br />

molecular, morphological, and chemical data. Cladistics. 18: 237–278.<br />

Stern, A., Doron-Faigenboim, A., Erez, E., Martz, E., Bacharach, E., and Pupko,<br />

T. (2007). Selecton 2007: advanced models for detecting positive and purifying<br />

selection using a Bayesian inference approach. Nucleic Acids Research. 35:<br />

W506-W511.<br />

Sugita, T., Takashima, M., Poonwan, N., and Mekha, N. (2006). Candida<br />

pseu<strong>do</strong>haemulonii an amphotericin B- and azole-resistant yeast species, isolated<br />

from the blood of a patient from Thailand. Microbiol Immunol. 50(6): 469-473.<br />

Suh, S-O., Blackwell, M., Kurtzman, C., and Lachance, M-A. (2006).<br />

Phylogenetics of Saccharomycetales, the ascomycete yeasts. Mycologia. 98(6):<br />

1006-1017.<br />

Sullivan, D., and Coleman, D. (1997). Candida dubliniensis: an emerging<br />

opportunistic pathogen. Curr Top Med Mycol. 8: 15–25.<br />

Sullivan, D. J., Westerneng, T. J., Haynes, K. A., Bennet, D. E., and Coleman,<br />

D. C. (1995). Candida dubliniensis sp. nov.: Phenotypic and molecular<br />

characterization of a novel species associated with oral candi<strong>do</strong>sis in HIV-<br />

infected individuals. Microbiology. 141: 1507-1521.<br />

Sundstrom, P. (2002). Adhesion in Candida spp. Cell Microbiol. 4: 461–469.<br />

Suzuki, Y., Glazko, G. V., and Nei, M. (2002). Overcredibility of molecular<br />

phylogenies obtained by Bayesian phylogenetics. Proc Natl Acad Sci. 99:<br />

16138–16143.<br />

Swanson, W. J., Yang, Z., Wolfner, M., and Aqua<strong>do</strong>, C. F. (2001). Positive<br />

Darwinian selection drives the evolution of several female reproductive proteins<br />

in mammals. Proc Natl Acad Sci. 98: 2509–2514.<br />

Swofford, D. L. (2002). PAUP: Phylogenetic analysis using parsimony (*and<br />

other methods). Sunderland, Massachusetts, Sinauer Associates.<br />

123


124<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Tachida, H., and Izuka, M. (1992). Persistence of repeated sequences that evolve<br />

by replication slippage. Genetics. 131: 471–478.<br />

Takezaki, N., and Nei, M. (1994). Inconsistency of the maximum parsimony<br />

method when the rate of nucleotide substitution is constant. J Mol Evol. 39:<br />

210–218.<br />

Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular<br />

Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular<br />

Biology and Evolution. 24: 1596-1599.<br />

Tanabe, Y., O‘Donnell, K., Saikawa, M., and Sugiyama, J. (2000). Molecular<br />

phylogeny of parasitic Zygomycota (Dimargaritales, Zoopagales) based on<br />

nuclear small subunit ribosomal DNA sequences. Molecular Phylogenetics and<br />

Evolution. 16: 253–262.<br />

Tanabe, Y., Saikawa, M., Watanabe, M. M., and Sugiyama, J. (2004). Molecular<br />

phylogeny of Zygomycota based on EF-1 and RPB1 sequences: limitations and<br />

utility of alternative markers to rDNA. Molecular Phylogenetics and Evolution.<br />

30: 438–449.<br />

Tanabe, Y., Watanabe, M. M., and Sugiyama, J. (2005). Evolutionary<br />

relationships among basal fungi (Chytridiomycota and Zygomycota): Insights<br />

from molecular phylogenetics. Journal of General and Applied Microbiology.<br />

51: 267–276.<br />

Tateishi, T., Murayama, S. Y., Otsuka, F., and Yamaguchi, H. (1996).<br />

Karyotyping by PFGE of clinical isolates of Sporothrix schenckii. FEMS<br />

Immunol Med Microbiol. 13: 147–154.<br />

Tavanti, A., Davidson, A., Johnson, E., Maiden, M., Shaw, D., Gow, N., and<br />

Odds, F. (2005a). Multilocus sequence typing for differentiation of strains of<br />

Candida tropicalis. J Clin Microbiol. 43(11): 5593-5600.<br />

Tavanti, A., Davidson, A., Gow, N., Maiden, M., and Odds, F. (2005b). Candida<br />

orthopsilosis and Candida metapsilosis spp. nov. to replace Candida<br />

parapsilosis groups II and III. J Clin Microbiol. 43(1): 284-292.<br />

Taylor, F. J. R. (1978). Problems in the development of an explicit hypothetical<br />

phylogeny of the lower eukaryotes. BioSystems. 10: 67–89.


REFERENCES<br />

Taylor, J. W., Geiser, D. M., Burt, A., and Koufopanou, V. (1999). The<br />

evolutionary biology and population genetics underlying fungal strain typing.<br />

Clin Microbiol Rev. 12(1): 126-146.<br />

Tehler, A. (1988). A cladistic outline of the Eumycota. Cladistics. 4: 227–277.<br />

Tehler, A., Farris, J. S., Lipscomb, D. L., Källerrsjö, M. (2000). Phylogenetic<br />

analyses of the fungi based on large rDNA data sets. Mycologia. 92: 459–474.<br />

Tehler, A., Little, D. P., and Farris, J. S. 2003. The full-length phylogenetic tree<br />

from 1551 ribosomal sequences of chitinous fungi, Fungi. Mycological<br />

Research. 107: 901–916.<br />

Theissen, G., Kim, J., and Saedler, H. 1996. Classification and phylogeny of the<br />

MADS-box multigene family suggest defined roles of MADS-box gene<br />

subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 43: 484-<br />

516.<br />

Thomas, J. R., Dwek, R. A., and Rademacher, T. W. (1990). Structure,<br />

biosynthesis, and function of glycosylphosphatidylinositols. Biochemistry. 29:<br />

5413–5422.<br />

Thompson, J. D., Higgins, D. G., and Gibson T. J. (1994). CLUSTALW:<br />

improving the sensitivity of progressive multiple sequence alignment through<br />

sequence weighting, position-specific gap penalties and weight matrix choice.<br />

Nucleic Acids Res. 22: 4673–4680.<br />

Tiede, A., Bastisch, I., Schubert, J., Orlean, P., and Schmidt, R. E. (1999).<br />

Biosynthesis of glycosylphosphatidylinositols in mammals and unicellular<br />

microbes. Biol Chem. 380: 503–523.<br />

Tietz, H., Hopp, M., Schmalreck, A., Sterry, W., and Czaika, V. (2001).<br />

Candida africana sp. nov. a new human pathogen or a variant of Candida<br />

albicans?. Mycoses. 44: 437-445.<br />

Toth, G., Gaspari, Z., and Jurka, J. (2000). Microsatellites in different eukaryotic<br />

genomes: survey and analysis. Genome Res. 10: 967–981.<br />

Treisman, R. (1990). The SRE: a growth factor responsive transcriptional<br />

regulator. Semin. Cancer Biol. 1: 47– 58.<br />

Trifonov, E. N. (2003). Tuning function of tandemly repeating sequences: a<br />

molecular device for fast adaptation. Pp. 1–24 in S. P. Wasser, ed., Evolutionary<br />

125


126<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

theory and processes: nodern horizons, papers in honor of Eviatar Nevo. Kluwer<br />

Academic Publishers. Amsterdam, The Netherlands.<br />

Tzung, K. W., Williams, R. M., Scherer, S., Federspiel, N., Jones, T., Hansen,<br />

N., Bivolarevic, V., Huizar, L., Komp, C., Surzycki, R., Tamse, R., Davis, R.<br />

W., and Agabian, N. (2001). Genomic evidence for a complete sexual cycle in<br />

Candida albicans. Proc Natl Acad Sci. 98(6): 3249-3253.<br />

Valenza, G., Valenza, R., Brederlau, J., Frosch, M., and Kurzai, O. (2006).<br />

Identification of Candida fabianii as a cause of letal septicaemia. Mycoses. 49:<br />

331-334.<br />

Van Belkum, A., Scherer, S., van Alphen, L., and Verbrugh, H. (1998). Short-<br />

sequence DNA repeats in prokaryotic genomes. Microbiol Mol Biol Rev. 62:<br />

275–293.<br />

Van de Peer, Y., Baldauf, S. L., Doolittle, W. F., and Meyer, A. (2000). An<br />

updated and comprehensive rRNA phylogeny of (crown) eukaryotes based on<br />

rate-calibrated evolutionary distances. J Mol Evol. 51: 565 – 576.<br />

Van der Walt, J. P. (1966). Lodderomyces, a new genus of the<br />

Saccharomycetaceae. Antonie Van Leeuwenhoek 1966. 32(1): 1-5.<br />

Voigt, K., and Wöstemeyer, J. (2001). Phylogeny and origin of 82 zygomycetes<br />

from all 54 genera of the Mucorales and Mortierellales based on combined<br />

analysis of actin and translation elongation factor EF-1α genes. Gene. 270: 113-<br />

120.<br />

Vos, P., Hogers, R., Bleeker, M., Reijans, M., Van de Lee, T., Hornes, M.,<br />

Frijters, A., Pot, J., Peleman, J., Kuiper, M., and Zabeau, M. (1995). AFLP: a<br />

new technique for DNA fingerprinting. Nucleic Acids Res. 23: 4407-4414.<br />

Wainright, P. O., Hinkle, G., Sogin, M. L., and Stickel, S. K. (1993).<br />

Monophyletic origins of the Metazoa: an evolutionary link with fungi. Science.<br />

260: 340 – 342.<br />

Wang, Z.; Binder, M.; Dai, Y.C.; and Hibbett, D.S. 2004. Phylogenetic<br />

relationships of Sparassis inferred from nuclear and mitochondrial ribosomal<br />

DNA and RNA polymerase sequences. Mycologia. 96: 1013-1027.<br />

Watanabe, Y., Irie, K., and Matsumoto, K. (1995). Yeast RLM1 encodes a<br />

serum response factor-like protein that may function <strong>do</strong>wnstream of the Mpk1<br />

(Slt2) mitogen-activated protein kinase pathway. Mol Cell Biol. 15: 5740–5749.


REFERENCES<br />

Weig, M., Jansch, L., Gross, U., De Koster, C. G., Klis, F. M., and De Groot, P.<br />

W. (2004). Systematic identification in silico of covalently bound cell wall<br />

proteins and analysis of protein-polysaccharide linkages of the human pathogen<br />

Candida glabrata. Microbiology. 150: 3129–3144.<br />

Wernersson, R. (2006). Virtual Ribosome-a comprehensive DNA translation<br />

tool with support for integration of sequence feature annotation. Nucleic Acids<br />

Research. 34: W385-W388.<br />

Whalley, A. J. S., and Edwards, R. L. (1995). Secondary metabolites and<br />

systematic arrangement within the Xylariaceae. Can J Bot. 73(Suppl. 1): S802-<br />

S810.<br />

Wickerham, L. J., and Burton, K. A. (1954). A clarification of the relationship of<br />

Candida guilliermondii to other yeasts by a study of their mating types. J<br />

Bacteriol. 68(5): 594-597.<br />

Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S.<br />

V. (1990). DNA polymorphisms amplified by arbitrary primers are useful as<br />

genetic markers. Nucleic Acids Research. 18: 6531-6535.<br />

Whittaker, R. (1969). New concepts of king<strong>do</strong>ms of organisms. Science. 163:<br />

150–160.<br />

Wingard, J. R., Mertz, W. G., Rinaldi, M. G., Johnson, T. R., Karp, J. E., and<br />

Saral, R. (1991). Increase in Candida krusei infection among patients with bone<br />

marrow transplant and neutropenia treated prophylactically with fluconazole. N<br />

Engl J Med. 325: 1274–1277.<br />

Wolfe, K. H., and Shields, D. C. (1997). Molecular evidence for an ancient<br />

duplication of the entire yeast genome. Nature. 387: 708–713.<br />

Woese, C., and Fox, G. (1977). Phylogenetic structure of the prokaryotic<br />

<strong>do</strong>main: The primary king<strong>do</strong>ms. Proc Natl Acad Sci. 74(11): 5088 – 5090.<br />

Woese, C.; Kandler, O.; and Wheelis, M. (1990). Towards a Natural System of<br />

Organisms: Proposal for the <strong>do</strong>mains Archaea, Bacteria, and Eucarya. Proc Natl<br />

Acad Sci. 87(12): 4576-4579.<br />

Wong, W. S. W., Yang, Z., Goldman, N., and Nielsen, R. (2004). Accurancy and<br />

power of statistical methods for detecting adaptive evolution in protein coding<br />

sequences and for identifying positive selective sites. Genetics. 168: 1041–1051.<br />

127


128<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Wren, J. D., Forgacs, E., Fon<strong>do</strong>n, J. W., 3rd, Pertsemlidis, A., Cheng, S. Y.,<br />

Gallar<strong>do</strong>, T., Williams, R. S., Shohet, R. V., Minna, J. D., and Garner, H. R.<br />

(2000). Repeat polymorphisms within gene regions: phenotypic and<br />

evolutionary implications. Am J Hum Genet. 67: 345–356.<br />

Yabana, N., and Yamamoto, M. (1996). Schizosaccharomyces pombe map1+<br />

encodes a MADS-box-family protein required for cell-type-specific gene<br />

expression. Mol Cell Biol. 16: 3420 – 3428.<br />

Yanofsky, M. F., Ma, H., Bowman, J. L., Drews, G. N., Feldmann, K. A., and<br />

Meyerowitz, E. M. (1990). The protein encoded by the Arabi<strong>do</strong>psis homeotic<br />

gene agamous resembles transcription factors. Nature. 346: 35– 39.<br />

Yang, Z. (1997). PAML: a program package for phylogenetic analysis by<br />

maximum likelihood. Comput Appl Biosci. 13: 555–556.<br />

Yang, Z., and Bielawski, J. P. (2000). Statistical methods for detecting<br />

molecular adaptation. Trends Ecol Evolut. 15: 496–503.<br />

Yang, Z., Wong, W. S., and Nielsen, R. (2005) Bayes empirical Bayes inference<br />

of amino acid sites under positive selection. Mol Biol Evol. 22: 1107–1118.<br />

Yang, Z., Nielsen, R., Goldman, N., and Pedersen, A. M. K, (2000). Co<strong>do</strong>n<br />

substitution models for heterogeneous selection pressure at amino acid sites.<br />

Genetics. 155: 431–449.<br />

Young, L. Y., Lorenz, M. C., and Heitman, J. (2000). A STE12 homolog is<br />

required for mating but dispensable for filamentation in Candida lusitaniae.<br />

Genetics. 155(1): 17-29.<br />

Young, E. T., Sloan, J. S., and van Riper, K. (2000). Trinucleotide repeats are<br />

clustered in regulatory genes in Saccharomyces cerevisiae. Genetics. 154: 1053–<br />

1068.<br />

Yu, G., and Fassler, J. S. (1993). SPT13 (GAL11) of Saccharomyces cerevisiae<br />

negatively regulates activity of the MCM1 transcription factor in Ty1 elements.<br />

Mol Cell Biol. 13: 63 – 71.<br />

Zhang, N., Clastebury, L. A., Miller, A. N., Huhn<strong>do</strong>rf, S. M., Schoch, C. L.,<br />

Seifert, K. A., Rossman, A. Y., Rogers, J. D., Kohlmeyer, J., Volkmann-<br />

Kohlmeyer, B., and Sung. G-H. (2006). An overview of the systematic of the<br />

Sordariomycetes based on a four-gene phylogeny. Mycologia. 98(6): 1076-<br />

1087.


ANNEXES


7. ANNEXES<br />

MrModelblock<br />

#NEXUS<br />

[Johan Nylander 2004-07-01]<br />

ANNEX I<br />

[! ***** MrModeltest block -- Modified from MODELTEST 3.0 *****]<br />

ANNEXES<br />

[The following command will calculate a NJ tree using the JC69 model of evolution]<br />

BEGIN PAUP;<br />

Log file= mrmodelfit.log replace;<br />

DSet distance=JC objective=ME base=equal rates=equal pinv=0<br />

subst=all negbrlen=setzero;<br />

NJ showtree=no breakties=ran<strong>do</strong>m;<br />

End;<br />

[!<br />

***** BEGIN TESTING 24 MODELS OF EVOLUTION ***** ]<br />

BEGIN PAUP;<br />

Default lscores longfmt=yes; [Workaround for the bug in PAUP 4b10]<br />

Set criterion=like;<br />

[!<br />

** Model 1 of 24 * Calculating JC **]<br />

lscores 1/ nst=1 base=equal rates=equal pinv=0<br />

scorefile=mrmodel.scores replace;<br />

[!<br />

** Model 2 of 24 * Calculating JC+I **]<br />

lscores 1/ nst=1 base=equal rates=equal pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 3 of 24 * Calculating JC+G **]<br />

lscores 1/ nst=1 base=equal rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 4 of 24 * Calculating JC+I+G **]<br />

lscores 1/ nst=1 base=equal rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 5 of 24 * Calculating F81 **]<br />

lscores 1/ nst=1 base=est rates=equal pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 6 of 24 * Calculating F81+I **]<br />

lscores 1/ nst=1 base=est rates=equal pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

131


132<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

pinv=0<br />

** Model 7 of 24 * Calculating F81+G **]<br />

lscores 1/ nst=1 base=est rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 8 of 24 * Calculating F81+I+G **]<br />

lscores 1/ nst=1 base=est rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 9 of 24 * Calculating K80 **]<br />

lscores 1/ nst=2 base=equal tratio=est rates=equal pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 10 of 24 * Calculating K80+I **]<br />

lscores 1/ nst=2 base=equal tratio=est rates=equal pin=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 11 of 24 * Calculating K80+G **]<br />

lscores 1/ nst=2 base=equal tratio=est rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 12 of 24 * Calculating K80+I+G **]<br />

lscores 1/ nst=2 base=equal tratio=est rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 13 of 24 * Calculating HKY **]<br />

lscores 1/ nst=2 base=est tratio=est rates=equal pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 14 of 24 * Calculating HKY+I **]<br />

lscores 1/ nst=2 base=est tratio=est rates=equal pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 15 of 24 * Calculating HKY+G **]<br />

lscores 1/ nst=2 base=est tratio=est rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 16 of 24 * Calculating HKY+I+G **]<br />

lscores 1/ nst=2 base=est tratio=est rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 17 of 24 * Calculating SYM **]<br />

lscores 1/ nst=6 base=equal rmat=est rclass= (a b c d e f) rates=equal<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 18 of 24 * Calculating SYM+I **]<br />

lscores 1/ nst=6 base=equal rmat=est rates=equal pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 19 of 24 * Calculating SYM+G **]


END;<br />

ANNEXES<br />

lscores 1/ nst=6 base=equal rmat=est rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 20 of 24 * Calculating SYM+I+G **]<br />

lscores 1/ nst=6 base=equal rmat=est rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 21 of 24 * Calculating GTR **]<br />

lscores 1/ nst=6 base=est rmat=est rates=equal pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 22 of 24 * Calculating GTR+I **]<br />

lscores 1/ nst=6 base=est rmat=est rates=equal pinv=est<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 23 of 24 * Calculating GTR+G **]<br />

lscores 1/ nst=6 base=est rmat=est rates=gamma shape=est pinv=0<br />

scorefile=mrmodel.scores append;<br />

[!<br />

** Model 24 of 24 * Calculating GTR+I+G **]<br />

lscores 1/ nst=6 base=est rmat=est rates=gamma shape=est pinv=est<br />

scorefile=mrmodel.scores append;<br />

Log stop=yes;<br />

133


134<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Annex II. Fungal species where were identified ortologue RLM1 genes. Taxonomy, Nº intron, gene location and database.<br />

Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />

Phycomyces blakeleeanus Mucoromycotina 5 Scaffold 1: 2877572 – 2879654 (-) Joint Genome Institute<br />

Mucor circinelloides Mucoromycotina 3 Scaffold 2: 32637 – 34277 (-) Broad Institute, Murcia University<br />

Rhizopus oryzae Mucoromycotina 4 Supercontig 2: 4147507 – 4149079 (-) Broad Institute<br />

Ustilago maydis Ustilagomycetes - Contig 191: 196940 – 198718 (+) Broad Institute<br />

Malassezia globosa Malasseziales - Sf 7.1: 204470 – 205939 (+) NCBI<br />

Sporobolomyces roseus Microbotryomycetes 2 Scaffold 10: 85788 – 88407 (+) Joint Genome Institute<br />

Cryptococcus neoformans Tremellomycetes 4 Chromosome 2: 1360064 - 1362029 (-) Broad Institute, Genome Sciences<br />

Center Canada, Duke Center for<br />

Genome Research, Stanford Genome<br />

Technology<br />

Cryptococcus gatti Tremellomycetes 4 Supercontig 16: 244132 – 246064 (+) Broad Institute<br />

Laccaria bicolor Agaricomycetes 4 Scaffold 20: 147515 - 149297 (+) Joint Genome Institute<br />

Coprinus cinereus Agaricomycetes 7 Contig 317: 81588 – 83599 (-)‖ Broad Institute<br />

Phanerochaete chrysosporium Agaricomycetes 4 Scaffold 3: 1745299 – 1747266 (+) Joint Genome Institute<br />

Postia placenta Agaricomycetes 8 Scaffold 174: 171333 – 174904 (+) Joint Genome Institute<br />

Schizosaccharomyces japonicus Schizosaccharomycetes - Supercontig 5: 941790 - 942323 (+) Broad Institute<br />

Schizosaccharomyces pombe Schizosaccharomycetes - Chromosome 2: 3626639-3627757 (+) Sanger Institute<br />

Schizosaccharomyces octosporus Schizosaccharomycetes - Supercontig 2: 351924 – 352948 (+) Broad Institute<br />

Yarrowia lipolytica Saccharomycetes 1 Chromosome F: 2507846 – 2509480 (-) Génolevures<br />

Candida lusitaniae Saccharomycetes - Supercontig 5: 493096 - 494358 (-) Broad Institute<br />

Candida guilliermondii Saccharomycetes - Supercontig 3: 493468 - 494775 (+) Broad Institute


Pichia stipitis Saccharomycetes - Chromosome 1: 1519668 – 1521421 (+) Joint Genome Institute<br />

Debaryomyces hansenii Saccharomycetes - Chromosome 2: 237182 - 238732 (+) Génolevures<br />

Lodderomyces elongisporus Saccharomycetes - Supercontig 6: 1060953 - 1063316 (+) Broad Institute<br />

Candida parapsilosis Saccharomycetes - Contig 005806: 536977 – 539238 (-) Sanger Institute<br />

Candida tropicalis Saccharomycetes - Supercontig 1: 49905 - 51623 (-) Broad Institute, Génolevures<br />

135<br />

ANNEXES<br />

Candida albicans Saccharomycetes - Chromosome 4: 253044 – 254879 (+) Stanford Genome Technology Center,<br />

Candida dubliniensis Saccharomycetes - Chromosome 4: 265932 – 267662 (+) Sanger Institute<br />

Kluyveromyces lactis Saccharomycetes - Chromosome E: 2138785 – 2140983 (+) Génolevures<br />

Sanger Institute and Broad Institute<br />

Ashbya gossypii Saccharomycetes 1 Chromosome 7: 1125892 - 1127486 (-) Biozentrum and Syngenta<br />

Saccharomyces kluyveri Saccharomycetes 1 Contig 4.8: 312314 – 314049 (-) Washington University Genome<br />

Kluyveromyces waltii Saccharomycetes 1 Contig 62: 28422 – 29929 (+) Broad Institute<br />

Candida glabrata Saccharomycetes - Chromosome H: 556498 – 558378 (+) Génolevures<br />

Sequencing Center, Génolevures<br />

Vanderwaltozyma polyspora Saccharomycetes - Contig 1048b: 36130 – 38160 (-) Trinity College Dublin, Ireland<br />

Saccharomyces castellii Saccharomycetes - Contig 664: 5363 - 7027 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces bayanus Saccharomycetes - Contig 737: 12450 – 14405 (+) Washington University Genome<br />

Sequencing Center, Broad Institute,<br />

Génolevures<br />

Saccharomyces kudriavzevii Saccharomycetes - Contig 02.2091: 16474 – 18432 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces mikatae Saccharomycetes - Contig 34: 13635 – 15605 (-) Washington University Genome<br />

Sequencing Center, Broad Institute


136<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Saccharomyces cerevisiae Saccharomycetes - Chromosome 16: 379117 – 379117 (-) Stanford Genome Technology Center,<br />

Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 387: 39147 – 40522 (-) Broad Institute<br />

Sanger Institute, Broad Institute<br />

Botrytis cinerea Leotiomycetes 2* Supercontig 4: 131839 – 133900 (+) Broad Institute and Syngenta AG,<br />

Genoscope<br />

Sclerotinia sclerotiorum Leotiomycetes 2 Supercontig 7: 1397346 - 1399404 (-) Broad Institute<br />

Magnoporthe grisea Sordariomycetes 2 Supercontig 27: 2198205 - 2200477 (+) Broad Institute<br />

Verticillium dahliae Sordariomycetes 2 Supercontig 2: 411495 - 413529 (-) Broad Institute<br />

Po<strong>do</strong>spora anserina Sordariomycetes 1 Chromosome 1_SC1: 2337785 – 2339800 (+) Genoscope<br />

Chaetomium globosum Sordariomycetes 3* Supercontig 1: 4237276 - 4239345 (+) Broad Institute<br />

Neurospora crassa Sordariomycetes 2 Contig 6: 745962 - 748345 (-) Broad Institute<br />

Nectria haematococca Sordariomycetes 2 Contig 1: 4240418 – 4242558 (+) Joint Genome Institute<br />

Fusarium oxysporum Sordariomycetes 2 Supercontig 6: 882635 - 884756 (-) Broad Institute<br />

Fusarium verticillioides Sordariomycetes 2 Supercontig 4: 851617 - 853730 (-) Broad Institute and Syngenta AG<br />

Fusarium graminearum Sordariomycetes 2 Supercontig 6: 1217222 – 1219438 (+) Broad Institute<br />

Ephicloa festucae Sordariomycetes 2* Contig 09290: 7786 – 8664 (+); Contig 10180:<br />

1- 1318 (+)<br />

Oklahoma University<br />

Trichoderma reesei Sordariomycetes 2 Contig 1: 484378 – 486398 (+) Joint Genome Institute<br />

Trichoderma virens Sordariomycetes 2 Locus ABDF01000081: 60794 – 62787 (-) Joint Genome Institute<br />

Trichoderma atroviride Sordariomycetes 2 Locus ABDG01000180: 45972 – 47983 (-) Joint Genome Institute<br />

Alternaria brassicicola Dothideomycetes 2 Contig 2.99: 20908 – 22912 (+) Washington University<br />

Mycosphaerella fijiensis Dothideomycetes 2 Scaffold 11: 1715901 – 1717838 (+) Joint Genome Institute<br />

Mycosphaerella graminicola Dothideomycetes 2 Scaffold 5: 2832610 – 2834512 (-) Joint Genome Institute<br />

Pyrenophora tritici-repentis Dothideomycetes 2 Supercontig 8: 1656744 – 1658773 (+) Broad institute


137<br />

ANNEXES<br />

Stagonospora no<strong>do</strong>rum Dothideomycetes 2 Supercontig 36: 142061 – 144139 (-) Broad Institute, International<br />

Stagonospora no<strong>do</strong>rum Genomics<br />

Consortium<br />

Cochliobolus heterostrophus Dothideomycetes 2 Scaffold 3: 15429 - 17447 (-) Joint Genome Insttitute<br />

Coccidioides immitis Eurotiomycetes 2 Chromosome 4: 837325 – 839343 (-) Broad Institute<br />

Coccidioides posadasii Eurotiomycetes 2 Chromosome 5: 696037 – 698131 (+) TIGR and Broad Institute<br />

Blastomyces dermatitidis Eurotiomycetes 2 Contig 5.48: 866 – 3141 (-) Washington University<br />

Histoplasma capsulatum Eurotiomycetes 2 Contig 1161: 255583 – 257713 (+) Broad institute, Washington University<br />

Genome Sequencing Center<br />

Paracoccidioides brasiliensis Eurotiomycetes 2 Supercontig 2: 1922491 - 1924634 (-) Broad Institute<br />

Uncinocarpus reesii Eurotiomycetes 2 Supercontig 5: 1771068 – 1772993 (+) Broad Institute<br />

Ascosphaera apis Eurotiomycetes 1* Contig 6709: 1 – 1360 (-); Contig 28: 3916 –<br />

5847 (-)<br />

Baylor College of Medicine<br />

Microsporum gypseum Eurotiomycetes 1 Supercontig 8: 169115 – 170914 (-) Broad Institute<br />

Talaromyces stipitatus Euriotiomycetes 2 Locus ABAS01000017: 261779 – 263672 (+) TIGR<br />

Penicillium marneffei Euriotiomycetes 2 Locus ABAR01000011: 131838 – 133723 (-) TIGR<br />

Neosartorya fischeri Eurotiomycetes 2 Contig 574: 1982352 - 1984281 (-) TIGR<br />

Aspergillus clavatus Eurotiomycetes 2 Contig 83: 930562 – 932534 (+) TIGR<br />

Aspergillus fumigatus Eurotiomycetes 2 Chromosome 3: 2185740 - 2187668 ( +) Sanger Institute and TIGR<br />

Aspergillus oryzae Eurotiomycetes 2 Supercontig 2: 3771013 - 3773030 (-) Broad Institute<br />

Aspergillus terreus Eurotiomycetes 2 Supercontig 2: 1754327 - 1756250 (-) Broad Institute<br />

Aspergillus flavus Eurotiomycetes 2 Contig 1: 3827471 – 3829494 (-) TIGR<br />

Aspergillus nidulans Euriotiomycetes 2 Contig 51: 528863 - 530802 (+) Broad Institute and Monsanto<br />

Aspergillus niger Eurotiomycetes 2 Contig 2: 3360920 – 3362951 (+) Joint Genome Institute


138<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Annex III. Fungal species where were identified ortologue MCM1 genes. Taxonomy, Nº intron, gene location and database.<br />

Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />

Phycomyces blakeleeanus Mucoromycotina 3 Scaffold 1: 2124764 – 2126294 (+) Joint Genome Institute<br />

Mucor circinelloides Mucoromycotina 2 Scaffold 4: 3273504 – 3274607 (-) Broad Institute, Murcia University<br />

Rhizopus oryzae Mucoromycotina 3 Supercontig 4: 1692147-1693115 (-) Broad Institute<br />

Ustilago maydis Ustilagomycetes - Contig 43: 64341-65705 (+) Broad Institute<br />

Malassezia globosa Malasseziales 1 Sf 19.1: 27642 – 28805 (+) NCBI<br />

Sporobolomyces roseus Microbotryomycetes 5 Scaffold 5: 1185071 – 1188512 (-) Joint Genome Institute<br />

Cryptococcus neoformans Tremellomycetes 4 Chromosome 13: 39688-42628 (+) Broad Institute, Genome Sciences<br />

Center Canada, Duke Center for<br />

Genome Research, Stanford Genome<br />

Technology<br />

Cryptococcus gatti Tremellomycetes 3 Supercontig 14: 629109 – 632178 (-) Broad Institute<br />

Laccaria bicolor Agaricomycetes 2 Scaffold 3: 1957445 - 1958697 (+) Joint Genome Institute<br />

Coprinus cinereus Agaricomycetes 2 Contig 175: 86341-87583 (+) Broad Institute<br />

Phanerochaete chrysosporium Agaricomycetes 1 Scaffold 8: 889662 – 890764 (-) Joint Genome Institute<br />

Postia placenta Agaricomycetes 1 Scaffold 93: 242447 – 243724 (+) Joint Genome Institute<br />

Schizosaccharomyces japonicus Schizosaccharomycetes 2 Supercontig 1: 436734-438566 (+) Broad Institute<br />

Schizosaccharomyces pombe Schizosaccharomycetes 2 Chromosome 1: 5292795-5294344 (+) Sanger Institute<br />

Schizosaccharomyces octosporus Schizosaccharomycetes 2 Supercontig 1: 2428314-2429843 (+) Broad Institute<br />

Yarrowia lipolytica Saccharomycetes 1 Chromosome C: 913297 – 914639 (-) Génolevures<br />

Candida lusitaniae Saccharomycetes - Supercontig 5: 694152-694868 (-) Broad Institute<br />

Candida guilliermondii Saccharomycetes - Supercontig 1: 710234-710938 (+) Broad Institute


Pichia stipitis Saccharomycetes - Chromosome 5: 1545724 – 1546455 (-) Joint Genome Institute<br />

Debaryomyces hansenii Saccharomycetes - Chromosome F: 1872703 - 1873428 (+) Génolevures<br />

Lodderomyces elongisporus Saccharomycetes - Supercontig 10: 288713-289579 (-) Broad Institute<br />

Candida parapsilosis Saccharomycetes - Contig 005806: 325917 – 326762 (+) Sanger Institute<br />

Candida tropicalis Saccharomycetes - Supercontig 7: 104018-104854 (+) Broad Institute, Génolevures<br />

139<br />

ANNEXES<br />

Candida albicans Saccharomycetes - Chromosome 7: 188023-188811 (-) Stanford Genome Technology Center,<br />

Candida dubliniensis Saccharomycetes - Chromosome 7: 207792 - 208493 (-) Sanger Institute<br />

Kluyveromyces lactis Saccharomycetes - Chromosome F: 2474742 – 2475452 (-) Génolevures<br />

Sanger Institute and Broad Institute<br />

Ashbya gossypii Saccharomycetes - Chromosome 2: 561146 - 561766 (-) Biozentrum and Syngenta<br />

Saccharomyces kluyveri Saccharomycetes - Contig 5.10: 2448 – 3053 (+) Washington University Genome<br />

Kluyveromyces waltii Saccharomycetes - Contig 206: 1 – 307 (-); Contig 205: 7934 –<br />

8165 (-)<br />

Sequencing Center, Génolevures<br />

Broad Institute<br />

Candida glabrata Saccharomycetes - Chromosome F: 618976 – 619632 (-) Génolevures<br />

Vanderwaltozyma polyspora Saccharomycetes - Contig 1073b: 43103 – 43882 (-) Trinity College Dublin, Ireland<br />

Saccharomyces castellii Saccharomycetes - Contig 649: 39880 - 40560 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces bayanus Saccharomycetes - Contig 579: 25293 – 26144 (-) Washington University Genome<br />

Sequencing Center, Broad Institute,<br />

Génolevures<br />

Saccharomyces kudriavzevii Saccharomycetes - Contig 02.820: 794 – 1651 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces mikatae Saccharomycetes - Contig 578: 4300 – 5130 (+) Washington University Genome


140<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Sequencing Center, Broad Institute<br />

Saccharomyces cerevisiae Saccharomycetes - Chromosome 13: 353870 – 354730 (+) Stanford Genome Technology Center,<br />

Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 178: 11877 – 12719 (-) Broad Institute<br />

Sanger Institute, Broad Institute<br />

Botrytis cinerea Leotiomycetes 2* Supercontig 165: 67693 – 68463 (-) Broad Institute and Syngenta AG,<br />

Genoscope<br />

Sclerotinia sclerotiorum Leotiomycetes 2 Supercontig 6: 1981916-1982762 (+) Broad Institute<br />

Magnoporthe grisea Sordariomycetes 2 Supercontig 23: 1178204-1179169 (-) Broad Institute<br />

Verticillium dahliae Sordariomycetes 2 Supercontig 3: 440666-441518 (+) Broad Institute<br />

Po<strong>do</strong>spora anserina Sordariomycetes 2 Chromosome 1_SC4: 1831500 – 1832359 (+) Genoscope<br />

Chaetomium globosum Sordariomycetes 2 Supercontig 3: 3648631 - 3649505 (-) Broad Institute<br />

Neurospora crassa Sordariomycetes 2 Contig 39: 300711 - 301579 (+) Broad Institute<br />

Nectria haematococca Sordariomycetes 2 Contig 1: 1904784 – 1905808 (+) Joint Genome Institute<br />

Fusarium oxysporum Sordariomycetes 2 Supercontig 3: 1396623 - 1397638 (-) Broad Institute<br />

Fusarium verticillioides Sordariomycetes 2 Supercontig 2: 1298954 - 1299971 (-) Broad Institute and Syngenta AG<br />

Fusarium graminearum Sordariomycetes 2 Supercontig 5: 973273 – 974303 (-) Broad Institute<br />

Ephicloa festucae Sordariomycetes 2 Contig 03502: 2589 – 3725 (-) Oklahoma University<br />

Trichoderma reesei Sordariomycetes 2 Contig 1: 293329 – 294473 (+) Joint Genome Institute<br />

Trichoderma virens Sordariomycetes 2 Locus ABDF01000238: 119788 – 120837 (-) Joint Genome Institute<br />

Trichoderma atroviride Sordariomycetes 2 Locus ABDG01000087 : 9802 – 10882 (-) Joint Genome Institute<br />

Alternaria brassicicola Dothideomycetes 2 Contig 7.54: 1842 – 2668 (+) Washington University<br />

Mycosphaerella fijiensis Dothideomycetes 2 Scaffold 5: 691696 – 692478 (+) Joint Genome Institute<br />

Mycosphaerella graminicola Dothideomycetes 2 Scaffold 7: 131733 – 132524 (-) Joint Genome Institute<br />

Pyrenophora tritici-repentis Dothideomycetes 2 Supercontig 5: 186836-187667 (+) Broad institute


141<br />

ANNEXES<br />

Stagonospora no<strong>do</strong>rum Dothideomycetes 2 Supercontig 11: 1046577 – 1047421 (+) Broad Institute, International<br />

Stagonospora no<strong>do</strong>rum Genomics<br />

Consortium<br />

Cochliobolus heterostrophus Dothideomycetes 2 Scaffold 2: 392134 - 393018 (-) Joint Genome Insttitute<br />

Coccidioides immitis Eurotiomycetes 2 Chromosome 3: 322326 – 323200 (-) Broad Institute<br />

Coccidioides posadasii Eurotiomycetes 2 Chromosome 2: 790014 – 790884 (-) TIGR and Broad Institute<br />

Blastomyces dermatitidis Eurotiomycetes 2 Contig 5593.1: 398 – 885 (-) Washington University<br />

Histoplasma capsulatum Eurotiomycetes 2 Supercontig 1.3: 251153 – 252083 (+) Broad institute, Washington University<br />

Genome Sequencing Center<br />

Paracoccidioides brasiliensis Eurotiomycetes 2 Supercontig 12: 450142-451180 (+) Broad Institute<br />

Uncinocarpus reesii Eurotiomycetes 2 Supercontig 2: 4576943-4577833 (+) Broad Institute<br />

Ascosphaera apis Eurotiomycetes 1 Contig 2371: 17 – 538 (-); Contig 6891: 66 –<br />

1133 (+)<br />

Baylor College of Medicine<br />

Microsporum gypseum Eurotiomycetes 2 Supercontig 1: 3737424-3738317 (+) Broad Institute<br />

Talaromyces stipitatus Euriotiomycetes 2 Locus ABAS01000004: 938690 – 939519 (+) TIGR<br />

Penicillium marneffei Euriotiomycetes 2 Locus ABAR01000013 : 274356 – 275187 (+) TIGR<br />

Neosartorya fischeri Eurotiomycetes 2 Contig 582: 357403-358305 (-) TIGR<br />

Aspergillus clavatus Eurotiomycetes 2 Contig 85: 946418-947301 (+) TIGR<br />

Aspergillus fumigatus Eurotiomycetes 2 Chromosome 6: 357383-358283 –() Sanger Institute and TIGR<br />

Aspergillus oryzae Eurotiomycetes 2 Supercontig 17: 221157-222072 (-) Broad Institute<br />

Aspergillus terreus Eurotiomycetes 2 Supercontig 10: 1111982-1112918 (+) Broad Institute<br />

Aspergillus flavus Eurotiomycetes 2 Contig 9: 471345-472255 (-) TIGR<br />

Aspergillus nidulans Euriotiomycetes 2 Contig 159: 33815-34722 (-) Broad Institute and Monsanto<br />

Aspergillus niger Eurotiomycetes 2 Contig 9: 222598 - 223521 (-) Joint Genome Institute


142<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Annex IV. Fungal species where identified orthologue IFF8 genes. Taxonomy, Nº de intron, gene location and database.<br />

Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />

Candida lusitaniae Saccharomycetes - Supercontig 7: 753415 - 755439 (+) Broad Institute<br />

Candida guilliermondii Saccharomycetes - Supercontig 2: 36309 - 39698 (+) Broad Institute<br />

Pichia stipitis Saccharomycetes - Chromosome 1: 307099 – 311559 (+) Joint Genome Institute<br />

Debaryomyces hansenii Saccharomycetes - Chromosome B: 107946 - 112736 (-) Génolevures<br />

Lodderomyces elongisporus Saccharomycetes - Contig 1.123: 280245 - 287281 (-) Broad Institute<br />

Candida parapsilosis Saccharomycetes - Contig 005807: 1470770 – 1475785 (+) Sanger Institute<br />

Candida tropicalis Saccharomycetes - Supercontig 10: 128215 - 134073 (+) Broad Institute, Génolevures<br />

Candida albicans Saccharomycetes - Chromosome 5: 164521 - 166665 (-) Stanford Genome Technology Center,<br />

Candida dubliniensis Saccharomycetes - Chromosome 5: 187922 - 189868 (+) Sanger Institute<br />

Sanger Institute and Broad Institute


Annex V. Fungal species where were identified ortologue SMP1 genes. Taxonomy, Nº intron, gene location and database.<br />

143<br />

Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />

Candida glabrata Saccharomycetes - Chromosome M: 657290 – 659269 (-) Génolevures<br />

Vanderwaltozyma polyspora Saccharomycetes - Contig 397: 8567 – 10588 (-) Trinity College Dublin, Ireland<br />

ANNEXES<br />

Saccharomyces castellii Saccharomycetes - Contig 721: 27914 - 29635 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces bayanus Saccharomycetes - Contig 3: 5420 – 6778 (+) Washington University Genome<br />

Sequencing Center, Broad Institute,<br />

Génolevures<br />

Saccharomyces kudriavzevii Saccharomycetes - Contig 02.1807: 8672 – 9925 (+) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces mikatae Saccharomycetes - Contig 100: 1952 – 3295 (+) Washington University Genome<br />

Sequencing Center, Broad Institute<br />

Saccharomyces cerevisiae Saccharomycetes - Chromosome 2: 594859 – 593501 (+) Stanford Genome Technology Center,<br />

Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 210: 35931 – 37187 (-) Broad Institute<br />

Sanger Institute, Broad Institute


144<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

Annex VI. Fungal species where were identified ortologue ARG80 genes. Taxonomy, Nº intron, gene location and database.<br />

Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />

Candida glabrata Saccharomycetes - Chromosome F: 620431 – 621027 (-) Génolevures<br />

Vanderwaltozyma polyspora Saccharomycetes - Contig 1073b: 45942 – 46289 (-) Trinity College Dublin, Ireland<br />

Saccharomyces castellii Saccharomycetes - Contig 709: 49895 - 50539 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces bayanus Saccharomycetes - Contig 579: 26917 – 27447 (-) Washington University Genome<br />

Sequencing Center, Broad Institute,<br />

Génolevures<br />

Saccharomyces kudriavzevii Saccharomycetes - Contig 02.1514: 585 – 1130 (-) Washington University Genome<br />

Sequencing Center<br />

Saccharomyces mikatae Saccharomycetes - Contig 2531: 1149 – 1703 (-) Washington University Genome<br />

Sequencing Center, Broad Institute<br />

Saccharomyces cerevisiae Saccharomycetes - Chromosome 13: 352602 – 353135 (+) Stanford Genome Technology Center,<br />

Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 178: 13466 – 14017 (-) Broad Institute<br />

Sanger Institute, Broad Institute


ANNEX VII<br />

ANNEXES<br />

Likelihood value, parameter estimates, and sites under positive selection as inferred under the six models<br />

proposed to calculate ω over co<strong>do</strong>ns in MADS-box of orthologue RLM1 genes. (PAML 4.1)<br />

Model l Estimates of parameters dN/dS Positively<br />

One-ratio (M0)<br />

Neutral (M1a)<br />

Selection (M2a)<br />

Free-ratio (M3)<br />

Beta (M7)<br />

Beta & w (M8)<br />

-8907.095177<br />

-8881.920467<br />

-8881.920467<br />

-8595.963248<br />

-8575.525557<br />

-8575.526257<br />

ω = 0.0185<br />

p: 0.97143 0.02857<br />

w: 0.01692 1.00000<br />

p: 0.97143 0.00817 0.02040<br />

w: 0.01692 1.00000 1.00000<br />

p: 0.35665 0.42127 0.22208<br />

w: 0.00122 0.01527 0.06434<br />

p= 0.42029 q= 13.55916<br />

p0= 0.99999 p= 0.42029 q= 13.55916<br />

(p1= 0.00001) w= 1.00000<br />

0.0185<br />

0.0450<br />

0.0450<br />

0.0212<br />

0.0283<br />

0.0283<br />

Selected sites<br />

None<br />

Not allowed<br />

None<br />

None<br />

Not allowed<br />

None<br />

145


146<br />

Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />

ANNEX VIII<br />

Likelihood value, parameter estimates, and sites under positive selection as inferred under the six models<br />

proposed to calculate ω over co<strong>do</strong>ns in Saccharomycotina RLM1 gene. (HYPHY 1.0)<br />

Model l Estimates of parameters dN/dS Positively<br />

One-ratio (M0)<br />

Neutral (M1a)<br />

Selection (M2a)<br />

Free-ratio (M3)<br />

Beta (M7)<br />

Beta & w (M8)<br />

-39464.90296<br />

-40768.67683<br />

-39216.75175<br />

-38569.21552<br />

-38570.03576<br />

-38570.03383<br />

ω = 0.13616<br />

p: 0.9999343 0.0000657<br />

w: 1.00000 1.00000<br />

p: 0.66539 0.33461 0.00000<br />

w: 0.13860 1.00000 2.98229<br />

p: 0.11323 0.46676 0.42001<br />

w: 0.01243 0.10299 0.28431<br />

p= 0.96651 q= 4.56744<br />

p0= 0.99999 p= 0.96573 q= 4.56035<br />

(p1= 0.00001) w= 1.05491<br />

0.13616<br />

1.000<br />

0.42684<br />

0.16889<br />

0.17465<br />

0.17476<br />

Selected sites<br />

None<br />

Not allowed<br />

None<br />

None<br />

Not allowed<br />

None


ANNEX IX<br />

ANNEXES<br />

Likelihood ratio test statistics for comparing site specific models for MADS-box and Saccharomycotina<br />

RLM1 gene.<br />

Analysis<br />

(DF)<br />

MADS-box<br />

M0 vs M3 M1a vs M2a M7 vs M8<br />

(4)<br />

(2)<br />

(2)<br />

622.26* 0 0.0014<br />

Saccharomycotina<br />

Rlm1 gene<br />

1791.37* 3103.85* 0.0039<br />

p value p> 0.05 p>0.05 p>0.05<br />

Degrees of free<strong>do</strong>m (DF), corresponding to the difference of the number of parameters in the models, are<br />

reported in brackets, significant values are marked by *.<br />

147

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!