Julio César Chávez Galarza - Universidade do Minho
Julio César Chávez Galarza - Universidade do Minho
Julio César Chávez Galarza - Universidade do Minho
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Phylogenetic relationships of the most common pathogenic<br />
Candida species inferred by sequence analysis of nuclear genes<br />
<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />
<strong>Minho</strong> 2009<br />
U<br />
<strong>Universidade</strong> <strong>do</strong> <strong>Minho</strong><br />
Escola de Ciências<br />
<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />
Phylogenetic relationships of the most<br />
common pathogenic Candida species<br />
inferred by sequence analysis of nuclear<br />
genes<br />
Abril de 2009
<strong>Universidade</strong> <strong>do</strong> <strong>Minho</strong><br />
Escola de Ciências<br />
<strong>Julio</strong> <strong>César</strong> <strong>Chávez</strong> <strong>Galarza</strong><br />
Phylogenetic relationships of the most<br />
common pathogenic Candida species<br />
inferred by sequence analysis of nuclear<br />
genes<br />
Tese de Mestra<strong>do</strong><br />
Mestra<strong>do</strong> em Genética Molecular<br />
Trabalho efectua<strong>do</strong> sob a orientação da<br />
Professora Doutora Ana Paula Fernandes<br />
Monteiro Sampaio Carvalho<br />
Abril de 2009
É AUTORIZADA A REPRODUÇÃO PARCIAL DESTA TESE,<br />
APENAS PARA EFEITOS DE INVESTIGAÇÃO, MEDIANTE DECLARAÇÃO<br />
ESCRITA DO INTERESSADO, QUE A TAL SE COMPROMETE.
This Thesis was supported by the Programme ALβAN, the European Union Programme of High<br />
Level Scholarships for Latin America, scholarship No E06M103915PE.<br />
iii
DEDICATORIA<br />
A Jehová Deus por tu<strong>do</strong><br />
Aos meus pais <strong>Julio</strong> y Claudia pelo seu<br />
amor e apoio em to<strong>do</strong> momento<br />
v
AGRADECIMENTOS<br />
Manifesto de maneira muito especial o meu agradecimiento à Prof. Doutora Ana Paula<br />
Sampaio, orienta<strong>do</strong>ra deste trabalho, por to<strong>do</strong> apoio, dedicação, determinação, constante<br />
estímulo, e em particular por to<strong>do</strong>s os ensinamentos que me transmitiu.<br />
À Prof. Doutora Célia Pais, pelo seu apoio, ensinamentos e valiosas críticas deste<br />
trabalho.<br />
À Prof. Cândida Lucas, por toda a sua ajuda que me proporcionou para começar os<br />
meus estu<strong>do</strong>s em Portugal.<br />
Ao Prof. Rui Oliveira, pelos seus conselhos e amizade.<br />
Ao Prof. Felipe Costa, pelo seu apoio e sugestões na realização desta tese.<br />
Ao Prof. Pedro Carvalho pela sua amizade e acolhida quan<strong>do</strong> cheguei a Portugal.<br />
Aos elementos que integram o Departamento de Biologia e, nomeadamente os seus<br />
<strong>do</strong>centes, investiga<strong>do</strong>res e técnicos, por to<strong>do</strong> o apoio demonstra<strong>do</strong>.<br />
Aos colegas que integram o laboratório Yolanda, Alexandra, Marlene, Eugénia, Ana,<br />
Rafael, Raquel, pelo auxilio no trabalho, pela companhia e pela amizade.<br />
Aos colegas <strong>do</strong> Departamento de Biologia, em especial à Andreia, Sílvia, Célia, Rita,<br />
Sofia, Emi ao Flavio e Jorge, pelos seus conselhos, pela amizade, alegria e longas<br />
tertúlias.<br />
Ao Cristovão pela sua ajuda na edição desta tese.<br />
À família Alves Pinto Pacheco, que me fizeram esquecer o meu Perú, para aprender<br />
querer Portugal.<br />
Aos colegas <strong>do</strong> mestra<strong>do</strong> á Dany, Dulcineia, Dulce, Ana, Nádia, ao Marcos e Ricar<strong>do</strong>,<br />
pelos bons momentos que partilhamos e apoio que me brindaram.<br />
vii
Aos professores da <strong>Universidade</strong> Nacional de Trujillo-Perú: Dr. Manuel Fernández<br />
Honores e Ms. C. Elmer Alvítez Izquier<strong>do</strong> por me terem apoia<strong>do</strong> e ensina<strong>do</strong> grandes<br />
valores e em forma muito especial ao Professor Roberto Rodríguez Rodríguez por ter<br />
si<strong>do</strong> o meu guía durante a vida universitária e modelo de pessoa.<br />
Aos professores Dr. Ricar<strong>do</strong> Fujita Alarcón <strong>do</strong> Instituto de Genética e Biologia<br />
Molecular da <strong>Universidade</strong> Particular San Martín de Porres e Dr. Enrique Pérez <strong>do</strong><br />
Instituto de Medicina Tropical da <strong>Universidade</strong> Cayetano Heredia, pelo apoio<br />
incondicional e confiança colocada em mim.<br />
Aos meus irmãos Gianmarco, Michael, Lizbeth, Pilar, Emilia, Ruth, Rommel, Maira e<br />
Milagros, pelo apoio incondicional.<br />
Ao meu tío Miguel pela sua confiança e apoio incondicional desde o início <strong>do</strong>s meus<br />
estu<strong>do</strong>s.<br />
Aos meus tíos Ramón, Picolín, Doris, Luz e Carlos pelo seu apoio e confiança que<br />
sempre têm posto em mim.<br />
À família Rodríguez Gonzales: Prof. Roberto, Sra. Rocio, Mireya, Lalo e Bran<strong>do</strong>n, pelo<br />
apoio e amizade nos momentos mais difíceis, este logro é produto da sua confiança<br />
posta em mim.<br />
Às famílias Salga<strong>do</strong> Rodríguez, Juárez Pita, Rebaza Rodríguez, Massé Miguel, sempre<br />
me fizeram sentir como parte delas.<br />
À Clarisa que sem a sua ajuda tivesse si<strong>do</strong> difícil este logro.<br />
À Yolanda pelo amor e carinho constante que nunca acabarão.<br />
Do mesmo mo<strong>do</strong>, aos meus amigos <strong>do</strong> Perú: Carlitos, Eduar<strong>do</strong>, Miguel, Victor Hugo,<br />
Daniel, Fibi, Ramón, Ludwig, Gabriela, Cecilia, Divián, Omar, Néstor, Marge, Sussy.<br />
viii
RESUMO<br />
As regiões genómicas codificantes são as mais usadas em estu<strong>do</strong>s filogenéticos,<br />
quer através de análises multigénicas, quer filogenómicas. Contu<strong>do</strong>, a maioria destas<br />
análises usa apenas as regiões com elevada similaridade entre as sequências ortólogas,<br />
não consideran<strong>do</strong> as restantes regiões, as quais nos podem dar informações muito úteis.<br />
Deste mo<strong>do</strong>, o principal objectivo desta tese foi avaliar o uso <strong>do</strong>s genes da família<br />
MADS-box, RLM1 e MCM1, assim como o gene IFF8, que codifica para uma proteína<br />
GPI, em estu<strong>do</strong>s de filogenia de fungos.<br />
Foram obtidas setenta e seis sequências ortólogas para RLM1 e MCM1 e oito<br />
para IFF8 através da pesquisa em várias bases de da<strong>do</strong>s. O alinhamento das sequências<br />
foi realiza<strong>do</strong> mediante o uso <strong>do</strong>s programas CLUSTALW e MUSCLE e a análise<br />
filogenética, utilizan<strong>do</strong> os programas PHYML 3.0 e MrBayes 3.2. Os resulta<strong>do</strong>s obti<strong>do</strong>s<br />
usan<strong>do</strong> o factor de transcrição RLM1 indicaram que este apresenta condições favoráveis<br />
para ser incluí<strong>do</strong> em estu<strong>do</strong>s filogenéticos de fungos, uma vez que a topologia obtida<br />
está muito próxima das estabelecidas em outras análises multigénicas e filogenómicas.<br />
O factor de transcrição MCM1 demonstrou limitações para ser utiliza<strong>do</strong> em estu<strong>do</strong>s<br />
filogenéticos a nível <strong>do</strong> reino, visto que as sequências obtidas apresentam uma grande<br />
variabilidade de tamanho, originan<strong>do</strong> problemas nos alinhamentos. Apesar desta<br />
limitação este gene pode ser utiliza<strong>do</strong>, quer independentemente quer combina<strong>do</strong> com<br />
outros genes, para resolver filogenias a nível <strong>do</strong> Filo ou de grupos de categoria inferior,<br />
uma vez que os resulta<strong>do</strong>s obti<strong>do</strong>s dentro <strong>do</strong> subfilo Saccharomycotina estão de acor<strong>do</strong><br />
com os estu<strong>do</strong>s publica<strong>do</strong>s. A utilização <strong>do</strong> gene IFF8 para inferir filogenia apresentou<br />
grandes limitações uma vez que foram identifica<strong>do</strong>s ortólogos deste gene apenas no<br />
grupo CUG e além disso a filogenia obtida não estava de acor<strong>do</strong> com outros estu<strong>do</strong>s<br />
publica<strong>do</strong>s, em particular no respeitante à posição de C. tropicalis na filogenia de<br />
Candida spp.<br />
É presentemente aceite que no Subfilo Saccharomycotina ocorreu um processo<br />
de duplicação completa <strong>do</strong> genoma, nos grupos sensu stricto e sensu lato <strong>do</strong> ‗Complexo<br />
Saccharomyces‘ com conservação de vários genes duplica<strong>do</strong>s. A pesquisa inicial por<br />
sequências ortólogas revelou que os factores de transcrição estuda<strong>do</strong>s neste trabalho<br />
estão dentro <strong>do</strong>s genes duplica<strong>do</strong>s que foram manti<strong>do</strong>s. Assim, outro objectivo deste<br />
trabalho foi determinar se o gene RLM1 esteve sob selecção positiva no Subfilo<br />
ix
Saccharomycotina. Esta análise identificou várias posições onde os aminoáci<strong>do</strong>s estão<br />
possivelmente sob selecção positiva e embora estas substituições tenham si<strong>do</strong><br />
observadas em diferentes locais da proteína, não se detectou nenhuma nas regiões<br />
conservadas. Esta observação sugere que a proteína executa uma função importante na<br />
célula que foi mantida durante o processo de divergência de espécies. A presença de<br />
aminoáci<strong>do</strong>s sob selecção positiva, no inicio da região repetitiva <strong>do</strong> terminal carboxílico<br />
da proteína, bem como a diferença no aminoáci<strong>do</strong> repeti<strong>do</strong> entre as espécies com o<br />
genoma duplica<strong>do</strong> e as espécies com genoma não duplica<strong>do</strong> sugeriam que uma mutação<br />
de frameshift seria a responsável pelas alterações observadas. Esta hipótese foi então<br />
testada no Subfilo Saccharomycotina, desenhan<strong>do</strong> as três grelhas de leitura e<br />
reconstruin<strong>do</strong> a sequência ancestral. Os resulta<strong>do</strong>s desta análise confirmaram que uma<br />
mutação de frameshift foi de facto responsável pelas alterações observadas na região<br />
repetitiva com substituição de aminoáci<strong>do</strong>s, diversifican<strong>do</strong> a função deste gene nas<br />
espécies que duplicaram o genoma.<br />
x<br />
Os principais resulta<strong>do</strong>s desta tese foram: (i) a identificação <strong>do</strong> potencial <strong>do</strong><br />
gene RLM1 para estu<strong>do</strong>s de filogenia <strong>do</strong> reino Fungi, com especial ênfase na filogenia<br />
das espécies <strong>do</strong> género Candida spp.; e (ii) a identificação <strong>do</strong> mecanismo molecular<br />
responsável pela alteração da região repetitiva <strong>do</strong> terminal carboxílico da proteína,<br />
ocorrida no Saccharomyces sensu stricto durante a divergência das espécies após<br />
duplicação <strong>do</strong> genoma.
ABSTRACT<br />
Coding regions are used to resolve phylogenetic relationships through<br />
multigenic and phylogenomic analyses. However, the majority of these analyses uses<br />
regions with similarity only among orthologue gene sequences and <strong>do</strong> not take into<br />
account other regions which could give useful information. Thus, the main purpose of<br />
this thesis was to evaluate the use of RLM1 and MCM1 MADS-box transcription<br />
factors, and IFF8, a GPI-anchor protein-coding gene, in fungi phylogeny.<br />
Seventy six putative orthologue sequences for RLM1 and MCM1 and eight for IFF8<br />
were obtained from different fungal databases. Sequence alignments were performed by<br />
using CLUSTALW and MUSCLE and phylogeny was inferred by using PHYML 3.0<br />
and Mr Bayes 3.2. Results obtained from the phylogenetic analysis, using the<br />
transcription factor RLM1, indicated that it presents conditions to be considered within a<br />
multigene analysis, since the obtained fungal phylogeny is closer to the ones established<br />
by multigene and phylogenomic analyses. The transcription factor MCM1 presented<br />
limitations to be used in phylogeny at the king<strong>do</strong>m level, because of its variable<br />
sequence sizes. However, despite this limitation this gene can be used to resolve<br />
phylogeny at the phylum or lower clade levels since the results obtained, independently<br />
and/or concatenated with the other genes used in this study, within the subphylum<br />
Saccharomycotina were in agreement with published studies. On the other hand, the use<br />
of IFF8 gene to infer phylogeny is limited and restricted to the CUG group, since other<br />
orthologues were not found within the king<strong>do</strong>m Fungi and the results obtained in the<br />
CUG group phylogeny presented conflicts, particularly in the position of Candida<br />
tropicalis which is not in agreement with previously determined relationship in Candida<br />
phylogeny.<br />
It is known that within the subphylum Saccharomycotina the process of genome<br />
duplication has occurred in the ‗Saccharomyces complex‘, groups sensu stricto and<br />
sensu lato, resulting in the duplication of some genes. The initial search for orthologue<br />
sequences showed that the transcription factors studied in this work are within the<br />
duplicated genes that were maintained. Thus, another objective of this work was to<br />
determine if RLM1 was under positive selection to search for an alternative explanation<br />
for the persistence and diversification of gene duplicates. These analyses identified<br />
several amino acid sites under positive selection within the subphylum<br />
xi
Saccharomycotina and although these substitutions were present in different positions,<br />
they were not inside conserved regions, suggesting that the protein plays an important<br />
role in fungi that was maintained during the evolution process of divergence of species.<br />
The presence of amino acids under positive selection at the beginning of Rlm1 C-<br />
terminal repetitive region and the differences in the amino acid under repetition between<br />
species that presented the duplicated genome (WGD) and species with non duplicated<br />
genome was indicative of a possible frameshift mutation. This hypothesis was tested in<br />
Saccharomycotina group by designing the three open reading frames and reconstructing<br />
the ancestral sequence. Results from this analysis showed that amino acid substitution<br />
that occurred during the divergence of species changed this repetitive region and<br />
diversified gene function in WGD species avoiding the loss of the protein function.<br />
xii<br />
The major findings of this work were (i) the identification of the potential use of<br />
RLM1 gene for inferring phylogeny in the king<strong>do</strong>m Fungi with special emphasis to<br />
Candida species, and (ii) the observation that this gene evolved within the<br />
Saccharomyces sensu stricto after genome duplication, being the molecular mechanism<br />
responsible for the change observed in the C-terminal of this protein most probably a<br />
frameshift mutation.
RESUMEN<br />
Las regiones genómicas codificantes son usadas en estudios filogenéticos através<br />
de análisis a niveles multigenicos y filogenómicos. Pero la mayoría de estos análisis<br />
usan solo regiones con similaridad entre secuencias de genes ortólogos y no toma en<br />
cuenta otras regiones las cuales podrían darnos informaciones útiles. Por lo tanto, el<br />
objetivo principal de esta tesis fue evaluar el uso de los genes MADS-box, RLM1 y<br />
MCM1, así como IFF8, un gen codificante de proteína de anclaje GPI, en la filogenia de<br />
hongos.<br />
Se obtuvieron setenta y seis secuencias putativas ortólogas para RLM1 y MCM1,<br />
y ocho para IFF8 a partir de las diferentes bases de datos de hongos. Los alineamientos<br />
de secuencias se realizaron mediante el uso de los programas CLUSTALW y MUSCLE,<br />
y la filogenia fue inferida por medio de los programas PHYML 3.0 y MrBayes 3.2. Los<br />
resulta<strong>do</strong>s obteni<strong>do</strong>s de los análisis filogenéticos, utilizan<strong>do</strong> el factor de transcripción<br />
RLM1 indicaron que este presenta condiciones para ser considera<strong>do</strong> dentro de análisis<br />
multigénico, ya que la filogenia obtenida para hongos está muy próxima a las<br />
establecidas por otros análisis multigénicos y filogenómicos. El factor de transcripción<br />
MCM1 presentan limitaciones para ser utiliza<strong>do</strong> en filogenia a nivel de reino, debi<strong>do</strong> a<br />
sus secuencias de tamaño variables, dan<strong>do</strong> lugar a problemas en el alineamiento. A<br />
pesar de esta limitación este gen se puede utilizar para resolver filogenias a nivel de Filo<br />
o cla<strong>do</strong>s inferiores debi<strong>do</strong> a los resulta<strong>do</strong>s obteni<strong>do</strong>s con su uso, de manera<br />
independiente y/o combina<strong>do</strong> con el resto de genes utiliza<strong>do</strong>s en este estudio, ya que<br />
dentro del Subfilo Saccharomycotina estuvieron de acuer<strong>do</strong> con estudios publica<strong>do</strong>s.<br />
Por otro la<strong>do</strong>, el uso de IFF8 gen para inferir filogenia es limita<strong>do</strong> y restringi<strong>do</strong> al grupo<br />
CUG, ya que otros ortólogos no se han encontra<strong>do</strong> en el reino Fungi y los resulta<strong>do</strong>s<br />
obteni<strong>do</strong>s en la filogenia del grupo CUG presentaron conflictos, en particular en la<br />
posición de C. tropicalis que no está de acuer<strong>do</strong> con lo que ya se ha determina<strong>do</strong> en la<br />
filogenia de Candida spp.<br />
Es actualmente acepta<strong>do</strong> que dentro del subfilo Saccharomycotina el proceso de<br />
duplicación del genoma se ha produci<strong>do</strong> en los grupos sensu stricto y sensu lato del<br />
'Complejo Saccharomyces‘, resultan<strong>do</strong> en la manutención de algunos genes. La<br />
búsqueda inicial de secuencias ortólogas mostró que los factores de transcripción<br />
estudia<strong>do</strong>s en este trabajo están dentro de los genes duplica<strong>do</strong>s que han si<strong>do</strong><br />
xiii
manteni<strong>do</strong>s. Así pues, otro objetivo de este trabajo fue determinar si RLM1 estuvo bajo<br />
selección positiva. Estos análisis identificaron varios sitios <strong>do</strong>nde los aminoáci<strong>do</strong>s<br />
estuvieron posiblemente bajo selección positiva dentro del Subfilo Saccharomycotina y<br />
aunque estas sustituciones estaban presentes en diferentes posiciones, no estuvieron<br />
presentes en las regiones conservadas, lo que sugiere que la proteína ejecuta una<br />
función importante en la célula que fue mantenida durante el proceso de divergencia de<br />
especies. La presencia de aminoáci<strong>do</strong>s bajo selección positiva en el inicio de la region<br />
repetitiva del C-terminal en Rlm1 y la diferencia en el aminoáci<strong>do</strong> bajo repetición entre<br />
las especies con el genoma duplica<strong>do</strong> y las especies con genoma no duplica<strong>do</strong> indicaba<br />
un posible frameshift mutation. Asi, esta hipótesis fue testada en el subfilo<br />
Saccharomycotina diseñan<strong>do</strong> tres open reading frames y reconstruyen<strong>do</strong> la secuencia<br />
ancestral. Los resulta<strong>do</strong>s de este análisis mostraron que la substitución de aminoáci<strong>do</strong>s<br />
ocurrida durante la divergencia de especies que alteró esa región fue mediante una<br />
frameshift mutation diversifican<strong>do</strong> la función de este gen en las especies que duplicaron<br />
su genoma.<br />
xiv<br />
Los principales resulta<strong>do</strong>s de esta tesis son: (i) la identificación del potencial del<br />
gen RLM1 para estudios de filogenia en el reino Fungi, con especial énfasis en la<br />
filogenia de las especies de Candida spp y (ii) la identificación del mecanismo<br />
molecular responsable por el cambio observa<strong>do</strong> en el C-terminal de esta proteína en el<br />
grupo Saccharomyces sensu stricto que alteró el gen durante la divergencia de especies<br />
después del proceso de duplicación de genoma.
INDEX<br />
DEDICATORIA .............................................................................................................. v<br />
AGRADECIMENTOS ................................................................................................. vii<br />
RESUMO ........................................................................................................................ ix<br />
ABSTRACT .................................................................................................................... xi<br />
RESUMEN ................................................................................................................... xiii<br />
ABBREVIATION LIST ............................................................................................. xvii<br />
FIGURE LIST .............................................................................................................. xxi<br />
TABLE LIST .............................................................................................................. xxiii<br />
1. INTRODUCTION ...................................................................................................... 1<br />
1.1. Classification of organisms .................................................................................. 3<br />
1.2. Tree of life ............................................................................................................. 5<br />
1.3. King<strong>do</strong>m Fungi ................................................................................................... 12<br />
1.4. Fungal taxonomy ................................................................................................ 15<br />
1.5. Fungal phylogeny ............................................................................................... 16<br />
1.6. Fungal identification .......................................................................................... 19<br />
1.6.1. Phenotypic methods .................................................................................... 20<br />
1.6.2. Genotypic methods ...................................................................................... 21<br />
1.7. Methods for phylogenetic inference.................................................................. 24<br />
1.7.1. Multiple sequence alignment ...................................................................... 25<br />
1.7.1.1. CLUSTAL series of programs .......................................................... 25<br />
1.7.1.2. MUSCLE ............................................................................................ 26<br />
1.7.2. Major methods for estimating phylogenetic trees and infer phylogeny . 26<br />
1.7.2.1. Distance methods ............................................................................... 26<br />
1.7.2.2. Character-based methods ................................................................. 28<br />
1.8. Adaptive evolution.............................................................................................. 30<br />
2. THESIS BACKGROUND AND OBJECTIVES .................................................... 33<br />
3. MATERIAL AND METHODS ............................................................................... 43<br />
3.1. Phylogenetic analysis ......................................................................................... 45<br />
3.1.1. Data retrieval ............................................................................................... 45<br />
3.1.2. Alignment of sequences ............................................................................... 46<br />
3.1.3. Selection of the best evolutionary model ................................................... 46<br />
xv
xvi<br />
3.1.4. Phylogenetic reconstruction ........................................................................ 46<br />
3.1.5. Interpretation of values considered for nodal support ............................. 47<br />
3.2. Analysis of adaptive evolution ........................................................................... 48<br />
3.3. Analysis of the repetitive region in the subphylum Saccharomycotina ......... 49<br />
4. RESULTS AND DISCUSSION................................................................................ 51<br />
4.1. Compilation of data ............................................................................................ 53<br />
4.2. Phylogeny based on MADS-box transcrition factors ...................................... 54<br />
4.2.1. Rlm1 transcription factor ........................................................................... 54<br />
4.2.2. Mcm1 transcription factor .......................................................................... 60<br />
4.2.3. Placement of fungal phylogeny based on MADS-box transcription factor<br />
within the current literature ................................................................................. 63<br />
4.3. Relationships within the subphylum Saccharomycotina ................................ 69<br />
4.3.1. Phylogeny of the ‘Saccharomyces complex’ ............................................... 69<br />
4.3.2. Phylogenetic relationships amongst Candida species ............................... 74<br />
4.4. Analysis of adaptive evolution ........................................................................... 77<br />
4.5. Event of frameshift mutation in S. cerevisiae RLM1 gene .............................. 82<br />
5. FINAL REMARKS ................................................................................................... 89<br />
6. REFERENCES .......................................................................................................... 95<br />
7. ANNEXES ................................................................................................................ 129
ABBREVIATION LIST<br />
AIC Akaike Information Criterion<br />
AICc Akaike Information Criterion corrected<br />
ACT1 Actin 1<br />
AFLP Amplified Fragment Length Polymorphism<br />
aLRT Approximate Likelihood Ratio Test<br />
AP-PCR Arbitrarily Primed - PCR<br />
ARG80 Arginine requirement 80<br />
BEB Bayesian Empirical Bayes<br />
BI Bayesian inference<br />
COX2 Cythocrome oxidase 2 gene<br />
CUG Organisms translate serine instead of leucine<br />
CytB Cythocrome oxidase B gene<br />
DHFR Dihydrofolate Reductase gene<br />
DNA Desoxiribonucleic acid<br />
dN Number of non-synonymous changes per non-synonymous site<br />
dS Number of synonymous changes per synonymous site<br />
EF-1α Elongation factor 1 alpha<br />
GPI Glycosylphosphatidylinositol<br />
HYPHY Hypothesis Testing Using Phylogenies<br />
IFF8 Individual proteins file family F 8<br />
ITS Intergenic Transcribed Spacer<br />
LPS Lipopolysaccharide<br />
LRTs Likelihood Ratio Tests<br />
MADS MCM1-AGAMOUS-DEFICIENS-SRF<br />
MAP1 Schizosaccharomyces pombe MCM1 homologue<br />
MCM1 MiniChromosome Maintenance<br />
MCMCMC Metropolis-coupled Markov Chain Monte Carlo<br />
MEC Mechanistical-Empirical Model<br />
MEF2 Myocite Enhancer Factor 2<br />
xvii
MEGA Molecular Evolutionary Genetics Analysis<br />
ML Maximum likelihood<br />
MMR Mismatch repair system<br />
MP Maximum parsimony<br />
mtDNA Mithocondrial Desoxiribonucleic acid<br />
mtLSUrDNA Mithocondrial large subunit ribosomal DNA gene<br />
mtSSUrDNA Mithocondrial small subunit ribosomal DNA gene<br />
MUSCLE Multiple Sequence Comparison by log-Expectation<br />
NJ Neighbor joining<br />
NonWGD Non Whole Genome Duplication<br />
nucLSUrDNA Nuclear large subunit ribosomal DNA gene<br />
nucSSUrDNA Nuclear small subunit ribosomal DNA gene<br />
OM Outer membrane<br />
Omp85 Outer membrane protein family 85<br />
ORF Open Reading Frame<br />
PAML Phylogenetic Analysis by Maximum Likelihood<br />
PAUP Phylogenetic Analysis Using Parsimony<br />
PCR Polymerase chain reaction<br />
PHYML Phylogeny by Maximum Likelihood<br />
PP Posterior Probability<br />
RAPD Ran<strong>do</strong>mly Amplified Polymorphic DNA<br />
rDNA ribosomal DNA gene<br />
rRNA ribosomal RNA<br />
RFLP Restriction Fragment Length Polymorphism<br />
RLM1 Resistance to Lethality of MKK1P386 overexpression gene<br />
RPB1 RNA polymerase II largest subunit<br />
RPB2 RNA polymerase II second largest subunit<br />
SAM SRF-Arg80-Mcm1<br />
S-H Shimodaira-Hasegawa suppot<br />
SMP1 Second MEF2-like Protein<br />
xviii
SRF Serum Response Factor<br />
TS Thymidylate Synthase gene<br />
Ty elements Transponsable element of yeast<br />
UMC1 Ustilago MCM1 homologue<br />
UPGMA Unweighted Pair Group Method with Aritmetic Mean<br />
WGD Whole Genome Duplication<br />
xix
FIGURE LIST<br />
Figure 1. Reconstruction of the Tree of Life based on transition analysis(…) ................ 7<br />
Figure 2. Reconstruction of the eukaryote evolutionary tree(…) .................................. 10<br />
Figure 3. Diversity of fungi(…) ..................................................................................... 14<br />
Figure 4. Phylogeny of the king<strong>do</strong>m Fungi(…) ............................................................. 19<br />
Figure 5. Schematic representation of Type I (SRF-like) and Type II (MEF2-like)<br />
protein <strong>do</strong>mains in plant, animal and fungi(…) .............................................................. 38<br />
Figure 6. Structure of Type I (SRF) and Type II (MEF2A) MADS <strong>do</strong>mains in complex<br />
with their DNA target (Homo sapiens)(…) .................................................................... 39<br />
Figure 7. GPI anchor structure(…) ................................................................................ 40<br />
Figure 8. Fungal phylogeny based on Rlm1 protein sequences(…) .............................. 56<br />
Figure 9. Fungal phylogeny based on RLM1 DNA sequences(…) ............................... 58<br />
Figure 10. Fungal phylogeny based on Mcm1 protein sequences(…) .......................... 62<br />
Figure 11. Phylogeny of the subphylum Saccharomycotina by using the RLM1(…) ... 70<br />
Figure 12. Phylogeny of the subphylum Saccharomycotina by using the MCM1(…) . 71<br />
Figure 13. Phylogeny of the subphylum Saccharomycotina inferred by using<br />
concatenated alignment of RLM1-MCM1(…) ............................................................... 72<br />
Figure 14. Phylogenetic network reconstructed using a concatenated alignment of<br />
RLM1-MCM1 in the subphylum Saccharomycotina(…) ............................................... 73<br />
Figure 15. Phylogeny of CUG Group based on IFF8(…) ............................................. 75<br />
Figure 16. Subphylum Saccharomycotina phylogeny reconstructed using concatenated<br />
alignment of RLM1-MCM1-IFF8(…) ............................................................................ 75<br />
Figure 17. Subphylum Saccharomycotina phylogenetic network reconstructed using a<br />
concatenated alignment of RLM1-MCM1-IFF8(…) ...................................................... 76<br />
xxi
Figure 18. Conserved regions in RLM1 gene of the subphylum Saccharomycotina:<br />
MADS-box, Acidic region, A-B conserved regions present in this subphylum(…) ...... 78<br />
Figure 19. Region of microsatellites present in some species from subphylum<br />
Saccharomycotina(…) ..................................................................................................... 79<br />
Figure 20. Saccharomyces cerevisiae amino acid sequence showing sites under positive<br />
selection by the MEC model(…) .................................................................................... 81<br />
Figure 21. The three RLM1 open reading frames of Candida albicans, Kluyveromyces<br />
lactis and Saccharomyces cerevisiae visualized with the Virtual Ribosome 1.0(…) ..... 83<br />
Figure 22. (A) Evolution of MADS-box genes based on Alvarez-Buylla et al. (2000)<br />
model, proposing a common origin for Animals, Fungi and Plants and a further<br />
duplication event in the Fungi lineage in some species within the subphylum<br />
Saccharomycotina (WGD group). (B) Comparison of Rlm1 protein/gene <strong>do</strong>mains in S.<br />
cerevisiae and C. albicans, showing the frameshift mutation ocurred after genome<br />
duplication in S. cerevisisae abd WGD species that changed the coding amino acid from<br />
Q to N .............................................................................................................................. 85<br />
Figure 23. Predicted putative ancestral RLM1 gene sequence from which CUG, WGD<br />
and Non-WGD groups diverged(…) ............................................................................... 87<br />
xxii
TABLE LIST<br />
Table 1. Evolution in the classification of living organisms throughout time. ................ 4<br />
Table 2. The six king<strong>do</strong>ms of life .................................................................................... 5<br />
Table 3. Phyla of the king<strong>do</strong>m Bacteria ........................................................................... 8<br />
xxiii
INTRODUCTION
1. INTRODUCTION<br />
1.1 Classification of organisms<br />
INTRODUCTION<br />
The classification of living organisms has always been a controversial subject.<br />
Attempts aiming at classifying living organisms were performed from the dawn of<br />
recorded history, however early classification systems were artificial and based<br />
primarily on habit and/or characteristics important to humans (i.e., medicines, food).<br />
The earliest known system of classification, from which current systems of<br />
classifying forms of life descend, comes from the thought presented by the Greek<br />
philosopher and scientist Aristotle (384-322 BC), who published in his metaphysical and<br />
logical works the first known classification of everything whatsoever, or "being". This<br />
was the scheme that gave moderns such words as substance, species and genus and was<br />
retained in a modified and less general form by Linnaeus. Aristotle‘s system classified<br />
all living organisms known at that time as either a plant or an animal on the basis of<br />
movement (air, land, or water), feeding mechanism, growth patterns, mode of<br />
reproduction and possession or lack of red blood cells. Microscopic organisms were<br />
unknown.<br />
In 1735, Swedish naturalist Carolus Linnaeus formalized the use of two latin names<br />
to identify each organism, a system called binomial nomenclature, in his great book the<br />
Systema Naturae that ran through twelve editions during his lifetime. In his work,<br />
nature was divided into three king<strong>do</strong>ms: mineral, vegetable and animal (Table1).<br />
Linnaeus grouped closely related organisms and introduced the modern classification<br />
groups: class, order, family, genus, and species.<br />
The german biologist Ernst Haeckel (1866) proposed a third king<strong>do</strong>m, Protista, to<br />
include all single-celled organisms. Some taxonomists also placed simple multicellular<br />
organisms, such as seaweeds, in the king<strong>do</strong>m Protista. Bacteria were also placed within<br />
Protista as a separate group called Monera.<br />
In 1938, American biologist Herbert Copeland was responsible for proposing the<br />
group Monera as a fourth king<strong>do</strong>m, which included only bacteria, separating it from<br />
king<strong>do</strong>m Protista. This was the first classification proposal to separate organisms<br />
3
4<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
without nuclei, the prokaryotes, from organisms with nuclei, called eukaryotes, at the<br />
king<strong>do</strong>m level.<br />
Table 1. Evolution in the Classification of Living Organisms throughout time.<br />
Linnaeus<br />
1735<br />
2<br />
King<strong>do</strong>ms<br />
Haeckel<br />
1866<br />
3<br />
King<strong>do</strong>ms<br />
Copeland<br />
1938<br />
4<br />
King<strong>do</strong>ms<br />
In 1969, American biologist Robert Whittaker proposed a fifth king<strong>do</strong>m, Fungi,<br />
based on fungi's unique structures and methods of obtaining food. Fungi <strong>do</strong> not ingest<br />
food as animals <strong>do</strong>, nor <strong>do</strong> they make their own food, as plants <strong>do</strong>; rather, they secrete<br />
digestive enzymes around their food and then absorb it into their cells.<br />
Woese and Fox (1977), divided the king<strong>do</strong>m Monera in two new king<strong>do</strong>ms:<br />
Eubacteria and Archeobacteria. Then, Carl Woese et al. (1990) proposed a new<br />
category, called Domain, to reflect evidence from nucleic acid studies that more<br />
precisely reveal evolutionary or family relationships. He suggested three <strong>do</strong>mains,<br />
Archaea, Bacteria, and Eukarya, based largely on studies of the small subunit rRNA.<br />
Recently, the system of classification has been modified well supported by<br />
molecular and cellular studies due to zoologist Thomas Cavalier-Smith (1981) with the<br />
proposal of a sixth king<strong>do</strong>m of life: the Chromista. In 2004, Chromista was accepted as<br />
king<strong>do</strong>m and the term Domain was changed to Empire, with the proposal of two:<br />
Prokaryota and Eukaryota (Table 2).<br />
Whittaker<br />
1969<br />
5<br />
King<strong>do</strong>ms<br />
Woese and Fox<br />
1977<br />
6<br />
King<strong>do</strong>ms<br />
Woese et<br />
al. 1990<br />
3 Domains<br />
Cavalier-<br />
Smith 2004<br />
2 Empires<br />
6<br />
King<strong>do</strong>ms<br />
Vegetabilia Protista Monera Monera Eubacteria Bacteria Prokaryota<br />
Archaebacteria Archea Bacteria<br />
Protista Protista Protista Eukarya Eukaryota<br />
Animalia Plantae Protozoa<br />
Plantae<br />
Fungi Fungi Chromista<br />
Animalia Plantae Plantae Fungi<br />
Animalia Animalia Animalia Plantae<br />
Animalia
Table 2. The six king<strong>do</strong>ms of life (Cavalier-Smith, 2004)<br />
1.2 Tree of Life<br />
INTRODUCTION<br />
The new insights coming from molecular sequence studies and their integration with<br />
numerous other lines of evidence, genetic, structural and biochemical have allowed an<br />
exact classification of organisms. Based on these facts, researchers are reconstructing<br />
the tree of life, and trying to place correctly the root of the evolutionary tree of all life<br />
what would enable us to deduce rigorously the major characteristics of the last common<br />
ancestor of all. It is probably the most difficult problem of all in phylogenetics, not yet<br />
solved and the most important to solve correctly because the result colours all<br />
interpretations of evolutionary history, influencing ideas of which features are primitive<br />
or derived and which branches are deeper and more ancient than others (Cavalier-Smith,<br />
2002a; Gribal<strong>do</strong> and Philippe, 2002).<br />
During a long time, researchers proposed that the root of the tree of life came<br />
from ancestral species of Archaebacteria, giving origin to other groups and ultimately to<br />
the formation of superior multicellular organisms. Other attempts of rooting the tree of<br />
life used protein paralogue trees that theoretically placed the root, but these results were<br />
contradictory due to tree-reconstruction artefacts or to poor resolution (Cavalier-Smith,<br />
2002a). Studies based on ribosome-related and DNA-handling enzymes suggested one<br />
root between Neomura (Eukaryotes plus Archaebacteria) and Eubacteria (Iwabe et al.,<br />
1989), whereas metabolic enzymes place the root within Eubacteria but in contradictory<br />
places (Peretó et al., 2004). Palaeontology shows that eubacteria are much more ancient<br />
5
6<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
than Eukaryotes (Cavalier-Smith, 1987; Douzery et al., 2004; Roger and Hug, 2006),<br />
with phylogenetic evidence that archaebacteria are sisters not ancestral to Eukaryotes,<br />
implying that the root is not within the Neomura.<br />
The first fully resolved prokaryote tree was inferred, by using 13 major<br />
biological transitions within eubacteria (Figure 1 and Table 3).This tree showed a basal<br />
stem comprising the new infraking<strong>do</strong>m Gli<strong>do</strong>bacteria (Chlorobacteria, Ha<strong>do</strong>bacteria,<br />
Cyanobacteria), which is entirely non-flagellate, and two derived branches comprising<br />
the infraking<strong>do</strong>m Gracilicutes and the subking<strong>do</strong>m Unibacteria that diverged<br />
immediately following the origin of flagella (Cavalier-Smith, 2006a).<br />
Proteasome evolution, a proteic structure responsible for degradating unneeded<br />
or damaged protein, indicated that the universal root would be outside the clade<br />
comprising Neomura and Actinomycetales (proteates), and thus lying within other<br />
eubacteria, contrary to the widespread assumption that it was between Eubacteria and<br />
neomura. Cell wall and flagellar evolution independently located the root outside<br />
Posibacteria (Actinobacteria and En<strong>do</strong>bacteria), and thus among Negibacteria, with two<br />
membranes, favouring a transition from de Negibacteria to Posibacteria and not from<br />
Posibacteria to Negibacteria as traditionally assumed. Posibacteria were derived from<br />
Eurybacteria and ancestral to Neomura.<br />
Studies based on the comparison of new amino acid insertions in the RNA<br />
polymerase gene strongly favor the monophyly of Gracilicutes which contain<br />
Proteobacteria, Planctobacteria, Sphingobacteria and Spirochaetes (Iyer et al., 2004).<br />
Evolution of the negibacterial outer membrane places the root within Eobacteria<br />
(Ha<strong>do</strong>bacteria and Chlorobacteria both primitive without lipopolysaccharide), as all<br />
phyla possesses the outer membrane β-barrel protein Omp85, and Chlorobacteria, the<br />
only Negibacteria without Omp85, or within Chlorobacteria (Cavalier-Smith, 2006a).<br />
A negibacterial root also fits the fossil record, which shows that Negibacteria are<br />
more than twice as old as Eukaryotes (Cavalier-Smith, 2002a; Cavalier-Smith, 2006b).<br />
Posibacteria, Archaebacteria and eukaryotes were probably all ancestrally heterotrophs,<br />
whereas Negibacteria are likely to have been ancestrally photosynthetic and diversified<br />
by evolving all the known types of photosystems and major antenna pigments.
INTRODUCTION<br />
Figure 1. Reconstruction of the Tree of Life based on biological transitions analysis. Thumbnail sketches<br />
show major variants in cell morphology (microtubular skeleton red, pepti<strong>do</strong>gglycan wall<br />
brown, outer membrane blue). The most likely root position is as shown; the possibility that it<br />
may lie within Chlorobacteria instead cannot yet be ruled out. Lowest level groups including or<br />
consisting enterily of photosynthetic organisms are in green or purple. The frequently<br />
misplaced hyperthermophilic eubacteria are in red (Adapted from Cavalier-Smith, 2006a).<br />
7
8<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Table 3. Phyla of the king<strong>do</strong>m Bacteria (Cavalier-Smith, 2006a). *Classification proposed by author.<br />
The last ancestor of all life forms would have been a eubacterium with acyl-ester<br />
membrane lipids, large genome, murein pepti<strong>do</strong>glycan walls, and with a fully developed<br />
eubacterial biology and cell division. It would be a non-flagellate negibacterium with<br />
two membranes, probably a photosynthetic green non-sulphur bacterium with relatively<br />
primitive secretory machinery, not a heterotrophic posibacterium with one membrane.<br />
As Negibacteria are the only prokaryotes that use sunlight to fix carbon dioxide this is<br />
also the only position that would have allowed the first ecosystems to have been based<br />
on photosynthesis, without which extensive evolution might have been impossible<br />
(Cavalier-Smith, 2002a; 2006a).
INTRODUCTION<br />
Ancestral neomura would have had their ancestral origin in an Arabobacteria<br />
(Actinobacteria). This ancestral would have been a thermophilic bacterium with<br />
cotranslational protein secretion, N-linked glicosylation, glycoprotein instead of murein,<br />
histones ¾ instead of DNA gyrase. This neomuran ancestor diverged sharply into two<br />
contrasting lineages. One formed a glycoprotein wall and became hyperthermophilic,<br />
evolving prenyl ether lipids and losing many eubacterial genes, as H1 histones, to form<br />
the Archaebacteria. The other became much more radically changed by using its<br />
glycoproteins as a flexible surface coat, evolving phagotrophy, an en<strong>do</strong>membrane<br />
system, en<strong>do</strong>skeleton, be uniciliate, unicentriolar aerobic zooflagellate and nucleus and<br />
enslaving an a-proteobacterium as a protomitochondrion to become the first eukaryote<br />
(Cavalier-Smith, 2000; 2002b).<br />
Eukaryotes comprise the basal king<strong>do</strong>m Protozoa and four derived king<strong>do</strong>ms:<br />
Animalia, Fungi, Plantae, and Chromista (Cavalier-Smith, 1998). Some thoughtful 19th<br />
century protozoologists suspected that flagellates preceded amoebae, as just asserted.<br />
But until DNA sequencing and molecular systematics burgeoned in the 1980s and<br />
1990s the often more popular idea of Haeckel that amoebae came first and flagellates<br />
were more advanced could not be disproved. However, it is now known that the primary<br />
diversification of eukaryote cells took place among zooflagellates with non-<br />
photosynthetic predatory cells having one or more flagella for swimming, and often also<br />
generating water currents for pulling in preys (Cavalier-Smith, 2006), from which two<br />
groups diverged, the bikonts and the unikonts that incorporate all eukaryotic king<strong>do</strong>ms<br />
(Figure 2).<br />
Plantae, Chromista, three protozoan infraking<strong>do</strong>ms (Alveolata, Rhizaria,<br />
Excavata), and the small zooflagellate phylum Apusozoa are all clearly ancestrally<br />
biciliate and together constitute a clade designated the bikonts (Cavalier-Smith, 2002b).<br />
Ancestral bikonts had two diverging cilia: the anterior undulated for swimming or prey<br />
attraction, and the posterior for gliding on surfaces in many groups (likely its ancestral<br />
condition), but secondarily adapted for swimming in other groups (or quite often was<br />
lost to make secondary uniciliates, not to be confused with genuine unikonts, which<br />
confusingly are sometimes secondarily biciliate). All major eukaryote groups except<br />
Ciliophora included severely modified organisms that lost cilia; some became highly<br />
amoeboid. The major shared derived character for all groups (not yet clearly<br />
9
10<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
demonstrated for Rhizaria) is the ciliary transformation in which the anterior<br />
cilium/centriole and its associated roots are always the first formed during the cellular<br />
division by bipartition; in the next cell cycle the cillium often undergo marked changes<br />
in structure and function to become the corresponding posterior organelles (Moestrup,<br />
2000).<br />
Figure 2. Reconstruction of the eukaryote evolutionary tree. Taxa in black comprise the basal king<strong>do</strong>m<br />
Protozoa. The thumbnail sketches show the contrasting microtubule skeletons of unikonts and<br />
bikonts in red. The dates highlighted in yellow are for the most ancient fossils known for each<br />
major group. Major innovations that help group higher taxa are shown by bars. Four major cell<br />
enslavements to form cellular chimaeras are shown by heavy arrows (enslaved bacteria) or<br />
dashed arrows (enslaved eukaryote algae) (Cavalier-Smith, 2006c).<br />
An independent derived character for bikonts is the fusion between thymidylate<br />
synthase (TS) and dihydrofolate reductase (DHFR) genes to encode a single<br />
bifunctional chimaeric protein. This fusion appears to have taken place in the common<br />
ancestor of the ancestral bikont after their invertion in the cenancestral eukaryote<br />
(Stechmann and Cavalier-Smith, 2002; 2003).<br />
Based on these facts, a new branching for the bikont group was proposed, in<br />
which the zooflagellate Phylum Apusozoa may be a clade sister of the Excavata. In an<br />
another group, it was observed that the clade chromalveolates was formed by the<br />
king<strong>do</strong>m Chromista and the protozoan Infraking<strong>do</strong>m Alveolata, which diverged from a<br />
common ancestor that enslaved a red algae and evolved novel plastid protein-targeting
INTRODUCTION<br />
machinery via the host rough en<strong>do</strong>plasmatic reticule (ER) and the enslaved algal plasma<br />
membrane (periplastid membrane). king<strong>do</strong>m Plantae, whose ancestor enslaved a<br />
cyanobacteria to form chloroplasts, may be sister of an ancestral to chromalveolates;<br />
Rhizaria and Excavata (jointly Cabozoa) are probably sisters if the formerly green algal<br />
plastid of euglenoids and chloarachneans (Cercozoa) was enslaved in a single event<br />
named symbiogenesis secondary in their common ancestor (Cavalier-Smith, 2002c;<br />
2003).<br />
The TS and DHFR genes are separately translated in Sarcomastigota, Animalia<br />
and Fungi. These three taxa are referred to as unikonts because it is argued that their<br />
common ancestor probably had only a single centriole and cilium per kinetid (Cavalier-<br />
Smith, 2002b). This unikont state is found in most distinct amoebozoan. On the other<br />
hand, unikonts share the same myosin II as in our muscles, being absent from all<br />
bikonts (Cavalier-Smith, 2006c). Unikont is argued to be the ancestral state for<br />
Amoebozoa (Cavalier-Smith, 2002b; Cavalier-Smith et al., 2004) as only myxogastrids<br />
and former protostelids arguably related to them have bicentriolar kinetids. The<br />
bicentriolar state of myxogastrids and many Choanozoa is developmentally different<br />
from that of bikonts, as their anterior cilium remains anterior in successive cell cycles<br />
and <strong>do</strong>es not transform into a posterior one. Thus, it is arguably not homologous to the<br />
bikont state with true ciliary transformation. A second gene fusion involving the first<br />
three enzymes of pyrimidine biosynthesis (carbamoyl phosphate synthase,<br />
dihydrorotase and aspartate carbamoyltransferase) is apparently a shared derived<br />
character for unikonts, absent from bikonts and the ancestral prokaryotes (Stechmann<br />
and Cavalier-Smith, 2003). As this involved two simultaneous fusion events, it is even<br />
less likely to ever have been reversed than the DHFR/TS fusion. If none of these gene<br />
fusions has ever been reversed during evolution, then they together indicate that the root<br />
of the eukaryote tree cannot lie within unikonts or bikonts, but must lie between these<br />
two clades (Stechmann and Cavalier-Smith, 2003).<br />
Earlier structural and molecular evidence would support the idea that Animalia,<br />
Choanozoa and Fungi together form a clade, characterized ancestrally by a single<br />
posterior cilium with a bicentriolar kinetid and flat mitochondrial cristae (Cavalier-<br />
Smith, 1987b). An exclusive grouping of animals and fungi was first noted in<br />
evolutionary trees of small subunit ribosomal RNA analysis (Wainright et al., 1993) and<br />
11
12<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
proteins such as Ef-1α, Act1 and α-β tubulin (Baldauf and Palmer, 1993). This grouping<br />
was subsequently designated as ‗‗Opisthokont‘‘ (Cavalier-Smith and Chao, 1995) and is<br />
now strongly supported by phylogenetic analyses of all taxonomically well-represented<br />
and well-characterized molecular data sets. These include concatenated multigene<br />
analyses (Baldauf et al., 2000; Bapteste et al. 2001; Lang et al., 2002) as well as many<br />
single-gene trees (Baldauf, 1999; Inagaki and Doolittle, 2000; Van de Peer et al., 2000).<br />
Recent molecular phylogenetic evidence has indicated that animals and fungi evolved<br />
independently from different unicellular protozoan choanozoan ancestors (Steenkamp et<br />
al., 2006; Shalchian-Tabrizi et al., 2008).<br />
The other clade that forms part of the Opisthokonts is the Phylum Choanozoa.<br />
Extant Choanozoa are either a clade that is sister to Animalia or a paraphyletic group<br />
ancestral to them (Cavalier-Smith and Chao, 2003; Rokas et al., 2003), and within the<br />
Choanozoa only the Nucleariid appears to be the closest sister taxon to Fungi<br />
(Steeankamp et al., 2006). A sister relationship between animals and choanoflagellates<br />
at least is supported by the probable gene fusion that generated the receptor tyrosine<br />
kinase from a cytoplasmic kinase and calcium-binding epidermal growth factor in their<br />
common ancestor after it diverged from Amoebozoa (King and Carroll, 2001). The<br />
Opisthokont clade is also supported by a shared derived 11–17 amino acid insertion in<br />
protein synthesis elongation factor EF-1α absent from prokaryotes and all other<br />
eukaryotes (Baldauf, 1999).<br />
1.3 King<strong>do</strong>m Fungi<br />
Fungi share a long history with human civilization. References in Greek literature,<br />
mushroom stones from Mesoamérica dating back to 1000-300 B.C. (Lowy, 1971), and<br />
dried mushrooms of Piptoporus betulinus found in a pouch around a Stone Age man´s<br />
neck in the Alps (Rensberger, 1992) all attest to this long relationship. These make up<br />
one of the major clades of life. Roughly 80000 species of fungi have been described, but<br />
the actual number of species has been estimated at approximately 1.5 million<br />
(Hawksworth, 1991; 2001). This number may yet underestimate the true magnitude of<br />
fungal diversity (Persoh and Rambold, 2003).<br />
For a long time, fungi were regarded as a single king<strong>do</strong>m belonging to the<br />
aforementioned five-king<strong>do</strong>m scheme. However, the organisms usually considered to be
INTRODUCTION<br />
fungi are very complex and diverse. Their recognition, delimitation and typification<br />
have been formidable problems since the discovery of these organisms. They can<br />
present chitin or not, multicellular and filamentous absorptive forms and unicellular<br />
assimilative forms, among others, and can reproduce by different types of propagules or<br />
even by fission (Alexopoulos et al., 1996; Guarro et al., 1999) (Figure 3). Despite this<br />
great variability in form and function, systematic studies based upon morphology and<br />
life cycle produced, by the middle 20th century, a manageable number of major phyla;<br />
and the use of molecular approaches have actually determined seven phyla well defined<br />
within the king<strong>do</strong>m Fungi (Hibbet et al., 2007).<br />
Fungi play a critical role impacting nearly all other forms of life in virtually all<br />
ecosystems as either friend or foe. Saprotrophic fungi are important in the environment<br />
in the cycling of nutrients through the decomposition of organic material, especially the<br />
carbon that is sequestered in wood and other plant tissues. Mutualistic symbiont fungi<br />
through relationships with prokaryotes, plant (including algae) and animals have<br />
enabled a diversity of other organisms to exploit novel habitats and resources. Indeed,<br />
the establishment of mycorrhizal associations may be a key factor that enabled plants to<br />
make the transition from aquatic to terrestrial habitats. Other group are pathogenic and<br />
parasitic Fungi that attack virtually all groups of organisms, including bacteria, plants,<br />
other Fungi, and animals, including humans. The economic impact of such fungi is<br />
massive either beneficial by the production of antibiotics or extremely detrimental by<br />
the devastating impacts in plant diseases, mycotoxins and mycosis (Pirozynski and<br />
Malloch, 1975; Moss, 1987; Alexopoulos et al., 1996; Blackwell, 2000).<br />
13
14<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Figure 3. Diversity of fungi. Microsporidia (Nosema apis (A), Octosporea bayeri (B));<br />
Blastocladiomycota (Flammulina velutipes (C)); Neocallimastigoycota (Caecomyces<br />
communis (D)); Chytridiomycota (Batrachochytrium dendrobatidis (E)); ‗Zygomycota‘-<br />
Mucoromycotina (Phycomyces blakesleeanus (H), Mucor circinelloides (G));<br />
Glomeromycota (Glomus coronatum (J)); Basidiomycota (Agaricus xanthodermus (F),<br />
Cryptococcus neoformans (I), Ustilago maydis (K)); Ascomycota (Microsporum gypseum<br />
(L), Aspergillus niger (M), Candida albicans (N), Blastomyces dermatitidis (O)).
1.4 Fungal taxonomy<br />
INTRODUCTION<br />
Historically the monophyletic Fungi have been classified into four phyla:<br />
Chytridiomycota, Zygomycota, Basidiomycota, and Ascomycota (Barr, 1992;<br />
Hawksworth et al., 1995; Alexopoulos et al. 1996). However, current studies have<br />
shown that such a simple classification <strong>do</strong>es not represent the phylogeny of those<br />
organisms, being currently recognized the phyla Microsporidia, Blastocladiomycota,<br />
Neocallimastigomycota, Chytridiomycota, Glomeromycota, Basidiomycota and<br />
Ascomycota. Presently, the phylum named Zygomycota is still to define due to the<br />
pending resolution of relationships among the clades that have traditionally been placed<br />
within this phylum (Hibbet et al., 2007).<br />
The current fungal classification accepts one king<strong>do</strong>m, one subking<strong>do</strong>m, seven<br />
phyla, ten subphyla, 35 classes, 12 subclasses, and 129 orders. The clade containing the<br />
Ascomycota and Basidiomycota is classified as the subking<strong>do</strong>m Dikarya (James et al.,<br />
2006), reflecting the putative synapomorphy of dikaryotic hyphae (Tehler, 1988). All of<br />
the other new names are based on automatically typified teleomorphic names. In<br />
Basidiomycota, the clades formerly called Basidiomycetes, Urediniomycetes, and<br />
Ustilaginomycetes in the last edition of Ainsworth & Bisby‘s Dictionary of the Fungi<br />
are now called the Agaricomycotina, Pucciniomycotina, and Ustilaginomycotina,<br />
respectively (Bauer et al., 2006). This was <strong>do</strong>ne to minimize confusion between taxon<br />
names and informal terms (basidiomycetes is a commonly used informal term for all<br />
Basidiomycota) and to refer to the included genera Agaricus (including the cultivated<br />
button mushroom) and Puccinia (which includes barberry-wheat rust). Another<br />
significant change in the Basidiomycota classification has been the inclusion of the<br />
Wallemiomycetes and Entorrhizomycetes as classes incertae sedis within the phylum,<br />
reflecting ambiguity about their higher-level placements (Matheny et al., 2007a).<br />
The most dramatic changes in the classification concern the ‗basal fungal<br />
lineages‘, which include the taxa that have traditionally been placed in the Zygomycota<br />
and Chytridiomycota. These groups have long been recognized to be polyphyletic,<br />
based on analyses of rRNA, EF-1α, and RPB1 (James et al., 2000; Nagahama et al.,<br />
1995; Tanabe et al., 2004, 2005). Recent multilocus analyses now provide the sampling,<br />
resolution, and support necessary to structure new classifications of these early-<br />
diverging groups, although significant questions remain (Liu et al., 2006; Lutzoni et<br />
15
16<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
al.,2004; James et al., 2006). The Chytridiomycota is retained in a highly restricted<br />
sense, including Chytridiomycetes and Monoblephari<strong>do</strong>mycetes. The Blastocladiales, a<br />
traditional member of the Chytridiomycota, is now treated as a phylum, the<br />
Blastocladiomycota (James et al., 2007). The Neocallimastigales, whose distinctiveness<br />
from other chytrids has long been recognized, is also elevated to phylum, based on both<br />
morphology and molecular phylogeny (Hibbett et al., 2007). The genera<br />
Caulochytrium, Olpidium, and Rozella, which have traditionally been placed in the<br />
Chytridiomycota, and Basidiobolus, previously classified in the Zygomycota<br />
(Entomophthorales), are not included in any higher taxa in the current classification,<br />
pending more definitive resolutions of their placements. The phylum Zygomycota has<br />
not been accepted in the current classification, and its resolution is pending due to the<br />
relationships among the groups that were traditionally part of this phylum. The<br />
traditional Zygomycota were distributed among the phylum Glomeromycota and four<br />
subphyla incertae sedis, including Mucoromycotina, Kickxellomycotina,<br />
Zoopagomycotina and Entomophthoromycotina (Hibbett et al., 2007). Microsporidia,<br />
unicellular parasites of animals and protists with highly reduced mitochondria, has been<br />
included as a phylum of the Fungi, based on multigenic analyses (Germot et al., 1997;<br />
Hirt et al., 1997; Peyretaillade et al., 1998; Keeling et al., 2000; Gill and Fast, 2006;<br />
James et al., 2006; Liu et al., 2006).<br />
1.5. Fungal phylogeny<br />
Until recently, the fungal phylogenetics has been well served by phenotypic<br />
characters such as morphological comparison, cell wall composition, cytologic testing,<br />
ultrastructure, cellular metabolism and fossil records (Le`John, 1974; Taylor, 1978;<br />
Heath, 1986; Hawksworth et al., 1995). However, higher-level relationships amongst<br />
these groups are less certain and are best elucidated using molecular techniques. More<br />
recently, the advent of cladistic and molecular approaches has changed the existing<br />
situation and provided new insights into fungal evolution. Genotypic analysis offers a<br />
straightforward means of resolving evolutionary questions where phenotypic characters<br />
mentioned are absent or contradictory.<br />
Many loci have been used by mycologists for evolutionary studies at single-gene<br />
level, but few of these loci revealed to be appropriated to resolve relationships among<br />
the main lineages of the Fungi. Even when trees are inferred by using multiple loci, the
INTRODUCTION<br />
phylogenetic signal may be limited strongly by the loci selected. Phylogenetic studies<br />
indicate that more than 83.9% of fungal phylogenies are based exclusively on sequences<br />
from the ribosomal RNA tandem repeats. The few protein-coding genes that have been<br />
sequenced for phylogenetic studies of fungi have demonstrated clearly that such genes<br />
can contribute greatly to resolving deep phylogenetic relationships with high support<br />
and/or increase support for topologies inferred by using ribosomal DNA genes (Liu et<br />
al., 1999). Studies using other protein-coding genes such as DNA topoisomerase II,<br />
Actina-1 (ACT1), first large subunit of RNA polymerase (RPB1), Citochrome oxidase 2<br />
(COX2) and Citochrome oxidase B (Cyt B) combined with traditional genes used in<br />
phylogeny are already being used to infer fungal phylogeny (Kurtzman and Robnett,<br />
2003; Diezmman et al., 2004; Lutzoni et al., 2004; Reeb et al. 2004; Wang et al., 2004;<br />
Liu et al., 2006; Matheny et al., 2007b).<br />
Until 2004, genome databases revealed that 21075 ITS, 7990 nucSSU, 5373<br />
nucLSU, 1991 mitSSU, and 349 RPB2 sequences were available and the number is<br />
continuosly increasing (Lutzoni et al., 2004). As impressive as these numbers is the<br />
collective effort to generate DNA sequence data for the Fungi but none of these loci<br />
alone can resolve the fungal tree of life with a satisfactory level of phylogenetic<br />
confidence (Kurtzman and Robnett, 1998; Tehler et al., 2000; Berbee, 2001; Binder and<br />
Hibbett, 2002; Moncalvo et al., 2002; Tehler et al., 2003). Combining sequence data<br />
from multiple loci is an integral part of large scale phylogenetic inference and is central<br />
to assembling the fungal tree of life. However, most published phylogenetic studies<br />
have restricted their analyses to one locus maximizing the number of fungal taxa. To<br />
quantify this observation, 560 publications reporting published fungal phylogenetic<br />
trees were surveyed from 1990 through 2003. Of the 595 trees considered in those<br />
studies, 489 (82.2%) were based on a single locus, only 77 trees were based on two<br />
combined loci, 19 on three combined loci, and 10 on four or more combined loci. Seven<br />
of the latter 10 studies were restricted to closely related species or strains within a<br />
species (Lutzoni et al.; 2004). Exceptions included studies with 93 species representing<br />
most major clades of Homobasidiomycetes (Binder and Hibbett, 2002), with 15 species<br />
representing 10 orders (Binder et al., 2001), and with 45 species representing nine<br />
orders (Hibbett and Binder, 2001).<br />
17
18<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
The largest phylogenetic tree based on one locus included 1551 nucSSU<br />
sequences representing 60 orders (Tehler et al., 2003). The largest multilocus trees<br />
included 162 ITS + ß-tubulin sequences representing a single order of Fungi (Stenroos<br />
et al., 2002); 158 species representing 10 orders based on nucSSU, nucLSU, and<br />
mitSSU (Hibbett et al., 2000); 110 species in a single order sequenced for ITS and<br />
nucLSU (Peterson, 2000); and 108 nucSSU + nucLSU sequences representing 19 orders<br />
of Fungi (Miadlikowska and Lutzoni, 2004). A more recent phylogenetic study directed<br />
toward resolving the fungal tree of life based on six combined loci, included members<br />
from all four traditionally recognized phyla of Fungi (Ascomycota, Basidiomycota,<br />
Chytridiomycota, and Zygomycota), and two new phyla Glomeromycota and<br />
Microsporidia (James et al., 2006).<br />
Although much effort has been invested in defining orders (compared to<br />
families, for example), few studies have focused on resolving relationships among<br />
orders of Fungi: 354 of the 595 trees examined (59.5%) conveyed relationships within<br />
single orders. The largest number of orders considered in a single study (N=62) resulted<br />
in a tree based only on nucSSU data (Tehler et al., 2000). The fungal trees based on<br />
combined data from multiple loci and encompassing the largest number of orders<br />
included 38 species representing 25 orders (Bhattacharya et al., 2000), 52 species<br />
representing 20 orders (Lutzoni et al., 2001), and 108 species representing 19 orders<br />
(Miadlikowska and Lutzoni, 2004). All of these studies focused on ascomycetes and<br />
were based on nucSSU and nucLSU rDNA. An exceptional study covering 16 orders of<br />
fungi (34 species) using a combined analysis of two protein-coding genes (α and ß-<br />
tubulin) was carried out to infer the phylogenetic placement of Microsporidia within<br />
king<strong>do</strong>m Fungi (Keeling, 2003). The major fungal phylogenomic study based on 42<br />
complete genomes was carried out by using all conserved regions of genes, resolving<br />
deep phylogenetic relationship of the majority of known taxa. The major drawback of<br />
this study is the low number of sequenced fungal genomes being an obstacle to resolve<br />
basal taxa in some clades (Fitzpatrick et al., 2006).<br />
Figure 4 shows a fungal phylogeny based on genetic and geologic data, which<br />
present the five main major groups of fungal organisms that have their genomes<br />
sequenced or in progress, but uses the previous taxonomic denomination. In this tree, it<br />
is evident that the Ascomycetes and Basidiomycetes are sister groups (subking<strong>do</strong>m
INTRODUCTION<br />
Dykaria), and that there is a close relationship of Glomeromycetes to the subking<strong>do</strong>m<br />
Dykaria, which diverged about 600 myrs ago. The basal groups in the king<strong>do</strong>m Fungi<br />
are Chytrids and Zygomycetes. Currently, studies for determining the correct placement<br />
of species in these groups are being carried out, since some of these produced paraphyly<br />
(Chytrids) or poliphyly (Zygomycetes) in the fungal phylogeny. Due to this result, the<br />
species from Zygomycetes have been reclassified (Hibbet et al., 2007).<br />
Figure 4. Phylogeny of the king<strong>do</strong>m Fungi. Major fungal groups colored as indicated by text at the right.<br />
Diamonds indicate evolutionary branch points, and their approximate dating (bottom), captured<br />
by fungal genomes sequenced in or in progress (modified Berbee and Taylor, 2001).<br />
1.6. Fungal identification<br />
The identification of fungi to species level and also the distinction of<br />
morphologically identical strains of the same species has been a challenge for the<br />
mycologists. Hence, most species have received only limited study, and the<br />
classification has been mainly traditional and based on readily observable<br />
morphological features. Other features besides morphology, such as susceptibility to<br />
chemicals and antifungal drugs, physiological and biochemical tests, production of<br />
secondary metabolites, ubiquinone systems, fatty acid composition, cell wall<br />
composition, protein composition and molecular approaches, have been used in<br />
classification and also in identification of fungal species (Guarro et al., 1999).<br />
19
20<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
1.6.1. Phenotypic methods<br />
The classification and identification of fungi using phenotypic methods relies<br />
mainly on morphological criteria. The classical light microscopic methods have been<br />
enhanced by Nomarski differential interference contrast, fluorescence, cytochemistry,<br />
and the development of new staining techniques such as those for ascus apical structures<br />
(Romero and Winter, 1988). Unfortunately, during infections most pathogenic fungi<br />
show only the vegetative phase (absence of sporulation), in host tissue only hyphal<br />
elements or other nonspecific structures are observed. Although the pigmentation and<br />
shape of these hyphae and the presence or absence of septa can give us an idea of their<br />
identity, fungal culture is required for accurate identification. For the identification and<br />
classification of these fungi, the type of conidia and conidiogenesis are considered the<br />
most important sets of characteristics to be observed (Cole and Samson, 1979; Guarro et<br />
al., 1999). In the rare instances that opportunistic fungi develop the teleomorph in vitro<br />
(this happens in numerous species of Ascomycota and in a few species of Zygomycota<br />
and Basidiomycota), there are many morphological details associated with sexual<br />
sporulation which can be extremely useful in their classification. The type of fruiting<br />
body and type of ascus (structure that contain ascospores produced after meiosis) are<br />
vital for classification. Shape, color, and the presence of an apical opening (ostiole) in<br />
the fruiting bodies are important features in the recognition of higher taxa (Guarro et al.,<br />
1999). In recent years, morphological techniques have been influenced by modern<br />
procedures, which allow more reliable phenotypic studies to be performed. Numerical<br />
taxonomy, effective statistical packages, and the application of computer facilities to the<br />
development of identification keys offer some solutions and the possibility of a<br />
renaissance of morphological studies.<br />
Besides the use of morphological identification, other approaches have been<br />
proposed to identify fungi, which are: a) Biotyping, which is based on the use of<br />
physiological and biochemical test such as the commercially avaliable kits, API and<br />
VITEK systems, and selective media as CHROMOAGAR have been used to identify<br />
unicellular fungi (San-Millan et al., 1996; Bernal et al., 1998; Graf et al., 2000; Ballesté<br />
et al., 2005), b) Serotyping, a method based on aglutination reaction between antigens<br />
and policlonal antibodies, used for pathogenic yeast typing as Candida spp,<br />
Cryptococcus neoformans at the strain level (Evans et al., 1950; Haesenclever and<br />
Mitchell, 1961), c) Production of secondary metabolites, which consists in the
INTRODUCTION<br />
identication of fungi by the metabolites that they produce (Carlile and Watkinson, 1994;<br />
Frisvad, 1994; Whaley and Edwards, 1995), d) Cell wall composition, based on studies<br />
of structure and composition of the cell wall of fungi, delimiting high taxa (Bartnicki-<br />
Garcia, 1987; Carlile and Watkinson, 1994), and e) Isoenzymatic analysis, that enable<br />
the differentiation between species by the electrophoretic pattern of specific enzyme<br />
activity (Busch and Nitscko, 1999; Soll, 2000). All of these techniques have presented<br />
limitations in the identification of fungal species, being some of them restricted to some<br />
species or fungal group. Hence, the use of techniques that included studies at the<br />
genomic level may resolve these disadvantages.<br />
1.6.2. Genotypic methods<br />
Methods based on studies directly in DNA are within the genotypic methods. The<br />
electrophoretic karyotyping is a method, which use pulsed field gel electrophoresis to<br />
assess variations of chromosome sizes and the precise determination of karyotypes,<br />
allowing the discrimination at the strain and species level. The electrophoretic<br />
karyotyping consist in the migration of chromosomes under the influence of electric<br />
fields of alternating orientation to move intact chromosomes according to size through<br />
the agarose gel matrix, which can be visualized by ethidium bromide staining (Guarro et<br />
al., 1999; Taylor et al., 1999). Electrophoretic karyotyping has been used to fingerprint<br />
different fungal species as Candida spp, Micromucor spp, Paracoccidioides<br />
brasiliensis, Pyrenophora spp (Dib et al., 1996; Nogueira et al., 1998; Aragona et al.,<br />
2000; Klempp-Selb et al., 2000; Shin et al., 2001, Nagy et al., 2004).<br />
One of the first DNA fingerprinting methods used to assess strain relatedness and<br />
taxonomy in fungi was restriction fragment analysis, or restriction fragment length<br />
polymorphism (RFLP) comparisons, without probe hybridization. DNA is extracted,<br />
digested with one or more restriction en<strong>do</strong>nucleases to sample short pieces of DNA<br />
sequence. Then, the fragments are separated by electrophoresis in an agarose gel. The<br />
banding pattern of digested DNA is then visualized, usually by staining with ethidium<br />
bromide. The obtained band pattern is based on different fragment lengths determined<br />
by the restriction sites identified by the particular en<strong>do</strong>nuclease employed (Taylor et al.,<br />
1999; Soll, 2000). Any region of DNA can be used for RFLP analysis if variation<br />
(alleles) can be visualized, directly due to multiple copies (mtDNA, rDNA) or indirectly<br />
either by hybridization with a probe or by PCR amplification. RFLPs were the first<br />
21
22<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
DNA markers used for fungal evolutionary biology. The band patterns can be tabulated<br />
and compared, and phenetic trees constructed (Edel et al., 1996). Critics of RFLP<br />
analysis note that while restriction sites in different individuals are most likely to be<br />
identical by descent that is not the case for missing sites, because it is easier to lose a<br />
site than to gain one. There are many ways in which a restriction en<strong>do</strong>nuclease site can<br />
be lost: any of the several nucleotides in the recognition sequence can be substituted, or<br />
the site can suffer length mutations. Missing sites that have arisen, unknowingly, by<br />
different routes confound evolutionary analysis (Taylor et al., 1999).<br />
Ran<strong>do</strong>mly amplified polymorphic DNA (RAPD) analysis or arbitrarily primed PCR<br />
(AP-PCR) analysis is similar to RFLP analysis in that it assays DNA sequence variation<br />
in short regions, but instead of analyzing restriction en<strong>do</strong>nuclease recognition<br />
sequences, it focuses on PCR priming regions (Williams et al., 1990). Using ran<strong>do</strong>m<br />
primers of approximately 10 bases and a low annealing temperature, amplicons<br />
throughout the genome are targeted and amplified. Amplified products are separated on<br />
an agarose gel and stained with ethidium bromide. When a single ran<strong>do</strong>m<br />
oligonucleotide primer is used in a reaction, it hybridizes to homologous sequences in<br />
the genome and if the primer hybridizes to sequences on alternative DNA strands within<br />
roughly 3 kb, the DNA region between the two hybridization sites will be amplified.<br />
Criticisms of RAPDs initially focused on reproducibility. RAPD analysis succeeds<br />
because just one nucleotide substitution can allow or prevent priming, and so it is not<br />
surprising that small differences in any aspect will have the same effect of PCR profile.<br />
Even if there were no problems with repeatability, there would still be the concern that<br />
bands of equal electrophoretic mobility may not be homologous and the related concern<br />
that missing bands may not be homologous because they can be lost by several possible<br />
nucleotide substitutions in either PCR priming site as well as by length mutations. There<br />
is also the problem of <strong>do</strong>minant and null alleles; in haploid organisms both the<br />
<strong>do</strong>minant (presence) and null (absence) alleles can be scored, but in diploids it is not<br />
possible to distinguish genotypes that are homozygous for the <strong>do</strong>minant allele from<br />
those that are heterozygous. It is also tempting to score more than one variable band per<br />
RAPD reaction, although they may not be independent (Taylor et al., 1999). Thus,<br />
RAPD has not been found useful for evolutionary studies. Because of the repetitive<br />
character of the target sequences, genetic distances calculated from RAPD could be
INTRODUCTION<br />
affected by parology, namely, recombination and duplication events not parallel with<br />
speciation events (Melo et al., 1998).<br />
Amplified fragment length polymorphism (AFLP) analysis is a relatively new<br />
technique which has a discriminatory power that makes it suitable for identification as<br />
well as for strain typing. In short, in AFLP analysis genomic DNA is digested with two<br />
restriction enzymes (EcoRI and MseI) and <strong>do</strong>uble-stranded oligonucleotide adapters are<br />
ligated to the fragments. These adapters serve as targets for the primers during PCR<br />
amplification. To increase the specificity, it is possible to elongate the primers at their<br />
3`ends with one to three selective nucleotides. One of the primers is labeled with a<br />
fluorescent dye (Vos et al 1995). The polymorphism revealed by amplified fragment<br />
length polymorphism (AFLP) analysis depends on restriction en<strong>do</strong>nuclease site<br />
differences, just like RFLP. The variable fragments (loci) have two alleles (positive and<br />
null), and so the same concerns raised about null alleles with RFLP and RAPD analysis<br />
apply (Taylor et al., 1999). The advantage of AFLP analysis is that only a limited<br />
amount of DNA is needed since the fragments are PCR amplified producing more than<br />
with RFLP. Furthermore, since stringent annealing temperatures are used during<br />
amplification, the technique is more reproducible and robust than other methods such as<br />
RAPD analysis (Savelkoul et al., 1999). The value and potential of the AFLP technique<br />
in differentiation and identification from strains and/or species and the similarity of<br />
genetic analysis has been demonstrated in yeast ecology and evolutionary studies<br />
(Barros Lopes et al., 1999).<br />
Sequencing is the best method for measuring phylogenetic relationships and<br />
discriminating between species and strains through comparison of the DNA sequences<br />
of a variety of noncoding and coding regions. The underlying assumption is that the<br />
rates at which mutations accumulate in gene regions reflect evolutionary clocks.<br />
Because particular noncoding and coding regions may be highly resistant to change and<br />
therefore may have very slow evolutionary clocks while other coding regions, such as<br />
those of genes involved in pathogenesis, may be far less resistant to change and<br />
therefore may have very fast evolutionary clocks. The studies of rRNA genes were<br />
based on the assumption that they were less prone to biases resulting from selective<br />
pressure (Karl and Avise, 1993). Until recently, cloning and sequencing genes were<br />
slow, technically demanding and expensive, that seemed beyond the technical capacity<br />
23
24<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
of most medical mycology laboratories. However, the emergence of PCR, automated<br />
DNA-sequencing technologies, and gene data banks have made these studies easier<br />
(Soll, 2000). The sequencing of several genes both, nuclear and mithocondrial, is being<br />
used to infer phylogeny and discriminate fungal species and strains.<br />
In the first sequencing studies, noncoding regions have been selected to infer<br />
phylogeny, using the small subunit ribosomal DNA sequences from eukaryotes and<br />
prokaryotes (Woese and Fox, 1977), allowing the reconstruction of the first tree of life.<br />
Intergenic Transcribed Spacer (ITS) regions are mainly used in fungal taxonomy and<br />
phylogeny, but to resolve the phylogenetic placement of some fungal species that<br />
present an uncertain position new genes are being used as nuclear large and small<br />
subunit ribosomal DNA (nucLSU and nucSSU), mithocondrial large and small<br />
ribosomal DNA (mitLSU and mitSSU), second large subunit of RNA polymerase<br />
(RPB2), α and β tubulin and elongation factor 1 (EF-1), clarifying the divergence of<br />
species (Lutzoni et al., 2004; Bridge et al., 2005; James et al., 2006). Currently, the<br />
sequencing is being performed at the genome level what has enabled to explain better<br />
the phylogenetic relationships in fungi (Fitzpatrick et al., 2006).<br />
1.7. Methods for phylogenetic inference<br />
Previously, the attempt of inferring the phylogeny was based in the identification of<br />
common morphological characters amongst species. Though these data enabled us to<br />
group close species, it did not always resolve deep phylogenetic relationships especially<br />
in organisms with particular characters like fungi that were for long time related with<br />
plants. Currently, the methods for inferring phylogenetic relationships between species<br />
are using amino acid or nucleotide sequences, being these data more reliable because<br />
we can obtain information that is not possible to visualize by morphological analysis,<br />
such as presence and difference among homologue genes, mutations, and duplicated<br />
genes, for instance.<br />
In phylogenetic studies the first step is the single-gene or multigenic homologue<br />
sequence alignment. This procediment will allow the identification of regions with high<br />
homologies and how close the sequences are by using both nucleotides and amino acid<br />
sequences. Several programs have been proposed for alignment of sequences, being the<br />
most used the CLUSTAL series of programs. The obtained alignment is imported into a
INTRODUCTION<br />
program of phylogenetic inference, which can use two primary approaches to<br />
phylogenetic tree construction: an algorithmic and a tree-searching. The first uses an<br />
algorithm to construct a tree from the data. The second constructs many trees, and then<br />
the best tree or best set of trees is selected. These two methods are also known as<br />
distance and character-based methods, respectively. Below, the two methods of<br />
alignment as well as methods of phylogenetic construction will be described.<br />
1.7.1. Multiple sequence alignment<br />
1.7.1.1. CLUSTAL series of programs<br />
The Clustal series of programs are widely used in molecular biology for the multiple<br />
alignments of both nucleic acid and protein sequences and for preparing phylogenetic<br />
trees. The first Clustal program was designed specifically to work efficiently on<br />
personal computers, which at that time, had feeble computing power by today‘s<br />
standards. It combined a memory-efficient dynamic programming algorithm with the<br />
progressive alignment strategy. The multiple alignments are built up progressively by a<br />
series of pairwise alignments, following the branching order in a guide tree. The initial<br />
pre-comparison used a rapid word-based alignment algorithm and the guide tree was<br />
constructed by using the Unweighted Pair-Group Method with Arithmetic Mean<br />
(UPGMA) method (Higgins and Sharp, 1988). In 1992, a new release was made, called<br />
ClustalV, which incorporated profile alignments (alignments of existing alignments)<br />
and the facility to generate trees from the multiple alignment by using the Neighbour<br />
Joining (NJ) method, and the user could also test the tree for robustness by using a<br />
simple bootstrap test of tree topology (Higgins et al., 1992). In 1994, the third<br />
generation of the series Clustal W was released. Clustal W incorporated position-<br />
specific gap penalties so that gap penalties can be lowered at hydrophilic residues and<br />
wherever gaps are already introduced into the alignment. The sequence pre-comparison<br />
in Clustal W uses more sensitive dynamic programming, which yields a much better<br />
dendrogram by means of NJ, which improves tree topology and provides a method for<br />
weighting sequences on the basis of their divergences (Thompson et al., 1994). After,<br />
Clustal X was developed, displaying many improvements. Within alignments,<br />
conserved columns are highlighted by using a colour scheme that the user can<br />
customize. Beneath the sequence alignment, Clustal X provides a plot of residue<br />
conservation. Quality-analysis tools that highlight misaligned regions are also available<br />
(Thompson et al., 1997). The popularity of the programs depends on a number of<br />
25
26<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
factors, including not only the accuracy of the results, but also the robustness,<br />
portability and user-friendliness of the programs.<br />
1.7.1.2. MUSCLE<br />
Multiple sequence comparison by log-expectation (MUSCLE) is a new<br />
computer program for creating multiple alignments of protein sequences or alternatively<br />
nucleotides. This program uses two distance measures for a pair of sequences: a kmer<br />
distance (for an unaligned pair) and the Kimura distance (for an aligned pair). A kmer is<br />
a contiguous subsequence of length k, also known as a word or k-tuple. Related<br />
sequences tend to have more kmers in common than expected by chance. The kmer<br />
distance is derived from the fraction of kmers in common in a compressed amino acid<br />
alphabet. This measure <strong>do</strong>es not require an alignment, giving a significant speed<br />
advantage. Given an aligned pair of sequences, it computes the pairwise identity and<br />
converts it to an additive distance estimate, applying the Kimura correction for multiple<br />
substitutions at a single site. Distance matrices are clustered using UPGMA, which give<br />
slightly improved results over neighbor-joining, despite the expectation that neighbor-<br />
joining will give a more reliable estimate of the evolutionary tree. This can be explained<br />
by assuming that in progressive alignment, the best accuracy is obtained at each node by<br />
aligning the two profiles that have fewest differences, even if they are not evolutionary<br />
neighbors. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of<br />
these sets. Without refinement, MUSCLE achieves average accuracy statistically<br />
indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods<br />
for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min<br />
on a current desktop computer (Edgar, 2004).<br />
1.7.2. Major methods for estimating phylogenetic trees and infer<br />
phylogeny<br />
1.7.2.1. Distance methods<br />
These methods convert aligned sequences into a distance matrix of pairwise<br />
differences (distances) between the sequences. The matrix is used as the data from<br />
which branching order and branch lengths are computed. Within the most used methods<br />
we have the NJ and the UPGMA.<br />
UPGMA is a straightforward example of a clustering method. It starts with grouping<br />
two taxa with the smallest distance between them according to the distance matrix.
INTRODUCTION<br />
Then, a new node is added in the midpoint of the two, and the two original taxa are put<br />
on the tree. The distance from the new node to other nodes will be the arithmetic<br />
average. Then, a reduced distance matrix is obtained by replacing two taxa with one<br />
new node. That process is repeated on the new matrix and reiterated until the matrix<br />
consists of a single entry. That set of matrices is then used to build up the tree by<br />
starting at the root and moving out to the first two nodes represented by the last two<br />
clusters. In fact, the outcome phylogenetic tree from UPGMA is not an arbitrary tree.<br />
Instead, it is a special type of tree, which satisfies the molecular clock property. That is,<br />
the total time travelling <strong>do</strong>wn a path to the leaves from any node is the same, regardless<br />
the choice of path. Thus, if the original phylogenetic tree from observation <strong>do</strong>es not<br />
have this property, it will not reconcile well with the tree from UPGMA. One necessary<br />
condition for producing a better phylogenetic tree is the ultrametric condition, which<br />
requires that any triangle must be at least equilateral with regard to the weights between<br />
them, what is very unlikely. For that and other reasons, UPGMA is considered a bad<br />
algorithm for construction of phylogenetic trees, since it relies on the rates of evolution<br />
among different lineages to be approximately equal (Sneath and Sokal, 1973; Hall,<br />
2001).<br />
Neighbor Joining is similar to UPGMA in that it manipulates a distance matrix,<br />
reducing it in size at each step, and reconstructing the tree from that series of matrices.<br />
It differs from UPGMA in that it <strong>do</strong>es not construct clusters but directly calculates<br />
distances to internal nodes. In this method, the total sum of the individual distances is<br />
not computed for all or many different topologies, but the examination of different<br />
topologies is imbedded in the algorithm, so that only one final tree is produced. From<br />
the original matrix, NJ starts first with a star phylogeny. The next step is to calculate for<br />
each taxon its net divergence from all other taxa as the sum of the individual distances<br />
from the taxon. It then uses that net divergence to calculate a corrected distance matrix.<br />
NJ then finds the pair of taxa with the lowest corrected distance and calculates the<br />
distance from each of those taxa to the node that joins them, so that we can identify a<br />
pair of neighbors. Once this pair is identified, they are combined as a single unit and<br />
treated as a single sequence in the next step. This process is continued until all<br />
multifurcating nodes are resolved into bifurcating ones. NJ <strong>do</strong>es not assume that all taxa<br />
are equidistant from a root. Unlike UPGMA, NJ method produces unrooted trees. One<br />
of the disadvantages of the NJ method is that it generates only one tree and <strong>do</strong>es not test<br />
27
28<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
other possible tree topologies. This can cause problems because in many cases in the<br />
initial set up of NJ, there might be multiple equally close pair of neighbors to join,<br />
leading to multiple trees. Since there is no way the algorithm would know which one is<br />
the most optimal tree, choosing a wrong option may produce a suboptimal tree. To<br />
overcome this problem, a generalized NJ method has been developed in which multiple<br />
trees with different initial taxon groupings are generated (Saitou and Nei, 1987; Nei,<br />
1996; Hall, 2001).<br />
1.7.2.2. Character-based methods<br />
These methods are characterized by using multiple alignments directly for<br />
comparison of characters within each column (each site) in the alignment. Three main<br />
methods are applied in phylogeny: maximum parsimony (MP), maximum likelihood<br />
(ML) and Bayesian inference (BI).<br />
The MP method, a given set of nucleotide (or amino acid) sequences are<br />
considered, and the nucleotides (or amino acids) of ancestral sequences for a<br />
hypothetical topology are inferred under the assumption that mutational changes occur<br />
in all directions among the four different nucleotides or the 20 amino acids. The<br />
smallest number of nucleotide substitutions that explain the entire evolutionary process<br />
for the given topology is then computed. This computation is <strong>do</strong>ne for all other<br />
topologies, and the topology that requires the smallest number of substitutions is chosen<br />
as the best tree. If there are no multiple substitutions at each site, MP is expected to<br />
generate the correct topology as long as enough parsimony-informative sites are<br />
examined. The disadvantage is that in practice the nucleotide sequences are often<br />
subject to backward and parallel substitutions and this introduces uncertainties in<br />
phylogenetic inference. When the true tree has a special type of topology and branch<br />
lengths MP may generate an incorrect topology even if an infinite number of<br />
nucleotides are examined and this can happen even if the rate of nucleotide substitution<br />
is constant for all evolutionary lineages (Felsenstein, 1978; Takezaki and Nei, 1994).<br />
Furthermore, in parsimony analysis it is difficult to treat the phylogenetic inference in a<br />
statistical framework because there is no natural way to compute the means and<br />
variances of minimum numbers of substitutions obtained by the parsimony procedure.<br />
However, under certain circumstances, MP is quite efficient in obtaining the correct<br />
topology. It should also be noted that MP is the only method that can easily take care of
INTRODUCTION<br />
insertions and deletions of nucleotides, which sometimes give important phylogenetic<br />
information (Nei, 1996).<br />
Cavalli-Sforza and Edwards (1967) were the first to propose the idea of using a<br />
ML method for phylogenetic inference from gene frequency data, but they encountered<br />
a number of problems when implementing this method. Later, considering nucleotide<br />
sequence data, an algorithm was developed for constructing a phylogenetic tree by the<br />
ML method (Felsenstein, 1981). In the case of amino acid sequence data, the likelihood<br />
method was proposed by using an empirical transition matrix for 20 different amino<br />
acids (Kishino et al., 1990). Later, this method was extended by using various transition<br />
matrices for nuclear and mitochondrial proteins (Adachi and Hasegawa, 1995; 1996).<br />
The first step is the selection of the substitution model of interest that gives the<br />
instantaneous rates at which each of the four possible nucleotides (or 20 amino acids)<br />
changes to each of the other three possible nucleotides (or 19 amino acids). Alignments<br />
of homologous DNA and amino acid sequences can be examined under a wide range of<br />
evolutive models (JC69, K80, F81, F84, HKY85, TN93 and GTR for nucleotides, and<br />
Dayhoff, JTT, mtREV, WAG and DCMut for amino acids) (Guin<strong>do</strong>n et al., 2005). ML<br />
looks for the tree that, under some model of evolution, maximizes the likelihood of<br />
observing the data that is expressed as log-likelihood. ML program seeks the tree with<br />
the largest log likelihood. The advantages of the ML method are that it allows users to<br />
specify the evolutionary model they want to use, and that the likelihood of the resulting<br />
tree is known. A disadvantage is that ML method is considerably slower than either<br />
Parsimony or NJ, and it is not difficult to exceed the capacity of even the most up-to-<br />
date desktop computer (Hall, 2001).<br />
In the BI, inferences of phylogenies are based upon the posterior probability<br />
(PP) of phylogenetic trees. Bayesian analysis of phylogenies is similar to ML in that the<br />
user postulates a model of evolution and the program searches for the best trees that are<br />
consistent with both the model and the data (the alignment). It differs somewhat from<br />
ML in that while ML seeks the tree that maximizes the probability of observing the data<br />
given by that tree, bayesian analysis seeks the tree that maximizes the probability of the<br />
tree given the data and model for evolution. Unlike ML, that seeks the single most<br />
likely tree, Bayesian analysis searches for the best set of trees. MrBayes program uses<br />
the Metropolis-coupled Markov Chain Monte Carlo (MCMCMC) method, which works<br />
29
30<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
proposing a new state (new tree) by using a substitution model, and then the acceptance<br />
probability for this state is calculated (Hasting, 1970; Green, 1995). A ran<strong>do</strong>m number<br />
between 0 and 1 is drawn, and if that number is less than the calculated probability, the<br />
new state is accepted; otherwise the state remains the same. The proposed new state<br />
involves moving a branch and/or changing length of a branch to create a modified tree.<br />
If that new tree is more likely than the existing tree, given the model and the data, it is<br />
more likely to be accepted. This process of proposing and accepting/rejecting new state<br />
is repeated many thousands or millions of times (Hall, 2001; Huelsenbeck and<br />
Ronquist, 2001). Although BI present computational efficiency with respect to the other<br />
character-based methods some questions are unsolved as convergence of Markov chain,<br />
discrepancy between Bayesian probabilities and nonparametric bootstrap test values<br />
(Huelsenbeck et al., 2002).<br />
1.8. Adaptive evolution<br />
Most amino acid substitutions are deleterious and are eventually eliminated from the<br />
population by purifying selection. Some amino acid substitutions are neutral and pure<br />
chance determines whether they will be fixed into the population. In rare instances, an<br />
amino acid substitution is beneficial and is fixed into the population by positive<br />
selection. At the nucleotide level, some base substitutions cause amino acid<br />
replacements and others are silent. The nucleotide position within a co<strong>do</strong>n affects<br />
whether an amino acid replacement takes place when the nucleotide changes. Most<br />
mutations at the first position in a co<strong>do</strong>n result in amino acid replacements; all<br />
mutations at second positions are replacement mutations, but because of the redundancy<br />
of the genetic code, many third position mutations are silent. The number of non-<br />
synonymous changes per non-synonymous site is called dN, and the number of<br />
synonymous changes per synonymous is called dS. When the sequences are compared a<br />
dN/dS ratio1 is taken as evidence of<br />
positive selection. To identify evidence of positive or purifying selection, studies of<br />
adaptive evolution normalize the number of non-synonymous and synonymous<br />
mutations to the number of non-synonymous and synonymous sites in the gene.<br />
Several methods have been proposed to estimate nonsynonymous and synonymous<br />
rate of substitution by comparison of nucleotides and amino acid sequences. The first
INTRODUCTION<br />
methods were based on estimating the numbers of synonymous and nonsynonymous<br />
substitutions from comparison of two coding DNA sequences (Miyata and Yasunaga,<br />
1980; Li et al., 1985; Nei and Gojobori, Li, 1993; Pamilo and Bianchi, 1993). The<br />
method of Nei and Gojobori (1986) is the simplest and performs a co<strong>do</strong>n-by-co<strong>do</strong>n<br />
comparison. The model of Jukes and Cantor (1969) is applied to correct multiple<br />
substitutions that have occurred at one site, and it normally produces results very similar<br />
to those obtained by Nei-Gojobori method.<br />
Methods based on a likelihood approach were proposed for determining positive<br />
selection by comparison of multiple sequences, whose models of co<strong>do</strong>n substitution<br />
assume a single ω=dN/dS for all lineages and sites and has been extended to account for<br />
variation of ω, either among lineages or among sites (Goldman and Yang, 1994; Muse<br />
and Gaut, 1994). The lineage-specific models allow for variable ωs among lineages and<br />
are thus suitable for detecting positive selection along lineages. They assume no<br />
variation in ω among sites. As a result, they detect positive selection for a lineage only<br />
if the average dN over all sites is higher than average dS (Muse and Gaut, 1994; Yang,<br />
1998; Yang and Nielsen, 1998). The specific-site models allow the ω ratio to vary<br />
among sites but not among lineages. Positive selection is detected at individual sites<br />
only if the average dN over all lineages is higher than average dS (Nielsen and Yang,<br />
1998; Yang et al., 2000). The model branch-site allows the ω ratio to vary both among<br />
sites and among lineages. This model assumes that positive selection occurs at a few<br />
time points and affects a few amino acids (Yang and Nielsen, 2002).<br />
The co<strong>do</strong>n-based models, based on likelihood approach, include parameters for<br />
transition-transversion bias and for co<strong>do</strong>n frequencies. In the case of the Goldman and<br />
Yang (1994) model it also includes the different replacement probabilities between<br />
amino acids based on the Grantham (1974) physicochemical distance matrix, and<br />
Bayesian models that assume a prior distribution dN/dS ratio (Nielsen and Yang, 1998;<br />
Yang et al., 2000). All methods are based on theoretical assumptions and ignore the<br />
empirical observations that distinct amino acids differ in their replacement rates.<br />
Recently, a co<strong>do</strong>n-based model, named mechanistic-empirical model (MEC), was<br />
published what takes into account all the parameters mentioned above and the<br />
assimilation of empirical amino acid replacement probabilities into the co<strong>do</strong>n-<br />
substitution matrix (Doron-Faigenboim and Pupko, 2007).<br />
31
THESIS BACKGROUND AND OBJECTIVES
2. THESIS BACKGROUND AND OBJECTIVES<br />
THESIS BACKGROUND AND OBJECTIVES<br />
Throughout the last decade, yeasts belonging to the genus Candida have emerged as<br />
major opportunistic pathogens broadly distributed in the nature, being frequently<br />
present as innocuous commensal of the skin, mouth, vagina, and pharynx as well as of<br />
the urinary and gastrointestinal tracts in humans and other warm-blooded animals<br />
(Coleman et al., 1998; Pedrós, 2003). These yeasts present small cells about 4 a 6 um<br />
diameter, hyphaes and pseu<strong>do</strong>hyphae and reproduce majorly by budding. As a relatively<br />
harmless commensal organism C. albicans exists in about 50% of the human population<br />
(Odds, 1988; Calderone, 2002).<br />
Candida species produce infections denomined candidiasis. These infections can<br />
be superficial form particularly in the mucosas (epitelium and en<strong>do</strong>telium), or<br />
generalized causing systemic candidiasis also designated as candidemia (Gudlaugsson<br />
et al., 2003). These infections have increased over the last two decades owing to the<br />
increase of immunocompromised patients due to hematological disorders, transplants,<br />
chemotherapy, AIDS, among others. They are also frequent in the extremes of age such<br />
as in the elderly and in neonates, particularly the underweight, and in women of<br />
childbearing age (López Martínez et al., 1984; Musial et al., 1988; Horowitz et al.,<br />
1992; Ball et al., 2004; Estrella et al., 2000; Clark et al., 2004; Chen et al., 2005;<br />
Tavanti et al., 2005a). As opportunistic they can present pathogenic character under<br />
determined conditions, becoming an important cause of infection (Coleman et al., 1998;<br />
Pedrós, 2003). The infections caused by yeasts from the genera Candida became the<br />
fourth cause of invasive infections in United States and the eighth in Europe (Jarvis,<br />
1995; Fluit et al., 2000).<br />
Approximately 200 Candida species are known, of which nearly 20 species of<br />
Candida have been identified as etiologic agents of infections, but the list of medically<br />
important yeasts continues to grow (Calderone, 2002). Although Candida albicans has<br />
been, in the past, the most common causative organism of fungemia and disseminated<br />
candidiasis, other Candida species are becoming increasingly more prevalent. A<br />
worldwide study of bloodstream infections from 1997 to 1999 showed that at least 45%<br />
of yeast infections were caused by other Candida species than C. albicans (Pfaller et al.,<br />
2001). The most commonly isolated species, apart from C. albicans, are C. parapsilosis<br />
(20 to 40% of all reported episodes of candidemia), C. tropicalis (10 to 30%), C. krusei<br />
35
36<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
(10 to 35%), C. glabrata (5 to 40%), C. guilliermondii (2 to 10%) and C. lusitaniae (up<br />
to 8%) (Sandven 2000, Krcmery and Barnes, 2002). The emergence of these non-<br />
albicans species is associated with inherent or acquired resistance to fluconazole in up<br />
to 75% of all C. krusei infections and 35% of all C. glabrata infections; conversely, at<br />
present C. lusitaniae remains highly susceptible to azole antifungal agents but it is<br />
resistant to Amphotericine B (Wingard et al., 1991; Krcmery and Barnes, 2002; Pfaller<br />
et al., 2003). C. dubliniensis has emerged as an opportunistic pathogen closely related to<br />
C. albicans. It shares many diagnostic characteristics with C. albicans but differs with<br />
respect to its epidemiology, virulence, and the ability to develop fluconazole resistance<br />
(Sullivan and Coleman, 1997; Gutierrez et al., 2002). The incidence of C. parapsilosis<br />
is associated with handling central venous catheters and its ability ofgrowing as biofilm<br />
on implanted medical devices (Baillie and Douglas, 1998; Kuhn et al., 2004). C.<br />
tropicalis is related to patients with cancer, particularly due to its increased invasiveness<br />
when the gastrointestinal mucosa is damaged and the intestinal flora is suppressed by<br />
pharmacological treatments (Kullberg and Lashof, 2002).<br />
Other new species have been appearing as C. rugosa, C. famata, C. africana, C.<br />
bracarensis, C. fabianii, C. ciferri, C. membranaefaciens, C. metapsilosis, C.<br />
orthopsilosis, C. pseu<strong>do</strong>haemulonii and C. pseu<strong>do</strong>rugosa, affecting mainly<br />
immunocompromised patients, taking advantage of host weakness to pass from<br />
commensal to infectious. (Hazen, 1995; Tietz et al., 2001; García-Martos et al., 2004;<br />
Fanci and Pecile, 2005; Tavanti et al., 2005b; Correia et al., 2006; Li et al., 2006;<br />
Sugita et al., 2006; Valenza et al., 2006).<br />
All Candida species belong to the subphylum Saccharomycotina, which is<br />
included in the phylum Ascomycota. Both mithocondrial and nuclear genes have been<br />
used to construct the phylogeny of the subphylum Saccharomycotina, making use of<br />
one gene or using multigenic analysis (Barns et al., 1991; Bowman et al., 1992;<br />
Diezmann et al., 2004; Lutzoni et al., 2004, Lopandic et al., 2005; James et al., 2006).<br />
Others have carried out phylogenomic analysis, with very good results, but some<br />
species still need to resolve their correct place in the phylogeny due to the lack of other<br />
supporting sequenced genomes (Robbertse et al., 2006; Fitzpatrick et al., 2006).
THESIS BACKGROUND AND OBJECTIVES<br />
The search of new genes and the evaluation of their usefulness in phylogenetic<br />
studies are very important to understand the phylogenetic relationships and their<br />
potential uses in fungal molecular systematics. Genes that are considered good<br />
candidates for use in phylogeny present specific features as vertical transmission,<br />
ubiquity and slowly evolving sites. Within these new genes we can consider MADS-box<br />
transcription factors and GPI-anchor proteins as potential candidates since they present<br />
all the above characteristics. The MADS-box genes encode a eukaryotic family of<br />
transcriptional regulators involved in diverse and important biological functions and the<br />
GPI-anchor proteins participate in the interaction with the environment.<br />
The MADS-box proteins have been identified in yeasts, plants, insects,<br />
nematodes, lower vertebrates and mammals. These proteins are characterized by<br />
containing a conserved DNA binding and dimerization <strong>do</strong>main named the MADS-box<br />
(Schwarz-Sommer et al., 1990) after the five founding members of the family: Mcm1<br />
(yeast) (Passmore et al., 1989), Arg80 (yeast) (Dubois et al., 1987) or Agamous (plant)<br />
(Yanofsky et al., 1990), Deficiens (plant) (Sommer et al., 1990) and SRF (human)<br />
(Norman et al., 1988). In animal and fungi, two distinct types of MADS-box genes have<br />
been identified, the SRF-like (type I) and MEF2-like (type II) classes (Shore and<br />
Sharrocks, 1995). These MEF2-like amino acid sequences are more closely related to<br />
most plant MADS-<strong>do</strong>main sequences, suggesting that at least one gene-duplication<br />
event occurred before the divergence of plants and animals (Alvarez-Buylla et al.,<br />
2000). Thus, both type I and type II MADS-box proteins can be found in each king<strong>do</strong>m.<br />
The organization of type I and type II MADS-box proteins is represented in Figure 5.<br />
Type I and type II proteins can be further classified in subfamilies on the basis of shared<br />
sequence similarity between their C-terminal extensions and the highly conserved<br />
MADS <strong>do</strong>main. These <strong>do</strong>mains are referred to as SAM in fungi and animal SRF-like<br />
proteins, as MEF2 in fungi and animal MEF2-like proteins and as I, K and C in plant<br />
type II proteins (Mueller and Nordheim, 1991; Sharrocks et al., 1993; Riechmann et al.,<br />
1996; Riechmann and Meyerowitz, 1997). MADS-box family members generally<br />
recognize AT rich consensus sequences, with a highly conserved core of 10 bp:<br />
CC(A/T)6GG is the binding site of SRF-like proteins known as CArG box (Treisman,<br />
1990), and CTA(A/T)4TAG is the binding site of MEF2-like proteins (Figure 6)<br />
(Pollock and Treisman, 1991).<br />
37
38<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Figure 5. Schematic representation of Type I (SRF-like) and Type II (MEF2-like) protein <strong>do</strong>mains in<br />
plant, animal, and fungi. The scale indicates the number of amino acids along the protein. Plant<br />
Type II-like proteins have carboxyl-terminal <strong>do</strong>mains that go beyond 200 amino acids. In plant<br />
Type I-like proteins the ―?‖ indicates carboxyl-terminal <strong>do</strong>mains not well defined yet and of<br />
variable lengths (Based on Alvarez-Buylla et al., 2000).<br />
In the yeast Saccharomyces cerevisiae four MADS-box proteins have been<br />
found: Mcm1 and Arg80 which are related to the human SRF, and Rlm1 and Smp1<br />
which belong to the MEF2-like family (Alvarez-Buylla et al., 2000). Rlm1 controls the<br />
expression of genes required for cell wall integrity, while Smp1 is involved in osmotic<br />
stress response mediated by the Hog1 signal transduction pathway (Watanabe et al.,<br />
1995; Do<strong>do</strong>u and Treisman, 1997; Nadal Ed et al., 2003). Arg80 is only required for<br />
repression of arginine anabolic genes and induction of arginine catabolic genes<br />
(Messenguy and Dubois, 2000), whereas Mcm1, also involved in the control of arginine<br />
metabolism, playing a pleiotropic role in the cell (Messenguy and Dubois, 1993). Mcm1<br />
is essential for cell viability and controls M/G1 and G2/M cell-cycle-dependent<br />
transcription (Althoefer et al., 1995; McInerny et al., 1997), mating (Jarvis et al., 1989),<br />
minichromosome maintenance (Passmore et al., 1988), recombination (Elble and Tye,<br />
1992), transcription of TY elements (Errede, 1993; Yu and Fassler, 1993) and<br />
osmotolerance (Kuo et al., 1997).
THESIS BACKGROUND AND OBJECTIVES<br />
Figure 6. Structure of Type I (SRF) and Type II (MEF2A) MADS <strong>do</strong>mains in complex with their DNA<br />
target (Homo sapiens). (a) Aligned sequences of SRF and MEF2A MADS <strong>do</strong>mains with α and<br />
β secondary structure assignments. (b) The SRF– core DNA complex with one monomer in red<br />
and the other monomer in green (taken from Pellegrini et al., 1995). DNA is in grey, with<br />
upper and lower DNA strands labelled W and C. (c) SRF DNA sequence (18 bp) with the<br />
palindromic CArG recognition element boxed. (d) MEF2A–DNA complex with one monomer<br />
in red and the other monomer in green (adapted from Santelli and Richmond, 2000). DNA is in<br />
grey. (e) MEF DNA sequence (17 bp) in the crystal structure with MEF2 consensus site boxed<br />
(taken from Santelli and Richmond, 2000).<br />
Other MADS-box proteins type I have also been studied, for instance in<br />
Schizosaccharomyces pombe, MAP1 gene encodes a protein required for cell-type-<br />
specific gene expression, in Ustilago maydis, UMC1 encodes a protein that regulates the<br />
expression of pheromone-inducible genes (Yabana and Yamamoto, 1996; Kruger et al.,<br />
1997) and in C. albicans, CaMCM1 is crucial for morphogenesis (Rottmann et al.,<br />
2003). In the case of MADS-box type II, putative RLM1 orthologues were identified in<br />
C. albicans, Paracoccidioides brasiliensis and Aspergillus niger (Damveld et al., 2005;<br />
Fernandes et al., 2005; Bruno et al., 2006).<br />
GPI-anchor proteins (GPiPs) contain a C-terminal signal sequence that allows<br />
the linkage to a glycosylphosphatidylinositol (GPI) anchor. Proteins destined to be GPI<br />
anchored share conserved features: an N-terminal signal sequence for localization to the<br />
39
40<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
en<strong>do</strong>plasmic reticulum (ER), a C-terminal hydrophobic <strong>do</strong>main (9 to 24 residues) for<br />
transient attachment to the ER membrane, and the so-called ω site, where the protein is<br />
cleaved to be ligated to the GPI anchor (Thomas et al., 1990). The ω site is localized 9<br />
to 10 amino acids before the C-terminal hydrophobic <strong>do</strong>main, and its amino acid<br />
environment determines whether the protein has a high probability to be GPI anchored.<br />
Amino acids at positions ω-1 to ω-11 form a linker region, usually with no charge and<br />
no secondary structure (α-helix or β-sheet). The ω site region is composed of small<br />
amino acids to fit the transamidase protease catalytic pocket, and finally, the spacer<br />
region (ω + 3 to ω + 9) is comparable to the linker, having no charge and being flexible<br />
(Eisenhaber et al., 2004). Shortly after protein synthesis in the ER, the preformed GPI<br />
anchor replaces the C-terminal transmembrane region (Figure 7).<br />
Figure 7. GPI anchor structure. The structure of core GPI anchor which consists of a lipid group (serving<br />
as the membrane anchor), a myoinositol group, an N-acetylglucosamine group, three<br />
mannose groups, and a phosphoethanol phosphoethanolamine group, which ultimately<br />
connects the GPI anchor to the protein via an amide linkage represented in black (Tiede et<br />
al., 1999). The addition of mannose groups and the positioning of side chains like<br />
phosphoethanolamine on the GPI anchor contributes to the variety of GPI anchors identified<br />
so far. The additional gray groups illustrate the side chains added to the GPI anchor in S.<br />
cerevisiae (side chains are specific to each organism).<br />
GPI anchoring is encountered in every eukaryotic cell from unicellular yeast<br />
cells to the highly specialized mammalian cells (Bowman et al., 2006; Ferguson, 1999;<br />
McConville and Menon, 2000). In S. cerevisiae, most are involved in cell wall<br />
compound biosynthesis, flocculation, protease activity, sporulation, and mating (Caro et<br />
al., 1997). These proteins have also been found in A. nidulans, C. albicans, C. glabrata,
THESIS BACKGROUND AND OBJECTIVES<br />
Neurospora crassa and Schizosacch. pombe (Caro et al., 1997; de Groot et al., 2003;<br />
Eisenhaber et al., 2003; Eisenhaber et al., 2004; Sundstrom, 2002; Weig et al., 2004). It<br />
is notable that S. cerevisiae and Schizosacch. pombe seem to possess a smaller number<br />
of GpiPs than C. albicans.<br />
IFF8 gene encodes a putative GPI-anchored protein of unknown function that<br />
forms part of the group of genes, also known as the IFF genes (for individual protein<br />
file family F). Analysis of the coding sequences in this family revealed gene products<br />
with a large N-terminal <strong>do</strong>main of 340 amino acids shared by each member of the<br />
family, with high homology. The proteins diverge strongly after this <strong>do</strong>main; both in<br />
size and sequence, including its C-terminal hydrophobic <strong>do</strong>main that allows for linkage<br />
to a glycosylphosphatidylinositol (GPI) anchor (Richard and Plaine, 2007).<br />
Objectives<br />
Currently, protein-coding genes are used to resolve deep level phylogenetic<br />
relationships through multigenic and phylogenomic analysis, but the majority of these<br />
analyses uses regions with similarity only among orthologue gene sequences and <strong>do</strong>es<br />
not take into account the other regions which could give us useful information to infer a<br />
more robust phylogeny. Hence, the main purpose of this thesis was to evaluate the use<br />
of the MADS-box transcription factors in fungi phylogeny and the use of IFF8 gene to<br />
help to resolve the phylogeny in the particular CUG group.<br />
Therefore, the specific aims of the present work were:<br />
* Search for MADS-box and IFF8 ortohologue genes in the king<strong>do</strong>m Fungi.<br />
* Analyze the potential use of these genes to infer phylogenetic relationships<br />
according to current proposal in the king<strong>do</strong>m Fungi.<br />
* Infer the phylogeny of Candida species based on concatenated alignment of the<br />
previously predicted genes.<br />
* Analyze the putative adaptive evolution of RLM1 gene in the subphylum<br />
Saccharomycotina, particularly in the Whole Genome Duplicate (WGD) group.<br />
* Identify the molecular mechanism that putatively changed RLM1 gene in the<br />
subphylum Saccharomycotina.<br />
41
MATERIAL AND METHODS
3. MATERIAL AND METHODS<br />
3.1. Phylogenetic analysis<br />
3.1.1. Data retrieval<br />
MATERIAL AND METHODS<br />
Sequences used in this study were obtained from several different databases (DB)<br />
and Rlm1 and Mcm1 protein and DNA sequences from Saccharomyces cerevisiae as<br />
well as the Iff8 protein and DNA sequences of Candida albicans were used as<br />
references. The S. cerevisiae Rlm1 and Mcm1 protein/gene sequences were obtained<br />
from the Genbank under accession numbers BAA09658/D63340 and<br />
CAA88409/Z48502, respectively. The Iff8 protein/gene sequences of C. albicans were<br />
obtained from Candida Database orf19.8201.<br />
To obtain both nucleotide and amino acid putative orthologue sequences for the 76<br />
fungal species, the protein sequences referred above were used as reference to carry out<br />
TBLASTN searches (Matrix BLOSUM62, cut-off value E > 10 -5 ) in the following<br />
databases: Candida DB (http://www.candidagenome.org), Saccharomyces DB<br />
(http://www.yeastgenome.org), Broad Institute (http://www.broad.mit.edu), Joint<br />
Genome Institute (JGI, http://www.jgi.<strong>do</strong>e.gov), Genolevures (http://cbi.labri.fr), The<br />
Institute for Genomic Research (TIGR, http://www.tigr.org), Washington University St.<br />
Louis (WUSTL, http://www.wustl.edu), National Center for Biotechnology Information<br />
(NCBI, http://www.ncbi.nlm.nih.gov), Sanger Institute (http://www.sanger.ac.uk),<br />
Baylor College of Medicine (http://www.hgsc.bcm.edu), Oklahoma University<br />
(http://www.genome.ou.edu) and Murcia University (http://mucorgen.um.es). To avoid<br />
errors in the following alignments, if partial orthologue sequences were obtained in this<br />
search they were completed with the information from the closest species. Animals and<br />
plants MADS box sequences were used as outgroup.<br />
Both outgroup protein/gene sequences were retrieved from Genbank database under<br />
the following accession numbers, Type II: Homo sapiens MEF2D<br />
NP_005911/NM_005920, Xenopus laevis MEF2 AAI12917/BC112916, Danio rerio<br />
MEF2D AAH98522/BC098522, Drosophila melanogaster DMEF2<br />
NP_477021/NM_057673, Caenorhabditis elegans CEMEF2 NP_492441/NM_060040,<br />
Arabi<strong>do</strong>psis thaliana APETALA1 NP_177074/NM_105581 and Oryza sativa<br />
AGAMOUS ABG21913/DP000011; Type I: Homo sapiens SRF<br />
NP_003122/NM_003131, Xenopus laevis SRF NP_001095218/NM_001101748, Danio<br />
45
46<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
rerio MEF2D NP_001103996/NM_001110526, Drosophila melanogaster DSRF<br />
NP_726438/NM_166667, Caenorhabditis elegans NP_492226/NM_059895,<br />
Arabi<strong>do</strong>psis thaliana AGL36 NP_850880/NM_180549 and Oryza sativa AGL35<br />
BAD81343/AP002070. No outgroup was used in Iff8 protein and gene analyses.<br />
3.1.2 Sequence alignments<br />
The sequences obtained were aligned by using MUSCLE program (Edgar, 2004),<br />
and then imported into MEGA 4.0 software (Tamura et al., 2007) to be manually edited.<br />
The alignments were based on amino acids, therefore nucleotides were aligned in<br />
co<strong>do</strong>ns. As some parts of the C-terminal regions were too divergent to be confidently<br />
aligned, a preliminary NJ tree in Clustal W (Thompson et al., 1994) and UPGMA tree<br />
in MUSCLE was produced for each phylum, aligning the conserved C-terminal<br />
<strong>do</strong>mains. After this previous analysis, the alignment of the less-conserved C-terminal<br />
regions from different phyla became much easier and a new and final total alignment<br />
was produced.<br />
3.1.3. Selection of the best evolutionary model to infer phylogeny<br />
The model selection approach, Akaike Information Criterion (AIC), was used to<br />
estimate the best-fit protein evolutionary model for ML. The ProtTest 1.4 program was<br />
used and the best model was selected as the one that presented the greatest likelihood<br />
value, from the 48 models produced by the program (Abascal et al., 2005). The same<br />
approach was used to estimate the best-fit DNA evolutionary model for ML but by<br />
using the jModeltest 1.0 program (Posada, 2008) for likelihood calculations, the best<br />
model was also the one that presented the greatest likelihood value from the 88 models<br />
of evolution produced. For Bayesian inference a model selection approach was also<br />
performed, using the 4 hierarchies for likelihood ratio test to determine the simplest and<br />
most appropriate DNA evolutionary model by using MrModeltest 2.3 (Nylander, 2004)<br />
and PAUP 4.0b10 (Swofford, 2002) programs as well as the required file<br />
Mrmodelblock (anexo I).<br />
3.1.4. Phylogenetic reconstruction<br />
Phylogenetic analysis was performed by ML in PHYML 3.0 program (Guin<strong>do</strong>n and<br />
Gascuel, 2003), for each protein and/or gene and protein and/or gene concatenated<br />
alignments previously obtained. To obtain the phylogenetic tree, the best evolutionary
MATERIAL AND METHODS<br />
model determined previously was considered and the branch/nodal supports were<br />
estimated with the approximate likelihood ratio test by using the Shimodaira-Hasegawa<br />
like support option (Anisimova and Gascuel, 2006).<br />
Bayesian inference (BI) was performed by using MrBayes 3.2 program<br />
(Huelsenbeck and Ronquist, 2001) for each alignment previously obtained. As for ML,<br />
to obtain the phylogenetic tree by bayesian inference the best evolutionary model<br />
determined by the previous analysis was also considered and the branch/nodal support,<br />
given as the posterior probability (PP) values in this analysis, was calculated by using<br />
the Metropolis-coupled Markov chain Monte Carlo method after calculating a large<br />
number of different trees. To infer fungal phylogeny using RLM1 gene four chains with<br />
4.000.000 analysis were run. For Saccharomycotina phylogeny inferred with RLM1<br />
gene, four chains with 800.000 analyses were run and for phylogeny inferred with<br />
MCM1 gene four chains with 1.250.000 analyses. Saccharomycotina phylogeny with<br />
RLM1+MCM1 genes was inferred by running four chains with 5.500.000 analyses. To<br />
infer CUG group phylogeny using IFF8 gene four chains with 1.000.000 analyses were<br />
run and by using RLM1+MCM1+IFF8 genes 2.000.000 generations. Trees were<br />
sampled every 100 generations and trees obtained before the convergence of the<br />
Markov chain were not included in the consensus tree. The remaining samples were<br />
used to construct the consensus tree and the branches/nodes probability posterior values<br />
were converted from frequency to percentage, importing into PAUP 4.0b10. In the BI,<br />
these percentages for the consensus tree are the rough equivalent of a ML search with<br />
bootstrapping analysis (Huelsenbeck and Ronquist, 2001).<br />
To visualize the degree of phylogenetic conflict within concatenated alignments a<br />
phylogenetic network was generated by using the NeighborNet method in the<br />
SPLITSTREE 4.1 program (Huson and Bryant, 2006), with 1000 bootstrap analysis.<br />
3.1.5. Interpretation of values considered for nodal support<br />
Previous studies have demonstrated that to infer about the nodal support in<br />
phylogenetic trees the information from posterior probabilities and bootstrap analysis<br />
should be complemented (Alfaro et al., 2003; Douady et al., 2003; Reeb et al., 2004).<br />
Bayesian methods are more efficient in recovering accurate nodal support values, but<br />
same authors have indicated that they could be less conservative than boostrap (Suzuki<br />
47
48<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
et al., 2002; Alfaro et al., 2003), unlike Shimodaira-Hasegawa (S-H) approach used in<br />
the ML analyses which is more conservative (http://atgc.lirmm.fr/alrt). Therefore, a<br />
combination of both, posterior probabilities and Shimodaira-Hasegawa like support,<br />
were used to assess the level of confidence of a specific node in the phylogenetic DNA<br />
trees. Throughout this work, the following scale was used:<br />
- High (strong) support - PP ≥ 95% and S-H ≥ 0.70;<br />
- Medium (moderate) support - PP ≥ 95% and 0.70 > S-H ≥ 0.50, or PP < 95%<br />
and S-H ≥ 0.70;<br />
- Low (poor or weak) support - PP ≥ 95% and S-H < 0.50, or PP < 95% and 0.70<br />
> S-H ≥ 0.50;<br />
- No support - PP < 95% and S-H < 0.50.<br />
In protein phylogenetic trees Shimodaira-Hasegawa like support (S-H) analyses was<br />
the only considered, using the same scale values.<br />
3.2. Analysis of adaptive evolution<br />
We applied the approach of Yang and coworkers (Yang et al., 2000; Swanson et al.,<br />
2001) to test for amino acids under positive selection in Rlm1 protein. To determine<br />
which model best fitted the data, likelihood ratio tests (LRTs) were performed by<br />
comparing the differences in log likelihood values between two models by using the χ 2<br />
distribution. The co<strong>do</strong>n-substitution models M0, M1a, M2a, M3, M7, and M8 that use a<br />
statistical distribution to describe the ran<strong>do</strong>m variation among co<strong>do</strong>n sites (ω=dN/dS)<br />
were used (Nielsen and Yang, 1998; Wong et al., 2004; Yang et al., 2005). We used<br />
LRTs to make 3 comparisons to find out whether positive selection has played a role in<br />
the molecular evolution of this gene; (i) the one ratio model (M0) was compared with<br />
the discrete model (M3), (ii) the neutral model (M1a) was compared with the selection<br />
model (M2a), and (iii) the beta model (M7) was compared with the beta and ω model<br />
(M8).<br />
To identify particular sites in the gene that were likely to have evolved under<br />
positive selection the Bayesian Empirical Bayes (BEB) analysis was used (Yang and<br />
Bielawski, 2000; Yang et al., 2005). This estimates the posterior probability (PP) values<br />
at each site. If, a particular co<strong>do</strong>n site presents PP value higher than 95% and ω higher
MATERIAL AND METHODS<br />
than 1 that site is inferred to be under positive selection (Nielsen and Yang, 1998; Yang<br />
and Bielawski, 2000; Yang et al., 2005). These calculations were performed by using<br />
the CODEML program inside PAML 4.1 (Yang, 1997) and HYPHY 1.0 program<br />
(Kosakovsky et al. 2005) to analyze positive selection in the MADS-box orthologues<br />
from RLM1 gene in plants, animals and fungi, and in the entire RLM1 gene within the<br />
Saccharomycotina group.<br />
A seventh model, named mechanistic empirical combined (MEC), was further<br />
used to test positive selection in the subphylum Saccharomycotina (Doron-Faigenboim<br />
and Pupko, 2006). This model was statistically tested by comparing the second order<br />
Akaike Information Criterion corrected (AICc) likelihood scores with Model 8a (Wong<br />
et al., 2004). The calculations were performed by using SELECTON 2.4 program (Stern<br />
et al., 2007).<br />
3.3. Analysis of the repetitive region in the subphylum Saccharomycotina<br />
To identify the putative molecular mechanism that contributed to the divergence of<br />
the MADS-box RLM1 gene in the subphylum Saccharomycotina, the C-terminal was<br />
analysed. Three reading frames from the sequences that presented a repetitive region in<br />
the C-terminal were analyzed by using VIRTUAL RIBOSOME 1.1 program<br />
(Wernersson, 2006) to determine the possible amino acids encoded by that region. The<br />
corresponding ancestral RLM1 gene, from which the most recent Saccharomycotina<br />
species reported in this study diverged, was constructed by using MrBayes 3.2 program.<br />
The Saccharomycotina Rlm1 protein sequences were aligned with this ancestral<br />
sequence to identify the presence of an ancestral repetitive region in the C-terminal.<br />
49
RESULTS AND DISCUSSION
4. RESULTS AND DISCUSSION<br />
4.1. Compilation of data<br />
RESULTS AND DISCUSSION<br />
The fungal Rlm1 and Mcm1 orthologue protein/gene sequences identified in this<br />
analysis consisted of 76 putative orthologues. From these, 47 putative Rlm1 orthologue<br />
sequences were directly identified in the databases, 22 were predicted by homology and<br />
7 presented partial sequences (Annex II). Regarding Mcm1, 60 putative orthologue<br />
sequences were found in the database, 14 were predicted by homology and 2 presented<br />
partial sequences (Annex III). In the case of Iff8 protein/gene only 9 putative orthologue<br />
sequences were identified (Annex IV). Other members of MADS-box protein, Smp1<br />
and Arg80 protein orthologue sequences were also identified in Saccharomyces sensu<br />
stricto and Saccharomyces sensu lato groups, but not being used in this study (Annexe<br />
V and VI).<br />
The presence of introns was detected only in RLM1 and MCM1 genes (Annexes II<br />
and III) IFF8 sequences did not contain introns (Annex IV). The number of introns<br />
varied in RLM1 and MCM1 genes, from the lack of them to 8 introns. The higher<br />
variability in the number of introns was observed within Mucoromycotina and<br />
Basidiomycota while the Ascomycota group presented a more constant pattern, from 0<br />
to 2 introns. In view of these facts, a bias towards an intron loss in MADS-box<br />
transcription factors in the king<strong>do</strong>m Fungi seems to be occurring, not a gain, which was<br />
evidenced by the absence or presence of only 2 introns in the most recently diverged<br />
phylum Ascomycota, unlike the older phyla that presented a varied number that could<br />
reach eight introns.<br />
In the present work, the results suggested that during the divergence of fungal<br />
species the intron loss would have been toward RML1 3‘ end because in some species<br />
that diverged recently the intron in the 5‘ end inside the MADS-box of this gene is the<br />
only maintained. In MCM1 a bias to lose introns both in 5‘ and in 3‘ ends was<br />
identified, but like in RLM1 gene one intron was maintained inside the MADS-box.<br />
Curiously, the presence of the first intron towards the 5‘ end was inside the MADS-box<br />
a feature present in all the genes with introns. In the case of MCM1 this was also true,<br />
the first intron is also present inside the MADS-box, but not in the 5‘ end since the<br />
MADS-box is in the middle of the gene.<br />
53
54<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
It is known that in intron-rich organisms, introns are evenly distributed within the<br />
coding sequences, but are biased toward the 5‘ end of genes in intron–poor organisms<br />
(Nielsen et al., 2004). Current models attribute the bias to 3‘ end intron loss due to a<br />
poly-adenosine-primed reverse transcription mechanism which was demonstrated in<br />
experiments with intron-containing Ty elements in yeast (Fink, 1987; Boecke et al.,<br />
1985). However, in a study carried out in different fungal species no increased<br />
frequency in intron loss toward the 3‘ end of genes was found, the loss occurred in the<br />
middle of the genes, suggesting either other mutational mechanisms (e.g., reverse<br />
transcription primed internally) or the presence of selective pressure to preferentially<br />
conserve introns near the 5‘ and 3‘ ends of genes (Nielsen et al., 2004). To date, the<br />
mechanism by which introns are inserted or delected from gene loci is not well<br />
understood.<br />
4.2. Phylogeny based on MADS-box transcription factors<br />
4.2.1. Rlm1 transcription factor<br />
Fungal phylogeny based on Rlm1 transcription factor was inferred both by using<br />
amino acids and DNA sequences. The phylogeny based on amino acids sequences was<br />
inferred only by maximum likelihood (ML) by using the PHYML 3.0 program. From<br />
the various models of evolution tested the Jones, Taylor and Thornton (JTT) model was<br />
the one selected, including the options to calculate the proportion of invariant sites, the<br />
gamma distribution and the frequency of amino acids. In the phylogeny based on<br />
nucleotides the General Time Reversible (GTR) model, including the proportion of<br />
invariant sites and the gamma distribution options, were the selected, both for ML<br />
analysis and bayesian inference.<br />
The phylogeny based on protein sequences resolved the Fungi as a clade strongly<br />
supported by a Shimodaira-Hasegawa (S-H) like value of 0.96 in the ML analysis.<br />
However the resolution of Fungi into a clade based on DNA sequences revealed the<br />
existence of a conflict. The S-H value obtained was of 0.24 for ML and of 100% for<br />
bayesian posterior probability (PP), which by the scale value means a low nodal support<br />
(Figure 8 and 9). This result could be due to the low number of DNA RLM1 sequences<br />
from species other than fungi that were used in this study. But, although the low nodal<br />
support in the phylogeny based on DNA sequences, the obtained topology is strongly<br />
supported by previous studies (Wainright et al., 1993; Baldauf and Palmer, 1993).
RESULTS AND DISCUSSION<br />
The Mucoromycotina ‗Zygomycota‘ are primarily coenocytic (sometimes producing<br />
cell septa) and undergo sexual reproduction by formation of a thick-walled resting spore<br />
called a zygospore, which is clearly different from the Basidiomycota and Ascomycota<br />
(Dikaryomycota) and other phyla. Morphologicaly the Ascomycota and Basidiomycota<br />
both possess regularly septate hyphae and a dikaryotic life stage but differ in the<br />
structures involved in meiosis and sporulation.<br />
The fungal phylogeny inferred by these analyses clearly placed the 76 fungal<br />
species into one subphyla incertae sedis, the Mucoromycotina ‗Zygomycota‘, and two<br />
well supported phyla, the Basidiomycota and the Ascomycota (Figures 8 and 9). The<br />
Mucoromycotina ‗Zygomycota‘ formed a clade moderately supported by protein (S-H<br />
value of 0.64) as well as by nucleotide analysis (S-H 0.69 ML and 99% PP). Both<br />
Ascomycota and Basidiomycota formed clades highly supported by S-H values of 0.84<br />
and 0.91, respectively with the protein analysis, but these clades were medium or low<br />
supported in the nucleotide ML analysis by S-H of 0.67 and 0.57 and PP values of 98%<br />
and 61%, respectively. Based on the literature a sister relationship between the<br />
Basidiomycota and Ascomycota (the ‗‗Dikaryomycota‘‘) has been proposed, being<br />
recently recognized as the subking<strong>do</strong>m Dikarya (James et al., 2006, Hibbett et al.,<br />
2007). The results obtained in these analyses highly support this sister relationship (S-H<br />
= 0.98 ML in protein, S-H= 0.97 ML and PP = 100% in nucleotides).<br />
In the present analysis three species belonging to the subphylum Mucoromycotina<br />
‗Zygomycota‘ were used two of which the Mucor circinelloides and the Rhyzopus<br />
oryzae are sisters by a well-supported node in the phylogenetic analysis using<br />
nucleotide (SH=0.87, PP=99%) and amino acids (SH=0.92) sequences (Figures 8 and<br />
9). This result agrees with the current taxonomic classification in which these two<br />
species form part of the family Mucoraceae (Voigt and Wöstemeyer, 2001). The other<br />
species included in this subphylum, Phycomyces blakeleeanus, member of the family<br />
Phycomycetaceae. The topology obtained is in agreement with previous results since<br />
both the Mucoraceae and Phycomycetaceae families are part of the order Mucorales<br />
within the subphylum Mucoromycotina (Hibbet et al., 2007).<br />
55
56<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Figure 8. Fungal phylogeny based on Rlm1 protein sequences. This phylogeny was obtained by<br />
maximum likelihood. The nodes are supported by Shimodaira-Hasegawa like support. Species<br />
with partial sequences are: Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus, Ascosphaera<br />
apis, Botrytis cinerea, Chaetomium globosum, Ephicloa festucae, Coccidioides immitis.
RESULTS AND DISCUSSION<br />
The phylum Basidiomycota is strongly supported as a monophyletic clade as well as<br />
classes Agaricomycetes (S-H=0.99 ML in protein, S-H=0.97 ML and PP=100% in<br />
nucleotides), Tremellomycetes (S-H=1.0 ML in protein, S-H=1.0 ML and PP=100% in<br />
nucleotides), and Ustilaginomycetes (S-H=0.97 ML in protein, S-H=0.97 ML and<br />
PP=91% in nucleotides), within this phylum by both protein and nucleotide analyses.<br />
The classification of Ustilaginomycetes, Microbotryomycetes, Tremellomycetes and<br />
Agaricomycetes a<strong>do</strong>pted here followed the current taxonomic proposal (Hibbet et al.,<br />
2007). The Ustilaginomycetes, which is a class of the subphylum Ustilaginomycotina,<br />
are represented by one species of the order Ustilaginales, Ustilago maydis, and by<br />
another of the order Malasseziales, Malassezia globosa, class incertae sedis. These two<br />
species received high support as sister groups in protein and nucleotide analyses.<br />
However, in the phylogeny inferred by the bayesian method a topology conflict was<br />
observed with Ustilaginomycetes and Schizosaccharomycetes grouping them as sister<br />
clades with a low support (PP=62%), being the latter a member of the phylum<br />
Ascomycota. The class Microbotryomycetes, which is a member of subphylum<br />
Pucciniomycotina, is only represented by one species, Sporobolomyces roseus, and is<br />
grouped as a sister group of the subphylum Agaricomycotina with high nodal support<br />
(S-H=0.96 ML in protein, S-H=0.86 ML and PP=95% in nucleotides). The classes<br />
Tremellomycetes and Agaricomycetes are members of the subphylum<br />
Agaricomycotina. These two classes form well supported sister clades in protein (S-<br />
H=0.99), but obtained a low support with the nucleotide analysis ((S-H=0.42 ML and<br />
PP=86%). The class Tremellomycetes is represented by two species of the order<br />
Tremellales, Cryptococcus neoformans and Cryptococcus gatti. These two species are<br />
strongly supported as sister taxa by protein (S-H= 1.0) and nucleotide (S-H= 1.0,<br />
PP=100%) analyses. The class Agaricomycetes is represented by four species included<br />
in the orders Agaricales (Coprinus cinereus, Laccaria bicolor), Polyporales (Postia<br />
placenta), and Corticiales (Phanerochaete chrysosporium), being these latter two<br />
considered as subclass incertae sedis. The order Polyporales and Corticiales are<br />
resolved as well-supported sister groups in protein (S-H=0.95) but with a low support in<br />
nucleotide (S-H= 0.0, PP=80%) analyses. The order Agaricales is resolved as the sister<br />
group of the Polysporales/Corticiales group with high nodal support in protein (S-<br />
H=0.99) and nucleotide (S-H=0.97, PP=100%) analyses.<br />
57
58<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Figure 9. Fungal phylogeny based on RLM1 DNA sequences. Nodes values correspond to posterior<br />
probabilities (left) and Shimodaira-Hasegawa (right). In the maximum likelihood analysis a S-<br />
H value of 0.67 (a) for Schizosaccharomycetes clade within phylum Ascomycota, value of 1.0<br />
(b) for V. polyspora as basal taxon from ‗Saccharomyces complex‘, value of 0.32 (c) for<br />
Ephicloa festucae as basal taxon from Trichoderma group and a value of 0.04 (d) for A. terreus<br />
as basal taxon of A. nidulans- A. niger group was observed. Species with partial sequence are:<br />
Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus, Ascosphaera apis, Botrytis cinerea,<br />
Chaetomium globosum, E. festucae, and Coccidioides immitis.
RESULTS AND DISCUSSION<br />
In this study, of the three subphyla recognized within the Ascomycota (Eriksson<br />
et al., 2004, Hibbet et al., 2007), only the Taphrinomycotina presented a conflict<br />
topology, as mentioned before. The species in this subphylum represented a<br />
monophyletic clade with a high nodal support in protein (S-H= 0.99) and nucleotide (S-<br />
H=1.0, PP=100%) analyses. The subphylum Saccharomycotina also presented<br />
monophyly with well-supported nodal values (S-H=0.89 in protein and S-H=0.88,<br />
PP=99% in nucleotide analyses). The subphylum Pezizomycotina was also represented<br />
as a monophyletic clade with high nodal support values (S-H=0.98 in protein and S-<br />
H=0.98, PP= 100% in nucleotide analyses), as well as its classes Leotiomycetes,<br />
Sordariomycetes, Dothideomycetes and Eurotiomycetes (Figures 8 and 9). The class<br />
Leotiomycetes was only represented by the order Helotiales with a high nodal support<br />
(S-H=1.0 in protein and S-H=1.0, PP=100% in nucleotide analyses), being the species<br />
Botrytis cinerea and Sclerotinia sclerotiorum members of this order.<br />
The class Sordariomycetes is represented by the orders Sordariales, Hypocreales,<br />
Phyllacorales – subclass incertae sedis, and the family Magnaporthaceae – order<br />
incertae sedis. The orders Sordariales, Phyllacorales and the family Magnaportheceae<br />
form a group highly supported (S-H=0.96 in protein and S-H= 0.98, PP=100% in<br />
nucleotide analyses) with only a nodal conflict in the placement of Neurospora crassa<br />
with the protein analysis, since this species should form a clade with Chaetomium<br />
globosum and Po<strong>do</strong>spora anserina. The order Hypocreales is a sister clade of the group<br />
Sordariales/Phyllacorales/Magnaporthaceae with a high nodal support (S-H=1.0 in<br />
protein and S-H=1.0, PP=100% in nucleotide analyses) but a conflict was also observed<br />
regarding the placement of Ephicloe festucae (family Clavicipitaceae). This species was<br />
placed as a sister clade with the family Hypocraceae (Trichoderma atroviride, T. reesei,<br />
T. virens) in ML both in protein and in nucleotide analysis, but as a sister clade of the<br />
family Nectriaceae (Nectria haematoccoca, Fusarium graminearum, F. oxysporum, F.<br />
verticillioides) in the BI, although with low nodal support in all analyses (S-H=0.12 ML<br />
in protein, S-H=0.32 ML and PP=63% in nucleotide).<br />
The class Dothideomycetes is represented by the orders Capnodiales and<br />
Pleosporales, which grouped as monophyletic sister clades with high nodal support<br />
(SH=0.99 in protein, S-H=0.97 PP=90% in nucleotide analyses) (Figures 8 and 9). The<br />
class Eurotiomycetes is represented by the orders Onygenales and Eurotiales and<br />
59
60<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
presented monophyly. In the order Onygenales, the families Ascosphaeraceae<br />
(Ascosphaera apis) and Arthrodermateceae (Microsporum gypseum) are sister clades<br />
with high nodal support in protein (S-H=0.92) and nucleotide (S-H=0.92, PP=99%)<br />
analyses as well as the families Onygenaceae (Uncinocarpus reesii) and Gymnoascaceae<br />
(Coccidioides immitis, Coccidioides posadasii). The family Ajellomycetaceae is<br />
represented by Paracoccidioides brasiliensis, Histoplasma capsulatum and Blastomyces<br />
dermatitidis, forming a highly supported clade by protein (S-H=1.0) and nucleotide (S-<br />
H=0.99, PP=100%) analyses. A nodal conflict was observed regarding the placement of<br />
the group Ascosphaeraceae/Arthrodermateceae, since the protein analysis showed it as a<br />
sister group of the group Onygenaceae/Gymnoascaceae and the nucleotide analyses as a<br />
sister group of the clade Ajellomycetaceae. However, both analyses showed low nodal<br />
suppor by protein (S-H=0.58) and nucleotide (S-H=0.04, PP=67%) analyses. The order<br />
Eurotiales is represented only by the family Trichocomaceae with a high nodal support<br />
(S-H=0.99 in protein, S-H=1.0, PP=100% in nucleotide analyses). Talaromyces<br />
stipitatus and Penicillium marneffei are sister taxa (Figures 8 and 9). The species<br />
Aspergillus clavatus, A. fumigatus and Neosartorya fischeri formed a group with a well<br />
nodal support that is a sister group of the one formed by A. oryzae, A. flavus, A. terreus,<br />
A. nidulans and A. niger, also well supported (Figures 8 and 9). Only regarding A.<br />
terreus placement by bayesian analysis a conflict was observed. This species grouped as<br />
sister taxa of A. oryzae and A. flavus, while in ML analysis it was placed together with<br />
A. nidulans and A. niger. However, both options had low supports. The orders<br />
Leotiomycetes and Sordariomycetes formed sister clades with a well nodal support<br />
while the orders Dothideomycetes and Eurotiomycetes also formed sister clades but<br />
with a moderate support (S-H=0.85 in protein, S-H=0.89, PP=86% in nucleotide<br />
analyses).<br />
4.2.2. Mcm1 transcription factor<br />
In fungal phylogeny based on the analysis of Mcm1 protein/gene only ML of<br />
protein sequences was considered, since the nucleotide analysis showed a contradictory<br />
topology to the one obtained with protein due to the high difference of MCM1 sequence<br />
sizes (Figure 10). BI was not performed.<br />
The phylogeny inferred from Mcm1 protein analysis coincides in the main<br />
points with the one inferred from Rlm1 protein/gene analyses. The Fungi were resolved
RESULTS AND DISCUSSION<br />
as a single clade with S-H= 0.82, strongly supported. The major differences in the<br />
overall topology fall on some clades:<br />
-The subphylum Mucoromycotina ‗Zygomycota‘ presented well supported<br />
polyphyly between the families Phycomycetaceae and Mucoraceae, while with Rlm1<br />
presented monophyly.<br />
-Within the phylum Basidiomycota, the species Malassezia globosa formed a<br />
sister group with the family Schizosaccharomycetes from the phylum Ascomycota with<br />
a good support value, contrary to the obtained with Rlm1 protein analysis, that correctly<br />
placed the Schizosaccharomycetes within the Ascomycota.<br />
-In the phylum Ascomycota, the class Saccharomycetes is present as a<br />
monophyletic group, but with a lower nodal support, while with Rlm1 analyses is well<br />
supported.<br />
-Within the subphylum Pezizomycotina, in the class Sordariomycetes (formed<br />
by the orders Sordariales, Phyllacorales) and the family Magnaportheceae presented<br />
Magnaporthe grisea as the basal clade, with a moderate nodal support, contrary to<br />
results found in the phylogeny with Rlm1 protein/gene that presented Verticillium<br />
dahliae as the basal taxon.<br />
-The basal taxon in the family Hypocraceae is Trichoderma atroviride, which<br />
disagrees with the obtained topology with Rlm1 protein analyses that places T. reesei as<br />
the basal taxa.<br />
-In the class Dothideomycetes, the latest taxa of the order Pleosporales is<br />
different regarding the result found in the topology with Rlm1 protein/gene analyses.<br />
With Mcm1 the latest taxa is Alternaria brassicicola and with Rlm1 is Pyrenophora<br />
tritici-repentis, but with Mcm1 analysis the nodal support is very low.<br />
-Within the class Eurotiomycetes Ascosphaera apis was found as basal of all the<br />
orders that constitute this clade, while the Rlm1 analyses placed as a sister taxon to<br />
Microsporum gypseum.<br />
-The clade Arthrodermateceae (Microsporum gypseum) is a sister clade of the<br />
group formed by Onygenaceae/Gymnoascaceae but presented a low nodal support<br />
contrary to results obtained with the nucleotide RLM1 analysis that joined it to the clade<br />
Ajellomycetaceae, also with low support.<br />
61
62<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Figure 10. Fungal phylogeny based on Mcm1 protein sequences. This phylogeny was obtained by<br />
maximum likelihood. The nodes are supported by Shimodaira-Hasegawa like support.<br />
Species with partial sequence: Kluyveromyces waltii, Blastomyces dermatitidis, Botrytis<br />
cinerea, Cryptococcus gatti.
RESULTS AND DISCUSSION<br />
-The order Eurotiales is weakly supported, while the group formed by the taxa<br />
A. niger, A. fumigatus and Neosarterya fischeri is well supported, contrary to results<br />
obtained in Rlm1 phylogeny, where A. clavatus was joined to the two latter taxa<br />
mentioned before.<br />
-The placement of the taxa A. terreus is also different in this analysis, it was<br />
placed as a sister taxon to A. oryzae and A.flavus, but only in the Rlm1 gene/ protein<br />
analyses its placement is not well supported. The same for A. clavatus that with Rlm1<br />
protein/genes analyses formed a highly supported group with Neosartorya fisheri and A.<br />
fumigatus, but with Mcm1 presented an uncertain placement.<br />
4.2.3. Placement of fungal phylogeny based on MADS-box transcription<br />
factor within the current literature<br />
Several phylogenetric studies have demonstrated the close relationship of the<br />
king<strong>do</strong>ms Animalia and Fungi (Baldauf and Palmer, 1993; Baldauf et al., 2000; Lang et<br />
al., 2002). These two king<strong>do</strong>ms are now known to be a part of a larger group that<br />
includes the phylum Choanozoa, termed the Opisthokonts (Cavalier-Smith and Chao,<br />
1995; Ragan et al., 1996). Together, the Opisthokonts and the phylum Amoebozoa form<br />
the group Unikonts (Cavalier-Smith, 2002b). In the present work the phylum<br />
Choanozoa has not been included but the close relationship between the two king<strong>do</strong>ms,<br />
Animalia and Fungi, was observed using species from the king<strong>do</strong>m Plantaea as<br />
outgroup. The king<strong>do</strong>m Plantae is part of the group Bikonts (Cavalier-Smith, 2002b).<br />
The phylogenetic topology obtained in this study is also in agreement with the topology<br />
obtained in a study that explain the origin of the MADS-box proteins in Animalia,<br />
Fungi, and Plantae (Alvarez-Buylla et al., 2000).<br />
Regarding the king<strong>do</strong>m Fungi, previous studies based on multigenic and rDNA<br />
analysis indicated that the phyla Microsporidia, Chytridiomycota,<br />
Neocallimastigomycota, Blastocladiomycota and the subphyla incertae sedis<br />
Mucoromycotina, Zoopagomycotina, Entomophthoromycotina and Kickxellomycotina<br />
are part of the earliest known divergence that took place during fungal evolution (Bruns<br />
et al., 1992; Berbee and Taylor, 1993; Tanabe et al., 2000, Lutzoni et al., 2004;<br />
Blackwell et al., 2006; Fitzpatrick et al., 2006; James et al., 2006). However, due to the<br />
low number of avalaible sequences within all the phyla that represents the king<strong>do</strong>m<br />
63
64<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Fungi, only sequences from species within Basidiomycota and Ascomycota phyla and<br />
the subphylum Mucoromycotina were used in this study.<br />
The subphylum Mucoromycotina is represented by species that belonged to the<br />
ex-phylum Zygomycota (Hibbet et al., 2007). The results obtained in our work, showed<br />
that the subphylum Mucoromycotina is an early diverging clade (Figures 8 and 9) which<br />
agrees with previous studies. The subphylum Mucoromycotina presented monophyly<br />
with moderate support with Rlm1 sequence analyses, in agreement with phylogenetic<br />
analysis inferred with ACT1, RPB1, RPB2, EF-1α, small and large subunit rDNA genes<br />
(Voigt and Wöstemeyer, 2001; James et al., 2006), and polyphyly in the analysis with<br />
MCM1 gene (Figure 10), like what was found in other studies (Lutzoni et al., 2004). To<br />
resolve this monophyly/polyphyly conflict the inclusion of sequences from other<br />
species within the subphyla Mucoromycotina, as well as from the other subphyla<br />
incertae sedis would be important. The use of additional protein-coding genes would<br />
also help to resolve this conflict.<br />
Basidiomycota includes about 30,000 species of rusts, smuts, yeasts, and<br />
mushroom fungi (Kirk et al., 2001). Most of them are characterized by their meiospores<br />
(basidiospores) on the exterior of typically club-shaped meiosporangia (basidia).<br />
Phylogenetic relationships among the three subphyla of Basidiomycota are still<br />
uncertain. The subphylum Pucciniomycotina is primarily distinguished by containing<br />
the rust fungi (7000 species), which are primarily pathogens of land plants. In our study,<br />
the species representing the subphylum Pucciniomycotina grouped together with the<br />
species of subphylum Agaricomycotina (Figures 8, 9 and 10). This result is not<br />
consistent with previous studies that suggested a sister group relationship with the group<br />
formed by the subphyla Ustilaginomycotina and Agaricomycotina (Lutzoni et al., 2004,<br />
James et al. 2006). The result from our study could be due to the use of only one<br />
representant of the subpylum Puccioniomycotina, the Sporobolomyces roseus, and also<br />
due to long-branch attraction artifact since the species used presented large MADS-box<br />
genes, like the ones observed within the representants of the class Tremellomycetes<br />
from the subphylum Agaricomycotina.<br />
The Ustilaginomycotina includes 1500 species of true smut fungi and yeasts,<br />
most of which cause systemic infections of angiosperm plant hosts. The
RESULTS AND DISCUSSION<br />
Agaricomycotina includes almost two-thirds of the known basidiomycetes, including<br />
the vast majority of mushroom forming fungi. Much of the morphological diversity<br />
exemplified in mushroom fruiting bodies is the result of radiations of certain lineages<br />
within the Agaricomycotina, and recovering their relationships with confidence has<br />
proven very difficult (Binder et al., 2005; Moncalvo et al., 2002). In our work, early-<br />
diverging lineages in the Agaricomycotina are strongly supported, which include<br />
parasitic and/or saprotrophic fungi capable of dimorphism or yeast-like phases, in<br />
agreement with previous studies that found the same relationships among species of this<br />
subphylum (Lutzoni et al., 2004; Fitzpatrick et al., 2006, James et al., 2006).<br />
Ascomycota is the largest phylum within the Fungi characterized by the<br />
production of meiospores (ascospores) in specialized sac-shaped meiosporangia (asci),<br />
which may or may not be produced within a sporocarp (ascoma). The majority of the<br />
species studied in our analysis belonged to this phylum. Within the Ascomycota three<br />
main subphyla are recognized, the Taphrinomycotina, the Pezizomycotina and the<br />
Saccharomycotina. Results from this study showed that all subphyla of Ascomycota<br />
were well supported as monophyletic groups in the phylogenetic tree presented in<br />
Figure 9.<br />
In the results obtained with the Mcm1 protein and RLM1 gene (Bayesian<br />
inference), the class Schizosaccharomycetes from the subphylum Taphrinomycotina sits<br />
outside the other subphyla from Ascomycota. Taphrinomycotina has been resolved as<br />
the earliest diverging monophyletic clade (Liu et al., 1999; Fitzpatrick et al., 2006;<br />
James et al., 2006; Liu et al., 2006; Liu et al., 2009). It includes a diverse group of<br />
species that exhibits yeast-like (for example, Pneumocystis carinii) and dimorphic (for<br />
example, Taphrina deformans) growth forms. The incongruence observed in the<br />
placement of the Schizosaccharomycetes could be due to the fact that in the MADS-box<br />
proteins identified for Ustilaginomycotina and Schizosaccharomycotina a high<br />
variability in the C‘-terminal region was observed. This variability caused a conflict in<br />
the topology determined by bayesian inference and maximum likelihood for RLM1 gene<br />
and Mcm1 protein, respectively (Figure 9 and 10), producing the paraphyly observed,<br />
and grouping these species. Another hypothesis to explain the incongruence could be<br />
the wrong and partial assembly of genomes, which would affect the correct sequence of<br />
genes in the databases; for instance the sequence from MCM1 gene of Blastomyces<br />
65
66<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
dermatitidis was identified, but part of the N‘-terminal was not matching with the query<br />
sequence. Thus, this sequence had to be corrected and completed with the N‘-terminal<br />
from closer species. Other reason for this incongruence could be the robustness of the<br />
method for phylogenetic inference, since the ALRT approach in maximum likelihood<br />
estimates the probability of a branch being correct better than bootstrapping, and is<br />
more conservative than Bayesian posterior probabilities (Anisimova and Gascuel, 2006;<br />
Hall and Salipante, 2007). That could explain why, the class Schizosaccharomycetes<br />
was correctly placed in the analysis with Rlm1 protein and DNA analyses by maximum<br />
likelihood and aLRT test.<br />
The subphylum Saccharomycotina consists of the ‗true yeasts‘, including<br />
bakers‘ yeast (Saccharomyces cerevisiae) and Candida albicans, the most frequently<br />
encountered fungal pathogen of humans, and its monophyly has been demonstrated in<br />
several studies (Lutzoni et al., 2004; Fitzpatrick et al., 2006; James et al. 2006; Suh et<br />
al., 2006). Results obtained in this work are also in agreement with the previous studies<br />
in which three clades, the CUG, the Saccharomyces sensu stricto and the<br />
Saccharomyces sensu lato are clearly shown (Figure 8, 9 and 10).<br />
Pezizomycotina is the largest subphylum of Ascomycota and includes the vast<br />
majority of filamentous and fruit-body-producing species. Within the Pezizomycotina a<br />
number of well-defined classes are observed, namely the Sordariomycetes, the<br />
Leotiomycetes, the Eurotiomycetes and Dothideomycetes whose relationships has been<br />
the subject of many debates in previous studies (Berbee, 1996; Liu et al., 1996). The<br />
topology obtained with Rlm1 and Mcm1 protein/gene in this study indicated that<br />
Leotiomycetes and Sordariomycetes are well supported sister clades (Figure 8, 9 and<br />
10). This result is in agreement with several previous analyses, like the rDNA-based<br />
analysis (Lumbsch et al., 2005), multigenic analyses (James et al., 2006; Spatafora et<br />
al., 2006) and analyses based on complete fungal genomes (Fitzpatrick et al., 2006;<br />
Robbertse et al., 2006).), but is in disagreement with one analysis of four gene<br />
combined dataset that placed the Dothideomycetes as a sister group to the<br />
Sordariomycetes (Lutzoni et al., 2004).<br />
In this study, within the Sordariomycetes, the inferred phylogenetic relationships<br />
amongst the Sordariomycetidae organisms concur with previous phylogenetic studies
RESULTS AND DISCUSSION<br />
(Berbee, 2001). However, in the order Sordariales the placement of Magnaporthe grisea<br />
is not well-defined as the analysis with Mcm1 protein considers it as basal taxon, unlike<br />
the analysis with Rlm1 protein/gene considers Verticillium dahliae as basal taxon.<br />
These results disagree with the phylogeny of Sordariomycetes obtained by other studies,<br />
which placed V. dahliae as the sister taxon of the Hypocreales (Zhang et al., 2006).<br />
Likewise, in this study the inferred phylogeny based on Mcm1 protein and RLM1 gene<br />
(Bayesian inference) propose that Ephicloe festucae is a sister taxon of the group<br />
formed by Fusarium species (Figure 9 and 10), and the analysis of Rlm1 protein/gene<br />
(maximum likelihood) placed it as sister taxon of group formed by Trichoderma<br />
species. However, in our study only one species from family Clavicipitaceae was used.<br />
The Dothideomycetes class form a well-supported sister group with the<br />
Eurotiomycetes in the analysis of Rlm1 protein/gene, but not in the analysis of Mcm1<br />
protein (S-H=0.0). The former result agrees with the previous analysis carried out with<br />
four combined genes which also grouped together the Dothideomycetes and the<br />
Eurotiomycetes (Lutzoni et al; 2004). Conversely, a concatenated alignment of 42<br />
fungal genomes inferred that Stagonospor no<strong>do</strong>rum (a member of the Dothideomycetes)<br />
is more closely related to the Sordariomycetes and Leotiomycetes lineages (Fitzpatrick<br />
et al., 2006). Another study also based on 17 Ascomycota genomes, which used<br />
concatenated alignment, reported conflicting inferences regarding the phylogentic<br />
position of S. no<strong>do</strong>rum (Robbertse et al., 2006). The phylogenies reconstructed based<br />
on 17 genomes, using NJ and ML methods, inferred a sister group relationship between<br />
S. no<strong>do</strong>rum and Eurotiomycetes (Robbertse et al., 2006). This work is in accordance<br />
with a supertree inferred based on 42 complete genomes (Fitzpatrick et al., 2006) and<br />
the tree obtained with Rlm1 protein/gene and Mcm1 protein analyses. However, the<br />
same phylogenomic study inferred, by using maximum parsimony, the placement of S.<br />
no<strong>do</strong>rum at the base of the Pezizomycotina (Robbertse et al., 2006). The topology<br />
inferred in this work clearly showed two sister clades (orders Capnodiales and<br />
Pleosporales), which concurs with previous phylogenetic studies (Schoch et al., 2006).<br />
In this study, within the Eurotiomycetes class there is only a clade corresponding<br />
to the order Onygenales (Histoplasma capsulatum; Blastomyces dermatitidis<br />
Paracoccidioides brasiliensis, Coccidioides immitis, Coccidioides posadassii,<br />
Uncinocarpus reesii, Microsporum gypseum and Ascosphaera apis). The Onygenales<br />
67
68<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
clade is particularly interesting as it contains mainly animal pathogenic species. Some<br />
of them namely Coccidioides immitis have initially been classified as a protist, but<br />
further research showed it were a fungus and separate studies placed it in three different<br />
divisions of the former group named Eumycota (Rixford and Gilchrist, 1896; Ophuls,<br />
1905; Ciferri and Redaelli, 1936; Baker et al., 1943). Subsequent ribosomal phylogeny<br />
studies (Pan et al., 1994; Bowman et al., 1996) suggested a close phylogenetic<br />
relationship between C. immitis and U. reesii, excluding H. capsulatum. The<br />
phylogenies based on Rlm1 and Mcm1 protein/gene agree with the placement of C.<br />
immitis, C. posadassii and U. reesii as sister taxa, representing two families,<br />
Gymnoascaceae and Onygenaceae. However, a conflict was observed in the placement<br />
of Ascosphaera apis, which formed a clade with Microsporum gypseum with the Rlm1<br />
protein/gene analysis and appears as a basal taxon in Eurotiomycetes, with the Mcm1<br />
protein analysis. The obtained results disagree with the Eurotiomycetes phylogeny<br />
study, in which Ascosphaera apis (Ascosphaeraceae) formed a sister clade in the<br />
Ajellomycetaceae (H. capsulatum, P. brasiliensis and B. dermatitidis) and Microsporum<br />
gypseum, a sister clade of Gymnoascaceae (Geiser et al., 2006). The Eurotiomycetes<br />
branch containing the Eurotiales clade inferred the close relationship among<br />
Talaromyces stipitatus and Penicillium marneffei (Figures 8, 9 and 10). A minor<br />
difference in the Aspergillus clade was observed between the phylogenetic analyses<br />
with Rlm1 and Mcm1 regarding the position of A. nidulans, A. terreus, A. niger and A.<br />
fumigatus (Figure 8, 9 and 10). In this study, Neosatoria fischeri and A. fumigatus<br />
formed well-supported sister taxa, as well as A. oryzae and A. flavus, in accordance to<br />
Fitzpatrick et al.(2006) and James et al. (2006) studies.<br />
Due to the fact that the majority of fungi are still undiscovered, a robust phylogeny<br />
of known taxonomic groups will be essential for placement of unknown species as these<br />
are discovered. As demonstrated for Bacteria and Archaea (Pace, 1997), the Fungi are<br />
likely to harbor many lineages whose discovery is dependent on phylogenetic analyses.<br />
Therefore, it will be necessary to carry out phylogenetic studies including a wider set of<br />
fungal species from other phyla, as the Glomeromycota and the Chytridiomycota, what<br />
will enable to resolve the phylogenetic relationships between misclassified and<br />
misplaced fungal species.
4.3. Relationships within the subphylum Saccharomycotina<br />
RESULTS AND DISCUSSION<br />
Due to the relationship that exists between Candida species and ‗Saccharomyces<br />
complex‘, these groups were further analysed by using both single gene and<br />
concatenated gene phylogenies.<br />
4.3.1. Phylogeny of the ‘Saccharomyces complex’<br />
In figures 8, 9 and 10, the subphylum Saccharomycotina is clearly shown as a<br />
monophyletic clade grouping Yarrowia lipolytica, Candida species and ‗Saccharomyces<br />
complex‘ species, in the phylogenetic tree inferred for the king<strong>do</strong>m Fungi. The<br />
phylogenies deduced based on single gene or concatenated genes, including only<br />
species belonging to the subphylum Saccharomycotina are present in Figures 11 to 14.<br />
Results from this analysis showed a topology very similar to the previously inferred<br />
topologies that included all other fungal species, by using Rlm1 and Mcm1 genes.<br />
The Non Whole Genome Duplication (Non-WGD) group, formed by<br />
Kluyveromyces lactis, Ashbya gossypii, Saccharomyces kluyveri and Kluyveromyces<br />
waltii, presented monophyly in all analyses, except in the topologies based on MCM1<br />
gene by bayesian inference (Figure 12B). The relationships amongst the Non-WGD<br />
species presented a variable placement of taxa, with their grouping poorly defined<br />
(Figures 11, 12 and 13). These results were not conclusive due to the low number of<br />
taxa within this complex that have available sequences for this study. In other studies,<br />
based on multigenic and phylogenomic analyses, the relationship amongst the Non-<br />
WGD species has been demonstrated showing K. waltii – S. kluyveri and K. lactis – A.<br />
gossypii grouping together (Fitzpatrick et al., 2006; James et al., 2006; Jeffroy et al.,<br />
2006; Liu et al., 2006).<br />
Another study of the ‗Saccharomyces complex‘ based on multigenic analysis,<br />
that used 75 species showed A. gossypii and K. lactis forming basal separated clades<br />
from S. kluyveri and K. waltii, these latter clade with a medium support (Kurtzman and<br />
Robnett, 2003). Another phylogenetic study based on SSU and LSU rDNA sequences<br />
from 64 species of the family Saccharomycetales placed K. lactis as an early clade and<br />
A. gossypi as a basal clade (Suh et al., 2006). Thus, the correct relationship amongst the<br />
Non-WGD species is still to be resolved.<br />
69
70<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
In our study, a noticeable difference occured in the placement of Saccharomyces<br />
castellii in the WGD clade with respect to previous studies. In the analyses based on<br />
RLM1 gene and protein, S. castellii was a well-supported early taxon within the WGD<br />
group (Figure 11).<br />
A B<br />
Figure 11. Phylogeny of the subphylum Saccharomycotina inferred by using RLM1. (A) Protein<br />
phylogeny obtained by maximum likelihood using the evolutive model JTT + I + G with<br />
nodal support by Shimodaira-Hasegawa; (B) Gene phylogeny obtained by using the evolutive<br />
model GTR + I + G. Nodes values correspond to posterior probabilities (left) and maximum<br />
likelihood (right), Shimodaira-Hasegawa like support). Species with partial sequence are<br />
Kluyveromyces waltii, Saccharomyces para<strong>do</strong>xus.<br />
However, the analysis based on Mcm1 protein showed a low support of S.<br />
castellii as an early taxon from Saccharomyces sensu stricto group (S. bayanus, S.<br />
kudriavzevii, S. mikatae, S. cerevisiae and S. para<strong>do</strong>xus). The analysis with MCM1 gene<br />
indicated a weak monophyly by ML with Saccharomyces sensu lato group (S. castellii,<br />
C. glabrata, Vanderwaltozyma polyspora) and BI presented a medium support as early<br />
taxon from Saccharomyces sensu stricto group (Figure 12).<br />
The Rlm1-Mcm1 concatenated phylogeny placed S. castellii as an early taxon from<br />
Saccharomyces sensu stricto group, both by maximum likelihood and bayesian methods<br />
(Figure 13). Additionally, in the global fungal phylogeny analysis based on Rlm1 and<br />
Mcm1 sequences, S. castellii also presented an inaccurate placement (Figures 11 - 13).
RESULTS AND DISCUSSION<br />
Inacurrate placement of S. castellii was also obtained by past multigenic studies, in<br />
which it either competed with C. glabrata to the basal placement in the Saccharomyces<br />
sensu stricto group, or formed a clade together with this group (Diezmann et al., 2004;<br />
Fitzpatrick et al., 2006; James et al., 2006; Jeffroy et al., 2006; Liu et al., 2006).<br />
A B<br />
Figure 12. Phylogeny of the subphylum Saccharomycotina inferred by using MCM1. (A) Protein<br />
phylogeny maximum likelihood using the evolutive model JTT + I + G with nodal support by<br />
Shimodaira-Hasegawa; (B) Gene phylogeny using the evolutive model GTR + I + G. Nodes<br />
values correspond to posterior probabilities (left) and maximum likelihood (right,<br />
Shimodaira-Hasegawa like support). b 0.16 is the support from group formed by S. castelli,<br />
C. glabrata and V.polypsora by maximum likelihood. Species with partial sequence:<br />
Kluyveromyces waltii<br />
However, a study including a higher number of Saccharomycetales species and<br />
based on SSU and LSU rDNA showed that V. polyspora was closer to S. cerevisiae than<br />
S. castellii and C. glabrata (Suh et al., 2006). Another study based on mitochondrial<br />
and nuclear genes as COX1, mit SSU rDNA, EF-1α, rDNA (ITS1, ITS2), nuc LSU<br />
rDNA (D1/D2), placed S. castellii forming a closer clade to Saccharomyces sensu<br />
stricto group, and C. glabrata and V. polyspora forming earlier clades than S. castellii<br />
with respect to Saccharomyces sensu stricto group (Kurtzman and Robnett, 2003).<br />
The sister group relationships amongst the Saccharomyces sensu stricto species<br />
also differed between Rlm1, Mcm1 and concatenated phylogenies (Figures 11 - 13). For<br />
example, the Saccharomycetales Rlm1 protein/gene phylogeny placed S. bayanus and S.<br />
71
72<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
kudriavzevii as sister taxa at the base of the Saccharomyces sensu stricto node and S.<br />
mikatae, S. cerevisiae and S. para<strong>do</strong>xus forming a group within it, by ML and BI. But<br />
Mcm1 protein phylogeny placed S. mikatae as basal taxon while MCM1 gene<br />
phylogeny did not define with accuracy the basal taxon between S. bayanus or S.<br />
kudriavzevii.<br />
A B<br />
Figure 13. Phylogeny of the subphylum Saccharomycotina inferred by using concatenated alignment of<br />
RLM1 - MCM1. (A) Protein phylogeny obtained by maximum likelihood with the evolutive<br />
model JTT + I + G and nodal support by Shimodaira-Hasegawa; and (B) Gene phylogeny<br />
obtained using the evolutive model GTR + I + G. Nodes values correspond to posterior<br />
probabilities (left) and maximum likelihood (right, Shimodaira-Hasegawa like support).<br />
Species with partial sequence: Kluyveromyces waltii.<br />
The concatenated alignment of Rlm1 and Mcm1 proteins (Figure 13) inferred a<br />
result similar to Rlm1 gene/protein phylogenies (Figure 12), and the analysis based on<br />
concatenated alignment of genes (RLM1 and MCM1) placed S. bayanus at the base of<br />
the Saccharomyces sensu stricto node and inferred a ladderised topology amongst the<br />
Saccharomyces sensu stricto species (Figure 13). These phylogenetic trees, based on<br />
RLM1 and concatenated genes, agree with supertrees that indicated both ladderised<br />
topology and S. bayanus and S. kudriavzevii as sister taxa (Fitzpatrick et al., 2006,<br />
Jeffroy et al., 2006). The study of the ‗Saccharomyces complex‘ based on multigenic<br />
analysis also showed a ladderised topology in Saccharomyces sensu stricto species,<br />
having S. bayanus as basal taxon (Kurtzman and Robnett, 2003).
RESULTS AND DISCUSSION<br />
To analyze the degree of conflicting phylogenetic signal within the concatenated<br />
alignment, a phylogenetic network was constructed and a bootstrap analysis was<br />
performed on this phylogenetic network (Figure 14). It is interesting to note that<br />
numerous alternative splits were obtained (60 in total), however no splits were observed<br />
excluding C. glabrata, V. polyspora and S. castellii from the remaining WGD<br />
organisms, despite these species form clades within the WGD group. These results are<br />
in conflict with the concatenated phylogeny (Figure 13), which inferred that C. glabrata<br />
and V. polyspora formed a well-defined clade, and S. castellii was the basal species<br />
from the remaining WGD organisms. The syntenic information clearly shows that S.<br />
castellii diverged from the Saccharomyces sensu stricto lineage before C. glabrata<br />
(Scanell et al., 2006). Therefore topologies that place C. glabrata as outgroup to the<br />
Saccharomyces sensu stricto lineage and S. castellii or vice-versa are unreliable<br />
(Scanell et al., 2006; Fitzpatrick et al., 2006; Suh et al., 2006), like the ones observed in<br />
our study with the simple and concatenated genes. It is known that systematic bias can<br />
influence the resulting tree (Phillips et al., 2004). Hence, a closer scrutiny would be<br />
necessary.The incongruences found in these analyses suggest that the use of additional<br />
Saccharomycotina taxa would be important to help solve the placement of species<br />
belonging to these groups. In addition, it is likely that stochastic errors and erroneous<br />
inferences could be eradicated with the addition of extra genome data, when it becomes<br />
available.<br />
Figure 14. Phylogenetic network reconstructed using a concatenated alignment of RLM1 – MCM1 genes<br />
in the subphylum Saccharomycotina. The NeighborNet method was used to infer splits within<br />
the alignment (1000 bootstrap nodal support).<br />
73
74<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
4.3.2. Phylogenetic relationship amongst Candida species<br />
The group of organisms that translate mainly CUG as serine instead of leucine has<br />
been designated as the CUG group (Kawaguchi et al., 1989; Santos and Tuite, 1995).<br />
This co<strong>do</strong>n reassignment has been proposed to have occurred about 170 million years<br />
ago (Massey et al., 2003). Rlm1, Mcm1 and concatenated topologies inferred a robust<br />
monophyletic clade containing organisms within the CUG group (Figures 11 - 13). The<br />
topology analysis showed that there are four putative CUG sub-clades, the first<br />
containing C. lusitaniae, the second C. guilliermondii and Debaromyces hansenii, the<br />
third Pichia stipitis and the fourth containing C. tropicalis, C. albicans, C. dubliniensis,<br />
C. parapsilosis, and Lodderomyces longisporus. These topologies have also been<br />
recognized in other phylogenetic studies inferred by rRNA genes (Suh et al., 2006).<br />
Some phenotypic and morphologic characteristics could correlate with the topology<br />
observed since C. lusitaniae and C. guilliermondii are haploid yeasts, and their<br />
teleomorphic states present sexual reproduction (Wickerham and Burton, 1954;<br />
Rodriguez de Miranda, 1979; Young et al., 2000). D. hansenii and Pichia stipitis are<br />
homothallic, with a fused mating locus (Dujon et al., 2004; Fabre et al., 2005). In<br />
contrast, members of the fourth sub-clade have at best a cryptic sexual cycle and have<br />
never been observed to undergo meiosis (Hull and Johnson, 1999; Hull et al., 2000;<br />
Magee and Magee, 2000; Pujol et al., 2004; Logue et al., 2005).<br />
The phylogeny of Candida species was also investigated in this study by using a<br />
further analysis with Iff8 protein and gene (Figure 15). The resultant topologies placed<br />
L. elongisporus within the asexual clade with high support, which is in agreement with<br />
other phylogenetic studies based on multigenic and phylogenomic analyses (James et<br />
al., 1994; Fitzpatrick et al., 2006; Diezmann et al., 2004).<br />
The CUG specific topology based on three concatenated proteins/genes, RLM1,<br />
MCM1 and IFF8, suggested that D. hansenii and C. guilliermondii are sister taxa, as<br />
they are grouped together with high support by maximum likelihood and bayesian<br />
inference, excluding C. lusitaniae and P. stipitis (Figure 16). Topologies obtained with<br />
Rlm1 protein/gene and RLM1-MCM1 concatenated genes presented conflicts in the<br />
placement of P. stipitis, being included within C. guilliermondii – Debaryomyces<br />
hansenii clade (Figure 11 and 13A).
A B<br />
RESULTS AND DISCUSSION<br />
Figure 15. Phylogeny of CUG Group based on IFF8. (A) Protein phylogeny obtained by maximum<br />
likelihood, using the JTT + I + G evolutive model with nodal support by Shimodaira-<br />
Hasegawa; (B) Gene phylogeny obtained by using the evolutive model used was GTR + I +<br />
G. Nodes values correspond to posterior probabilities (left) and maximum likelihood (right,<br />
Shimodaira-Hasegawa like support). *In the maximum likelihood analysis, C. tropicalis,<br />
C.parapsilosis and L.elongisporus form a clade with 0.41 S-H and C.parapsilosis-<br />
L.elongisporus are sister taxa with 0.98 S-H. The species mentioned before with C.albicans<br />
and C.dubliniensis form a clade well supported 0.97S-H.<br />
A B<br />
Figure 16. Subphylum Saccharomycotina phylogeny reconstructed using concatenated alignment of<br />
RLM1 - MCM1 – IFF8. (A) Protein phylogeny obtained by maximum likelihood with the<br />
evolutive model JTT + I + G and Shimodaira-Hasegawa like nodal support; (B) Genes<br />
phylogeny obtained by using the evolutive model used was GTR + I + G. Nodes values<br />
correspond to posterior probabilities (left) and maximum likelihood (right, Shimodaira-<br />
Hasegawa like support).<br />
75
76<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Other studies have placed C. lusitaniae in a clade with C. guilliermondii, and<br />
inferred a closer relationship between the two with Debaryomyces species (Daniel et al.,<br />
2001; Diezmann et al., 2004). These results raise interesting questions regarding the<br />
sexual status of the Candida species. It is possible that the "asexual" species are in fact<br />
fully sexual. C. albicans, C. dubliniensis and C. tropicalis have been observed to mate<br />
(Soll et al., 1988; Pujol et al., 2004), and in addition C. albicans genome contains most<br />
of the required genes for the development of meiosis (Tzung et al., 2001). In contrast<br />
the evidence that L. elongisporus reproduces sexually is based on the appearance of<br />
asci, with one (or sometimes two) spores (Van der Walt, 1966).<br />
The phylogenetic network constructed based on the three concatenated genes<br />
(Figure 17) corroborated the CUG-specific concatenated tree regarding the grouping of<br />
C. guilliermondii and D. hansenii as sister taxa, excluding C. lusitianiae and P.stipitis,<br />
but when the network was inferred by two concatenated genes C. guilliermondii, D.<br />
hansenii and P. stipitis formed sister taxa (Figure 14). This analysis showed a topology<br />
conflict with the phylogeny based on Iff8 protein and gene for L. elongisporus group<br />
with C. parapsilosis (Figure 15). As expected there was no conflict for the grouping of<br />
C. albicans and C. dubliniensis, illustrating their high genotypic similarity (Sullivan et<br />
al., 1995), and those species formed with C. tropicalis a highly supported clade, which<br />
agree with other recent published studies (Fitzpatrick et al., 2006; James et al., 2006;<br />
Suh et al., 2006).<br />
Figure 17. Subphylum Saccharomycotina phylogenetic network reconstructed using a concatenated<br />
alignment of RLM1 – MCM1 – IFF8 genes. The NeighborNet method was used to infer splits<br />
within the alignment (1000 bootstrap nodal support).
4.4. Analysis of adaptive evolution<br />
RESULTS AND DISCUSSION<br />
Genome duplication has occurred in the WGD group as previously described (Kellis<br />
et al., 2004). Understanding the fate of duplicates is fundamental to clarify mechanisms<br />
of genetic redundancy and the link between gene family diversification and phenotypic<br />
evolution. Models to identify possible sites of positive selection can provide alternative<br />
explanations for the persistence and diversification of gene duplicates.<br />
The detection of an excess in the ratio of the rate of nonsynonymous (dN) over the<br />
synonymous (dS) substitutions (dN/dS, also denoted ω) is a nonambiguous indicator of<br />
positive selection at the co<strong>do</strong>n level. Positive selection (ω>1) is considered as the<br />
selection of fixing advantageous mutations, and this term is used interchangeably with<br />
molecular adaptation and adaptive molecular evolution. Purifying selection (ω
78<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
avoid the presence of a high number of gaps and to analyze closer species. This new<br />
analysis was performed by using the HYPHY 1.0 package because it consider the<br />
presence of gaps calculating a dN/dS value for those positions (Annex VIII). To better<br />
analyze this sequence the conserved moieties were identified. In the alignment of<br />
Saccharomycotina Rlm1 proteins four conserved moities were recognized, the first<br />
motif is the MADS-box conserved region at the N-terminal, the second is an acidic<br />
region close to MADS-box, third and fourth motif were named A and B, respectively<br />
(Figure 18).<br />
MADS-BOX<br />
Acidic region A<br />
B<br />
Figure 18. Conserved regions in RLM1 gen of the subphylum Saccharomycotina: MADS-box, Acidic<br />
region, A-B Conserved regions present in this subphylum.
RESULTS AND DISCUSSION<br />
At the C-terminal of Rlm1 a repetitive region was identified in Candida albicans,<br />
Candida dubliniensis, Kluyveromyces lactis and Saccharomyces sensu stricto group<br />
(Figure 19).<br />
Results obtained by applying the models to identify sites of positive selection<br />
showed exactly the same evidence, no positive selection not only in the MADS-box<br />
<strong>do</strong>main but also in the all RLM1 sequence, confirming the previous findings. The<br />
significance of the values obtained with the complex models against their corresponding<br />
null models was tested by using a Likelihood Ratio Test (LRT), according to Anisimova<br />
et al. (2001) (Annex IX).<br />
The LRT test confirmed the results supporting no positive selection. Overall, the<br />
results from maximum-likelihood models of co<strong>do</strong>n evolution indicated that the<br />
evolution of MADS-box from fungi and Saccharomycotina RLM1 gene are mainly<br />
under purifying selection, suggesting that the expressed protein plays an important role<br />
in the fungi that must be maintained.<br />
Figure 19. Region of microsatellites present in some species from subphylum Saccharomycotina.<br />
Alignment was performed by using MEGA 4.0 with ClustalW.<br />
A new co<strong>do</strong>n-based model the Mechanistic Empirical Combined (MEC), based<br />
on the junction of all parameters used by the traditional models mentioned before and a<br />
new empirical amino acid replacement model was recently described (Doron-<br />
Faigenboim and Pupko, 2006). This MEC model was also tested on the<br />
Saccharomycotina sequences to detect sites in Rlm1 under positive selection. By means<br />
of this new model it was possible to identify sites under positive selection in the<br />
79
80<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Saccharomycotina Rlm1 protein. The protein sequence from S. cerevisiae was used as<br />
pattern to identify those sites (Figure 20).<br />
The MEC model produced a loglikelihood value of -38544.7 that is higher than -<br />
39363.5 obtained for M8a (used as the null model) which indicates the presence of sites<br />
under positive selection. To test the significancy of the MEC model result, an AICc<br />
(Akaike Information Criterion correction) was performed and the value of AICc for the<br />
MEC model (77099.41602) was lower than the one obtained for the M8a model<br />
(78735.01068), which means that the hypothesis of positive selection can be accepted.<br />
Several sites in the Rlm1 sequence were identified as putatively under positive<br />
selection. It is curious to observe that 57% (20 out of 35) of the places under positive<br />
selection were observed in the first part of the protein, right after the MADS-Box and<br />
before the region A. Between the MADS-box and the acidic region 12% of the amino<br />
acid residues were identified as being under positive selection, between the acidic<br />
region and the region A 8%, between the A and B region 1.7%, and between the B and<br />
the repetitive region 10% (Figure 20).<br />
A particular fact was the observation of amino acids under positive selection<br />
near the microsatellite region. Taking into consideration this repetitive region, we can<br />
observe that in C. albicans, C. dubliniensis and K. lactis the amino acid under repetition<br />
is glutamine, coded by CAA, while in the group of Saccharomyces sensu stricto /S.<br />
cerevisiae is asparagine, encoded by AAC. This observation showed that amino acid<br />
substitutions that would have occurred during the divergence of species, due to events<br />
of mutations and natural selection changed this microsatellite region and probably<br />
diversified gene function in WGD species, avoiding the loss of protein function.<br />
Numerous lines of evidence have demonstrated that genomic distribution of<br />
repetitive DNA, particularly of microsatellites, is nonran<strong>do</strong>m presumably because of<br />
their effects on chromatin organization, regulation of gene activity, recombination,<br />
DNA replication, cell cycle, mismatch repair (MMR) system, etc. (Li et al., 2002).<br />
Microsatellites may provide an evolutionary advantage of fast adaptation to new<br />
environments as evolutionary tuning knobs (Kashi et al., 1997; Trifonov, 2003). Due to<br />
the presence of trinucleotide microsatellites within protein coding regions tracts of
RESULTS AND DISCUSSION<br />
repeated amino acid are present in some proteins and different amino acid repeats are<br />
concentrated in different classes of proteins. In yeast proteins, the most abundant amino<br />
acid repeats are Gln, Asn, Asp, Glu, and Ser (Richard and Dujon 1997; Alba et al.,<br />
1999). In this work within Saccharomycotina Rlm1 microsatellite region tracts of poly-<br />
Gln and poly-Asn have been found (Figure 19). The change in the amino acid tract at<br />
the microsatellite region, from Gln (CAA) to Asn (AAC) could be due to a possible<br />
frameshift mutation that occurred in the gene before the microsatellite region. Since,<br />
this kind of mutations has been already observed in MADS-Box genes, exactly at the C-<br />
terminal (Kramer at al. 2006). Thus, the microsatellite repetitive region of this gene was<br />
further analysed.<br />
Figure 20. Saccharomyces cerevisiae amino acid sequence showing sites under positive selection by the<br />
MEC model. Conserved regions: MADS-box (blue), Acidic region (orange), Region A<br />
(purple), Region B (green) and microsatellites (red).<br />
81
82<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
4.5. Event of frameshift mutation in Saccharomyces cerevisiae RLM1<br />
gene<br />
In the sequence alignment of Saccharomycotina Rlm1 proteins four conserved<br />
moities were recognized, and at the C-terminal a repetitive region was identified in C.<br />
albicans, C. dubliniensis, Kluyveromyces lactis and Saccharomyces sensu stricto group,<br />
which seems to be under positive selection. It is difficult to determine the molecular<br />
events that changed the amino acid residues in the sites identified as under positive<br />
selection but from the nucleotide sequence analysis, the molecular event that could have<br />
changed the tract at the repetitive region might have been a frameshift mutation.<br />
Additionally, this type of mutations has been already observed at the C- terminal of<br />
MADS-box genes, as previously mentioned, and thus this hypothesis was tested by<br />
analyzing the different RML1 reading frames.<br />
The three open reading frames from C. albicans, C. dubliniensis, K. lactis and S.<br />
cerevisiae were analyzed to find all the potential repetitive amino acids in the C-<br />
terminal region of this gene. The first reading frame showed a poli- Glutamine (Gln–Q)<br />
tract in the C-terminal of C. albicans, C. dubliniensis and K. lactis, and a poli-<br />
Asparagine (Asn–N) in the C - terminal of S. cerevisiae (Figure 21). These reading<br />
frames are the actual coding frames of the mentioned yeast species. The S. cerevisiae<br />
second reading frame showed a translation to Threonine (The–T), while the third<br />
reading frame encoded a poli-Glutamine tract, just like C. albicans, C. dubliniensis and<br />
K. lactis (Figure 21).<br />
The mutational mechanisms known to contribute to microsatellite polymorphism<br />
and that could affect the reading frame in the microsatellite region are the following: i)<br />
DNA polymerase slippage during DNA replication (Tachida and Izuka, 1992) and ii)<br />
recombination (unequal crossover or gene conversion) between DNA strands (Harding<br />
et al., 1992; Li et al., 2002). Microsatellites studies indicated that trinucleotide repeats<br />
in coding sequences of many organisms are selected against possible frameshift<br />
mutation since these mutations are generally considered detrimental by causing<br />
nonfunctional transcripts and /or protein, through possible insertion of premature stop<br />
co<strong>do</strong>ns (Ohno, 1970, Tóth et al., 2000; Wren et al., 2000; Cordeiro et al., 2001).<br />
Although trinucleotide repeats are under selective pressure to avoid frameshift<br />
mutations, exceptional situations have been described when a protein is temporarily
RESULTS AND DISCUSSION<br />
freed from selective pressure. If selective pressure is relieved, for example, through a<br />
second copy of a gene, this duplicate can compensate for the possible loss of function<br />
caused by the frameshift mutation and enable such mutations to lead to functional<br />
divergence (Raes and Van de Peer, 2005).<br />
Figure 21. The three RLM1 open reading frames of Candida albicans, Kluyveromyces lactis and<br />
Saccharomyces cerevisisae visualized with the Virtual Ribosome 1.0. The microsatellite<br />
region, as well as the most likely position to insert a nucleotide and change S. cerevisisae<br />
reading frame, avoiding the presence of stop co<strong>do</strong>ns, are indicated<br />
Genomic duplication has been proposed as an advantageous path to evolutionary<br />
innovation, because duplicated genes can supply genetic raw material for the emergence<br />
of new functions through the forces of mutation and natural selection (Ohno, 1970).<br />
Such duplication can involve individual genes, genomic segments or whole genomes<br />
and coordinated duplication of an entire genome may allow for large-scale adaptation to<br />
new environments.<br />
Nucleotide insertion<br />
C. albicans<br />
K. lactis<br />
S. cerevisiae<br />
In this work, the phylogeny analysis clearly inferred two groups that suffered<br />
genome duplication: Saccharomyces sensu strict and sensu lato and it was in the<br />
Saccharomyces sensu stricto group that the amino acid encoded by trinucleotide repeats<br />
suffered alteration. The genome duplication of S. cerevisisae was firstly proposed by the<br />
locations of paralogues that revealed the existence of ancestral duplication blocks<br />
(Wolfe and Shields, 1997; Langkjaer et al., 2003). The model WGD was then proposed<br />
3<br />
2<br />
1<br />
3<br />
2<br />
1<br />
3<br />
2<br />
1<br />
83
84<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
and corroboted by syntenic and genomic studies (Wolfe and Shields, 1997; Seoighe and<br />
Wolfe, 1999, Kellis et al., 2004). Those studies postulated that the WGD event occurred<br />
after the divergence of S. cerevisiae and K. lactis, being also identified a group of Non-<br />
WGD species represented by K. lactis, A. gossypii, S. kluyveri and K. waltii (Byrne and<br />
Wolfe, 2006). In accordance to the inferred phylogenies firstly the CUG group diverged<br />
from the Non WGD – WGD groups, and then the Non WGD – WGD groups separated.<br />
Therefore, the microsatellite within RLM1 must have been present in same ancestral<br />
taxon within Saccharomycotina, evolving during the speciation process, and changing<br />
in recent species, possibly due to a frameshift mutation, as this work is indicating.<br />
Additionally, frameshift mutations have been detected in MADS-box transcription<br />
factor families of plant, causing mutations exactly in the C–terminal and yielding<br />
functional diversification (Lachtman, 1998, Kramer et al., 2006).<br />
In the present study two members of the MADS- box transcription factor family<br />
were studied, the Rlm1 and the Mcm1. However, in S. cerevisiae two other members of<br />
this family were identified, and this species presents four MADS- box in total: the Smp1<br />
whose function is involved in regulating the response to osmotic stress (de Nadal et al.,<br />
2003) and with orthologues in the WGD group and none in the Non WGD and CUG<br />
groups; and the Arg80 whose function is involved in the regulation of arginine-<br />
responsive genes (Dubois et al., 1987), being identified in S. cerevisiae and in the WGD<br />
group (Annexes V and VI). Recently, the complete genomes of Zygosaccharomyces<br />
rouxii and Kluyveromyces thermotolerans have been sequenced, showing that Z. rouxii<br />
presents four MADS-box transcription factors, like the WGD group, while K.<br />
thermotolerans presents just two MADS-box transcription factors, like Non-WGD<br />
group. Therefore, we could infer that the genome duplication in Saccharomycotina<br />
would have occurred in taxa that presented the four MADS-box transcription factors.<br />
Previous studies suggested that at least one MADS-box gene was present in the<br />
common ancestor of plants, animals, and fungi, and that probably the duplication that<br />
gave rise to the animal MADS- box type II and type I genes occurred after animals<br />
diverged from plants, but before fungi diverged from animals (Theissen et al., 1996).<br />
However, Alvarez-Buylla et al. (2000) proposed that at least one ancestral MADS- box<br />
gene duplicated in the common ancestor of the major eukaryotic king<strong>do</strong>ms, more than a<br />
billion years ago, to give rise to the distinct type I (SRF-like) and type II (MEF2-like)
RESULTS AND DISCUSSION<br />
lineages found in plants, fungi, and animals today. Figure 22 presents a scheme of the<br />
MADS-box genes evolution model based on Alvarez-Buylla et al. (2000).<br />
B)<br />
A)<br />
Animal and Fungi<br />
Animals<br />
Lineage<br />
MEF2-like and<br />
SRF-like<br />
isoforms<br />
Lineages<br />
Fungi<br />
Lineage<br />
WGD – Saccharomycotina<br />
Lineage<br />
al., 2000)<br />
Plants<br />
Lineage<br />
Common<br />
Ancestral<br />
1 billion<br />
years ago<br />
(Based on<br />
Alvarez-Buylla et<br />
APETALA,<br />
PISTILLATA,<br />
DEFICIENS,<br />
CAULIFLOWER,<br />
FRUITFULL,<br />
AGAMOUS and<br />
SQUAMOSA<br />
Saccharomyces cervisiae Candida albicans<br />
Figure 22. A) Evolution of MADS-box genes based on the Alvarez-Buylla et al.(2000) model, proposing<br />
a common origin for Animals, Fungi and Plants and a further duplication event in the Fungi<br />
lineage in some species within the subphylum of Saccharomycotina (WGD group). B)<br />
Comparison of Rlm1 protein/gene <strong>do</strong>mains in S. cerevisiae and C. albicans, showing the<br />
frameshift mutations occurred after genome duplication in S. cerevisiae and WGD species<br />
that changed the coding amino acid from Q to N.<br />
85
86<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Animal and fungal MADS-box sequences type II are more closely related to most<br />
plant MADS-box sequences type II than to animal MADS-box sequences type I,<br />
suggesting the occurrence of at least one gene-duplication event before the divergence<br />
of plants and animals. This model also proposes that after the divergence of animal-<br />
fungi and plant lineages both types of MADS-box genes were transferred to each<br />
lineage. In the plant lineages different evolutive mechanisms such as gene duplication,<br />
frameshift mutation, and alternative splicing, occurred, producing a variety of MADS-<br />
box transcription factor genes, while in fungi and animals each diverged with both<br />
MADS-box types. In animals the alternative splicing has been reported as the main<br />
evolutive mechanism that changed these genes. Fungi lineages maintained both MADS-<br />
box types represented by RLM1 and MCM1 genes, but within the ‗Saccharomyces<br />
complex‘ in subphylum Saccharomycotina, with the whole genome duplication event<br />
occurring in Saccharomyces sensu stricto and sensu lato clades, the genes SMP1 and<br />
ARG80 appeared, that are paralogues to Rlm1 and Mcm1, respectively. After this<br />
genome duplication the coded amino acid in the repetitive region of RLM1 gene within<br />
the Saccharomyces sensu stricto group changed. This was due to a frameshift mutation<br />
to encode Asn instead of Gln as in the Non-WGD and CUG groups, that represent more<br />
basal clades in the subphylum Saccharomycotina.<br />
The putative ancestral RLM1 gene was predicted to try to confirm the microsatellite<br />
co<strong>do</strong>n present in the commom ancestor from which CUG, Non-WGD and WGD groups<br />
diverged (Figure 23). This ancestral gene presented the main features characteristic of<br />
the MADS-box transcription factor that has been previously identified namely, the<br />
MADS-box, the acidic region and the regions A and B (Figure 18), but the repetitive<br />
region was not observed. However, in the C-teminal of the protein this predicted<br />
putative ancestral gene/protein showed a low accuracy percentage (59.9% - 43%),<br />
presenting 194 unreliable sites of 1407. These unreliable sites are present in different<br />
regions other than conserved ones, indicating a large variability on the unreliable<br />
regions that were mainly replaced by amino acids with similar properties. On the other<br />
hand, the conserved regions will be a primordial in the function of the protein and thus,<br />
their presence in basal taxa in the subphylum Saccharomycotina.<br />
Several reports indicate that transcription factors and protein kinases are<br />
significantly associated with acidic and polar amino acid repeats, whereas Ser repeats
RESULTS AND DISCUSSION<br />
are significantly associated with membrane transporter proteins (Alba et al., 1999).<br />
Rlm1 is a transcription factor and the repetitive tracts observed are poly-Gln and poly-<br />
Asn, which agrees with the previous refered works.<br />
M G R R K I E I Q P I T D D R N R T V T F I K R K A G L F K<br />
5' ATGGGTAGAAGAAAGATTGAAATTCAGCCGATCACGGACGATCGAAACCGGACAGTGACGTTTATCAAGCGTAAGGCCGGTCTATTCAAA 90<br />
>>>.......................................................................................<br />
* * *<br />
K A H E L A V L C Q V D V A V I I F G N N N K L Y E F S S V<br />
5' AAGGCTCACGAGTTGGCTGTGCTTTGCCAGGTAGATGTTGCGGTGATCATCTTTGGCAACAACAATAAGCTGTACGAGTTCTCCTCTGTT 180<br />
............)))......................................................)))..................<br />
*<br />
D T N E L I K R Y Q K N M P H E M K A P E D Y G D Y K K K K<br />
5' GACACAAACGAGTTGATTAAAAGATACCAGAAAAACATGCCGCACGAAATGAAGGCGCCTGAGGACTACGGAGACTACAAGAAGAAAAAA 270<br />
............))).....................>>>.........>>>.......................................<br />
* * * * *<br />
H V G D D R P T A H F N V N N N N D D D D D D D D D E D D D<br />
5' CATGTGGGTGATGATAGACCAACGGCACATTTTAATGTTAATAACAATAATGACGATGATGATGACGATGATGATGATGAAGATGATGAT 360<br />
..........................................................................................<br />
* * * * * * * * * * * * *<br />
D T D N D T P D A N P K D K N E T E K K T Q N Q T P K E L N<br />
5' GATACTGATAACGATACTCCTGATGCCAATCCTAAAGATAAAAACGAGACGGAAAAAAAAACTCAAAATCAAACTCCTAAGGAGCTGAAC 450<br />
....................................................................................)))...<br />
* * * * * * * * * * * * * * * *<br />
H P Q Q P P P P Q H T S Q Q Q V Q K P E A P K D Q P A P P T<br />
5' CATCCTCAGCAACCCCCGCCGCCTCAACATACTTCTCAACAACAAGTACAAAAACCAGAAGCACCTAAAGACCAACCCGCGCCGCCGACA 540<br />
..........................................................................................<br />
* * * * * * * * * * * * *<br />
P P N H T H K N P D T M N Q R P E P P V Q I P T D V Q T N H<br />
5' CCTCCAAACCACACACACAAGAACCCGGATACTATGAACCAAAGACCTGAACCGCCAGTTCAAATTCCAACTGATGTCCAGACTAATCAT 630<br />
.................................>>>......................................................<br />
* * * * * * * * * * * * * * * * * *<br />
Q N D N A K T I T A I H T H K Q K N Q K N T K N K N N K N N<br />
5' CAGAACGATAATGCTAAAACCATTACGGCCATTCACACCCATAAACAAAAAAATCAAAAGAATACTAAGAATAAAAATAATAAAAATAAT 720<br />
..........................................................................................<br />
* * * * * * * * * * * * * * * * * * * * * * * *<br />
L S N Q N H I N S E T T P T T N Y S A S I K T P D S S N H T<br />
5' CTAAGTAATCAGAATCATATTAATAGTGAAACTACTCCAACTACCAATTATTCTGCATCGATCAAAACACCTGATTCATCAAACCATACT 810<br />
..........................................................................................<br />
* * * * * * * * * * * * * *<br />
L P I P T T T K E L P P T P V T A T A P G L P I S K N N P Y<br />
5' TTGCCAATACCAACAACAACTAAAGAACTACCTCCTACTCCAGTTACTGCAACTGCTCCTGGATTACCAATAAGTAAAAATAATCCATAT 900<br />
))).......................................................................................<br />
* * * * * * * * *<br />
F F G S E P P P Q V S P P Y P S I T P I L Q H E I P Q Q D P<br />
5' TTTTTTGGATCCGAACCTCCACCTCAAGTTTCTCCTCCATATCCTTCTATTACACCTATACTTCAACATGAAATTCCTCAACAAGATCCA 990<br />
..........................................................................................<br />
* * * * * * * * * *<br />
P P Q P V G Q H T P Q Y Q Q T T Q T D T N N K K K L P P P H<br />
5' CCGCCGCAACCGGTAGGACAACATACACCTCAATACCAACAAACTACTCAAACAGATACTAACAATAAGAAAAAATTACCACCACCACAT 1080<br />
..........................................................................................<br />
* * * * * * * * * * * * * *<br />
P N L N H P Q N N A A P I P P P E T I E H N P T G E E Q T P<br />
5' CCTAATCTAAATCATCCACAAAATAATGCGGCACCAATACCTCCACCTGAAACAATTGAACATAATCCTACTGGTGAAGAACAAACGCCA 1170<br />
..........................................................................................<br />
* * * * * * * * * * * * * * * * * *<br />
I S G L P S R Y V N D M F P S P S N F Y A P Q D W P T G M T<br />
5' ATATCAGGACTACCATCGCGATATGTTAACGACATGTTTCCATCTCCATCTAACTTCTACGCTCCTCAAGATTGGCCCACTGGTATGACA 1260<br />
.................................>>>................................................>>>...<br />
* * *<br />
P I N A N M P Q Y V A G T I P A G G P K K E Q T R T R K K T<br />
5' CCCATTAATGCTAATATGCCTCAATACGTTGCGGGTACGATTCCTGCAGGAGGTCCTAAGAAAGAACAAACTCGAACTCGCAAAAAAACA 1350<br />
...............>>>........................................................................<br />
* * * * * * * * * * * * * * * * * * *<br />
K G N D K T D N K E D N N K E K K R K<br />
5' AAGGGAAATGATAAAACGGATAATAAGGAAGACAATAATAAAGAAAAAAAGAGAAAA 1407<br />
.........................................................<br />
* * * * * * * * * * * * * *<br />
Figure 23. Predicted putative ancestral RLM1 gene sequence from which CUG, WGD and Non-WGD<br />
groups diverged. MADS.box (red), Acidic region (orange), conserved region A (yellow) and<br />
conserved region B (green). *unreliable sites.<br />
Earlier investigations speculated that eukaryotes incorporating more DNA<br />
repeats, might provide a molecular device for faster adaptation to environmental<br />
87
88<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
stresses (Kashi et al., 1997; Marcotte et al. 1999; Wren et al., 2000; Li et al. 2002;<br />
Trifonov, 2003). This speculation has been supported by an increasing number of<br />
experiments. For instance, in S. cerevisiae, microsatellites are overrepresented among<br />
ORFs encoding for regulatory proteins (e.g., transcription factors and protein kinases)<br />
rather than for structural ones, indicating the role of microsatellites as a factor<br />
contributing to fast evolution of adaptive phenotypes (Young et al., 2000). In<br />
prokaryotes, microsatellites are not as abundant as in eukaryotes. Most of the<br />
microsatellites in bacteria are located in virulence genes and/or regulatory regions, and<br />
they affect pathogenesis and bacterial adaptive behavior (Hood et al., 1996; Peak et al.,<br />
1996; van Belkum et al., 1998; Field and Wills, 1998). The contingency genes<br />
containing microsatellites show high mutation rates, allowing the bacterium to act<br />
swiftly on deleterious environmental conditions, showing the role of these repetitive<br />
DNA sequences in natural selection (Moxon et al. 1994). In C. albicans, it has been<br />
demonstrated that the microsatellite region in the 3‘end of RLM1 gene, denominated<br />
CAI, presents high polymorphism (Sampaio et al., 2003, 2009). Additionally, the<br />
analysis of CAI polymorphism in strains from recurrent infections revealed<br />
microevolutionary changes at CAI, suggesting a putative involvement of RLM1 in<br />
evolutive and/or adaptative response of C. albicans strains during recurrence (Sampaio<br />
et al. 2005).<br />
The C-terminal of proteins from the MADS box family is necessary for dimerization<br />
and required for transcripticional activation (Messenguy and Dubois, 2003). It is known<br />
that their regulatory specificity depends on accessory factors and in many cases the<br />
cofactor with which the protein interacts specifies which genes are regulated, when they<br />
are regulated and if these genes are transcriptionally activated or repressed (Shore and<br />
Sharrocks, 1995). Moreover, Sampaio et al., (2009) demonstrated that the CAI<br />
repetitive region confers a high genetic variability to RLM1 gene which is reflected in<br />
strain susceptibility to different stress conditions. Thus, and although not evident in the<br />
putative ancestral sequence, this regions seemed to be important for the protein<br />
function. After genome duplication the coded amino acid in the repetitive region of<br />
RLM1 gene within the Saccharomyces sensu stricto group changed, possibly due to a<br />
frameshift mutation to encode Asn instead of Gln. This change certainly created<br />
additional /new function for this protein.
FINAL REMARKS
5. FINAL REMARKS<br />
FINAL REMARKS<br />
Presently, the search of protein-encoding genes that present the desired<br />
characteristics to infer a robust phylogeny is a challenge. Several genes have been used<br />
to construct phylogenetic trees of different species among which the ribosomal genes<br />
were used to infer most of the phylogenetic relationships among the different king<strong>do</strong>ms<br />
of life. In recent years, the use of concatenated gene alignments (multigene analysis)<br />
and of complete genome analysis has become the best approach for resolving<br />
phylogenetic relationships among species, whose placements were not well determined<br />
by single-gene approach. However, the concatenated alignments of genes have only<br />
been based in conserved regions, which can lead to a loss of genetic information that<br />
might be important to resolve certain species placement that even with a large number<br />
of genes may not be possible to infer accurate position.<br />
In this work we used three nuclear protein-encoding genes, RLM1 and MCM1,<br />
belonging to the family of MADS-box transcription factors, and Iff8 a GPI-anchor<br />
protein, to infer phylogeny in the king<strong>do</strong>m Fungi, using the entire gene sequence in<br />
contrast with the previously mentioned studies. Orthologue sequences for the two genes,<br />
RLM1 and MCM1, were identified in 76 species from the king<strong>do</strong>m Fungi that have their<br />
genome sequenced, and 8 putative orthologue IFF8 genes in the group CUG.<br />
Results obtained from the phylogenetic analysis using the transcription factor RLM1<br />
indicated that it presents conditions to be considered within a multigene analysis, since<br />
the obtained phylogeny is closer to the ones established by multigene and phylogenomic<br />
analyses. The transcription factor Mcm1 presented limitations to be used in phylogeny<br />
at the king<strong>do</strong>m level, because of its variable size sequences, leading to possible errors in<br />
the alignment. This problem could be overcome with extra sequences of other species<br />
that would help to infer a more accurate phylogeny and still be used at the king<strong>do</strong>m<br />
level. Despite this limitation this gene can be used to resolve phylogeny at the phylum<br />
or lower clades level, because its use, independently and/or concatenated, to infer<br />
phylogeny within the subphylum Saccharomycotina was in agreement with published<br />
studies. On the other hand, the use of IFF8 gene to infer phylogeny is limited and<br />
restricted to the CUG group, since other orthologues were not found within the king<strong>do</strong>m<br />
Fungi and the results obtained in CUG group phylogeny presented conflicts, particularly<br />
91
92<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
in the position of C. tropicalis that is not in agreement with what has already been<br />
determined as its correct relationship in Candida phylogeny.<br />
It is known that, within subphylum Saccharomycotina, the process of genome<br />
duplication has occurred in the ‗Saccharomyces complex‘, in known groups as sensu<br />
stricto and sensu lato, resulting in duplication of some genes. The transcription factors<br />
studied in this work are within the duplicated genes that were maintained.<br />
Understanding the fate of duplicates is fundamental to clarify mechanisms of genetic<br />
redundancy and the link between gene family diversification and phenotypic evolution.<br />
Thus, in this study, the RLM1 gene was analyzed particularly within the subphylum<br />
Saccharomycotina to identify possible sites of positive selection that could provide<br />
alternative explanations for the persistence and diversification of this gene. The analysis<br />
of adaptive evolution within the subphylum Saccharomycotina, by using the<br />
conventional evolutive models, indicated no positive selection in the RLM1 gene<br />
(purifying selection), suggesting that this protein plays an important role in these<br />
organisms. However, a more complete model, the MEC, identified positive selection in<br />
several amino acid sites, indicating nonsynonymous substitutions in some amino acid<br />
positions. Although these substitutions were present in different positions, these<br />
positions were not inside conserved regions, suggesting that these changes occurred due<br />
to different molecular events during the evolution process of divergence of species. A<br />
particular situation that was further analyzed in this study was the observation of the<br />
presence of amino acids under positive selection at the beginning of the repetitive<br />
region, a trinucleotide microsatellite, and differences in the amino acid under repetition<br />
between the species that presented the duplicated the genome and the non duplicated.<br />
The S. cerevisiae Rlm1 protein presents a common microsatellite region in C.<br />
albicans, C. dubliniensis and K. lactis in the C-terminal of the protein, but there is one<br />
peculiarity about this repetitive region. The amino acid encoded by this trinucleotide<br />
microsatellite in S. cerevisiae is the Asn, and in the other species is Gln. The closer<br />
analyses of the different reading frame of RLM1 gene lead to the conclusion that the<br />
most likely molecular mechanism that changed the amino acids at the microsatellite<br />
region was a frameshift mutation by insertion of nucleotides and not an alteration of<br />
nucleotides in the co<strong>do</strong>ns, as it is generally the case in nonsynonymous substitutions.<br />
This mechanism of frameshift mutation prevented the insertion of stop co<strong>do</strong>ns in the C-
FINAL REMARKS<br />
terminal, avoiding the loss of this region in the protein. This change of amino acids in<br />
the microsatellites of S. cerevisiae RLM1 gene should have brought structural and<br />
functional alterations to the protein.<br />
Another peculiarity of this gene is the absence of these microsatellites in basal<br />
species, within the subphylum Saccharomycotina, indicating that their evolution has<br />
occurred during the divergence of species. This was further confirmed by the<br />
observation that the predicted ancestral RLM1 gene presented the main features<br />
characteristic of the MADS-box transcription factor that has been previously identified<br />
namely the MADS-box, the acidic region and the regions A and B, but the repetitive<br />
region was absent. This absence would be related with the role that the microsatellites<br />
play in the functionality of these proteins in some species and adaptation to different<br />
environmental conditions. Thus, the process of duplication, and genome reorganization<br />
influenced the restructuring of the S. cerevisiae RLM1 gene, decreasing the selective<br />
pressure on this gene and enabling a frameshift mutation, which shifted the reading of<br />
co<strong>do</strong>ns from Gln to Asn. Additionally, this process of gene duplication affected not only<br />
RLM1 gene but also MCM1 gene in the Saccharomyces sensu stricto and<br />
Saccharomyces sensu lato groups. This observation points to the potential use of<br />
MADS-box transcription factors in the studies of identification and process of genomes<br />
duplication within subphylum Saccharomycotina, due to the presence of two or four<br />
MADS-box genes in these yeast species.<br />
The major finding of this work were (i) the identification of the potential use of<br />
RLM1 gene for inferring phylogeny in the king<strong>do</strong>m Fungi with special emphasis to<br />
Candida species, and (ii) the observation that this gene evolved within the<br />
Saccharomyces sensu strict group after genome duplication, being the molecular<br />
mechanism responsible for the change observed in the C-terminal of this protein most<br />
probably a frameshift mutation.<br />
93
REFERENCES
6. REFERENCES<br />
REFERENCES<br />
Abascal, F., Zar<strong>do</strong>ya, R., and Posada, D. (2005). ProtTest: Selection of best-fit<br />
models of protein evolution. Bioinformatics. 21(9): 2104-2105.<br />
Adachi, J., and Hasegawa, M. (1995). MOLPHY: Programs for Molecular<br />
Phylogenetics. Tokyo: Inst. Statist. Math.<br />
Alba, M. M., Santibáñez-Koref, M. F., and Hancock, J. M. (1999). Amino acid<br />
reiterations in yeast are overrepresented in particular classes of proteins and<br />
show evidence of a slippage-like mutational process. J Mol Evol. 49: 789–797.<br />
Alexopoulos, C.J., Mims, C.W., and Blackwell, M. (1996). Introductory<br />
Micology. 4 th edition. John Wiley & Sons Inc., New York, EE.UU.<br />
Alfaro, M. E., Zoller, S., and Lutzoni, F. (2003). Bayes or bootstrap? A<br />
simulation study comparing the performance of Bayesian Markov chain Monte<br />
Carlo sampling and bootstrapping in assessing phylogenetic confidence.<br />
Molecular Biology and Evolution. 20: 255–266.<br />
Althoefer, H., Schleiffer, A., Wassmann, K., Nordheim, A., and Ammerer, G.<br />
(1995). Mcm1 is required to coordinate G2-specific transcription in<br />
Saccharomyces cerevisiae. Mol Cell Biol. 15: 5917–5928.<br />
Alvarez-Buylla, E. R., Pelaz, S., Liljegren, S. J., Gold, S. E., Burgeff, C., Ditta,<br />
G. S., Ribas de Pouplana, L., Martinez-Castilla, L., and Yanofsky, M. F. (2000).<br />
An ancestral MADS-box gene duplication occurred before the divergence of<br />
plants and animals. Proc Natl Acad Sci. 97: 5328–5333.<br />
Anisimova, M., Bielawski, J. P., and Yang, Z. (2001). Accuracy and power of<br />
the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol<br />
Evol, 18(8): 1585-1592.<br />
Anisimova, M., Bielawski, J. P., and Yang, Z. (2002). Accuracy and power of<br />
Bayes prediction of amino acid sites under positive selection. Mol Biol Evol. 19:<br />
950-958.<br />
Anisimova, M., and Gascuel, O. (2006). Approximate Likelihood-Ratio Test for<br />
branches: A fast, accurate and powerful alternative. Systematic Biology. 55(4):<br />
539-552.<br />
Aragona, M., Montigiani, M., and Porta-Puglia, A. (2000). Electrophoretic<br />
karyotypes of the phytopathogenic Pyrenophora graminea and P. teres.<br />
Mycological Research. 104: 853-857.<br />
97
98<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Baillie, G. S., and Douglas, L. J. (1998). Effect of growth rate on resistance of<br />
Candida albicans biofilms to antifungal agents. Antimicrob Agents Chemother.<br />
42: 1900–1905.<br />
Baldauf, S. L. (1999). A search for the origins of animals and fungi: comparing<br />
and combining molecular data. American Naturalist. 154: S178-S188.<br />
Baldauf, S. L., and Palmer, J. D. (1993). Animals and fungi are each other‘s<br />
closest relatives: congruent evidence from multiple proteins. Proc Natl Acad Sci.<br />
90: 11558–11562.<br />
Baldauf, S. L., Roger, A. J., Wenk-Sierfert, I., and Doolittle, W. F. (2000). A<br />
king<strong>do</strong>m-level phylogeny of eukaryotes based on combined protein data.<br />
Science. 290: 972–977.<br />
Ball, L. M., Bes, M. A., Theelen, B., Boekhout, T., Egeler, R., and Kuijper, E. J.<br />
(2004). Significance of amplified fragment length polymorphism in<br />
identification and epidemiological examination of Candida species colonization<br />
in children undergoing allogeneic stem cell transplantation. J Clin Microbiol.<br />
42(4): 1673-1679.<br />
Ballesté, R, Arteta, Z., Fernández, N., Mier, C., Mousqués, N., Xavier, B.,<br />
Cabrera, M. J., Acosta, G., Combol, A., and Gezuele, E. (2005). Evaluación del<br />
medio cromógeno CHROMagar CandidaTM para la identificación de levaduras<br />
de interés médico. Rev Med Uruguay. 21: 186-193.<br />
Baker, E. E., Mrak, M., and Smith, C. E. (1943). The morphology, taxonomy,<br />
and distribution of Coccidioides irnmitis Rixford and Gilchrist 1896. Farlowia.<br />
1: 199-244.<br />
Bapteste, E., Brinkmann, H., Lee, J. A., Moore, D. V., Sensen, C. W., Gor<strong>do</strong>n,<br />
P., Duruflé, L., Gaasterland, T., Lopez, P., Müller, M., and Philippe, H. (2001).<br />
The analysis of 100 genes supports the grouping of three highly divergent<br />
amoebae: Dictyostelium, Entamoeba, and Mastigamoeba. Proc Natl Acad Sci.<br />
99: 1414 – 1419.<br />
Barr, D. J. S. (1992). Evolution and king<strong>do</strong>ms of organisms from the perspective<br />
of a mycologist. Mycologia. 84: 1–11.<br />
Barros Lopes, M., Rainieri, S., Henschke, P.A., and Langridge, P. (1999). AFLP<br />
fingerprinting for analysis of yeast genetic variation. Int J Syst Bacteriol. 49:<br />
915-924.
REFERENCES<br />
Bartnicki-Garcia, S. (1987). The cell wall in fungal evolution., p. 389-403. In A.<br />
D. M. Rayner, C. M. Brasier, and D. Moore (ed.), Evolutionary biology of the<br />
fungi. Cambridge University Press, New York, N. Y.<br />
Bhattacharya, D., Lutzoni, F., Reeb, V., Simon, D., Nason, J., and Fernandez, F.<br />
(2000). Widespread occurrence of spliceosomal introns in the rDNA genes of<br />
Ascomycetes. Molecular Biology and Evolution. 17: 1971–1984.<br />
Bauer, R., Begerow, D., Sampaio, J.P., Weiß, M., and Oberwinkler, F. (2006).<br />
The simple-septate basidiomycetes: a synopsis. Mycological Progress. 5: 41–66.<br />
Beckmann, J. S., and Weber, J. L. (1992). Survey of human and rat<br />
microsatellites. Genomics. 12: 627–631.<br />
Berbee, M. L. (1996). Loculoascomycete origins and evolution of filamentous<br />
ascomycete morphology based on 18S rDNA gene sequence data. Molecular<br />
Biology and Evolution. 13: 462–470.<br />
Berbee, M. L. (2001). The phylogeny of plant and animal pathogens in the<br />
Ascomycota. Physiology and Molecular Plant Pathology. 59: 165–187.<br />
Berbee, M. L., and Taylor, J. W. (1993). Dating the evolutionary radiations of<br />
the true fungi. Canadian Journal of Botany. 71: 1114–1127.<br />
Bernal, S., Mazuelos, E. M., Chavez, M., Coronilla, J., and Valverde, A. (1998).<br />
Evaluation of the new API Candida system for identification of the most<br />
clinically important yeast species. Diagnostic Microbiology and Infectuos<br />
Diseases. 32(3): 217-221.<br />
Binder, M., and Hibbett, D. S. (2002). Higher-level phylogenetic relationships of<br />
Homobasidiomycetes (mushroom-forming fungi) inferred from four rDNA<br />
regions. Molecular Phylogenetics and Evolution. 22: 76–90.<br />
Binder, M., Hibbett, D. S, and Molitoris, H. P. (2001). Phylogenetic<br />
relationships of the marine gasteromycete Nia vibrissa. Mycologia. 93: 679–<br />
688.<br />
Binder, M., Hibbett, D. S., Larsson, K-H., Larsson, E., Langer, E., and Langer,<br />
G. (2005). The phylogenetic distribution of resupinate forms across the major<br />
clades of mushroom-forming fungi (Homobasidiomycetes). Systematics and<br />
Biodiversity. 3(2): 113-157.<br />
Blackwell, M. (2000). Evolution. Terrestrial life-fungal from the start?. Science.<br />
293: 1129-1133.<br />
99
100<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Blackwell, M., Hibbett, D. S., Taylor, J., and Spatafora, J. W. (2006). Research<br />
coordination networks: a phylogeny for king<strong>do</strong>m Fungi (Deep Hypha).<br />
Mycologia. 98(6): 829-837.<br />
Boeke, J. D., Garfinkel, D. J., Styles, C. A., and Fink, G. R. (1985). Ty elements<br />
transpose through an RNA intermediate. Cell. 40: 491–500.<br />
Botterel, F., Cesterke, C., Costa, C., and Bretagne, S. (2001). Analysis of<br />
microsatellite markers of Candida albicans used for rapid typing. J Clin<br />
Microbiol. 39: 4076-4081.<br />
Bowman, B. H., White, T. J., and Taylor, J. W. (1996). Human pathogeneic<br />
fungi and their close nonpathogenic relatives. Mol Phylogenet Evol. 6(1): 89-96.<br />
Bowman, S. M., Piwowar, A., Al Dabbous, M., Vierula, J., and Free, S. J.<br />
(2006). Mutational analysis of the glycosylphosphatidylinositol (GPI) anchor<br />
pathway demonstrates that GPI-anchored proteins are required for cell wall<br />
biogenesis and normal hyphal growth in Neurospora crassa. Eukaryot Cell 5:<br />
587–600.<br />
Bretagne, S., Costa, J. M., Besmond, C., Carsique, R., and Calderone, R. (1997).<br />
Microsatellite polymorphism in the promoter sequence of the elongation factor 3<br />
gene of Candida albicans as the basis for a typing system. J Clin Microbiol. 35:<br />
1777-1780.<br />
Bridge, P. D., Spooner, B. M., and Roberts, P. J. (2005). The impact of<br />
molecular data in fungal systematics. Adv Bot Res. 42: 33 – 67.<br />
Bruno, V. M., Kalachikov, S., Subaran, R., Nobile, C. J., Kyratsous, C., and<br />
Mitchell, A. P. (2006). Control of the C. albicans cell wall damage response by<br />
transcriptional regulatorCas5. Plos Pathogens. 2(3): e21.<br />
Bruns, T. D., Vilgalys, R., Barns, S. M., Gonzalez, D., Hibbett, D. S., Lane, D.<br />
J., Simon, L., Stickel, S., Szaro, T. M., Weisburg, W. G., and Sogin, M. L.<br />
(1992). Evolutionary relationships within the fungi: analyses of nuclear small<br />
subunit RNA sequences. Molecular Phylogenetics and Evolution. 1: 231–241.<br />
Busch, U., and Nitschko, H. (1999). Methods for the differentiation of<br />
microorganisms. Journal of Chromatography B. 722: 263 - 278.<br />
Byrne, K. P., and Wolfe, K. H. (2006). Visualizing syntenic relationships among<br />
the hemiascomycetes with the Yeast Gene Order Browser.Nucleic Acids Res.<br />
34: D452-D455.
REFERENCES<br />
Calderone, R. A. (2002). Candida and Candidiasis. ASM Press. Washington,<br />
DC, EE.UU.<br />
Carlile, M. J., and Watkinson, S. C. (1994). The Fungi. Academic Press,Ltd.,<br />
Lon<strong>do</strong>n, United King<strong>do</strong>m.<br />
Cavalier-Smith, T. (1981). Eukaryote king<strong>do</strong>ms: seven or nine?. Biosystems.<br />
14(3-4): 461-481.<br />
Cavalier-Smith, T. (1987a). The origin of eukaryotic and archaebacterial cells.<br />
Ann N Y Acad Sci. 503: 17-54.<br />
Cavalier-Smith, T. (1987b). The origin of Fungi and pseu<strong>do</strong>fungi. In: Rayner A.<br />
D. M., Brasier C. M. and Moore D. (eds): Evolutionary Biology of the Fungi,<br />
pp. 339–353. Cambridge University Press.<br />
Cavalier-Smith, T. (1998). A revised six-king<strong>do</strong>m system of life. Biol. Rev.<br />
Camb. Philos. Soc. 73: 203–266.<br />
Cavalier-Smith, T. (2000). Flagellate megaevolution: the basis for eukaryote<br />
diversification. In: Green J. R. and Leadbeater B. S. C. (eds): The Flagellates,<br />
pp. 361-390. Taylor and Francis. Lon<strong>do</strong>n.<br />
Cavalier-Smith, T. (2002a). The neomuran origin of archaebacteria, the<br />
negibacterial root of the universal tree and bacterial megaclassification. Int J<br />
Syst Evol Microbiol. 52(Pt 1): 7-76.<br />
Cavalier-Smith, T. (2002b). The phagotrophic origin of eukaryotes and<br />
phylogenetic classification of Protozoa. Int J Syst Evol Microbiol. 52(part2):<br />
297-354.<br />
Cavalier-Smith, T. (2002c). Chloroplast evolution: Secundary symbiogenesis<br />
and multiple losses. Curr Biol. 12: R62-R64.<br />
Cavalier-Smith, T. (2003). Protist phylogeny and the high-level classification of<br />
Protozoa. Europ J Protistol. 39: 338-348.<br />
Cavalier-Smith, T. (2004). Only six king<strong>do</strong>ms of life. Proc R Soc Lond B. 271:<br />
1251-1262.<br />
Cavalier-Smith, T. (2006a). Rooting of tree of life by transition analysis.<br />
Biology Direct. 1: 19.<br />
Cavalier-Smith, T. (2006b). Cell evolution and earth history: stasis and<br />
revolution. Phil Trans Roy Soc B. 361: 969-1006.<br />
101
102<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Cavalier-Smith, T. (2006c). Protozoa: the most abundant predators on earth.<br />
Microbiology today. 166-169.<br />
Cavalier-Smith, T., and Chao, E. E. (1995). The opalozoan Apusomonas is<br />
related to the common ancestor of animals, fungi, and choanoflagellates. Proc R<br />
Soc Lond B. 261: 1–9.<br />
Cavalier-Smith, T., and Chao, E. E. (2003). Phylogeny of Choanozoa,<br />
Apusozoa, and other Protozoa and early eukaryote megaevolution. J Mol Evol.<br />
56: 540–563.<br />
Cavalier-Smith T., Chao E. E., and Oates B. (2004): Molecular phylogeny of<br />
Amoebozoa and the evolutionary significance of the unikont Phalansterium. Eur<br />
J Protistol. 40: 21-48.<br />
Cavalli-Sforza, L. L., and Edwards, A. W. F. (1967). Phylogenetic analysis:<br />
models and estimation procedures. Am J Hum Genet. 19: 122–257.<br />
Chapela, I. H. (1991). Spore size revisited: analysis of spore populations using<br />
automated particle size. Sy<strong>do</strong>wia. 43: 1–14.<br />
Chen, T. C., Chen, Y. H., Tsai, J. J., Peng, C. F., Lu, P. L., Chang, K., Hsieh, H.<br />
C., and Chen, T .P. (2005). Epidemiologic analysis and antifungal susceptibility<br />
of Candida blood isolates in Southern Taiwan. Journal of Microbiology and<br />
Immunological Infectious. 38: 200-210.<br />
Chistiakov, D. A., Hellemans, B., and Volckaert, F. A. M. (2006).<br />
Microsatellites and their genomic distribution, evolution, function and<br />
applications: A review with special reference to fish genetics. Aquaculture. 255:<br />
1-29.<br />
Ciferri, R., and Redaelli, P. (1936). Morfologia, biologia e posizione sistemática<br />
di Coccidioides immitis stiles e delle sue varieta, con notizie sul granuloma<br />
coccidioide. R Accad Ital. 7: 399-474.<br />
Clark, T. A., Slavinski, S. A, Morgan, J., Lott, T., Arthington-Skaggs, B. A.,<br />
Brandt, M. E., Webb, R. M., Currier, M., Flowers, R. H., Fridkin, S. K., and<br />
Hajjeh, R. A. (2004). Epidemiologic and molecular characterization of an<br />
outbreak of Candida parapsilosis bloodstream infections in a community<br />
hospital. J Clin Microbiol. 42(10): 4468-4472.<br />
Cole, G. T., Samson, R. A. (1979). Patterns of development in conidial fungi.<br />
Pittman, Lon<strong>do</strong>n, United King<strong>do</strong>m.
REFERENCES<br />
Coleman, D. C., Rinaldi, M. G., Haynes, K. A., Rex, J. H., Summerbell, R. C.,<br />
Anaissie, E. J., Li, A., and Sullivan, D. J. (1998). Importance of Candida species<br />
other than Candida albicans as opportunistic pathogens. Medical Mycology. 36:<br />
156–165.<br />
Copeland, H. (1956). The Classification of Lower Organisms. Palo Alto: Pacific<br />
Books.<br />
Caro, L. H., Tettelin, H., Vossen, J. H., Ram, A. F., van den Ende, H., and Klis,<br />
F. M. (1997). In silico identification of glycosyl-phosphatidylinositol-anchored<br />
plasma-membrane and cell wall proteins of Saccharomyces cerevisiae. Yeast.<br />
13: 1477–1489.<br />
Cordeiro, G. M., Casu, R., McIntyre, C. L., Manners, J. M., and Henry, R. J.<br />
(2001). Microsatellite markers from sugarcane (Saccharum spp) ESTs across<br />
transferable to erianthus and sorghum. Plant Sci. 160: 1115–1123.<br />
Correia, A., Sampaio, P., James, S., and Pais, C. (2006). Candida bracarensis<br />
sp. nov., a novel anamorphic yeast species phenotypically similar to Candida<br />
glabrata. Int J Syst Evol Microbiol. 56: 313-317.<br />
Damveld, R. A., Arentshorst, M., Franken, A., vanKuyk, P. A., Klis, F. M., van<br />
den Hondel, C. A. M. J. J., and Ram, A. F. J. (2005). The Aspergillus niger<br />
MADS-box transcription factor RlmA is required for cell wall reinforcement in<br />
response to cell wall stress. Molecular Microbiology. 58(1): 305-319.<br />
Daniel, H. M., Sorrell, T. C., and Meyer, W. (2001). Partial sequence analysis of<br />
the actin gene and its potential for studying the phylogeny of Candida species<br />
and their teleomorphs. Int J Syst Evol Microbiol. 51(Pt 4): 1593-1606.<br />
De Groot, P. W., de Boer, A. D., Cunningham, J., Dekker, H. L., de Jong, L.,<br />
Hellingwerf, K. J., de Koster, C., and Klis, F. M. (2004). Proteomic analysis of<br />
Candida albicans cell walls reveals covalently bound carbohydrate-active<br />
enzymes and adhesins. Eukaryot Cell, 3: 955–965.<br />
De Nadal., E., Casa<strong>do</strong>mé, L., and Posas, F. (2003). Targeting the MEF2-like<br />
transcription factor Smp1 by the stress-activated Hog1 mitogen-activated protein<br />
kinase. Mol Cel Biol. 23(1): 229-237.<br />
Dib, J.C., Dube, M., Kelly, C., Rinaldi, M.G., and Patterson, J. E. (1996).<br />
Evaluation of pulsed-field gel electrophoresis as a typing system for Candida<br />
103
104<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
rugosa: comparison of karyotype and restriction fragment length<br />
polymorphisms. J Clin Microbiol. 34: 1494–1496.<br />
Diezmann, S., Cox, C. J., Schönian, G., Vilgalys, R., and Mitchell, T. (2004).<br />
Phylogeny and evolution of medical species of Candida and related taxa: A<br />
multigenic analysis. J Clin Microbiol. 42(12): 5624-5635.<br />
Do<strong>do</strong>u, E., and Treisman, R. (1997). The Saccharomyces cerevisiae MADS-box<br />
transcription factor Rlm1 is a target for the Mpk1 mitogen-activated protein<br />
kinase pathway. Mol Cell Biol. 17: 1848– 1859.<br />
Doron-Faigenboim, A., and Pupko, T. 2006. A combined empirical and<br />
mechanistic co<strong>do</strong>n model. Mol Biol Evol. 24(2): 388-397.<br />
Douady, C. J., Delsuc, F., Boucher, Y., Doolittle, W. F., and Douzery, E. J. P.<br />
(2003). Comparison of the Bayesian and maximum likelihood bootstrap<br />
measures of phylogenetic reliability. Molecular Biology and Evolution. 20: 248–<br />
254.<br />
Douzery, E. J., Snell, E. A., Bapteste, E., Delsuc, F., and Philippe, H. (2004) The<br />
timing of eukaryotic evolution: <strong>do</strong>es a relaxed molecular clock reconcile<br />
proteins and fossils? Proc Natl Acad Sci. 101(43): 15386-15391.<br />
Dubois, E., Bercy, J., and Messenguy, F. (1987). Characterization of two genes,<br />
ARGRI and ARGRIII required for specific regulation of arginine metabolism in<br />
yeast. Mol Gen Genet. 207: 142–148.<br />
Dujon, B., Sherman, D., Fischer, G., Durrens, P., Casaregola, S., Lafontaine, I.,<br />
De Montigny, J., Marck, C., Neuveglise, C., Talla, E., Goffard, N., Frangeul, L.,<br />
Aigle, M., Anthouard, V., Babour, A., Barbe, V., Barnay, S., Blanchin, S.,<br />
Beckerich, J. M., Beyne, E., Bleykasten, C., Boisrame, A., Boyer, J., Cattolico,<br />
L., Confanioleri, F., De Daruvar, A., Despons, L., Fabre, E., Fairhead, C., Ferry-<br />
Dumazet, H., Groppi, A., Hantraye, F., Hennequin, C., Jauniaux, N., Joyet, P.,<br />
Kachouri, R., Kerrest, A., Koszul, R., Lemaire, M., Lesur, I., Ma, L., Muller, H.,<br />
Nicaud, J. M., Nikolski, M., Oztas, S., Ozier-Kalogeropoulos, O., Pellenz, S.,<br />
Potier, S., Richard, G. F., Straub, M. L., Suleau, A., Swennen, D., Tekaia, F.,<br />
Wesolowski-Louvel, M., Westhof, E., Wirth, B., Zeniou-Meyer, M., Zivanovic,<br />
I., Bolotin-Fukuhara, M., Thierry, A., Bouchier, C., Caudron, B., Scarpelli, C.,<br />
Gaillardin, C., Weissenbach, J., Wincker, P., and Souciet, J. L. (2004). Genome<br />
evolution in yeasts. Nature. 430(6995): 35-44.
REFERENCES<br />
Edel, V., Steinberg, C., Gautheron, N., and Alabouvette, C. (1996). Evaluation<br />
of restriction analysis of polymerase chain reaction (PCR)-amplified ribosomal<br />
DNA for the identification of Fusarium species. Mycol Res. 101: 179–187.<br />
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy<br />
and high throughput. Nucleic Acids Research. 32(5): 1792-1797.<br />
Eisenhaber, B., Maurer-Stroh, S., Novatchkova, M., Schneider, G., and<br />
Eisenhaber, F. (2003). Enzymes and auxiliary factors for GPI lipid anchor<br />
biosynthesis and post-translational transfer to proteins. Bioessays. 25: 367–385.<br />
Eisenhaber, B., Schneider, G., Wildpaner, M., and Eisenhaber, F. (2004). A<br />
sensitive predictor for potential GPI lipid modification sites in fungal protein<br />
sequences and its application to genome-wide studies for Aspergillus nidulans,<br />
Candida albicans, Neurospora crassa, Saccharomyces cerevisiae and<br />
Schizosaccharomyces pombe. J Mol Biol. 337: 243–253.<br />
Elble, R., and Tye, B. K. (1992). Chromosome loss, hyperrecombination, and<br />
cell cycle arrest in a yeast mcm1 mutant. Mol Biol Cell. 3: 971 – 980.<br />
Ellegren, H. (2000). Microsatellite mutations in the germline: implications for<br />
evolutionary inference. Trends Genet. 16: 551–558.<br />
Eriksson, O. E., Baral, H. O., Currah, R. S., Hansen, K., Kurtzman, C. P.,<br />
Rambold, G., and Laessøe, T (eds). (2004). Outline of Ascomycota— 2004.<br />
Myconet, 10: 1–99.<br />
Errede, B. (1993). MCM1 binds to a transcriptional control element in Ty1. Mol<br />
Cell Biol. 13: 57 – 62.<br />
Estrella, M. C., Mella<strong>do</strong>, E., Guerra, T. M. D., Monzón, A., and Tudela, J. L. R.<br />
(2000). Susceptibility of fluconazole-resistant clinical isolates of Candida spp.<br />
to echinocandin LY303366, itraconazole and amphotericin B. Journal of<br />
Antimicrobial Chemotherapy. 46: 475-477.<br />
Evans, E. E. (1950). The antigenic composition of Cryptococcus neoformans. I.<br />
A serologic classification by means of the capsular and agglutination reactions. J<br />
Immunol, 64: 423 – 430.<br />
Fabre, E., Muller, H., Therizols, P., Lafontaine, I., Dujon, B., and Fairhead, C.<br />
(2005). Comparative genomics in hemiascomycete yeasts: evolution of sex,<br />
silencing, and subtelomeres. Mol Biol Evol. 22(4): 856-873.<br />
105
106<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Fanci, R., and Pecile, P. (2005). Central venous catheter-related infection due to<br />
Candida membranaefaciens, a new opportunistic azole-resistant yeast in a<br />
cancer patient: a case report and a review of literature. Mycoses, 48: 357-359.<br />
Felsenstein, J. (1978). Cases in which parsimony or compatibility methods will<br />
be positively misleading. Syst Zool. (27): 401– 410.<br />
Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum<br />
likelihood approach. J Mol Evol. 17: 368–376.<br />
Ferguson, M. A. (1999). The structure, biosynthesis and functions of<br />
glycosylphosphatidylinositol anchors, and the contributions of trypanosome<br />
research. J Cell Sci. 112: 2799–2809.<br />
Fernandes, L., Araújo, M. A. M., Amaral, A., Reis, V. C. B., Martins, N. F., and<br />
Felipe, M. S. (2005). Cell signalling pathways in Paracocci<strong>do</strong>ides brasiliensis –<br />
inferred from comparisons with other fungi. Genet Mol Res. 4(2): 216-231.<br />
Field, D., and Wills, C. (1996). Long, polymorphic microsatellites in simple<br />
organisms. Proc R Soc Lon<strong>do</strong>n Ser B. 263: 209–251.<br />
Fink, G. R. (1987). Pseu<strong>do</strong>genes in yeast? Cell. 49: 5–6.<br />
Fitzpatrick, D., Logue, M., Stajich, J., and Butler, G. (2006) A fungal phylogeny<br />
bosed on 42 complete genomes derived from supertree and combined gene<br />
analysis. BMC Evolutionary Biology. 6: 99.<br />
Fluit, A. C., Jones, M. E., Schmitz, F-J., Acar, J., Gupta, R., Verhoef, J.,<br />
SENTRY participants Groupl. (2000). Antimicrobial susceptibility and<br />
frequency of occurrence of clinical blood isolates in Europe from the SENTRY<br />
Antimicrobial Surveillance Program, 1997 and 1998. Clin Infect Dis. 30: 454–<br />
60.<br />
Frisvad, J. C. (1994). Classification of organisms by secondary metabolites,p.<br />
303-321. In D. L. Hawsworth (ed.), The identification and characterization of<br />
pest organisms. CAB International, Wallingford, United King<strong>do</strong>m.<br />
García-Martos, P., Ruiz-Aragón, J., Garcia-Agu<strong>do</strong>, L., Saldarreaga, A., Lozano,<br />
M., and Marín, P. (2004). Aislamiento de Candida ciferri en un paciente<br />
inmunodeficiente. Rev Iberoam Micol. 21: 85-86.<br />
Germot, A., Philipe, H., and Le Guyader, H. (1997). Evidence for loss of<br />
mitochondria in microsporidia from a mitochondrial HSP70 in Nosema locustae.<br />
Molecular and Biochemical Parasitology. 87: 159–168.
REFERENCES<br />
Geiser, D. M., Gueidan, C., Miadlikowska, J., Lutzoni, F., Kauff, F., Hofstetter,<br />
V., Fraker, E., Schoch, C. L., Tibell, L., Untereiner, W. A., and Aptroot, A.<br />
(2006). Eurotiomycetes: Eurotiomycetidae and Chaetothyriomycetidae.<br />
Mycologia, 98(6): 1053-1064.<br />
Gill, E. E., and Fast, N. M. (2006). Assessing the microsporidia–fungi<br />
relationship: combined phylogenetic analysis of eight genes. Gene. 375: 103–<br />
109.<br />
Graf, B., Adam, T., Zill, E., and Göbel, U. B. (2000). Evaluation of the VITEK 2<br />
system for rapid identification of yeasts and yeast-like organisms. Journal of<br />
Clinical Microbiology, 38(5): 1782-1785.<br />
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation<br />
and Bayesian model determination. Biometrika. 82: 711-732.<br />
Gribal<strong>do</strong>, S., and Philippe, H. (2002). Ancient phylogenetic relationships. Theor<br />
Popul Biol. 61(4): 391-408.<br />
Guarro, J., Gené, J., and Stchigel, A. 1999. Developments in fungal taxonomy.<br />
Clin Microbiol Rev. 12(3): 454-500.<br />
Gudlaugsson, O., Gillespie, S., Lee, K., Van de Berg, J., Hu, J., Messer, S.,<br />
Herwaldt, L., Pfaller, M., and Diekema, D. (2003). Attributable mortality of<br />
nosocomial candidemia. Clin Infect Dis. 37: 1172-1177.<br />
Guin<strong>do</strong>n, S., and Gascuel, O. (2003). A simple, fast, and accurate algorithm to<br />
estimate large phylogenies by maximum likelihood. Systematic Biology. 52(5):<br />
696-704.<br />
Guin<strong>do</strong>n, S., Lethiec, F., Duroux, P., and Gascuel, O. (2005). PHYML Online –<br />
a web server for fast maximum likelihood-based phylogenetic inference. Nucleic<br />
Acids Research. 33: W557-W559.<br />
Gur-Arie, R., Cohen, C., Eitan, L., Shelef, E., Hallerman, E., and Kashi, Y.<br />
(2000). Abundance, non-ran<strong>do</strong>m genomic distribution, nucleotide composition,<br />
and polymorphism of simple sequence repeats in E. coli. Genome Res. 10: 61–<br />
70.<br />
Gutierrez, J., Morales, P., Gonzalez, M. A, and Quin<strong>do</strong>s, G. (2002). Candida<br />
dubliniensis, a new fungal pathogen. J. Basic Microbiol. 42: 207–227.<br />
Haeckel, E. (1866). Generelle Morphologie der Organismen. Reimer, Berlin.<br />
107
108<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Hall, B. G. (2008). Phylogenetic trees made easy: A how to manual for<br />
molecular biologists. Third Edition. Sinauer Associates, Inc. Sunderland,<br />
Massachusetts, EE UU.<br />
Harding, R. M., Boyce, A. J., and Clegg, J. B. (1992). The evolution of tandemly<br />
repetitive DNA: recombination rules. Genetics. 132: 847–859.<br />
Hasenclever, H. F., and Mitchell, W. O. (1961). Antigenic studies of Candida. I.<br />
Observations of two antigenic groups in Candida albicans. J Bacteriol. 82: 570<br />
– 573.<br />
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains<br />
and their applications. Biometrika. 57: 97-109.<br />
Hawksworth, D. L. (1991). The fungal dimension of biodiversity-magnitude,<br />
significance, and conservation. Mycological Research. 95: 641.<br />
Hawksworth, D. L. (2001). The magnitude of fungal diversity: the 1-5 million<br />
species estimate revisited. Mycological Research. 105: 1422-1433.<br />
Hawksworth, D. L., Kirk, P. M., Sutton, B. C., and Pegler, D. N. (1995).<br />
Ainsworth & Bisby‘s Dictionary of the Fungi. 8th ed. CAB International,<br />
Wallingford, Oxon, United King<strong>do</strong>m.<br />
Hazen, K. C. (1995). New and emerging yeast pathogens. Clin Microbiol<br />
Reviews. 8: 462–478.<br />
Heath, I. B. (1986). Nuclear division: a marker for protist phylogeny? Progress<br />
in Protistology. 1: 115-162.<br />
Hibbett, D. S., and Binder, M. (2001). Evolution of marine mushrooms.<br />
Biological Bulletin. 201: 319–322.<br />
Hibbett, D. S., Gilbert, L. B., and Donoghue, M.J. (2000). Evolutionary<br />
instability of ectomycorrhizal symbiosis in basidiomycetes. Nature. 407: 506–<br />
508.<br />
Hibbett, D. S., Binder, M., Bischoff, J. F., Blackwell, M., Cannon, P. F.,<br />
Eriksson, O. E., Huhn<strong>do</strong>rf, S., James, T., Kirk, P. M., Lücking, R., Lumbsch, T.,<br />
Lutzoni, F., Matheny, P. B., Mclaughlin, D. J., Powell, M. J., Redhead, S.,<br />
Schoch, C. L., Spatafora, J. W., Stalpers, J. A., Vilgalys, R., Aime, M. C.,<br />
Aptroot, A., Bauer, R., Begerow, D., Benny, G. L., Castlebury, L. A., Crous, P.<br />
W., Dai, Y-C., Gams, W., Geiser, D. M., Griffith, G. W., Gueidan, C.,<br />
Hawksworth, D. L., Hestmark, G., Hosaka, K., Humber, R. A., Hyde, K.,
REFERENCES<br />
Ironside, J. E., Kõljalg, U., Kurtzman, C. P., Larsson, K-H., Lichtwardt, R.,<br />
Longcore, J., Midlikowska, J., Miller, A., Moncalvo, J-M., Mozley-Standridge,<br />
S., Oberwinkler, F., Parmasto, E., Reeb, V., Rogers, J. D., Roux, C., Ryvarden,<br />
L., Sampaio, J. P., Schüßler, A., Sugiyama, J., Thorn, R. G., Tibell, L.,<br />
Untereiner, W. A., Walker, C., Wang, Z., Weir, A., Weiß, M., White, M. M.,<br />
Winka, K., Yao, Y-J., and Zhang, N. (2007). A higher-level phylogenetic<br />
classification of the Fungi. Mycological Research III. 509-247.<br />
Hirt, R. P., Healy, B., Vossbrinck, C. R., Canning, E. U., and Embley, T. M.<br />
(1997). A mitochondrial Hsp70 orthologue in Vairimorpha necatrix: molecular<br />
evidence that Microsporidia once contained mitochondria. Current Biology. 7:<br />
995–998.<br />
Hood, D. W., Deadman, M. E., Jennings, M. P., Bisercic, M., Fleischmann, R.<br />
D., Venter, J. C., and Moxon, E. R. (1996). DNA repeats identify novel<br />
virulence genes in Haemophilus influenzae. Proc Natl Acad Sci. 93: 11121–<br />
11125.<br />
Horowitz, B. J., Giaquinta, D., and Ito, S. (1992). Evolving pathogens in<br />
vulvovaginal candidiasis: Implications for patient care. J Clin Pharmacol. 32(3):<br />
248-255.<br />
Huelsenbeck, J. P., and Ronquist, F. (2001). MrBayes: Bayesian inference of<br />
phylogenetic trees. Bioinformatics Applications Note. 17(8): 754-755.<br />
Helsenbeck, J. P., Larget, B., Miller, R. E., and Ronquist, F. (2002). Potential<br />
applications and pitfalls of bayesian inference of phylogeny. Syst Biol. 51(5):<br />
673 – 688.<br />
Hull, C. M., and Johnson, A. D. (1999). Identification of a mating type-like<br />
locus in the asexual pathogenic yeast Candida albicans. Science. 285(5431):<br />
1271-1275.<br />
Hull, C. M., Raisner, R. M., and Johnson, A. D. (2000). Evidence for mating of<br />
the "asexual" yeast Candida albicans in a mammalian host. Science. 289(5477):<br />
307-310.<br />
Huson, D., and Bryant, D. (2006). Application of Phylogenetic Networks in<br />
evolutionary studies. Mol Biol Evol. 23(2): 254-267.<br />
Inagaki, Y., and Doolittle, W. F. (2000). Evolution of the eukaryotic translation<br />
termination system: origins of release factors. Mol Biol Evol. 17: 882 – 889.<br />
109
110<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Iyer, L. M., Koonin, E. V., and Aravind L. (2004). Evolution of bacterial RNA<br />
polymerase: implications for large-scale bacterial phylogeny, <strong>do</strong>main accretion,<br />
and horizontal gene transfer. Gene. 335: 73-88.<br />
Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S., and Miyata, T. (1989).<br />
Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred<br />
from phylogenetic trees of duplicated genes. Proc Natl Acad Sci. 86(23): 9355-<br />
9359.<br />
James, T. Y., Porter, D., Leander, C. A., Vilgalys, R., and Longcore, J. E.<br />
(2000). Molecular phylogenetics of the Chytridiomycota supports the utility of<br />
ultrastructural data in chytrid systematics. Canadian Journal of Botany. 78: 336–<br />
350.<br />
James, T. Y., Kauff, F., Schoch, C. L., Matheny, P. B., Hofstetter, V., Cox, C.,<br />
Celio, G., Gueidan, C., Fraker, E., Miädlikowska, J., Lumbsch, H. T., Rauhut,<br />
A., Reeb, V., Arnold, E. A., Amtoft, A., Stajich, J. E., Hosaka, K., Sung, G-H.,<br />
Johnson, D., O‘Rourke, B., Crockett, M., Binder, M., Curtis, J. M., Slot, J. C.,<br />
Wang, Z., Wilson, A. W., Schüßler, A., Longcore, J. E., O‘Donnell, K., Mozley-<br />
Standridge, S., Porter, D., Letcher, P. M., Powell, M. J., Taylor, J. W., White,<br />
M. M., Griffith, G. W., Davies, D. R., Humber, R. A., Morton, J., Sugiyama, J.,<br />
Rossman, A. Y., Rogers, J. D., Pfister, D. H., Hewitt, D., Hansen, K.,<br />
Hambleton, S., Shoemaker, R. A., Kohlmeyer, J., Volkmann-Kohlmeyer, B.,<br />
Spotts, R. A., Serdani, M., Crous, P. W., Hughes, K. W., Matsuura, K., Langer,<br />
E., Langer, G., Untereiner, W. A., Lücking, R., Büdel, B., Geiser, D. M.,<br />
Aptroot, A., Diederich, P., Schmitt, I., Schultz, M., Yahr, R., Hibbett, D. S.,<br />
Lutzoni, F., McLaughlin, D., Spatafora, J., and Vilgalys, R. (2006).<br />
Reconstructing the early evolution of the fungi using a six gene phylogeny.<br />
Nature. 443: 818–822.<br />
James, T. Y., Letcher, P. M., Longcore, J. E., Mozley-Standridge, S. E., Porter,<br />
D., Powell, M. J., Griffith, G. W., and Vilgalys, R. (2007). A molecular<br />
phylogeny of the flagellated fungi (Chytridiomycota) and a proposal for a new<br />
phylum (Blastocladiomycota). Mycologia. 98: 860–871.<br />
Jarvis, W. R. (1995). Epidemiology of nosocomial fungal infections, with<br />
emphasis on Candida species. Clin Infect Dis. 20: 1526–1530.
REFERENCES<br />
Jarvis, E. E., Clark, K. L., and Sprague Jr., G. F. (1989). The yeast transcription<br />
activator PRTF, a homolog of the mammalian serum response factor, is encoded<br />
by the MCM1 gene. Genes Dev. 3: 936–945.<br />
Jeffroy, O., Brinkmann, H., Delsuc, F., and Philippe, H. (2006). Phylogenomics:<br />
the beginning of incongruence? Trends Genet. 22(4): 225-231.<br />
Karl, S. A., and Avise, J. C. (1993). PCR-based assays of mendelian<br />
polymorphisms from anonymous single-copy nuclear DNA-techniques and<br />
applications for population genetics. Mol Biol Evol. 10: 342–361.<br />
Kashi, Y., King, D., and Soller, M. (1997). Simple sequence repeats as a source<br />
of quantitative genetic variation. Trends Genet. 13: 74–78.<br />
Kawaguchi, Y., Honda, H., Taniguchi-Morimura, J., and Iwasaki, S. (1989). The<br />
co<strong>do</strong>n CUG is read as serine in an asporogenic yeast Candida cylindracea.<br />
Nature. 341(6238): 164-166.<br />
Keeling, P. J. (2003). Congruent evidence from alpha-tubulin and beta-tubulin<br />
gene phylogenies for a zygomycete origin of microsporidia. Fungal Genetics and<br />
Biology. 38: 298–309.<br />
Keeling, P. J., Luker, M. A., and Palmer, J. D. (2000). Evidence from beta-<br />
tubulin phylogeny that Microsporidia evolved from within the fungi. Molecular<br />
Biology and Evolution. 17: 23–31.<br />
King, N., and Carroll, S. B. (2001): A receptor tyrosine kinase from<br />
choanoflagellates: molecular insights into early animal evolution. Proc. Natl<br />
Acad. Sci. 98: 15032–15037.<br />
Kirk, P. M., Cannon, P. F., David, J. C., and Stalpers, J. A. (eds). (2001).<br />
Ainsworth & Bisby‘s; Dictionary of the Fungi 60 (CAB International,<br />
Wallingford, UK).<br />
Klempp-Selb, B., Rimek, D., and Kappe, R. (2000). Karyotyping of Candida<br />
albicans and Candida glabrata from patients with Candida sepsis. Mycoses. 43:<br />
159-163.<br />
Kohn, L. M. 1992. Developing new characters for fungal systematics: an<br />
experimental approach for determining the rank of resolution. Mycologia. 84:<br />
139–153.<br />
Kosakovsky, S. L., Frost, S. D. W., and Muse, S. V. (2005). HyPhy: Hypothesis<br />
testing using phylogenies. Bioinformatics. 21(5): 676–9.<br />
111
112<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Kramer, E. M., Su, H-J., Wu, C-C., and Hu, J-M. (2006). A simplified<br />
explanation for the frameshift mutation that created novel C-terminal motif in<br />
the APETALA3 gene lineage. BMC Evolutionary Biology. 6: 30.<br />
Krcmery, V., and Barnes, A. J. (2002). Non-albicans Candida spp. causing<br />
fungaemia: pathogenicity and antifungal resistance. J Hosp Infect. 50: 243–260.<br />
Kruger, J., Aichinger, C., Kahmann, R., and Bolker, M. (1997). A MADS-box<br />
homologue in Ustilago maydis regulates the expression of pheromoneinducible<br />
genes but is nonessential. Genetics. 147: 1643– 1652.<br />
Kuhn, D. M., Mukherjee, P. K., Clark, T. A., Pujol, C., Chandra, J., Hajjeh, R.<br />
A., Warnock, D. W., Soll, D. R., and Ghannoum, M. A. (2004). Candida<br />
parapsilosis characterization in an outbreak setting. Emerg Infect Dis. 10: 1074–<br />
1081.<br />
Kullberg, B. J., and Lashof, A. M. L. O. (2002). Epidemiology of opportunistic<br />
invasive mycoses. European Journal of Medical Research. 7: 183-191.<br />
Kuo, M. H., Nadeau, E. T., and Grayhack, E. J. (1997). Multiple phosphorylated<br />
forms of the Saccharomyces cerevisiae Mcm1 protein include an isoform<br />
induced in response to high salt concentrations. Mol Cell Biol. 17: 819–832.<br />
Kurtzman, C. P., and Robnett, C. J. (1998). Identification and phylogeny of<br />
ascomycetous yeasts from analysis of nuclear large subunit (26S) ribosomal<br />
DNA partial sequences. Antonie van Leeuwenhoek, 73: 331–371.<br />
Kurtzman, C. P., and Robnett, C. J. (2003). Phylogenetic relationships among<br />
yeasts of the ‗Saccharomyces complex‘ determined from multigene sequence<br />
analyses. FEMS Yeast Res. 3: 417–432.<br />
Latchman, D. S. (1998). Eukaryotic transcription factors, Academic Press.<br />
Lang, B. F., O‘Kelly, C., Nerad, T., Gray, M. W., and Burger, G. (2002). The<br />
closest unicellular relatives of animals. Curr Biol. 12: 1773–1778.<br />
Langkjaer, R. B., Cliften, P. F., Johnston, M., and Piskur, J. (2003). Yeast<br />
genome duplication was followed by asynchronous differentiation of duplicated<br />
genes. Nature. 421: 848–852.<br />
Linnaei, C. (1735). Systema naturae sive Regna Tria Naturae Systematice<br />
proposita per Classes, Ordines, Genera et Species. Lugduni Batavorum: apud T.<br />
Hakk.
REFERENCES<br />
Le´John, H. B. (1974). Biochemical parameters of fungal phylogenetics. Evol<br />
Biol. 7: 79–125.<br />
Li, J., Xu, J., and Bai, F. (2006). Candida pseu<strong>do</strong>rugosa sp. nov., a novel yeast<br />
species from sputum. J Clin Microbiol. 44(12): 4486-4490.<br />
Li, Y-C., Korol, A. B., Fahima, T., Beiles, A., and Nevo, E. (2002).<br />
Microsatellites: genomic distribution, putative functions and mutational<br />
mechanisms. Molecular Ecology. 11: 2453-2465.<br />
Li, Y-C., Korol, A.B., Fahima, T., and Nevo, E. (2004). Microsatellites within<br />
genes: structure, function, and evolution. Mol Biol Evol. 21: 991–1007.<br />
Liu, Y. J., Whelen, S., and Hall, B. D. (1999). Phylogenetic relationships among<br />
ascomycetes: evidence from an RNA polymerase II subunit. Molecular Biology<br />
and Evolution. 16: 1799–1808.<br />
Liu, Y. J., Hodson, M. C., and Hall, B. D. (2006). Loss of the flagellum<br />
happened only once in the fungal lineage: phylogenetic structure of king<strong>do</strong>m<br />
Fungi inferred from RNA polymerase II subunit genes. BMC Evolutionary<br />
Biology. 6: 74.<br />
Liu, Y. J., Leigh, J. W., Brinkmann, H., Cushion, M. T., Rodriguez-Ezpeleta, N.,<br />
Philippe, H., and Lang, B. F. (2009). Phylogenomic analysis support the<br />
monophyly of Taphrinomycotina, including Schizosaccharomyces fission<br />
yeasts. Mol Biol Evol. 26: 27-34.<br />
Logue, M. E., Wong, S., Wolfe, K. H., and Butler, G. (2005). A genome<br />
sequence survey shows that the pathogenic yeast Candida parapsilosis has a<br />
defective MTLa1 allele at its mating type locus. Eukaryot Cell. 4(6): 1009-1017.<br />
López Martínez, R., Ruiz Sánchez, J., and Vértiz <strong>Chávez</strong>, E. (1984). Vaginal<br />
candidiasis. Mycopathologia. 85: 167-170.<br />
Lowy B. 1971. New records of mushroom stones from Guatemala. Mycologia.<br />
63: 983-993.<br />
Lumbsch, H. T., Schmitt, I., Lindemuth, R., Miller, A., Mangold, A., Fernandez,<br />
F., and Huhn<strong>do</strong>rf, S. (2005). Performance of four ribosomal DNA regions to<br />
infer higher-level phylogenetic relationships of inoperculate euascomycetes<br />
(Leotiomyceta). Mol Phylogenet Evol. 34(3):512-524.<br />
Lutzoni, F., Pagel, M., and Reeb, V. (2001). Major fungal lineages are derived<br />
from lichen symbiotic ancestors. Nature. 411: 937–940.<br />
113
114<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Lutzoni, F., Kauff, F., Cox, C. J., McLaughlin, D., Celio, G., Dentinger, B.,<br />
Padamsee, M., Hibbett, D. S., James, T. Y., Baloch, E., Grube, M., Reeb, V.,<br />
Hofstetter, V., Schoch, C., Arnold, A. E., Miädlikowska, J., Spatafora, J.,<br />
Johnson, D., Hambleton, S., Crockett, M., Shoemaker, R., Sung, G-H., Lücking,<br />
R., Lumbsch, T., O‘Donnell, K., Binder, M., Diederich, P., Ertz, D., Gueidan,<br />
C., Hansen, K., Harris, R. C., Hosaka, K., Lim, Y-W., Matheny, B., Nishida, H.,<br />
Pfister, D., Rogers, J., Rossman, A., Schmitt, I., Sipman, H., Stone, J.,<br />
Sugiyama, J., Yahr, R., and Vilgalys, R. (2004). Assembling the fungal tree of<br />
life: progress, classification, and evolution of subcellular traits. American<br />
Journal of Botany. 91: 1446–1480.<br />
Magee, B. B., and Magee, P. T. (1987). Electrophoretic karyotypes and<br />
chromosome numbers in Candida species. J Gen Microbiol. 133: 425–430.<br />
Magee, B. B., and Magee, P. T. (2000). Induction of mating in Candida albicans<br />
by construction of MTLa and MTLalpha strains. Science. 289(5477): 310-313.<br />
Marcotte, E. M., Pellegrini, M., Yeates, T. O., and Eisenberg, D. (1999). A<br />
census of protein repeats. J Mol Biol. 293: 151–160.<br />
Massey, S. E., Moura, G., Beltrao, P., Almeida, R., Garey, J. R., Tuite, M. F.,<br />
and Santos, M. A. (2003). Comparative evolutionary genomics unveils the<br />
molecular mechanism of reassignment of the CTG co<strong>do</strong>n in Candida spp.<br />
Genome Res. 13(4): 544-557.<br />
Matheny, P. B., Gossman, J. A., Zalar, P., Arun Kumar, T. K., and Hibbett, D. S.<br />
(2007a). Resolving the phylogenetic position of the Wallemiomycetes: an<br />
enigmatic major lineage of Basidiomycota. Canadian Journal of Botany. 84:<br />
1794–1805.<br />
Matheny, P. B., Wang, Z., Binder, M., Curtis, J. M., Lim, Y. W., Nilsson, R. H.,<br />
Hughes, K. W., Hofstetter, V., Ammirati, J. F., Schoch, C. L., Langer, G. E.,<br />
McLaughlin, D. J., Wilson, A. W., Frøslev, T., Ge, Z. W., Kerrigan, R. W., Slot,<br />
J. C., Vellinga, E. C., Liang, Z. L., Baroni, T. J., Fischer, M., Hosaka, K.,<br />
Matsuura, K., Seidl, M. T., Vaura, J., and Hibbett, D. S. (2007b). Contributions<br />
of Rpb2 and Tef1 to the phylogeny of mushrooms and allies (Basidiomycota,<br />
Fungi). Molecular Phylogenetics and Evolution. 43: 430-451.<br />
Mayer, V. W., and Aguilera, A. (1990). High levels of chromosome instability in<br />
polyploids of Saccharomyces cerevisiae. Mutat Res. 231: 177–186 (1990).
REFERENCES<br />
McConville, M. J., and Menon, A. K. (2000). Recent developments in the cell<br />
biology and biochemistry of glycosylphosphatidylinositol lipids (review). Mol<br />
Membr Biol. 17: 1–16.<br />
McInerny, C. J., Partridge, J. F., Mikesell, G. E., Creemer, D. P., and Breeden,<br />
L. L. (1997). A novel Mcm1-dependent element in the SWI4, CLN3, CDC6, and<br />
CDC47 promoters activates M/G1-specific transcription. Genes Dev. 11: 1277–<br />
1288.<br />
Melo, A., de Almeida, L., Colombo, A., and Briones, M. (1998). Evolutionary<br />
distances and identification of Candida species in clinical isolates by ran<strong>do</strong>mly<br />
amplified polymorphic DNA (RAPD). Mycopathologia. 142: 57-66.<br />
Messenguy, F., and Dubois, E. (1993). Genetic evidence for a role for MCM1 in<br />
the regulation of arginine metabolism in Saccharomyces cerevisiae. Mol Cell<br />
Biol. 13: 2586– 2592.<br />
Messenguy, F., and Dubois, E. (2000). Regulation of arginine metabolism in<br />
Saccharomyces cerevisiae: a network of specific and pleiotropic proteins in<br />
response to multiple environmental signals. Food Technol Biotechnol. 38: 277–<br />
285.<br />
Metzgar, D., van Belkum, A., Field, D., Haubrich, R., and Wills, C. (1998).<br />
Ran<strong>do</strong>m amplification of polymorphic DNA and microsatellite genotyping of<br />
pre- and posttreatment isolates of Candida spp. from human immunodeficiency<br />
virus-infected patients on different fluconazole regimens. J Clin Microbiol. 36:<br />
2308-2313.<br />
Metzgar, D., Bytof, J., and Wills, C. (2000). Selection against frameshift<br />
mutations limits microsatellite expansion in coding DNA. Genome Res. 10: 72–<br />
80.<br />
Miadlikowska, J., and Lutzoni, F. (2004). Phylogenetic classification of<br />
peltigeralean fungi (Peltigerales, Ascomycota) based on ribosomal RNA small<br />
and large subunits. American Journal of Botany. 91: 449–464.<br />
Micales, J. A., Bonde, M. R., and Peterson, G. L. (1986). The use of isozyme<br />
analysis in fungal taxonomy and genetics. Mycotaxon. 27: 405-409.<br />
Moestrup, Ø. (2000): The flagellate cytoskeleton: introduction of a general<br />
terminology for microtubular roots in protists. In: Leadbeater B. S. and Green J.<br />
115
116<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
C. (eds): The flagellates: unity, diversity and evolution, pp. 69–94. Taylor and<br />
Francis, Lon<strong>do</strong>n.<br />
Moncalvo, J. M., Vilgalys, R., Redhead, S. A., Johnson, J. E., James, T. Y.,<br />
Aime, M. C., Hofstetter, V., Verduin; S. J. W., Larsson, E., Baroni, T. J., Thorn,<br />
R.G., Jacobsson, S., Clémençon, H., and Miller, jr, O. K. (2002). One hundred<br />
and seventeen clades of euagarics. Molecular Phylogenetics and Evolution. 23:<br />
357–400.<br />
Moss, M. O. (1987). Fungal biotechnology roundup. Mycologist, 21: 55-58.<br />
Moxon, E. R., Rainey, P. B., Nowak, M. A., and Lenski, R. E. (1994). Adaptive<br />
evolution of highly mutable loci in pathogenic bacteria. Curr Biol. 4: 24–33.<br />
Mueller, C. G., and Nordheim, A. (1991). A protein <strong>do</strong>main conserved between<br />
yeast MCM1 and human SRF directs ternary complex formation. EMBO J. 10:<br />
4219– 4229.<br />
Musial, C. E., Cockerill, F. R., and Roberts, G. D. (1988). Fungal infections of<br />
the immunocompromised host: Clinical and laboratory aspects. Clin Microbiol<br />
Rev. 1: 349–364.<br />
Nagahama, T., Sato, H., Shimazu, M., and Sugiyama, J. (1995). Phylogenetic<br />
divergence of the entomophthoralean fungi: evidence from nuclear 18S<br />
ribosomal RNA gene sequences. Mycologia. 87: 203–209.<br />
Nagy, Á., Pesti, M., Galgóczy, M., and Vágvölyi, C. (2004). Electrophoretic<br />
karyotype of two Micromucor species. Journal of Basic Microbiology. 44(1):<br />
36-41.<br />
Nadal Ed, E., Casa<strong>do</strong>me, L., and Posas, F. (2003). Targeting the MEF2-like<br />
transcription factor Smp1 by the stress-activated Hog1 mitogen-activated protein<br />
kinase. Mol Cell Biol. 23: 229– 237.<br />
Nei, M. (1996). Phylogenetic analysis in molecular evolutionary genetics. Annu<br />
Rev Genet. 30: 371-403.<br />
Nielsen, R., and Yang, Z. (1998). Likelihood models for detecting positively<br />
selected amino acid sites and applications to the HIV-1 envelope gene. Genetics.<br />
148: 929–936.<br />
Nielsen, C. B., Friedman, B., Birren, B., Burgue, C. B., and Galagan, J. E.<br />
(2004). Patterns of intron gain and loss in Fungi. Plos Biology. 2(12): e422.
REFERENCES<br />
Nogueira, M. I., Silva, P., Galin<strong>do</strong>, I., Ramirez, J. L., Arruda, R., and Franco, J.<br />
(1998). Electrophoretic karyotypes and genome sizing of the pathogenic fungus<br />
Paracoccidioides brasiliensis. J Clin Microbiol. 36(3): 742-747.<br />
Norman, C., Runswick, M., Pollock, R., and Treisman, R. (1988). Isolation and<br />
properties of cDNA clones encoding SRF, a transcription factor that binds to the<br />
c-fos serum response element. Cell. 55: 989– 1003.<br />
Nylander, J. A. A. (2004). MrModeltest v2. Program distributed by the author.<br />
Evolutionary Biology Centre, Uppsala University.<br />
Odds, F. C. (1988). Candida and Candi<strong>do</strong>sis: A Review and Bibliography. 2 ed.<br />
Baillière Tindall, Lon<strong>do</strong>n.<br />
Ohno, S. (1970). Evolution by Gene Duplication (Allen and Unwin, Lon<strong>do</strong>n).<br />
Ophuls, M. D. (1905). Further observations on a pathogenic mould formerly<br />
described as a protozoon (Coccidioides immitis, Coccidioides pyrogenes). J Exp<br />
Med. 6: 443-485.<br />
Pace, N. R. (1997). A molecular view of microbial diversity and the biosphere.<br />
Science. 276: 734–740.<br />
Pan, S., Sigler, L., and Cole, G. T. (1994). Evidence for a phylogenetic<br />
connection between Coccidioides immitis and Uncinocarpus reesii<br />
(Onygenaceae). Microbiology. 140 (Pt 6): 1481-1494.<br />
Passmore, S., Maine, G. T., Elble, R., Christ, C., and Tye, B. K. (1988).<br />
Saccharomyces cerevisiae protein involved in plasmid maintenance is necessary<br />
for mating of MAT alpha cells. J Mol Biol. 204: 593– 606.<br />
Passmore, S., Elble, R., and Tye, B. K.. (1989). A protein involved in<br />
minichromosome maintenance in yeast binds a transcriptional enhancer<br />
conserved in eukaryotes. Genes Dev. 3: 921– 935.<br />
Peak, I. R. A., Jennings, M. P., Hood, D. W., Bisercic, M., and Moxon, E. R.<br />
(1996). Tetrameric repeat units associated with virulence factor phase variation<br />
in Haemophilus also occur in Neiserria spp. and Moraxella catarrhalis. FEMS<br />
Microbiol Lett. 137: 109–114.<br />
Pedrós, M. (2003). Clonación y caracterización de una hidrofobina de clase II<br />
(CaHPB) de Candida albicans. Tesis Doctoral. Facultad de Farmacia.<br />
Universitat de Valencia. Valencia, España. 215 pag.<br />
117
118<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Pellegrini, L., Tan, S., and Richmond, T. J. (1995). Structure of serum response<br />
factor core bound to DNA. Nature. 376: 490– 498.<br />
Peretó, J., López-García, P., and Moreira, D. (2004). Ancestral lipid biosynthesis<br />
and early membrane evolution. Trends Biochem Sci. 29: 469-477.<br />
Persoh, D., and Rambold, G. (2003). Fungal diversity in the host lichen Letharia<br />
vulpine. In J. Stadler, I. Hensen, S. Klotz, and H. Feldmann (eds.),<br />
Biodiversity—from patterns to processes. Kurzfassungen der Beiträge zur 33.<br />
Jahrestagung der Gesellschaft für Ökologie, 161. Verhandlungen der<br />
Gesellschaft für Ökologie, Halle/Saale, Germany.<br />
Peterson, S. W. (2000). Phylogenetic relationships in Aspergillus based on<br />
rDNA sequences analysis. In R. A. Samson and J. I. Pitt (eds.), Integration of<br />
modern taxonomic methods for Penicillium and Aspergillus classification.<br />
Harwood Academic Publishers, Amsterdam, Netherlands.<br />
Peyretaillade, E., Broussolle, V., Peyret, P., Méténier, G., Gouy, M., Vivarès, C.<br />
P. (1998). Microsporidia, amitochondrial protists, possess a 70-kDa heat shock<br />
protein gene of mitochondrial evolutionary origin. Molecular Biology and<br />
Evolution. 15: 683–689.<br />
Pfaller, M. A., Diekema, D. J., Jones, R. N., Sader, H. S., Fluit, A. C., Hollis, R.<br />
J., and Messer, S. A. (2001). International surveillance of bloodstream infections<br />
due to Candida species: frequency of occurrence and in vitro susceptibilities to<br />
fluconazole, ravuconazole, and voriconazole of isolates collected from 1997<br />
through 1999 in the SENTRY antimicrobial surveillance program. J Clin<br />
Microbiol. 39: 3254–3259.<br />
Pfaller, M. A., Diekema, D. J., Messer, S. A., Boyken, L., Hollis, R. J., and<br />
Jones, R. N. (2003). In vitro activities of voriconazole, posaonazole, and four<br />
licensed systemic antifungal agents against Candida species infrequently<br />
isolated from blood. J Clin Microbiol. 41: 78–83.<br />
Phillips, M. J., Delsuc, F., and Penny, D. (2004). Genome-scale phylogeny and<br />
the detection of systematic biases. Mol Biol Evol. 21(7): 1455-1458.<br />
Pirozynski, K. A., and Malloch, D. (1975). The origin of land plants: a matter of<br />
mycotropism. BioSystems. 6: 153–164.
REFERENCES<br />
Pollock, R., and Treisman, R. (1991). Human SRF-related proteins:<br />
DNAbinding properties and potential regulatory targets. Genes Dev. 5: 2327–<br />
2341.<br />
Polonelli, L., Conti, S., Magliani, W., and Horace, G. (1989). Biotyping of<br />
pathogenic fungi by the killer system and with monoclonal antibodies.<br />
Mycopathologia. 107(1): 17 – 23.<br />
Posada, D. (2008). jModelTest: Phylogeny model averaging. Mol Biol Evol.<br />
25(7): 1253-1256.<br />
Pujol, C., Daniels, K. J., Lockhart, S. R., Srikantha, T., Radke, J. B., Geiger, J.,<br />
and Soll, D. R. (2004). The closely related species Candida albicans and<br />
Candida dubliniensis can mate. Eukaryot Cell. 3(4): 1015-1027.<br />
Raes J., and van de Peer, Y. (2005). Functional divergenceof proteins through<br />
frameshift mutations. Trends in Genetics. 21(8): 428-431.<br />
Ragan, M. A., Goggins, C. L., Cawthorn, R. J., Cerenius, L., Jamieson, A. V. C.,<br />
Plourde, S. M., Rand, T. G., Söderhäll, K., and Gutell, R. R. (1996). A novel<br />
clade of protistan parasites near the animal-fungal divergence. Proc Natl Acad<br />
Sci. 93: 11907–11912.<br />
Reeb, V., Lutzoni, F., and Roux, C. (2004). Contribution of RPB2 to multilocus<br />
phylogenetic studies of the euascomycetes (Pezizomycotina, Fungi) with special<br />
emphasis on the lichen-forming Acarosporaceae and evolution of polyspory.<br />
Molecular Phylogenetics and Evolution. 32: 1036-1060.<br />
Ribeiro, E., Pereira, C., Takaki, R., and Höfling, J.F. (2000). Grouping oral<br />
Candida species by multilocus enzyme electrophoresis. Int J Syst Evol<br />
Microbiol. 50: 1343-1349.<br />
Richard, G.–F., and Dujon, B. (1997). Trinucleotide repeats in yeast. Res<br />
Microbiol, 148: 731–744.<br />
Riechmann, J. L., and Meyerowitz, E. M. (1997). Determination of floral organ<br />
identity by Arabi<strong>do</strong>psis MADS <strong>do</strong>main homeotic proteins AP1, AP3, PI, and<br />
AG is independent of their DNA-binding specificity. Mol Biol Cell. 8: 1243–<br />
1259.<br />
Riechmann, J. L., Krizek, B. A., and Meyerowitz, E. M. (1996). Dimerization<br />
specificity of Arabi<strong>do</strong>psis MADS <strong>do</strong>main homeotic proteins APETALA1,<br />
119
120<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
APETALA3, PISTILLATA, and AGAMOUS. Proc Natl Acad Sci. 93: 4793 –<br />
4798.<br />
Rixford, E., and Gilchrist, C. (1896). Two cases of protozoan (coccidioidal)<br />
infection of the skin and other organs. Johns Hopkins Hospital Report. 1: 209-<br />
268.<br />
Rensberger, B. (1992). The Iceman: Now the research is on ice. J NIIT Res. 4:<br />
25–27.<br />
Rodrigues de Miranda, L. (1979). Clavispora, a new yeast genus of the<br />
Saccharomycetales. Antonie Van Leeuwenhoek. 45(3): 479-483.<br />
Roger, A. J, and Hug, L. A. (2006). The origin and diversification of eukaryotes:<br />
problems with molecular phylogenetics and molecular clock estimation. Philos<br />
Trans R Soc Lond B Biol Sci. 361: 1039-1054.<br />
Rokas, A., King, N., Finnerty, J., and Carroll, S. B. (2003). Conflicting<br />
phylogenetic signals at the base of the metazoan tree. Evol. Dev. 5: 346–359.<br />
Romero, A. J., and Minter, D. W. (1988). Fluorescence microscopy: an aid to the<br />
elucidation of ascomycete structures. Trans Br Mycol Soc. 90: 457–470.<br />
Rottmann, M., Dieter, S., Brunner, H., and Rupp, S. (2003). A screen in<br />
Saccharomyces cerevisiae identified CaMCM1, an essential gene in Candida<br />
albicans crucial for morphogenesis. Mol Microbiol. 47: 943–959.<br />
Saitou, N., and Nei, M. (1987). The neighbor-joining method: a new method for<br />
reconstructing phylogenetic trees. Mol Biol Evol. 4: 406–425.<br />
Sampaio, P., Gusmão, L., Alves, C., Pina-Vaz, C., Amorim, A., and Pais, C.<br />
(2003). Highly polymorphic microsatellite for identification of Candida albicans<br />
strains. J Clin Microbiol. 41(2): 552-557.<br />
Sampaio, P., Gusmão, L., Correia, A., Alves, C., Rodrigues, A., Pina-Vaz, C.,<br />
Amorim, A., and Pais, C. (2005). New microsatellite multiplex PCR for Candida<br />
albicans strain typing reveals microevolutionary changes. J Clin Microbiol.<br />
43(8): 3869-3876.<br />
Sampaio, P., Nogueira, E., Loureiro, A. S., Delga<strong>do</strong>-Silva, Y, Correia, A., Pais,<br />
C. (2009). Increased number of glutamine repeats in the C-terminal of Candida<br />
albicans Rlm1p enhances the resistant to stress agents. Antonie van<br />
Leeuwehoek. In press.
REFERENCES<br />
Sandven, P. (2000). Epidemiology of candidemia. Rev Iberoam Micol. 17: 73–<br />
81.<br />
San-Millan, R., Ribacoba, L., Ponton, J., and Quin<strong>do</strong>s, G. (1996). Evaluation of<br />
a commercial medium for identification of Candida species. J Clin Microbiol<br />
Infect Dis. 15: 153 – 158.<br />
Santos, M. A., and Tuite, M. F. (1995). The CUG co<strong>do</strong>n is decoded in vivo as<br />
serine and not leucine in Candida albicans. Nucleic Acids Res. 23(9): 1481-<br />
1486.<br />
Santelli, E., and Richmond, T.J. (2000). Crystal structure of MEF2A core bound<br />
to DNA at 1.5 A resolution. J Mol Biol. 297: 437– 449.<br />
Savelkoul, P. H. M., Aarts, H. J. M., de Haas, J., Dijkshoorn, L., Duim, B.,<br />
Otsen, M., Rademaker, J. L. W., Schouls, L., and Lenstra, J. A. 1999. Amplified<br />
fragment length polymorphism analysis: the state of an art. J Clin Microbiol. 37:<br />
3083–3091.<br />
Scannell, D. R., Byrne, K. P., Gor<strong>do</strong>n, J. L., Wong, S., and Wolfe, K. H. (2006).<br />
Multiple rounds of speciation associated with reciprocal gene loss in polyploid<br />
yeasts. Nature. 440(7082): 341-345.<br />
Shalchian-Tabrizi, K., Minge, M. A., Espelund, M., Orr, R., Ruden, T.,<br />
Jakobsen, K. S., and Cavalier-Smith, T. (2008). Multigene phylogeny of<br />
Choanozoaand the Origin of Animals. Plos ONE. 3(5): e2098.<br />
Schoch, C. L., Shoemaker, R. A., Seifert, K. A. Hambleton, S., Spatafora, J. W.,<br />
and Crous, P. W. (2006). A multigene phylogeny of the Dothideomycetes using<br />
four nuclear loci. Mycologia. 98(6): 1041-1052.<br />
Schüßler, A., Schwarzott, D., and Walker, C. (2001). A new fungal phylum, the<br />
Glomeromycota: phylogeny and evolution. Mycologycal Research. 105: 1413-<br />
1421.<br />
Schwarz-Sommer, Z., Huijser, P., Nacken, W., Saedler, H., and Sommer, H.<br />
(1990). Genetic control of flower development by homeotic genes in<br />
Antirrhinum majus. Science. 250: 931– 936.<br />
Seifert, K. A., Wingfield, B. D., and Wingfield, M. J. 1995. A critique of DNA<br />
sequence analysis in the taxonomy of filamentous Ascomycetes and<br />
ascomycetous anamorphs. Can J Bot. 73 (Suppl. 1): S760–S767.<br />
121
122<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Seoighe, C., and Wolfe, K. H. (1999). Updated map of duplicated regions in the<br />
yeast genome. Gene. 238: 253–261.<br />
Sharrocks, A. D., Gille, H., and Shaw, P. E. (1993). Identification of amino acids<br />
essential for DNA binding and dimerization in p67SRF: implications for a novel<br />
DNA-binding motif. Mol Cell Biol. 13: 123– 132.<br />
Shemer, R., Weissman, Z., Hashman, N., and Kornitzer, D. 2001. A highly<br />
polymorphic degenerate microsatellite for molecular strain typing of Candida<br />
krusei. Microbiology. 147: 2021-2028.<br />
Shin, J. H., Shin, D. H., Song, J. W., Kee, S. J., Suh, S. P., and Ryang, D. W.<br />
(2001). Electrophoretic karyotype analysis of sequential Candida parapsilosis<br />
isolates from patients with persistent or recurrent fungemia. J Clin Microbiol.<br />
39(4): 1258-1263.<br />
Sneath, P. H. A., and Sokal, R. R. (1973). Numerical Taxonomy. W.H. Freeman<br />
and Company, San Francisco, pp 230-234.<br />
Soll, D. R., Staebell, M., Langtimm, C.J., Pfaller, M., Hicks, J., and Rao, T.V.G.<br />
(1988). Multiple Candida strains in the course of a single systemic infection. J.<br />
Clin. Microbiol. 26: 1448-1459.<br />
Soll, D. (2000). The ins and out of DNA fingerprinting the infectious fungi. Clin<br />
Microbiol Rev. 13(2): 332-370.<br />
Sommer, H., Beltran, J. P., Huijser, P., Pape, H., Lonnig, W. E., Saedler, H., and<br />
Schwarz-Sommer, Z. (1990). Deficiens, a homeotic gene involved in the control<br />
of flower morphogenesis in Antirrhinum majus: the protein shows homology to<br />
transcription factors. EMBO J. 9: 605– 613.<br />
Spatafora, J. W., Sung, G-H., Johnson, D., Hesse, C., O‘Rourke, B., Serdani, M.,<br />
Spotts, R., Lutzoni, F., Hofstetter, V., Miadlikowska, J., Reeb, V., Gueidan, C.,<br />
Fraker, E., Lumbsch, T., Lücking, R., Schmitt, I., Hosaka, K., Aptroot, A.,<br />
Roux, C., Miller, A. N., Geiser, D. M., Hafellner, J., Hestmark, G., Arnold, A.<br />
E., Büdel, B., Rauhut, A., Hewitt, D., Untereiner, W. A., Cole, M. S.,<br />
Scheidegger, C., Schultz, M., Sipman, H., and Schoch, C. L. (2006). A five-gene<br />
phylogeny of Pezizomycotina. Mycologia. 98(6): 1018-1028.<br />
Stechmann, A., and Cavalier-Smith, T. (2002). Rooting the eukaryote tree by<br />
using a derived gene fusion. Science. 297: 89–91.
REFERENCES<br />
Stechmann, A., and Cavalier-Smith, T. (2003). The root of the eukaryote tree<br />
pinpointed. Curr Biol. 13: R665–R666.<br />
Steenkamp, E. T., Wright, J., and Baldauf, S. (2006). The protistan origins of<br />
Animals and Fungi. Mol Biol Evol. 23(1): 93 -106.<br />
Stenroos, S., Hyvönen, J., Myllys, L., Thell, A., and Ahti, T. (2002). Phylogeny<br />
of the genus Cla<strong>do</strong>nia s. lat. (Cla<strong>do</strong>niaceae, Ascomycetes) inferred from<br />
molecular, morphological, and chemical data. Cladistics. 18: 237–278.<br />
Stern, A., Doron-Faigenboim, A., Erez, E., Martz, E., Bacharach, E., and Pupko,<br />
T. (2007). Selecton 2007: advanced models for detecting positive and purifying<br />
selection using a Bayesian inference approach. Nucleic Acids Research. 35:<br />
W506-W511.<br />
Sugita, T., Takashima, M., Poonwan, N., and Mekha, N. (2006). Candida<br />
pseu<strong>do</strong>haemulonii an amphotericin B- and azole-resistant yeast species, isolated<br />
from the blood of a patient from Thailand. Microbiol Immunol. 50(6): 469-473.<br />
Suh, S-O., Blackwell, M., Kurtzman, C., and Lachance, M-A. (2006).<br />
Phylogenetics of Saccharomycetales, the ascomycete yeasts. Mycologia. 98(6):<br />
1006-1017.<br />
Sullivan, D., and Coleman, D. (1997). Candida dubliniensis: an emerging<br />
opportunistic pathogen. Curr Top Med Mycol. 8: 15–25.<br />
Sullivan, D. J., Westerneng, T. J., Haynes, K. A., Bennet, D. E., and Coleman,<br />
D. C. (1995). Candida dubliniensis sp. nov.: Phenotypic and molecular<br />
characterization of a novel species associated with oral candi<strong>do</strong>sis in HIV-<br />
infected individuals. Microbiology. 141: 1507-1521.<br />
Sundstrom, P. (2002). Adhesion in Candida spp. Cell Microbiol. 4: 461–469.<br />
Suzuki, Y., Glazko, G. V., and Nei, M. (2002). Overcredibility of molecular<br />
phylogenies obtained by Bayesian phylogenetics. Proc Natl Acad Sci. 99:<br />
16138–16143.<br />
Swanson, W. J., Yang, Z., Wolfner, M., and Aqua<strong>do</strong>, C. F. (2001). Positive<br />
Darwinian selection drives the evolution of several female reproductive proteins<br />
in mammals. Proc Natl Acad Sci. 98: 2509–2514.<br />
Swofford, D. L. (2002). PAUP: Phylogenetic analysis using parsimony (*and<br />
other methods). Sunderland, Massachusetts, Sinauer Associates.<br />
123
124<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Tachida, H., and Izuka, M. (1992). Persistence of repeated sequences that evolve<br />
by replication slippage. Genetics. 131: 471–478.<br />
Takezaki, N., and Nei, M. (1994). Inconsistency of the maximum parsimony<br />
method when the rate of nucleotide substitution is constant. J Mol Evol. 39:<br />
210–218.<br />
Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular<br />
Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular<br />
Biology and Evolution. 24: 1596-1599.<br />
Tanabe, Y., O‘Donnell, K., Saikawa, M., and Sugiyama, J. (2000). Molecular<br />
phylogeny of parasitic Zygomycota (Dimargaritales, Zoopagales) based on<br />
nuclear small subunit ribosomal DNA sequences. Molecular Phylogenetics and<br />
Evolution. 16: 253–262.<br />
Tanabe, Y., Saikawa, M., Watanabe, M. M., and Sugiyama, J. (2004). Molecular<br />
phylogeny of Zygomycota based on EF-1 and RPB1 sequences: limitations and<br />
utility of alternative markers to rDNA. Molecular Phylogenetics and Evolution.<br />
30: 438–449.<br />
Tanabe, Y., Watanabe, M. M., and Sugiyama, J. (2005). Evolutionary<br />
relationships among basal fungi (Chytridiomycota and Zygomycota): Insights<br />
from molecular phylogenetics. Journal of General and Applied Microbiology.<br />
51: 267–276.<br />
Tateishi, T., Murayama, S. Y., Otsuka, F., and Yamaguchi, H. (1996).<br />
Karyotyping by PFGE of clinical isolates of Sporothrix schenckii. FEMS<br />
Immunol Med Microbiol. 13: 147–154.<br />
Tavanti, A., Davidson, A., Johnson, E., Maiden, M., Shaw, D., Gow, N., and<br />
Odds, F. (2005a). Multilocus sequence typing for differentiation of strains of<br />
Candida tropicalis. J Clin Microbiol. 43(11): 5593-5600.<br />
Tavanti, A., Davidson, A., Gow, N., Maiden, M., and Odds, F. (2005b). Candida<br />
orthopsilosis and Candida metapsilosis spp. nov. to replace Candida<br />
parapsilosis groups II and III. J Clin Microbiol. 43(1): 284-292.<br />
Taylor, F. J. R. (1978). Problems in the development of an explicit hypothetical<br />
phylogeny of the lower eukaryotes. BioSystems. 10: 67–89.
REFERENCES<br />
Taylor, J. W., Geiser, D. M., Burt, A., and Koufopanou, V. (1999). The<br />
evolutionary biology and population genetics underlying fungal strain typing.<br />
Clin Microbiol Rev. 12(1): 126-146.<br />
Tehler, A. (1988). A cladistic outline of the Eumycota. Cladistics. 4: 227–277.<br />
Tehler, A., Farris, J. S., Lipscomb, D. L., Källerrsjö, M. (2000). Phylogenetic<br />
analyses of the fungi based on large rDNA data sets. Mycologia. 92: 459–474.<br />
Tehler, A., Little, D. P., and Farris, J. S. 2003. The full-length phylogenetic tree<br />
from 1551 ribosomal sequences of chitinous fungi, Fungi. Mycological<br />
Research. 107: 901–916.<br />
Theissen, G., Kim, J., and Saedler, H. 1996. Classification and phylogeny of the<br />
MADS-box multigene family suggest defined roles of MADS-box gene<br />
subfamilies in the morphological evolution of eukaryotes. J Mol Evol. 43: 484-<br />
516.<br />
Thomas, J. R., Dwek, R. A., and Rademacher, T. W. (1990). Structure,<br />
biosynthesis, and function of glycosylphosphatidylinositols. Biochemistry. 29:<br />
5413–5422.<br />
Thompson, J. D., Higgins, D. G., and Gibson T. J. (1994). CLUSTALW:<br />
improving the sensitivity of progressive multiple sequence alignment through<br />
sequence weighting, position-specific gap penalties and weight matrix choice.<br />
Nucleic Acids Res. 22: 4673–4680.<br />
Tiede, A., Bastisch, I., Schubert, J., Orlean, P., and Schmidt, R. E. (1999).<br />
Biosynthesis of glycosylphosphatidylinositols in mammals and unicellular<br />
microbes. Biol Chem. 380: 503–523.<br />
Tietz, H., Hopp, M., Schmalreck, A., Sterry, W., and Czaika, V. (2001).<br />
Candida africana sp. nov. a new human pathogen or a variant of Candida<br />
albicans?. Mycoses. 44: 437-445.<br />
Toth, G., Gaspari, Z., and Jurka, J. (2000). Microsatellites in different eukaryotic<br />
genomes: survey and analysis. Genome Res. 10: 967–981.<br />
Treisman, R. (1990). The SRE: a growth factor responsive transcriptional<br />
regulator. Semin. Cancer Biol. 1: 47– 58.<br />
Trifonov, E. N. (2003). Tuning function of tandemly repeating sequences: a<br />
molecular device for fast adaptation. Pp. 1–24 in S. P. Wasser, ed., Evolutionary<br />
125
126<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
theory and processes: nodern horizons, papers in honor of Eviatar Nevo. Kluwer<br />
Academic Publishers. Amsterdam, The Netherlands.<br />
Tzung, K. W., Williams, R. M., Scherer, S., Federspiel, N., Jones, T., Hansen,<br />
N., Bivolarevic, V., Huizar, L., Komp, C., Surzycki, R., Tamse, R., Davis, R.<br />
W., and Agabian, N. (2001). Genomic evidence for a complete sexual cycle in<br />
Candida albicans. Proc Natl Acad Sci. 98(6): 3249-3253.<br />
Valenza, G., Valenza, R., Brederlau, J., Frosch, M., and Kurzai, O. (2006).<br />
Identification of Candida fabianii as a cause of letal septicaemia. Mycoses. 49:<br />
331-334.<br />
Van Belkum, A., Scherer, S., van Alphen, L., and Verbrugh, H. (1998). Short-<br />
sequence DNA repeats in prokaryotic genomes. Microbiol Mol Biol Rev. 62:<br />
275–293.<br />
Van de Peer, Y., Baldauf, S. L., Doolittle, W. F., and Meyer, A. (2000). An<br />
updated and comprehensive rRNA phylogeny of (crown) eukaryotes based on<br />
rate-calibrated evolutionary distances. J Mol Evol. 51: 565 – 576.<br />
Van der Walt, J. P. (1966). Lodderomyces, a new genus of the<br />
Saccharomycetaceae. Antonie Van Leeuwenhoek 1966. 32(1): 1-5.<br />
Voigt, K., and Wöstemeyer, J. (2001). Phylogeny and origin of 82 zygomycetes<br />
from all 54 genera of the Mucorales and Mortierellales based on combined<br />
analysis of actin and translation elongation factor EF-1α genes. Gene. 270: 113-<br />
120.<br />
Vos, P., Hogers, R., Bleeker, M., Reijans, M., Van de Lee, T., Hornes, M.,<br />
Frijters, A., Pot, J., Peleman, J., Kuiper, M., and Zabeau, M. (1995). AFLP: a<br />
new technique for DNA fingerprinting. Nucleic Acids Res. 23: 4407-4414.<br />
Wainright, P. O., Hinkle, G., Sogin, M. L., and Stickel, S. K. (1993).<br />
Monophyletic origins of the Metazoa: an evolutionary link with fungi. Science.<br />
260: 340 – 342.<br />
Wang, Z.; Binder, M.; Dai, Y.C.; and Hibbett, D.S. 2004. Phylogenetic<br />
relationships of Sparassis inferred from nuclear and mitochondrial ribosomal<br />
DNA and RNA polymerase sequences. Mycologia. 96: 1013-1027.<br />
Watanabe, Y., Irie, K., and Matsumoto, K. (1995). Yeast RLM1 encodes a<br />
serum response factor-like protein that may function <strong>do</strong>wnstream of the Mpk1<br />
(Slt2) mitogen-activated protein kinase pathway. Mol Cell Biol. 15: 5740–5749.
REFERENCES<br />
Weig, M., Jansch, L., Gross, U., De Koster, C. G., Klis, F. M., and De Groot, P.<br />
W. (2004). Systematic identification in silico of covalently bound cell wall<br />
proteins and analysis of protein-polysaccharide linkages of the human pathogen<br />
Candida glabrata. Microbiology. 150: 3129–3144.<br />
Wernersson, R. (2006). Virtual Ribosome-a comprehensive DNA translation<br />
tool with support for integration of sequence feature annotation. Nucleic Acids<br />
Research. 34: W385-W388.<br />
Whalley, A. J. S., and Edwards, R. L. (1995). Secondary metabolites and<br />
systematic arrangement within the Xylariaceae. Can J Bot. 73(Suppl. 1): S802-<br />
S810.<br />
Wickerham, L. J., and Burton, K. A. (1954). A clarification of the relationship of<br />
Candida guilliermondii to other yeasts by a study of their mating types. J<br />
Bacteriol. 68(5): 594-597.<br />
Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S.<br />
V. (1990). DNA polymorphisms amplified by arbitrary primers are useful as<br />
genetic markers. Nucleic Acids Research. 18: 6531-6535.<br />
Whittaker, R. (1969). New concepts of king<strong>do</strong>ms of organisms. Science. 163:<br />
150–160.<br />
Wingard, J. R., Mertz, W. G., Rinaldi, M. G., Johnson, T. R., Karp, J. E., and<br />
Saral, R. (1991). Increase in Candida krusei infection among patients with bone<br />
marrow transplant and neutropenia treated prophylactically with fluconazole. N<br />
Engl J Med. 325: 1274–1277.<br />
Wolfe, K. H., and Shields, D. C. (1997). Molecular evidence for an ancient<br />
duplication of the entire yeast genome. Nature. 387: 708–713.<br />
Woese, C., and Fox, G. (1977). Phylogenetic structure of the prokaryotic<br />
<strong>do</strong>main: The primary king<strong>do</strong>ms. Proc Natl Acad Sci. 74(11): 5088 – 5090.<br />
Woese, C.; Kandler, O.; and Wheelis, M. (1990). Towards a Natural System of<br />
Organisms: Proposal for the <strong>do</strong>mains Archaea, Bacteria, and Eucarya. Proc Natl<br />
Acad Sci. 87(12): 4576-4579.<br />
Wong, W. S. W., Yang, Z., Goldman, N., and Nielsen, R. (2004). Accurancy and<br />
power of statistical methods for detecting adaptive evolution in protein coding<br />
sequences and for identifying positive selective sites. Genetics. 168: 1041–1051.<br />
127
128<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Wren, J. D., Forgacs, E., Fon<strong>do</strong>n, J. W., 3rd, Pertsemlidis, A., Cheng, S. Y.,<br />
Gallar<strong>do</strong>, T., Williams, R. S., Shohet, R. V., Minna, J. D., and Garner, H. R.<br />
(2000). Repeat polymorphisms within gene regions: phenotypic and<br />
evolutionary implications. Am J Hum Genet. 67: 345–356.<br />
Yabana, N., and Yamamoto, M. (1996). Schizosaccharomyces pombe map1+<br />
encodes a MADS-box-family protein required for cell-type-specific gene<br />
expression. Mol Cell Biol. 16: 3420 – 3428.<br />
Yanofsky, M. F., Ma, H., Bowman, J. L., Drews, G. N., Feldmann, K. A., and<br />
Meyerowitz, E. M. (1990). The protein encoded by the Arabi<strong>do</strong>psis homeotic<br />
gene agamous resembles transcription factors. Nature. 346: 35– 39.<br />
Yang, Z. (1997). PAML: a program package for phylogenetic analysis by<br />
maximum likelihood. Comput Appl Biosci. 13: 555–556.<br />
Yang, Z., and Bielawski, J. P. (2000). Statistical methods for detecting<br />
molecular adaptation. Trends Ecol Evolut. 15: 496–503.<br />
Yang, Z., Wong, W. S., and Nielsen, R. (2005) Bayes empirical Bayes inference<br />
of amino acid sites under positive selection. Mol Biol Evol. 22: 1107–1118.<br />
Yang, Z., Nielsen, R., Goldman, N., and Pedersen, A. M. K, (2000). Co<strong>do</strong>n<br />
substitution models for heterogeneous selection pressure at amino acid sites.<br />
Genetics. 155: 431–449.<br />
Young, L. Y., Lorenz, M. C., and Heitman, J. (2000). A STE12 homolog is<br />
required for mating but dispensable for filamentation in Candida lusitaniae.<br />
Genetics. 155(1): 17-29.<br />
Young, E. T., Sloan, J. S., and van Riper, K. (2000). Trinucleotide repeats are<br />
clustered in regulatory genes in Saccharomyces cerevisiae. Genetics. 154: 1053–<br />
1068.<br />
Yu, G., and Fassler, J. S. (1993). SPT13 (GAL11) of Saccharomyces cerevisiae<br />
negatively regulates activity of the MCM1 transcription factor in Ty1 elements.<br />
Mol Cell Biol. 13: 63 – 71.<br />
Zhang, N., Clastebury, L. A., Miller, A. N., Huhn<strong>do</strong>rf, S. M., Schoch, C. L.,<br />
Seifert, K. A., Rossman, A. Y., Rogers, J. D., Kohlmeyer, J., Volkmann-<br />
Kohlmeyer, B., and Sung. G-H. (2006). An overview of the systematic of the<br />
Sordariomycetes based on a four-gene phylogeny. Mycologia. 98(6): 1076-<br />
1087.
ANNEXES
7. ANNEXES<br />
MrModelblock<br />
#NEXUS<br />
[Johan Nylander 2004-07-01]<br />
ANNEX I<br />
[! ***** MrModeltest block -- Modified from MODELTEST 3.0 *****]<br />
ANNEXES<br />
[The following command will calculate a NJ tree using the JC69 model of evolution]<br />
BEGIN PAUP;<br />
Log file= mrmodelfit.log replace;<br />
DSet distance=JC objective=ME base=equal rates=equal pinv=0<br />
subst=all negbrlen=setzero;<br />
NJ showtree=no breakties=ran<strong>do</strong>m;<br />
End;<br />
[!<br />
***** BEGIN TESTING 24 MODELS OF EVOLUTION ***** ]<br />
BEGIN PAUP;<br />
Default lscores longfmt=yes; [Workaround for the bug in PAUP 4b10]<br />
Set criterion=like;<br />
[!<br />
** Model 1 of 24 * Calculating JC **]<br />
lscores 1/ nst=1 base=equal rates=equal pinv=0<br />
scorefile=mrmodel.scores replace;<br />
[!<br />
** Model 2 of 24 * Calculating JC+I **]<br />
lscores 1/ nst=1 base=equal rates=equal pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 3 of 24 * Calculating JC+G **]<br />
lscores 1/ nst=1 base=equal rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 4 of 24 * Calculating JC+I+G **]<br />
lscores 1/ nst=1 base=equal rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 5 of 24 * Calculating F81 **]<br />
lscores 1/ nst=1 base=est rates=equal pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 6 of 24 * Calculating F81+I **]<br />
lscores 1/ nst=1 base=est rates=equal pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
131
132<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
pinv=0<br />
** Model 7 of 24 * Calculating F81+G **]<br />
lscores 1/ nst=1 base=est rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 8 of 24 * Calculating F81+I+G **]<br />
lscores 1/ nst=1 base=est rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 9 of 24 * Calculating K80 **]<br />
lscores 1/ nst=2 base=equal tratio=est rates=equal pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 10 of 24 * Calculating K80+I **]<br />
lscores 1/ nst=2 base=equal tratio=est rates=equal pin=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 11 of 24 * Calculating K80+G **]<br />
lscores 1/ nst=2 base=equal tratio=est rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 12 of 24 * Calculating K80+I+G **]<br />
lscores 1/ nst=2 base=equal tratio=est rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 13 of 24 * Calculating HKY **]<br />
lscores 1/ nst=2 base=est tratio=est rates=equal pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 14 of 24 * Calculating HKY+I **]<br />
lscores 1/ nst=2 base=est tratio=est rates=equal pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 15 of 24 * Calculating HKY+G **]<br />
lscores 1/ nst=2 base=est tratio=est rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 16 of 24 * Calculating HKY+I+G **]<br />
lscores 1/ nst=2 base=est tratio=est rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 17 of 24 * Calculating SYM **]<br />
lscores 1/ nst=6 base=equal rmat=est rclass= (a b c d e f) rates=equal<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 18 of 24 * Calculating SYM+I **]<br />
lscores 1/ nst=6 base=equal rmat=est rates=equal pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 19 of 24 * Calculating SYM+G **]
END;<br />
ANNEXES<br />
lscores 1/ nst=6 base=equal rmat=est rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 20 of 24 * Calculating SYM+I+G **]<br />
lscores 1/ nst=6 base=equal rmat=est rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 21 of 24 * Calculating GTR **]<br />
lscores 1/ nst=6 base=est rmat=est rates=equal pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 22 of 24 * Calculating GTR+I **]<br />
lscores 1/ nst=6 base=est rmat=est rates=equal pinv=est<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 23 of 24 * Calculating GTR+G **]<br />
lscores 1/ nst=6 base=est rmat=est rates=gamma shape=est pinv=0<br />
scorefile=mrmodel.scores append;<br />
[!<br />
** Model 24 of 24 * Calculating GTR+I+G **]<br />
lscores 1/ nst=6 base=est rmat=est rates=gamma shape=est pinv=est<br />
scorefile=mrmodel.scores append;<br />
Log stop=yes;<br />
133
134<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Annex II. Fungal species where were identified ortologue RLM1 genes. Taxonomy, Nº intron, gene location and database.<br />
Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />
Phycomyces blakeleeanus Mucoromycotina 5 Scaffold 1: 2877572 – 2879654 (-) Joint Genome Institute<br />
Mucor circinelloides Mucoromycotina 3 Scaffold 2: 32637 – 34277 (-) Broad Institute, Murcia University<br />
Rhizopus oryzae Mucoromycotina 4 Supercontig 2: 4147507 – 4149079 (-) Broad Institute<br />
Ustilago maydis Ustilagomycetes - Contig 191: 196940 – 198718 (+) Broad Institute<br />
Malassezia globosa Malasseziales - Sf 7.1: 204470 – 205939 (+) NCBI<br />
Sporobolomyces roseus Microbotryomycetes 2 Scaffold 10: 85788 – 88407 (+) Joint Genome Institute<br />
Cryptococcus neoformans Tremellomycetes 4 Chromosome 2: 1360064 - 1362029 (-) Broad Institute, Genome Sciences<br />
Center Canada, Duke Center for<br />
Genome Research, Stanford Genome<br />
Technology<br />
Cryptococcus gatti Tremellomycetes 4 Supercontig 16: 244132 – 246064 (+) Broad Institute<br />
Laccaria bicolor Agaricomycetes 4 Scaffold 20: 147515 - 149297 (+) Joint Genome Institute<br />
Coprinus cinereus Agaricomycetes 7 Contig 317: 81588 – 83599 (-)‖ Broad Institute<br />
Phanerochaete chrysosporium Agaricomycetes 4 Scaffold 3: 1745299 – 1747266 (+) Joint Genome Institute<br />
Postia placenta Agaricomycetes 8 Scaffold 174: 171333 – 174904 (+) Joint Genome Institute<br />
Schizosaccharomyces japonicus Schizosaccharomycetes - Supercontig 5: 941790 - 942323 (+) Broad Institute<br />
Schizosaccharomyces pombe Schizosaccharomycetes - Chromosome 2: 3626639-3627757 (+) Sanger Institute<br />
Schizosaccharomyces octosporus Schizosaccharomycetes - Supercontig 2: 351924 – 352948 (+) Broad Institute<br />
Yarrowia lipolytica Saccharomycetes 1 Chromosome F: 2507846 – 2509480 (-) Génolevures<br />
Candida lusitaniae Saccharomycetes - Supercontig 5: 493096 - 494358 (-) Broad Institute<br />
Candida guilliermondii Saccharomycetes - Supercontig 3: 493468 - 494775 (+) Broad Institute
Pichia stipitis Saccharomycetes - Chromosome 1: 1519668 – 1521421 (+) Joint Genome Institute<br />
Debaryomyces hansenii Saccharomycetes - Chromosome 2: 237182 - 238732 (+) Génolevures<br />
Lodderomyces elongisporus Saccharomycetes - Supercontig 6: 1060953 - 1063316 (+) Broad Institute<br />
Candida parapsilosis Saccharomycetes - Contig 005806: 536977 – 539238 (-) Sanger Institute<br />
Candida tropicalis Saccharomycetes - Supercontig 1: 49905 - 51623 (-) Broad Institute, Génolevures<br />
135<br />
ANNEXES<br />
Candida albicans Saccharomycetes - Chromosome 4: 253044 – 254879 (+) Stanford Genome Technology Center,<br />
Candida dubliniensis Saccharomycetes - Chromosome 4: 265932 – 267662 (+) Sanger Institute<br />
Kluyveromyces lactis Saccharomycetes - Chromosome E: 2138785 – 2140983 (+) Génolevures<br />
Sanger Institute and Broad Institute<br />
Ashbya gossypii Saccharomycetes 1 Chromosome 7: 1125892 - 1127486 (-) Biozentrum and Syngenta<br />
Saccharomyces kluyveri Saccharomycetes 1 Contig 4.8: 312314 – 314049 (-) Washington University Genome<br />
Kluyveromyces waltii Saccharomycetes 1 Contig 62: 28422 – 29929 (+) Broad Institute<br />
Candida glabrata Saccharomycetes - Chromosome H: 556498 – 558378 (+) Génolevures<br />
Sequencing Center, Génolevures<br />
Vanderwaltozyma polyspora Saccharomycetes - Contig 1048b: 36130 – 38160 (-) Trinity College Dublin, Ireland<br />
Saccharomyces castellii Saccharomycetes - Contig 664: 5363 - 7027 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces bayanus Saccharomycetes - Contig 737: 12450 – 14405 (+) Washington University Genome<br />
Sequencing Center, Broad Institute,<br />
Génolevures<br />
Saccharomyces kudriavzevii Saccharomycetes - Contig 02.2091: 16474 – 18432 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces mikatae Saccharomycetes - Contig 34: 13635 – 15605 (-) Washington University Genome<br />
Sequencing Center, Broad Institute
136<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Saccharomyces cerevisiae Saccharomycetes - Chromosome 16: 379117 – 379117 (-) Stanford Genome Technology Center,<br />
Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 387: 39147 – 40522 (-) Broad Institute<br />
Sanger Institute, Broad Institute<br />
Botrytis cinerea Leotiomycetes 2* Supercontig 4: 131839 – 133900 (+) Broad Institute and Syngenta AG,<br />
Genoscope<br />
Sclerotinia sclerotiorum Leotiomycetes 2 Supercontig 7: 1397346 - 1399404 (-) Broad Institute<br />
Magnoporthe grisea Sordariomycetes 2 Supercontig 27: 2198205 - 2200477 (+) Broad Institute<br />
Verticillium dahliae Sordariomycetes 2 Supercontig 2: 411495 - 413529 (-) Broad Institute<br />
Po<strong>do</strong>spora anserina Sordariomycetes 1 Chromosome 1_SC1: 2337785 – 2339800 (+) Genoscope<br />
Chaetomium globosum Sordariomycetes 3* Supercontig 1: 4237276 - 4239345 (+) Broad Institute<br />
Neurospora crassa Sordariomycetes 2 Contig 6: 745962 - 748345 (-) Broad Institute<br />
Nectria haematococca Sordariomycetes 2 Contig 1: 4240418 – 4242558 (+) Joint Genome Institute<br />
Fusarium oxysporum Sordariomycetes 2 Supercontig 6: 882635 - 884756 (-) Broad Institute<br />
Fusarium verticillioides Sordariomycetes 2 Supercontig 4: 851617 - 853730 (-) Broad Institute and Syngenta AG<br />
Fusarium graminearum Sordariomycetes 2 Supercontig 6: 1217222 – 1219438 (+) Broad Institute<br />
Ephicloa festucae Sordariomycetes 2* Contig 09290: 7786 – 8664 (+); Contig 10180:<br />
1- 1318 (+)<br />
Oklahoma University<br />
Trichoderma reesei Sordariomycetes 2 Contig 1: 484378 – 486398 (+) Joint Genome Institute<br />
Trichoderma virens Sordariomycetes 2 Locus ABDF01000081: 60794 – 62787 (-) Joint Genome Institute<br />
Trichoderma atroviride Sordariomycetes 2 Locus ABDG01000180: 45972 – 47983 (-) Joint Genome Institute<br />
Alternaria brassicicola Dothideomycetes 2 Contig 2.99: 20908 – 22912 (+) Washington University<br />
Mycosphaerella fijiensis Dothideomycetes 2 Scaffold 11: 1715901 – 1717838 (+) Joint Genome Institute<br />
Mycosphaerella graminicola Dothideomycetes 2 Scaffold 5: 2832610 – 2834512 (-) Joint Genome Institute<br />
Pyrenophora tritici-repentis Dothideomycetes 2 Supercontig 8: 1656744 – 1658773 (+) Broad institute
137<br />
ANNEXES<br />
Stagonospora no<strong>do</strong>rum Dothideomycetes 2 Supercontig 36: 142061 – 144139 (-) Broad Institute, International<br />
Stagonospora no<strong>do</strong>rum Genomics<br />
Consortium<br />
Cochliobolus heterostrophus Dothideomycetes 2 Scaffold 3: 15429 - 17447 (-) Joint Genome Insttitute<br />
Coccidioides immitis Eurotiomycetes 2 Chromosome 4: 837325 – 839343 (-) Broad Institute<br />
Coccidioides posadasii Eurotiomycetes 2 Chromosome 5: 696037 – 698131 (+) TIGR and Broad Institute<br />
Blastomyces dermatitidis Eurotiomycetes 2 Contig 5.48: 866 – 3141 (-) Washington University<br />
Histoplasma capsulatum Eurotiomycetes 2 Contig 1161: 255583 – 257713 (+) Broad institute, Washington University<br />
Genome Sequencing Center<br />
Paracoccidioides brasiliensis Eurotiomycetes 2 Supercontig 2: 1922491 - 1924634 (-) Broad Institute<br />
Uncinocarpus reesii Eurotiomycetes 2 Supercontig 5: 1771068 – 1772993 (+) Broad Institute<br />
Ascosphaera apis Eurotiomycetes 1* Contig 6709: 1 – 1360 (-); Contig 28: 3916 –<br />
5847 (-)<br />
Baylor College of Medicine<br />
Microsporum gypseum Eurotiomycetes 1 Supercontig 8: 169115 – 170914 (-) Broad Institute<br />
Talaromyces stipitatus Euriotiomycetes 2 Locus ABAS01000017: 261779 – 263672 (+) TIGR<br />
Penicillium marneffei Euriotiomycetes 2 Locus ABAR01000011: 131838 – 133723 (-) TIGR<br />
Neosartorya fischeri Eurotiomycetes 2 Contig 574: 1982352 - 1984281 (-) TIGR<br />
Aspergillus clavatus Eurotiomycetes 2 Contig 83: 930562 – 932534 (+) TIGR<br />
Aspergillus fumigatus Eurotiomycetes 2 Chromosome 3: 2185740 - 2187668 ( +) Sanger Institute and TIGR<br />
Aspergillus oryzae Eurotiomycetes 2 Supercontig 2: 3771013 - 3773030 (-) Broad Institute<br />
Aspergillus terreus Eurotiomycetes 2 Supercontig 2: 1754327 - 1756250 (-) Broad Institute<br />
Aspergillus flavus Eurotiomycetes 2 Contig 1: 3827471 – 3829494 (-) TIGR<br />
Aspergillus nidulans Euriotiomycetes 2 Contig 51: 528863 - 530802 (+) Broad Institute and Monsanto<br />
Aspergillus niger Eurotiomycetes 2 Contig 2: 3360920 – 3362951 (+) Joint Genome Institute
138<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Annex III. Fungal species where were identified ortologue MCM1 genes. Taxonomy, Nº intron, gene location and database.<br />
Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />
Phycomyces blakeleeanus Mucoromycotina 3 Scaffold 1: 2124764 – 2126294 (+) Joint Genome Institute<br />
Mucor circinelloides Mucoromycotina 2 Scaffold 4: 3273504 – 3274607 (-) Broad Institute, Murcia University<br />
Rhizopus oryzae Mucoromycotina 3 Supercontig 4: 1692147-1693115 (-) Broad Institute<br />
Ustilago maydis Ustilagomycetes - Contig 43: 64341-65705 (+) Broad Institute<br />
Malassezia globosa Malasseziales 1 Sf 19.1: 27642 – 28805 (+) NCBI<br />
Sporobolomyces roseus Microbotryomycetes 5 Scaffold 5: 1185071 – 1188512 (-) Joint Genome Institute<br />
Cryptococcus neoformans Tremellomycetes 4 Chromosome 13: 39688-42628 (+) Broad Institute, Genome Sciences<br />
Center Canada, Duke Center for<br />
Genome Research, Stanford Genome<br />
Technology<br />
Cryptococcus gatti Tremellomycetes 3 Supercontig 14: 629109 – 632178 (-) Broad Institute<br />
Laccaria bicolor Agaricomycetes 2 Scaffold 3: 1957445 - 1958697 (+) Joint Genome Institute<br />
Coprinus cinereus Agaricomycetes 2 Contig 175: 86341-87583 (+) Broad Institute<br />
Phanerochaete chrysosporium Agaricomycetes 1 Scaffold 8: 889662 – 890764 (-) Joint Genome Institute<br />
Postia placenta Agaricomycetes 1 Scaffold 93: 242447 – 243724 (+) Joint Genome Institute<br />
Schizosaccharomyces japonicus Schizosaccharomycetes 2 Supercontig 1: 436734-438566 (+) Broad Institute<br />
Schizosaccharomyces pombe Schizosaccharomycetes 2 Chromosome 1: 5292795-5294344 (+) Sanger Institute<br />
Schizosaccharomyces octosporus Schizosaccharomycetes 2 Supercontig 1: 2428314-2429843 (+) Broad Institute<br />
Yarrowia lipolytica Saccharomycetes 1 Chromosome C: 913297 – 914639 (-) Génolevures<br />
Candida lusitaniae Saccharomycetes - Supercontig 5: 694152-694868 (-) Broad Institute<br />
Candida guilliermondii Saccharomycetes - Supercontig 1: 710234-710938 (+) Broad Institute
Pichia stipitis Saccharomycetes - Chromosome 5: 1545724 – 1546455 (-) Joint Genome Institute<br />
Debaryomyces hansenii Saccharomycetes - Chromosome F: 1872703 - 1873428 (+) Génolevures<br />
Lodderomyces elongisporus Saccharomycetes - Supercontig 10: 288713-289579 (-) Broad Institute<br />
Candida parapsilosis Saccharomycetes - Contig 005806: 325917 – 326762 (+) Sanger Institute<br />
Candida tropicalis Saccharomycetes - Supercontig 7: 104018-104854 (+) Broad Institute, Génolevures<br />
139<br />
ANNEXES<br />
Candida albicans Saccharomycetes - Chromosome 7: 188023-188811 (-) Stanford Genome Technology Center,<br />
Candida dubliniensis Saccharomycetes - Chromosome 7: 207792 - 208493 (-) Sanger Institute<br />
Kluyveromyces lactis Saccharomycetes - Chromosome F: 2474742 – 2475452 (-) Génolevures<br />
Sanger Institute and Broad Institute<br />
Ashbya gossypii Saccharomycetes - Chromosome 2: 561146 - 561766 (-) Biozentrum and Syngenta<br />
Saccharomyces kluyveri Saccharomycetes - Contig 5.10: 2448 – 3053 (+) Washington University Genome<br />
Kluyveromyces waltii Saccharomycetes - Contig 206: 1 – 307 (-); Contig 205: 7934 –<br />
8165 (-)<br />
Sequencing Center, Génolevures<br />
Broad Institute<br />
Candida glabrata Saccharomycetes - Chromosome F: 618976 – 619632 (-) Génolevures<br />
Vanderwaltozyma polyspora Saccharomycetes - Contig 1073b: 43103 – 43882 (-) Trinity College Dublin, Ireland<br />
Saccharomyces castellii Saccharomycetes - Contig 649: 39880 - 40560 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces bayanus Saccharomycetes - Contig 579: 25293 – 26144 (-) Washington University Genome<br />
Sequencing Center, Broad Institute,<br />
Génolevures<br />
Saccharomyces kudriavzevii Saccharomycetes - Contig 02.820: 794 – 1651 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces mikatae Saccharomycetes - Contig 578: 4300 – 5130 (+) Washington University Genome
140<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Sequencing Center, Broad Institute<br />
Saccharomyces cerevisiae Saccharomycetes - Chromosome 13: 353870 – 354730 (+) Stanford Genome Technology Center,<br />
Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 178: 11877 – 12719 (-) Broad Institute<br />
Sanger Institute, Broad Institute<br />
Botrytis cinerea Leotiomycetes 2* Supercontig 165: 67693 – 68463 (-) Broad Institute and Syngenta AG,<br />
Genoscope<br />
Sclerotinia sclerotiorum Leotiomycetes 2 Supercontig 6: 1981916-1982762 (+) Broad Institute<br />
Magnoporthe grisea Sordariomycetes 2 Supercontig 23: 1178204-1179169 (-) Broad Institute<br />
Verticillium dahliae Sordariomycetes 2 Supercontig 3: 440666-441518 (+) Broad Institute<br />
Po<strong>do</strong>spora anserina Sordariomycetes 2 Chromosome 1_SC4: 1831500 – 1832359 (+) Genoscope<br />
Chaetomium globosum Sordariomycetes 2 Supercontig 3: 3648631 - 3649505 (-) Broad Institute<br />
Neurospora crassa Sordariomycetes 2 Contig 39: 300711 - 301579 (+) Broad Institute<br />
Nectria haematococca Sordariomycetes 2 Contig 1: 1904784 – 1905808 (+) Joint Genome Institute<br />
Fusarium oxysporum Sordariomycetes 2 Supercontig 3: 1396623 - 1397638 (-) Broad Institute<br />
Fusarium verticillioides Sordariomycetes 2 Supercontig 2: 1298954 - 1299971 (-) Broad Institute and Syngenta AG<br />
Fusarium graminearum Sordariomycetes 2 Supercontig 5: 973273 – 974303 (-) Broad Institute<br />
Ephicloa festucae Sordariomycetes 2 Contig 03502: 2589 – 3725 (-) Oklahoma University<br />
Trichoderma reesei Sordariomycetes 2 Contig 1: 293329 – 294473 (+) Joint Genome Institute<br />
Trichoderma virens Sordariomycetes 2 Locus ABDF01000238: 119788 – 120837 (-) Joint Genome Institute<br />
Trichoderma atroviride Sordariomycetes 2 Locus ABDG01000087 : 9802 – 10882 (-) Joint Genome Institute<br />
Alternaria brassicicola Dothideomycetes 2 Contig 7.54: 1842 – 2668 (+) Washington University<br />
Mycosphaerella fijiensis Dothideomycetes 2 Scaffold 5: 691696 – 692478 (+) Joint Genome Institute<br />
Mycosphaerella graminicola Dothideomycetes 2 Scaffold 7: 131733 – 132524 (-) Joint Genome Institute<br />
Pyrenophora tritici-repentis Dothideomycetes 2 Supercontig 5: 186836-187667 (+) Broad institute
141<br />
ANNEXES<br />
Stagonospora no<strong>do</strong>rum Dothideomycetes 2 Supercontig 11: 1046577 – 1047421 (+) Broad Institute, International<br />
Stagonospora no<strong>do</strong>rum Genomics<br />
Consortium<br />
Cochliobolus heterostrophus Dothideomycetes 2 Scaffold 2: 392134 - 393018 (-) Joint Genome Insttitute<br />
Coccidioides immitis Eurotiomycetes 2 Chromosome 3: 322326 – 323200 (-) Broad Institute<br />
Coccidioides posadasii Eurotiomycetes 2 Chromosome 2: 790014 – 790884 (-) TIGR and Broad Institute<br />
Blastomyces dermatitidis Eurotiomycetes 2 Contig 5593.1: 398 – 885 (-) Washington University<br />
Histoplasma capsulatum Eurotiomycetes 2 Supercontig 1.3: 251153 – 252083 (+) Broad institute, Washington University<br />
Genome Sequencing Center<br />
Paracoccidioides brasiliensis Eurotiomycetes 2 Supercontig 12: 450142-451180 (+) Broad Institute<br />
Uncinocarpus reesii Eurotiomycetes 2 Supercontig 2: 4576943-4577833 (+) Broad Institute<br />
Ascosphaera apis Eurotiomycetes 1 Contig 2371: 17 – 538 (-); Contig 6891: 66 –<br />
1133 (+)<br />
Baylor College of Medicine<br />
Microsporum gypseum Eurotiomycetes 2 Supercontig 1: 3737424-3738317 (+) Broad Institute<br />
Talaromyces stipitatus Euriotiomycetes 2 Locus ABAS01000004: 938690 – 939519 (+) TIGR<br />
Penicillium marneffei Euriotiomycetes 2 Locus ABAR01000013 : 274356 – 275187 (+) TIGR<br />
Neosartorya fischeri Eurotiomycetes 2 Contig 582: 357403-358305 (-) TIGR<br />
Aspergillus clavatus Eurotiomycetes 2 Contig 85: 946418-947301 (+) TIGR<br />
Aspergillus fumigatus Eurotiomycetes 2 Chromosome 6: 357383-358283 –() Sanger Institute and TIGR<br />
Aspergillus oryzae Eurotiomycetes 2 Supercontig 17: 221157-222072 (-) Broad Institute<br />
Aspergillus terreus Eurotiomycetes 2 Supercontig 10: 1111982-1112918 (+) Broad Institute<br />
Aspergillus flavus Eurotiomycetes 2 Contig 9: 471345-472255 (-) TIGR<br />
Aspergillus nidulans Euriotiomycetes 2 Contig 159: 33815-34722 (-) Broad Institute and Monsanto<br />
Aspergillus niger Eurotiomycetes 2 Contig 9: 222598 - 223521 (-) Joint Genome Institute
142<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Annex IV. Fungal species where identified orthologue IFF8 genes. Taxonomy, Nº de intron, gene location and database.<br />
Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />
Candida lusitaniae Saccharomycetes - Supercontig 7: 753415 - 755439 (+) Broad Institute<br />
Candida guilliermondii Saccharomycetes - Supercontig 2: 36309 - 39698 (+) Broad Institute<br />
Pichia stipitis Saccharomycetes - Chromosome 1: 307099 – 311559 (+) Joint Genome Institute<br />
Debaryomyces hansenii Saccharomycetes - Chromosome B: 107946 - 112736 (-) Génolevures<br />
Lodderomyces elongisporus Saccharomycetes - Contig 1.123: 280245 - 287281 (-) Broad Institute<br />
Candida parapsilosis Saccharomycetes - Contig 005807: 1470770 – 1475785 (+) Sanger Institute<br />
Candida tropicalis Saccharomycetes - Supercontig 10: 128215 - 134073 (+) Broad Institute, Génolevures<br />
Candida albicans Saccharomycetes - Chromosome 5: 164521 - 166665 (-) Stanford Genome Technology Center,<br />
Candida dubliniensis Saccharomycetes - Chromosome 5: 187922 - 189868 (+) Sanger Institute<br />
Sanger Institute and Broad Institute
Annex V. Fungal species where were identified ortologue SMP1 genes. Taxonomy, Nº intron, gene location and database.<br />
143<br />
Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />
Candida glabrata Saccharomycetes - Chromosome M: 657290 – 659269 (-) Génolevures<br />
Vanderwaltozyma polyspora Saccharomycetes - Contig 397: 8567 – 10588 (-) Trinity College Dublin, Ireland<br />
ANNEXES<br />
Saccharomyces castellii Saccharomycetes - Contig 721: 27914 - 29635 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces bayanus Saccharomycetes - Contig 3: 5420 – 6778 (+) Washington University Genome<br />
Sequencing Center, Broad Institute,<br />
Génolevures<br />
Saccharomyces kudriavzevii Saccharomycetes - Contig 02.1807: 8672 – 9925 (+) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces mikatae Saccharomycetes - Contig 100: 1952 – 3295 (+) Washington University Genome<br />
Sequencing Center, Broad Institute<br />
Saccharomyces cerevisiae Saccharomycetes - Chromosome 2: 594859 – 593501 (+) Stanford Genome Technology Center,<br />
Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 210: 35931 – 37187 (-) Broad Institute<br />
Sanger Institute, Broad Institute
144<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
Annex VI. Fungal species where were identified ortologue ARG80 genes. Taxonomy, Nº intron, gene location and database.<br />
Genus/species Taxonomy N° Intron Chromosome, Supercontig or Contig Sequencing Center<br />
Candida glabrata Saccharomycetes - Chromosome F: 620431 – 621027 (-) Génolevures<br />
Vanderwaltozyma polyspora Saccharomycetes - Contig 1073b: 45942 – 46289 (-) Trinity College Dublin, Ireland<br />
Saccharomyces castellii Saccharomycetes - Contig 709: 49895 - 50539 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces bayanus Saccharomycetes - Contig 579: 26917 – 27447 (-) Washington University Genome<br />
Sequencing Center, Broad Institute,<br />
Génolevures<br />
Saccharomyces kudriavzevii Saccharomycetes - Contig 02.1514: 585 – 1130 (-) Washington University Genome<br />
Sequencing Center<br />
Saccharomyces mikatae Saccharomycetes - Contig 2531: 1149 – 1703 (-) Washington University Genome<br />
Sequencing Center, Broad Institute<br />
Saccharomyces cerevisiae Saccharomycetes - Chromosome 13: 352602 – 353135 (+) Stanford Genome Technology Center,<br />
Saccharomyces para<strong>do</strong>xus Saccharomycetes - Contig 178: 13466 – 14017 (-) Broad Institute<br />
Sanger Institute, Broad Institute
ANNEX VII<br />
ANNEXES<br />
Likelihood value, parameter estimates, and sites under positive selection as inferred under the six models<br />
proposed to calculate ω over co<strong>do</strong>ns in MADS-box of orthologue RLM1 genes. (PAML 4.1)<br />
Model l Estimates of parameters dN/dS Positively<br />
One-ratio (M0)<br />
Neutral (M1a)<br />
Selection (M2a)<br />
Free-ratio (M3)<br />
Beta (M7)<br />
Beta & w (M8)<br />
-8907.095177<br />
-8881.920467<br />
-8881.920467<br />
-8595.963248<br />
-8575.525557<br />
-8575.526257<br />
ω = 0.0185<br />
p: 0.97143 0.02857<br />
w: 0.01692 1.00000<br />
p: 0.97143 0.00817 0.02040<br />
w: 0.01692 1.00000 1.00000<br />
p: 0.35665 0.42127 0.22208<br />
w: 0.00122 0.01527 0.06434<br />
p= 0.42029 q= 13.55916<br />
p0= 0.99999 p= 0.42029 q= 13.55916<br />
(p1= 0.00001) w= 1.00000<br />
0.0185<br />
0.0450<br />
0.0450<br />
0.0212<br />
0.0283<br />
0.0283<br />
Selected sites<br />
None<br />
Not allowed<br />
None<br />
None<br />
Not allowed<br />
None<br />
145
146<br />
Phylogenetic relationships of Candida species inferred by sequence analysis of nuclear genes<br />
ANNEX VIII<br />
Likelihood value, parameter estimates, and sites under positive selection as inferred under the six models<br />
proposed to calculate ω over co<strong>do</strong>ns in Saccharomycotina RLM1 gene. (HYPHY 1.0)<br />
Model l Estimates of parameters dN/dS Positively<br />
One-ratio (M0)<br />
Neutral (M1a)<br />
Selection (M2a)<br />
Free-ratio (M3)<br />
Beta (M7)<br />
Beta & w (M8)<br />
-39464.90296<br />
-40768.67683<br />
-39216.75175<br />
-38569.21552<br />
-38570.03576<br />
-38570.03383<br />
ω = 0.13616<br />
p: 0.9999343 0.0000657<br />
w: 1.00000 1.00000<br />
p: 0.66539 0.33461 0.00000<br />
w: 0.13860 1.00000 2.98229<br />
p: 0.11323 0.46676 0.42001<br />
w: 0.01243 0.10299 0.28431<br />
p= 0.96651 q= 4.56744<br />
p0= 0.99999 p= 0.96573 q= 4.56035<br />
(p1= 0.00001) w= 1.05491<br />
0.13616<br />
1.000<br />
0.42684<br />
0.16889<br />
0.17465<br />
0.17476<br />
Selected sites<br />
None<br />
Not allowed<br />
None<br />
None<br />
Not allowed<br />
None
ANNEX IX<br />
ANNEXES<br />
Likelihood ratio test statistics for comparing site specific models for MADS-box and Saccharomycotina<br />
RLM1 gene.<br />
Analysis<br />
(DF)<br />
MADS-box<br />
M0 vs M3 M1a vs M2a M7 vs M8<br />
(4)<br />
(2)<br />
(2)<br />
622.26* 0 0.0014<br />
Saccharomycotina<br />
Rlm1 gene<br />
1791.37* 3103.85* 0.0039<br />
p value p> 0.05 p>0.05 p>0.05<br />
Degrees of free<strong>do</strong>m (DF), corresponding to the difference of the number of parameters in the models, are<br />
reported in brackets, significant values are marked by *.<br />
147