Broad Street Scientific 2018-2019

BROAD 

STREET 

SCIENTIFIC 

VOLUME 8 | 2018-2019 

The North Carolina School of Science and Mathematics Journal of 

Student STEM Research

Front Cover 

This map segment of Durham features 

the North Carolina School of Science and 

Mathematics campus (upper-center) as well as 

the nearby Duke University campus (lowerleft) 

and several historic neighborhoods. Map 

data © 2019 Google; Image created by Kathleen 

Hablutzel. 

Approximate Scale: 6.5 kilometers 

Biology Section 

This image of the dorsal raphe nucleus labels 

dopamine neurons in green, red, and yellow. 

This region of the brain is critical in generating 

the increased sociability that typically occurs 

after a period of social isolation. Image credit 

Gillian Matthews, Ungless Lab, Imperial 

College London. 

Approximate Scale: 450 micrometers 

Chemistry Section 

This is a transmission electron microscope image 

of a graphene lattice. Graphene is a periodic 

structure entirely composed of carbon atoms. At 

this scale, individual atoms can be observed at 

the corners of the hexagons. Image credit Ethan 

Minot, Department of Physics, Oregon State 

University (Original grayscale image colorized). 

Approximate Scale: 4 nanometers

Engineering Section 

Visualizing patterns of air traffic over the 

contiguous United States reveals major airports 

and commonly flown-over regions. The darkest 

areas receive little-to-no flyovers. Image credit 

Aaron Koblin, Scott Hessels, and Gabriel 

Dunne, UCLA. 

Approximate Scale: 4500 kilometers 

Mathematics and Computer Science Section 

The Opte Project aims to visualize the internet 

by mapping routing paths from all over the 

world. Each color represents computers from a 

different region of the world. This visualization 

is from 2015. Image credit Barrett Lyon/The 

Opte Project. 

Approximate Scale: One zettabyte for the 

year 2015 

Physics Section 

The Baryon Oscillation Spectroscopic Survey 

(BOSS) Great Wall, a galaxy supercluster, is 

one of the largest structures in the observable 

universe. This image shows a simulation of how 

galaxy clusters form. The filaments are regions 

where galaxies are more likely to be found. 

Image credit Volker Springel, Max Planck 

Institute for Astrophysics. 

Approximate Scale: 1 billion light years 

Back Cover 

Scientific collaborations span the globe. 

This map depicts collaboration networks 

between researchers in different locations. 

Collaborations often - but not always - seem 

to follow linguisic and cultural connections. 

Image computed by Oliver H. Beauchesne and 

SCImago Lab, data by Elsevier Scopus. 

Approximate Scale: 39,000 kilometers

TABLE of CONTENTS 

4 Letter from the Chancellor 

5 Words from the Editors 

6 Broad Street Scientific Staff 

7 Essay: The AI We Haven't Considered 

JACKSON MEADE, 2020 

Biology 

10 Overexpression of a Heat Shock Protein in Cyanobacteria to Increase Growth Rate 

ROBERT LANDRY, 2019 

18 Hypoglycemic Effect of Momordica charantia Against Type 2 Diabetes Modeled in Bombyx mori 

AARUSHI VENKATAKRISHNAN, 2019 

Chemistry 

26 Tetraethyl Orthosilicate-Polyacrylonitrile Hybrid Membranes and their Application in Redox 

Flow Batteries 

ETHAN FREY, 2019 

32 Novel Synergistic Antioxidative Interactions Between Soy Lecithin and Cyclodextrin- 

Encapsulated Quercetin in a Lipid Matrix 

ANIRUDH HARI, 2019 

37 Utilization of Atomic Layer Deposition to Create Novel Metal Oxide Photoanodes for Solar- 

Driven Water Splitting 

ANNIE WANG, 2019

Engineering 

44 Using a Hybrid Machine Learning Approach for Test Cost Optimization in Scan Chain Testing 

LUKE DUAN, 2019 

49 Novel Water Desalination Filter Utilizing Granular Activated Carbon 

GEOFFREY FYLAK, 2019 

Mathematics and Computer Science 

59 Long Prime Juggling Patterns 

DANIEL CARTER AND ZACH HUNTER, 2019 

67 An Analysis of a Novel Neural Network Architecture 

VATSAL VARMA, 2019 ONLINE 

Physics 

75 Effects of Relativity on Quadrupole Oscillations of Compact Stars 

ABHIJIT GUPTA, 2019 

84 Effect of Elliptic Flow Fluctuations on the Two- and Four-Particle Azimuthal Cumulant 

BRIAN LIN, 2019 

Featured Article 

89 An Interview with Dr. Valerie Ashby

LETTER from the CHANCELLOR 

"Science is a cooperative enterprise, spanning the generations. It's the passing of a torch from teacher, to student, to 

teacher. A community of minds reaching back to antiquity and forward to the stars." 

~ Dr. Neil deGrasse Tyson 

I am proud to introduce the eighth edition of the 

North Carolina School of Science and Mathematics’ 

(NCSSM) scientific journal, Broad Street Scientific. Each 

year students at NCSSM conduct significant scientific 

research, and Broad Street Scientific is a student-led and 

student-produced showcase of some of the impressive 

research being done by students. 

Excellence in scientific research has a deep and 

far-reaching impact on nearly every aspect of daily life, 

including (among other areas) health care, food safety, 

space travel, national security, and the environment. 

When NCSSM students are given opportunities to apply 

their learning through research, they are doing more than 

increasing their individual knowledge; their valuable 

work is increasing our collective body of knowledge 

and strengthening our ability to address current global 

challenges and prepare for those to come. 

Opened in 1980, NCSSM was the nation’s first public 

residential high school where students study a specialized 

curriculum emphasizing science and mathematics. 

Teaching students to do research and providing them with 

opportunities to conduct high-level research in biology, 

chemistry, physics, computational science, engineering 

and computer science, math, humanities, and the social 

sciences are critical components of NCSSM’s mission 

to educate academically talented students to become 

state, national, and global leaders in science, technology, 

engineering, and mathematics. I am thrilled that each year 

we continue to increase the outstanding opportunities 

NCSSM students have to participate in research. 

This publication serves to highlight some of the high 

quality research students conduct each year at NCSSM 

under the direction of our outstanding faculty and in 

collaboration with researchers at major universities. For 

thirty-four years, NCSSM has showcased student research 

through our annual Research Symposium each spring and 

at major research competitions such as the Regeneron 

Science Talent Search and the International Science and 

Engineering Fair. The publication of Broad Street Scientific 

provides another opportunity to share with the broader 

community the outstanding research being conducted by 

NCSSM students each year. 

I would like to thank all of the students and faculty 

involved in producing Broad Street Scientific, particularly 

faculty sponsor Dr. Jonathan Bennett, and senior editors 

Emily Wang, Navami Jain, and Kathleen Hablutzel. 

Explore and enjoy! 

Dr. Todd Roberts, Chancellor 

4 | 2018-2019 | Broad Street Scientific

WORDS from the EDITORS 

Welcome to the Broad Street Scientific, NCSSM’s journal 

of student research in science, technology, engineering 

and mathematics. In this eighth edition of Broad Street 

Scientific, we hope to inspire readers to get involved in the 

scientific community by sharing the innovative research 

conducted by our students. We hope you enjoy this year’s 

edition! 

This year’s theme is networks: the connections we 

find within and between groups throughout our world. 

Connectivity is an integral component of modern life, 

and studying people or objects interacting in networks 

allows us to describe collective behavior of groups. 

Billions of interconnected neurons comprise the human 

brain, yet a brain is more than a bunch of cells. Brains can 

think, feel, and act both consciously and unconsciously. 

Thus, networks do not simply behave as the sum of their 

parts. Networks are powerful in predicting the complex 

behavior of a dynamic group without needing complex 

information on each individual in a network. For example, 

networks can predict the spread of an infectious disease 

without needing information on each individual in the 

network. Networks are powerful tools in describing our 

interconnected world. 

In the featured images of this journal, we explore the 

scales of networks, from the atomic to astronomical levels. 

The featured image for the Chemistry section displays 

a network of carbon atoms on the scale of fractions of 

a nanometer, while the featured image for the Physics 

section displays a network of superclusters of galaxies on 

the scale of approximately one billion light years – one of 

the largest known structures in the universe. On any scale, 

our world is built on interactions, and these interactions 

organize our world into networks. 

We would like to thank the faculty, staff and 

administration of NCSSM for their continued support 

towards our student researchers. It is this unmatched 

encouragement that prepares us to use our interests and 

skills in STEM to address problems in our community, 

both locally and beyond. For 39 years, NCSSM has 

fostered an environment conducive to learning through 

encouraging students to take risks and take ownership of 

their academic path. We would especially like to thank our 

faculty advisor, Dr. Jonathan Bennett, for his support and 

guidance throughout the publication process. We would 

also like to thank Chancellor Dr. Todd Roberts, Dean of 

Science Dr. Amy Sheck, and Director of Mentorship and 

Research Dr. Sarah Shoemaker. Lastly, the Broad Street 

Scientific would like to acknowledge Dr. Valerie Ashby, 

chemistry professor and Dean of Trinity College of Arts 

and Sciences at Duke University, for speaking with us 

about her inspiring journey in STEM and offering advice 

to young prospective scientists. 

Kathleen Hablutzel, Navami Jain, and Emily Wang 

Editors-in-Chief 

Broad Street Scientific | 2018-2019 | 5

BROAD STREET SCIENTIFIC STAFF 

Editors-in-Chief 

Kathleen Hablutzel, 2019 

Navami Jain, 2019 

Emily Wang, 2019 

Publication Editors 

Rohit Jagga, 2020 

Grishma Patel, 2019 

Sanjana Pothugunta, 2020 

Eleanor Xiao, 2020 

Biology Editors 

Megan Wu, 2019 

Ishaan Maitra, 2020 

Joseph Wang, 2020 

Chemistry Editors 

Melody Wen, 2020 

Varun Varanasi, 2020 

Engineering Editors 

Aakash Kothapally, 2020 

Jason Li, 2020 

Mathematics and 

Computer Science Editors 

Hahn Lheem, 2019 

Olivia Fugikawa, 2020 

Physics Editors 

Will Staples, 2020 

Ben Wu, 2020 

Faculty Advisor 

Dr. Jonathan Bennett 


THE AI WE HAVEN'T CONSIDERED 

Jackson Meade 

Jackson Meade was selected as the winner of the 2018-2019 Broad Street Scientific Essay Contest. His award included the 

opportunity to interview Dr. Valerie Ashby, distinguished chemist and professor and Dean of Trinity College of Arts and 

Sciences at Duke University. This interview can be found in the Featured Article section of the journal. 

“People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve 

already taken over the world.” 

~ Pedro Domingos 

When we bring up artificial intelligence in 

conversation, the rhetoric is relatively future-oriented. 

Discussions about the “possibilities” AI possesses – and 

the dangers it poses – abound, all in the context of what 

our technological future holds. But peel back the layer of 

speculation, and you may find something surprising. It 

might not be obvious, but artificial intelligence is already 

here – in fact, it’s everywhere. 

Though that statement sounds concerning, there isn’t 

a conspiracy of shadowy Artificial Intelligences operating 

behind the backs of the public. We’ve simply grown 

accustomed to its cohabitation in our systems. Artificial 

Intelligence, through “Machine Learning,” started 

accelerating in 1957, when Frank Rosenblatt designed 

the first Neural Network, called a perceptron (Lewis & 

Denning), to model the structure of the human brain 

(Marr). By 1985, Professor Terry Sejnowski had created 

NetTalk, which could pronounce 20,000 English words 

with just a week of training (New York Times). 

When you flip through a stuffed email inbox, machine 

learning keeps it from exploding by marking most of the 

spam and trashing it, arguably with impressive precision 

(Aski & Sourati). Go to your search engine and type your 

query, and the “suggested search” bar that appears at the 

bottom, as well as the results your query generates, are the 

product of a well-trained, personalized machine learning 

algorithm (Schachinger). When you purchase something 

on Amazon or scroll through your recommended videos 

page on YouTube, a machine learning system makes sure 

you see the kinds of things you might want to watch or 

buy, even if you couldn’t articulate it yourself. If you are 

looking at a screen, it is likely that machine learning had 

its hands (for lack of a more computerized term) in it. 

Since the early days of computing, computers have 

required painstaking algorithms – increasing by orders of 

magnitude in complexity – to do anything from displaying 

text to managing Google’s 40,000 search queries per 

second (Alphabet, Inc). This creates a ceiling of capabilities 

for our systems that grows infinitely harder to raise. But 

computer systems are, after all, human systems, and we 

should model them that way. This is exactly what machine 

learning algorithms do. Based on inferences from the data 

we give them, they teach themselves how to analyze and 

manipulate it, and the more data we give them, the better 

they get at doing their jobs (Faggella) (Lewis & Denning). 

This is significantly “human” – barring willful ignorance, 

we get better at analyzing and understanding our world 

given new information. 

Despite possible concerns, you are kept safe because of 

machine learning. In 2014, Kaspersky Lab's Anti-Malware 

Research Team processed between 200,000 and 315,000 

malicious files per day (Kaspersky Lab). But malicious files 

aren’t so different from each other, so machine learning 

algorithms can very easily identify the code for files with 

malicious intent far faster than any human actors could. In 

a country and world growing ever more concerned with 

data security, these algorithms provide a necessary wall 

between us and the actions of evil people. 

In our finances, we’re relinquishing control to the 

machines as well. Micromanagement of our funds is a 

multibillion-dollar business, and artificial intelligence 

completely disrupts it. While humans are good at predicting 

what the stock market can do over large spans of time 

because of noticeable trends, on smaller and smaller time 

scales and in more volatile markets, our grand spending 

schemes are fundamentally nothing short of guesswork. 

And while machine learning algorithms are admittedly 

built on guesswork, they can achieve super-human levels 

of accuracy during training on multitudes of data that are 

simply unattainable for even the most dedicated human. 

Predicting stocks is not the only artificial-intelligenceguided 

moneymaker around. Advertising is one of the most 

lucrative businesses of the modern world, having generated 

about $32.66 billion dollars in revenue for Google’s parent 

company, Alphabet, in Quarter 2 of 2018 (D’Onfro). This 

comes from thousands of paying customers, all of them 

companies hoping their product appeals to the right niche, 

and only works because of machine learning. 


In this realm, one cannot avoid the topic of driverless 

cars. Artificial intelligences are crucial to computer vision 

algorithms (Khirodkar, Yoo, & Kitani), though other 

hard-coded solutions can aid them. 74% of automotive 

company executives expect that these smart cars will be on 

the road by 2025, according to a report from IBM (IBM). 

The menial tasks of our lives – our driving, our purchases 

– will be automated if they can be. 

We have been exploring the risks of developing artificial 

intelligences prior to the day we could make them. In 1942, 

science fiction author Isaac Asimov published his nowfamous 

laws of robotics in a short story, “Runaround.” 

They stated: 

First, “A robot may not injure a human being or, 

through inaction, allow a human being to come to harm.” 

Second, “A robot must obey the orders given to it by 

human beings, except where such orders would conflict 

with the First Law. 

Third, “A robot must protect its own existence as long 

as such protection does not conflict with the First or 

Second Law.” 

But restricting our concerns about Artificial Intelligence 

to this view is too narrow. It comes from an assumption 

about the types of intelligences we intend to create. It 

assumes that we will “build ourselves” – that we will build 

copies of humans, in humanoid robot bodies with human 

emotions and human capabilities. 

We are a species that changes its environment to fit its 

needs instead of adapting to its surroundings. Machine 

learning and artificial intelligence are the newest evolution 

of this pattern – just another way that the world and the 

patterns within it can be adjusted according to our wishes. 

The patterns of our world once influenced us to a degree 

we could not control, but artificial intelligence will allow 

us to take full control and then completely relinquish it. All 

this works because machine learning is based in prediction 

– on understanding the once-unintelligible patterns that 

comprise the fabric of our world. That is a flaw that could 

spell the end of our humanity. 

It seems unlikely a malicious AI will attempt to literally 

end life on Earth. At the least, we have Asimov’s three laws 

to thank for that. But in a world where everything can be 

predicted, where everything we want to see can be shown 

to us, and where things that are “unpopular” or “troubling” 

never reach our eyes, it feels like a part of our humanity 

is lost. An artificial intelligence could operate in plain 

sight, tailoring our world to the patterns that dictate us. 

As mentioned, artificial intelligences are human systems, 

so they will follow the human model of changing the world 

to fit their needs. It is reasonable that if our needs rely on a 

series of predictable patterns, then an artificial intelligence 

with benevolent intentions could inadvertently neutralize 

the world’s ideological diversity and the differences that 

give us the human condition. 

This isn’t to say that we shouldn’t create artificial 

intelligences – in fact, it seems clear that our modern 

world couldn’t operate without them. There are proactive 

steps we must take to be stewards of our humanity. We 

must make an active choice to diversify our interests 

and the viewpoints to which we expose ourselves, even 

when they aren’t completely satisfying. We should model 

another fundamental element of our humanity into our 

artificial intelligences: variation. Our machine learning 

algorithms cannot rely on optimizing patterns alone – they 

must contain anomalies in their paradoxically predictive, 

average-based algorithms. If we do this, we can ensure 

that our artificial intelligences will enhance us instead of 

dictating conformity. 

References 

A Q & A with Pedro Domingos: Author of ‘The Master 

Algorithm’ [Interview by J. Langston]. (2015, September 

17). Retrieved January 4, 2019, from https://www. 

washington.edu/news/2015/09/17/a-q-a-with-pedrodomingos-author-of-the-master-algorithm/ 

Aski, A. S., & Sourati, N. K. (2016). Proposed efficient 

algorithm to filter spam using machine learning 

techniques. Pacific Science Review A: Natural Science 

and Engineering, 18(2), 145-149. doi:10.1016/j. 

psra.2016.09.017 

D’Onfro, J. (2018, July 23). Alphabet jumps after big 

earnings beat. Retrieved January 7, 2019, from https:// 

www.cnbc.com/2018/07/23/alphabet-earnings-q2-2018. 

html 

Faggella, D. (2018, December 21). What is Machine 

Learning? | Emerj - Artificial Intelligence Research and 

Insight. Retrieved January 9, 2019, from https://emerj. 

com/ai-glossary-terms/what-is-machine-learning/ 

Google Search Trends, Search Per Second. (n.d.). 

Retrieved January 12, 2019, from https://trends.google. 

com/trends/?geo=US 

IBM (2015). Automotive 2025: Industry without Borders. 

IBM Institute for Business Value. Retrieved January 

9, 2019, from http://www-935.ibm.com/services/ 

multimedia/GBE03640USEN.pdf 

Kaspersky Lab is Detecting 325,000 New Malicious 

Files Every Day. (n.d.). Retrieved January 5, 2019, from 

https://www.kaspersky.com/about/press-releases/2014_ 

kaspersky-lab-is-detecting-325000-new-malicious-filesevery-day 


Khirodkar, R., Yoo, D., & Kitani, K. M. (2018). 

VADRA: Visual Adversarial Domain Randomization 

and Augmentation. Carnegie Mellon University. 

Retrieved December 10, 2018, from https://arxiv.org/ 

pdf/1812.00491.pdf 

Learning, Then Talking. (1988, August 16). Retrieved 

January 6, 2019, from https://www.nytimes. 

com/1988/08/16/science/learning-then-talking.html 

Lewis, T. G., & Denning, P. J. (2018). The Profession of 

IT: Learning Machine Learning. Communications of the 

ACM, 61(12), 24-27. Retrieved December 26, 2018, from 

https://calhoun.nps.edu/bitstream/handle/10945/60898/ 

Denning_Learning_Machine_Learning_ACM_2018-12. 

pdf?sequence=1&isAllowed=y. 

Levenson, E. (2014, January 31). The TSA is in the 

Business of ‘Security Theater,’ Not Security. Retrieved 

January 7, 2019, from https://www.theatlantic.com/ 

national/archive/2014/01/tsa-business-security-theaternot-security/357599/ 

Marr, B. (2016, March 08). A Short History of Machine 

Learning -- Every Manager Should Read. Retrieved 

January 5, 2019, from https://www.forbes.com/sites/ 

bernardmarr/2016/02/19/a-short-history-of-machinelearning-every-manager-should-read/#2493077c15e7 

Matney, L. (2017, May 17). Google has 2 billion users on 

Android, 500M on Google Photos. Retrieved January 5, 

2019, from https://techcrunch.com/2017/05/17/googlehas-2-billion-users-on-android-500m-on-google-photos/ 

Schachinger, K. (2018, December 06). A Complete Guide 

to the Google RankBrain Algorithm. Retrieved January 

4, 2019, from https://www.searchenginejournal.com/ 

google-algorithm-history/rankbrain/ 

Scott, T. (2018, December 06). Retrieved January 13, 

2019, from https://www.youtube.com/watch?v=- 

JlxuQ7tPgQ 

Wu, J., Zhang, C., Xue, T., Freeman, W. T., & 

Tenenbaum, J. B. (2016). Learning a Probabilistic Latent 

Space of Object Shapes via 3D Generative-Adversarial 

Modeling. Advances In Neural Information Processing 

Systems, 29. Retrieved November 11, 2018, from https:// 

arxiv.org/abs/1610.07584. 

Yoganarasimhan, H. (2017). Search Personalization 

Using Machine Learning. SSRN Electronic Journal. 

doi:10.2139/ssrn.2590020 


OVEREXPRESSION OF A HEAT SHOCK PROTEIN IN 

CYANOBACTERIA TO INCREASE GROWTH RATE 

Robert Landry 

Abstract 

To increase earth’s capacity to support human population growth, methods of growing food more efficiently, especially 

in warmer environments as climate change progresses, must be developed. This project sought to increase the growth 

rate of one population of photosynthetic organisms, cyanobacteria, through genetic engineering. Synechococcus elongatus 

UTEX 2973 cultures were transformed to overexpress dnaJ, a heat shock protein, in normal and heat-stressed conditions 

to determine the gene’s effects on growth rates. The growth rates of the dnaJ overexpressing strain were related to the 

control--wild-type Synechococcus elongatus UTEX 2973 transformed with a plasmid without dnaJ--through comparisons 

of optical density measurements at 745 nanometers (OD745), which can accurately quantify growth rates. The change 

in OD745 in the dnaJ overexpressing strain was significantly greater than the OD745 measurements for the control in 

normal conditions. When the temperature was increased to 42˚C, the dnaJ overexpressing strain continued to grow, 

while the control strain’s OD745 measurements decreased. From this data, it appeared that the overexpression of a heat 

shock protein in the genome of cyanobacteria significantly increased their growth rates and provided heat resistance. 

Researching the effects of overexpressing a heat shock protein could be furthered in organisms such as corn, rice, soybeans, 

and other photosynthetic species. 

1. Introduction 

Cyanobacteria, bacteria that conduct photosynthesis, 

have the potential to revolutionize both agricultural 

practices and the food industry, if higher yields of target 

materials are attained (Chow et al., 2015). Cyanobacteria, 

capable of utilizing 10% of the sun’s energy, are nearly 10 

times more efficient at fixing carbon found in CO2 than 

other energy plants such as sugar cane or corn, which 

harness only 1% of the sun’s energy (Hunt, 2003). This 

efficiency drives cyanobacteria into the energy industry’s 

spotlight as a possible, influential source of energy for 

humanity. Moreover, their increased photosynthetic rates 

decrease the amount of CO2 in the atmosphere, which 

benefits the global environment. Five other aspects of 

these photosynthetic bacteria that interest scientists are 

that they: grow in high densities, use water as an electron 

donor, utilize infertile land, require non-food-based 

feedstock, and thrive in many different water conditions 

(brackish, fresh, or saltwater) (Parmar et al., 2015). 

Although all of these benefits already apply to 

cyanobacteria, it is still expensive to culture, grow, and 

eventually utilize the products of the bacteria in an 

efficient way. In order for cyanobacteria to be widely used, 

a sharp increase in target yields and decrease in expense 

must occur in order to compete with the simplicity and 

economic benefits of plants. 

Coupled with being more cost-effective when producing 

target materials, plants have also been genetically modified 

with genes originating from cyanobacteria to increase 

efficiency. For example, carbon fixation rates in transgenic 

tobacco were increased significantly after transforming 

cyanobacterial Rubisco into the tobacco’s genome. 

Photosynthetic efficiency was increased as a result of 

cyanobacteria’s efficiency, which serves as a precedent for 

future research (Occhialini et al., 2015). 

This transgenic tobacco demonstrates the viability of 

increasing the efficiency of plant growth with cyanobacteria 

research. This pursuit is important because scientists of 

the Global Harvest Institute estimate that the world could 

face a food crisis by 2030 (Martin, 2017). Developing new 

methods of growing crops is paramount to mitigating this 

impending humanitarian need. 

In recent decades, knowledge regarding cyanobacteria 

has increased exponentially, stemming first from the 

genome-mapping of Synechocystis sp. 6803, one species 

of cyanobacteria. Now there are more than 128 different 

strains of cyanobacteria fully sequenced, which provides 

many opportunities in genetic engineering to study 

the properties of the bacteria. This developing field 

of genetic engineering allows researchers to utilize 

various transformation techniques in order to optimize 

photosynthetic rates within cyanobacteria, and ultimately 

in other organisms as well (Al-Haj et al., 2016). 

The species Synechococcus elongatus PCC7942 is one 

species of cyanobacteria that has had its entire genome 

sequenced and therefore is a candidate for many genetic 

engineering projects that study photosynthetic processes, 

regulation of nitrogen-containing compounds, and 

acclimation to stressed conditions (Home - Synechococcus 

elongatus PCC 7942). Synechococcus elongatus PCC7942, 

previously known as Anacystis nidulans R2, is a freshwater 

cyanobacteria that was the first cyanobacteria to be 

successfully transformed using exogenous DNA (Shestakov 

10 | 2018-2019 | Broad Street Scientific BIOLOGY

Table 1. The list of forward and reverse primers used for isolating dnaJ. 

Species Gene Direction Melting 

Temperature 

(°C) 

Synechococcus elongatus 

UTEX L 2973 

Sequence (5’-3’) 

dnaJ Forward 69.2 GAGAATTCATGGGTC- 

GTCGCTGGA 

Purpose 

Transformation 


UTEX L 2973 

dnaJ Reverse 68.19 GAGGATCCCTAGCATG- 

CAAGCTCTCCTG 

Transformation 


UTEX L 2973 


UTEX L 2973 

dnaJ Forward 68.16 ATGCAAAATTTTCGC- 

GACTACTATGCC 

dnaJ Reverse 67.47 TCAACGCGATTGTTC- 

GAGCGAT 

RT-PCR 

RT-PCR 

& Khyen, 1970). Synechococcus elongatus PCC7942 are 

obligate photoautotrophs, which means that they only 

rely on their photosynthetic ability to produce nutrients 

instead of being able to break down and use nutrients found 

in their environment (Minda et al., 2008). Due to this 

attribute, Synechococcus elongatus PCC7942’s photosynthetic 

efficiency must be optimized for any condition, including 

stress, to outlast their natural competition. One such way 

that Synechococcus elongatus PCC7942 has been shown 

to adapt to extreme heat and high light conditions is the 

induction of the dnaK and dnaJ genes (Hihara et al., 2001). 

The gene dnaK has three different homologues found 

in the genome of Synechococcus elongatus PCC7942, 

designated dnaK1, dnaK2, and dnaK3. DnaK1’s function is 

unknown in the Synechococcus elongatus PCC7942, although 

it is known to be found in the cytosol of the cyanobacteria. 

Both dnaK2 and dnaK3 are essential for the growth of 

Synechococcus elongatus PCC7942 (Watanabe, 2007). Similar 

to dnaK, dnaJ has 4 homologues within the Synechococcus 

elongatus PCC7942 genome, referred to as dnaJ1, dnaJ2, 

dnaJ3, and dnaJ4. DnaJ3 has been found to be located in 

the membrane of the cyanobacteria. DnaJ2 is shown to be 

induced in extreme heat and high light conditions. Apart 

from these two homologues, dnaJ2 and dnaJ3, most of 

dnaJ roles in the cell have not been discovered (Shestakov 

& Khyen, 1970). The substitute for Synechococcus elongatus 

PCC7492 that will be used in this experiment due to 

budget constraints is Synechococcus elongatus UTEX L 2973. 

Within Synechococcus elongatus UTEX L 2973, there are 

10 homologues of dnaJ (Genome). Their respective roles 

within the cell beyond molecular chaperones are largely 

unknown, apart from dnaJ3, which is a known heat 

shock protein (Genome). The third homologue of dnaJ 

was isolated and overexpressed in a transformed strain of 

cyanobacteria in this research project. 

The goal of this project is to determine the effects of 

dnaJ on the photosynthetic rate of Synechococcus elongatus 

UTEX L 2973 and explore the correlation between the 

genes’ overexpression and growth rates in various heat 

conditions. This research could lead to new advancements 

in industry and agriculture through the higher production 

rates of glucose and target materials. 

2. Methods 

2.1 – Culturing Synechococcus elongatus UTEX L 2973 

Synechococcus elongatus UTEX L 2973 thrive in BG-11 

liquid medium at 30°C under 12-hour light cycles from a 

Percival Incubator. Once the cyanobacteria showed initial 

growth in the medium, the bacteria were aliquoted to more 

containers to protect the Synechococcus elongatus UTEX L 

2973 from contamination that could ruin the whole strain 

(Kufryk et al., 2002). 

2.2 – DNA extraction, PCR and RT-PCR 

The DNA from Synechococcus elongatus UTEX L 2973 

was extracted using the QIAamp DNA Mini Kit and its 

corresponding protocol (QIAGEN). Using the primers 

listed in Table 1, dnaJ was isolated including the restriction 

enzyme cut sites necessary for ligation. The PCR was run 

according to the OneTaq Hot Start protocol (Biolabs). The 

extension phase lasted for 2 minutes and the annealing 

temperature was 61°C. 

2.3 – Cloning 

Using BamHI and EcoRI restriction enzymes, dnaJ 

was ligated into the plasmid pSyn_6 from a GeneArt 

Synechococcus Engineering Kit. 

2.4 – Transformation of E. Coli 

A 5-alpha E. coli strain was transformed using the 

heat shock method to replicate the desired plasmid. Two 

different plasmids were used to transform the E. coli, one 

vector without dnaJ and one plasmid including dnaJ. After 

transformation, the E. coli grew in SOC medium, which 

was then spread on LB plates with spectinomycin at 50 

BIOLOGY 


μg/mL concentration. After growing overnight, colonies 

were labeled and were inoculated into tubes corresponding 

to their label to grow overnight. 

2.5 – Transformation of Synechococcus Elongatus UTEX L 

2973 

The plasmid DNA from the E. Coli was extracted using 

a Spin Miniprep Kit and its corresponding protocol. This 

plasmid DNA was then used to transform Synechococcus 

Elongatus UTEX L 2973 following the protocol provided 

by GeneArt Synechococcus Engineering Kit. This vector 

has not been used to transform pSyn_6 before. 

2.6 – Statistical Analysis 

To analyze the OD745 data, error bars were calculated 

by multiplying the standard error of the mean by two. To 

test significance, a t-test calculator for the comparison of 

means was used to determine a p-value. One asterisk (*) 

represents significance at a p-value of < .05; two asterisks 

(**) concludes significance at a p-value of < .01; three 

asterisks (***) demonstrates that the data are significant at 

a p-value of < .005 

show the successful isolation of dnaJ, a gene of length 

1.8kb (Fig. 2a). 

Figure 2a. Successful PCR amplification of dnaJ. The 

band at 1.8kb is dnaJ. 

Figure 2b. The cutout portion of the gel isolated the 

vector that was used in gel extraction and ligation. 

Figure 2c. Cutouts from the gel isolated dnaJ. 

These bands were ligated into the plasmid after gel 

extraction. 

Figure 1. dnaJ will be inserted in between EcoRI and 

BAMHI 

3. Results 

3.1 – Cloning Strategy 

A cloning strategy was used (Fig. 1). DnaJ was isolated 

including the restriction enzyme cut sites necessary for 

ligation using the aforementioned primers (Table 1). The 

enzymes cut the target gene at the lines on either side of 

dnaJ (Fig. 1). The vector for GeneArt also had the same 

two restriction enzyme cut sites, BamHI and EcoRI, as the 

isolated gene. Utilizing DNA Ligase, the dnaJ was inserted 

into the plasmid in the 5’-3’ direction with a constitutively 

active promoter, PpsaA (NEB). 

The bands around 1.8kb surrounded by the red boxes 

Figure 2d. Gel electrophoresis of restriction enzyme 

digested plasmid from transformed E. coli. The bands 

at 1.8kb and 4.5 kb in lane 7 demonstrate correct 

ligation and transformation of the E. coli colony. 

Plasmid from this colony was used to transform 

Synechococcus elongatus UTEX L 2973. 


Following the preliminary PCR, another PCR reaction 

was run, and its products were cut with restriction enzymes 

before being exposed to ultraviolet light. Ultraviolet is a 

known DNA mutagen and hence, exposing dnaJ to this 

light before transformation in the cyanobacteria could 

alter its natural use in the cell. The vector was also cut 

with the restriction enzymes, BamHI and EcoRI. These 

two cut products were run through gels to purify the 

digested DNA (Fig 2b & 2c). Once the products were 

cut out, gel extraction was run to purify the DNA from 

the gel, so that ligation could be run (QIAquick). After 

the ligated plasmid was formed using DNA Ligase and a 

ligation buffer, the E. coli strain 5-alpha from New England 

Biolabs was transformed using heat-shock method (Fig. 

1). 1, 3, and 6 μL of extracted DNA solution were added 

into separate vials of transformation-competent E. coli 

cells and were mixed gently. This mixture was put on ice 

for 30 minutes then heat-shocked at 42°C for 30 seconds 

without shaking. The transformed E. coli was put on ice for 

2 minutes. 250 μL of room temperature SOC medium was 

added to the vial of E. coli. This vial was incubated shaking 

horizontally at 55 rpms at 37°C for 1 hour. Following 

the incubation, the various tubes of transformed E. coli 

were plated on separate solid LB medium plates with 

spectinomycin at a concentration of 50 μg/mL. The plates 

were incubated overnight at 37°C. Because of the presence 

of spectinomycin on the plates and the plasmid’s resistance 

to spectinomycin, the colonies that grew on the plate 

overnight had to contain the target plasmid. 

Once the colonies formed, 12 colonies were isolated 

and grown individually in 3.0 mL of LB medium with 

spectinomycin at a concentration of 50 μg/mL overnight. 

The plasmid was isolated from these vials of transformed 

E. coli using the Spin Miniprep Kit (QIAprep). The plasmid 

was digested by EcoRI and BamHI. Gel electrophoresis 

was conducted to determine whether or not the plasmid 

incorporated the target gene properly (Fig. 2d). Culture 

6 replicated the desired plasmid as seen by the bands at 

4.5kb and 1.8kb, so the remaining plasmid that was not 

run through the gel was used to transform Synechococcus 

elongatus UTEX L 2973. 

Figure 3a. Two colonies transformed with only 

vector in the presence of 10 μg/mL spectinomycin. 

Figure 3b. Six colonies overexpressing dnaJ in the 

selective presence of spectinomycin. 

3.2 – Transformation 

The cyanobacteria were transformed using the protocol 

corresponding to the GeneArt Synechococcus Engineering 

Kit. Following transformation, the cyanobacteria 

were plated on solid BG-11 media with 10 μg/mL 

spectinomycin under normal conditions. Colonies formed 

and overexpressed dnaJ, and those that were transformed 

with only pSyn_6 plasmid grew (Fig. 3a & 3b). All of 

these colonies were numbered and then inoculated into 

flasks containing liquid BG-11 media with 10 µg/mL 

spectinomycin. 

BIOLOGY 


3.3 – Growth Assays 

tested, it appeared as if the overexpression resulted in 

increased rates in both conditions (Fig. 5a & 5b). 

Figure 4a. Flasks with varying optical densities. The 

three flasks on the left were cultured for the longest 

time and thus, had the highest optical densities. The 

flasks on the right had grown more recently and 

were not as dark. 

Figure 4b. Optical density values of corollary flasks. 

Higher optical densities correspond to darker flasks. 

In order to determine the effects of dnaJ’s overexpression 

on growth rates within cyanobacteria, optical density 

measurements were taken from different cultures at 745 

nanometers (nm) at varying temperatures. This is an 

appropriate wavelength because optical density measures 

turbidity instead of absorbance. The absorbance of the 

selected wavelength should be negligible in order for the 

measurements to strictly account for the reflection of light 

off of the cells in the solution (Martin, 2014). 

There were 7 flasks of cyanobacteria before 

transformation with varying optical densities (Fig. 4a). 

In order to demonstrate what color and darkness of flasks 

correlates to OD745 values, optical density measurements 

corresponding to cyanobacteria culture were graphed (Fig. 

4b). From the left to right, the optical density values were: 

.250, .133, .292, .119, .144, .022, .014. The darkest flasks 

evidently have the highest optical density measurements. 

This growth assay measures the total increase in optical 

density over time. The higher the change in optical density 

is, the higher the rate of growth for the colony is. Thus, 

when testing the two different transformed strains of 

Synechococcus elongatus UTEX L 2973, the transformed 

strain overexpressing dnaJ should have the higher change 

in optical density if the heat shock protein overexpressed 

through dnaJ truly increases photosynthetic and growth 

rates. 

Because dnaJ codes for a heat shock protein, it 

was suspected that the growth rates of the strain 

overexpressing this gene would be significantly greater 

in only heat stressed conditions in comparison to the 

cyanobacteria only transformed with the vector. It was 

believed that the growth rate in normal conditions would 

not be affected greatly by the heat shock protein because 

the overexpression would not be necessary to withstand 

high temperatures. However, once the growth rates were 

Figure 5a. Graph of optical density (745nm) of control 

stain and dnaJ overexpressing strain 30° Celsius. The 

data suggest dnaJ significantly increases growth 

rates. 

Figure 5b. The colonies were grown at 40° for 9 days. 

There was a trend in the data that indicates dnaJ 

promotes faster growth, but after the colony was 

exposed to a higher temperature, 42°, the control 

stain did not grow whereas the dnaJ strain continued 

normal growth. Data became significant two days 

after increased temperature. 

The overexpression of dnaJ in normal conditions of 

30°C and 12-hour light cycles increased the growth rate 

of the cyanobacteria significantly in comparison to the 

control. This significance was seen as early as day 8. The 

average OD745 of the dnaJ strain after 12 days was .39 in 

comparison to the vector strain that had an average OD745 

of .25. This increase in optical density is attributed to the 

overexpression of dnaJ. 

DnaJ’s overexpression within cyanobacteria in heat 

stressed conditions of 40°C and 12-hour light cycles also 

tended to increase growth rates. After 9 days, the average 


OD745 of the overexpressing strain was .58 while the other 

strain had an average of only .52; however, the standard 

deviation within each of the sample groups was too high 

to conclude significance at 40°. When the temperature 

in the Percival Incubator was increased to 42°, the dnaJ 

overexpressing strain grew normally, whereas the control 

strain’s average optical density decreased. The average 

optical density of the dnaJ overexpressing strain after 

19 days was 1.35, and the control strain had an average 

optical density of .35. After just two days being exposed 

to the higher temperature, the difference between the two 

strains was significant, suggesting that the overexpression 

of dnaJ provided heat resistance to the transformed 

strain of cyanobacteria. There is a visual difference in 

optical density at day 19 in comparison to day 5, which 

demonstrates dnaJ’s potential to increase growth rates and 

provide heat resistance (Fig. 6). 

Figure 6. The photo of the flasks in the top panel was 

taken on day 5. The flasks have similar tints of green. 

The last photo was taken on day 19. In the eight flasks 

on the left, the dnaJ overexpressing cultures have a 

much darker color than the control flasks. 

4. Discussion 

Essentially, this research sought to create a unique 

strain of cyanobacteria through genetic transformation. 

The specific plasmid utilized in the experimentation 

had not been used to transform Synechococcus elongatus 

UTEX L 2973 previously. The successful transformation 

as seen from the colony growth in the selective presence 

of spectinomycin demonstrates the competence of the 

plasmid pSyn_6 in transforming the experiment’s specific 

strain. 

Despite the successful outcome of the research, 

there were several limitations in the experiment due to 

equipment and budget restrictions. One such limitation 

was the inability to determine the difference between 

the rates of oxygen evolution in the two strains. This 

would have led to a more precise measurement of the 

photosynthetic rate because oxygen is directly produced in 

photosynthesis. Optical density is a less direct measurement 

of this rate, but accurate, nonetheless. Without the 

generation of sugars through photosynthesis, the strains 

BIOLOGY 

could not grow. Because of this, higher photosynthetic 

rates should correspond to higher growth rates. Another 

limitation of this experiment was the inability to confirm 

gene expression in the transformed strains. However, 

confirming the correct plasmid makes it reasonable to 

assume that the growth rates increased on account of dnaJ 

overexpression. 

This beneficial genetic overexpression has many 

potential applications in both the agriculture and energy 

industries. Because cyanobacteria are currently the most 

photosynthetically efficient organisms on the planet, 

this modification could lead to future applications in 

agriculture or more economic biofuel production that will 

capitalize on their efficiency (Hunt, 2003). One possible 

application could be the production and secretion of 

sugars for consumption. Because cyanobacteria are not 

seasonal like sugar cane, they could produce sugars more 

consistently and more efficiently, especially following 

genetic engineering. Clearly, isolation of sugar from a 

cyanobacteria solution would have to be much cheaper 

for this to be a viable contender with sugar cane, but 

nonetheless, this could be a potential application of 

genetically engineered cyanobacteria. Beyond sugars, 

cyanobacteria’s products have been manipulated to 

produce ethanol (Chow et al., 2015). Producing ethanol 

could prove to be a disruptive application of cyanobacteria 

in the energy industry, especially when paired with dnaJ 

overexpression. 

Another possible application could be overexpressing 

heat shock proteins in other photosynthetic organisms to 

determine their effect on growth and photosynthetic rates. 

Overexpressing either dnaJ or corollary proteins specific 

to certain species within corn, rice, or soybeans could 

lead to increased production of these crops both in fertile 

geographies and in regions that are currently considered 

arid. Because heat shock proteins increased growth in 

Synechococcus elongatus UTEX L 2973 even in heat stressed 

conditions, it could be possible to genetically engineer 

cash crops to make them resistant to higher temperatures. 

This resistance could lead to the cultivation of previously 

infertile land, feeding millions more people worldwide. 

Further experimentation must be done to conclude the 

viability of any of these applications. 

5. Acknowledgements 

I would like to thank Dr. Monahan for teaching 

me the research process and guiding me through the 

fickle experimentation that is molecular biology. Thanks 

to her instruction and her patience with my stubborn 

commitment to this project, I was able to persevere 

through obstacles and accomplish my dream of genetically 

engineering cyanobacteria. I would like to thank Dr. 

Sheck for supervising me while I spent hours in the sterile 


hood working with my cyanobacteria. I would also like to 

thank the rest of my Research in Biology colleagues for 

encouraging me throughout my time researching. I would 

like to thank Kevin Zhang and Tyler Edwards who were 

lab assistants during the Glaxo Summer Research Fellows 

Program. Finally, I want to thank the North Carolina School 

of Science and Mathematics and the Glaxo Endowment for 

blessing me with the opportunity to experience research in 

high school. I have learned many lessons that I will carry 

with me through the rest of my career in both research and 

other fields. 

6. References 

Al-Haj, L., Lui, Y. T., Abed, R. M. M., Gomaa, M. A. 

& Purton, S. Cyanobacteria as Chassis for Industrial 

Biotechnology: Progress and Prospects. Life (Basel) 6, 

(2016). 

Algae, U. C. C. of. UTEX L 2973 Synechococcus elongatus. 

UTEX Culture Collection of Algae Available at: https:// 

utex.org/products/utex-l-2973. (Accessed: 25th January 

2018) 

Biolabs, N. E. Protocol for OneTaq Hot Start DNA 

Polymerase (M0481). New England Biolabs: Reagents 

for the Life Sciences Industry Available at: https://www. 

neb.com/protocols/2012/09/05/one-taq-hot-start-dnapolymerase-m0481. 

(Accessed: 27th October 2018) 

Biolabs, N. E. Taq 2X Master Mix. New England Biolabs: 

Reagents for the Life Sciences Industry Available at: 

https://www.neb.com/products/m0270-taq-2x-mastermix#Product 

Information. (Accessed: 7th October 2018) 

Chow, T.J. et al. Using recombinant cyanobacterium 

(Synechococcus elongatus) with increased carbohydrate 

productivity as feedstock for bioethanol production via 

separate hydrolysis and fermentation process. Bioresource 

Technology 184, 33–41 (2015). 

GeneArt Synechococcus Protein Expression Vector. 

Thermo Fisher Scientific Available at: https://www. 

thermofisher.com/order/catalog/product/A24230. 

(Accessed: 7th October 2018) 

Hihara, Y., Kamei, A., Kanehisa, M., Kaplan, A. & Ikeuchi, 

M. DNA Microarray Analysis of Cyanobacterial Gene 

Expression during Acclimation to High Light. Plant Cell 

13, 793–806 (2001). 

Home - Synechococcus elongatus PCC 7942. Available 

at:https://genome.jgi.doe.gov/portal/synel/synel.home. 

html. (Accessed: 21st January 2018) 

Hunt, S. Measurements of photosynthesis and respiration 

in plants. Physiol Plant 117, 314–325 (2003). 

Kufryk, G. I., Sachet, M., Schmetterer, G. & Vermaas, W. 

F. J. Transformation of the cyanobacterium Synechocystis 

sp. PCC 6803 as a tool for genetic mapping: optimization 

of efficiency. FEMS Microbiology Letters 206, 215–219 

(2002). 

Martin, A., Researchgate. Available at: https://www. 

researchgate.net/post/When_measuring_cyanobacterial_ 

growth_when_do_I_use_which_wavelength. 

Martin, S. World will run out of food by 2050 thanks 

to population boom. Express.co.uk (2017). Available 

at: https://www.express.co.uk/news/science/803791/ 

World-will-run-out-of-food-by-2050-population-boom. 

Minda, Renu, et al. “The Evolutionary Significance of 

‘Obligate’ Photoautotrophy of Cyanobacteria.” Current 

Science, vol. 94, no. 7, 10 April 2008, pp. 850-852. 

Occhialini, A., Lin, M. T., Andralojc, P. J., Hanson, M. R. 

& Parry, M. A. J. Transgenic tobacco plants with improved 

cyanobacterial Rubisco expression but no extra assembly 

factors grow at near wild-type rates if provided with 

elevated CO2. The Plant Journal 85, 148–160 (2015). 

Parmar, A., Singh, N. K., Pandey, A., Gnansounou, E. & 

Madamwar, D. Cyanobacteria and microalgae: A positive 

prospect for biofuels. Bioresource Technology 102, 10163– 

10172 (2011). 

QIAGEN. Quick-Start Protocol: QIAamp DNA Mini 

Kit. Confidence in Your PCR Results - The Certainty of 

Internal Controls - QIAGEN Available at: https://www. 

qiagen.com/us/resources/resourcedetail?id=566f1cb1- 

4ffe-4225-a6de-6bd3261dc920&lang=en. 

QIAprep Spin Miniprep Kit. Confidence in Your PCR 

Results - The Certainty of Internal Controls - QIAGEN 

Available at: https://www.qiagen.com/us/shop/sampletechnologies/dna/plasmid-dna/qiaprep-spin-miniprepkit/#orderinginformation. 

QIAquick Gel Extraction Kit. Confidence in Your PCR 

Results - The Certainty of Internal Controls - QIAGEN 

Available at: https://www.qiagen.com/us/shop/sampletechnologies/dna/dna-clean-up/qiaquick-gel-extractionkit/#orderinginformation. 

Restriction Endonuclease Products | NEB. Available 

at: https://www.neb.com/products/restrictionendonucleases. 

(Accessed: 2nd February 2018) 


Shestakov, S. V. & Khyen, N. T. Evidence for genetic 

transformation in blue-green alga Anacystis nidulans. 

Molec. Gen. Genet. 107, 372–375 (1970). 

Synechococcus sp. UTEX 2973, complete genome. (2015). 

Watanabe, S., Sato, M., Nimura-Matsune, K., Chibazakura, 

T. & Yoshikawa, H. Protection of psbAII transcript from 

ribonuclease degradation in vitro by DnaK2 and DnaJ2 

chaperones of the cyanobacterium Synechococcus elongatus 

PCC 7942. Biosci. Biotechnol. Biochem. 71, 279–282 

(2007). 

BIOLOGY 


HYPOGLYCEMIC EFFECT OF Momordica charantia 

AGAINST TYPE 2 DIABETES MODELED IN Bombyx mori 

Aarushi Venkatakrishnan 

Abstract 

Diabetes is a disease that affects millions across the world, occurring when there are high levels of glucose in the blood. 

Currently, treatments for Type 2 Diabetes include lifestyle and diet changes, medication, and insulin injections; however, 

natural treatments, such as the vegetable bitter melon, have become more popular in recent years. As it is abundantly 

grown in Asia, which houses 60% of the world’s diabetics, this finding can be very effective. Using a silkworm model, the 

hypoglycemic effect of bitter melon was quantified by measuring the silkworm’s hemolymph glucose concentration with 

the phenol sulfuric acid method. Injections of saline, insulin, and bitter melon solutions were made at the first proleg of 

the silkworms. Hyperglycemia was induced after two days of a 10% high glucose diet, and human insulin significantly 

counteracted the effect. There were no changes to mass or length between the hyperglycemic and normal silkworms. 

After comparing its hypoglycemic effect to insulin, a known hypoglycemic agent, the most effective tested dose of bitter 

melon was found to be 175 µg/mL, 5 times greater than the corresponding insulin dose, 35 µg/mL. With further trials to 

determine the symptoms and overall effects to human health, bitter melon can potentially be recommended as an addition 

to the diet for diabetes treatment. 

1. Background 

1.1 Introduction 

Diabetes mellitus is a disease which is characterized by 

high levels of sugar in the blood, hyperglycemia, resulting 

from the body unable to use blood glucose for energy 

(Drive, n.d.). Typical symptoms include increased thirst, 

unexplained weight loss, and frequent infections (Drive, 

n.d.). This disease occurs when the body is unable to 

effectively use insulin, a hormone made by the pancreas, 

to process glucose, and thus causing an increase in blood 

sugar (“Insulin, Medicines, & Other Diabetes Treatments,” 

2016). There are two types ranging in severity: Type 1 and 

Type 2. Type 1 diabetes is an autoimmune disease in which 

the immune system destroys islet cells resulting in the body 

being unable to make insulin (Jin Yang & Mook Choi, 

2015). Type 2 diabetes is a chronic condition that changes 

how the body is able to metabolize glucose, caused by the 

pancreas either not producing enough insulin or the body 

becoming resistant to insulin (Matsumoto et al., 2011). 

The current treatment for Type 1 is administering insulin 

exogenically with numerous insulin treatments available, 

such as an insulin pump, pen, or an inhaler. These vary in 

terms of how fast they act, the quickest being 15 minutes 

after injection and the longest being several hours, but 

correspondingly the duration of the effect differs (“Insulin, 

Medicines, & Other Diabetes Treatments,” 2016). Type 2 

diabetes treatment includes maintaining a healthy lifestyle 

and monitoring blood glucose levels (Matsumoto et al., 

2011). However, Type 2 diabetic patients can even take 

insulin treatments to make up for that not produced in the 

body; metformin is a commonly prescribed medicine first 

given to diabetic patients to lower the amount of glucose 

produced by the liver and help the body process insulin 

better (“Insulin, Medicines, & Other Diabetes Treatments,” 

2016). 

The number of people being affected by this condition 

has been increasing rapidly. In 2015, 1.5 million new cases 

were diagnosed, and in the United States alone, there were 

30.2 million Americans with some form of diabetes in 2017 

(“CDC Press Releases,” 2016). With this rise, the demand 

for diabetes treatments has increased. The development 

of new ways to introduce or stimulate insulin secretion 

is necessary as it has the potential to help the millions of 

people afflicted with diabetes live healthier lives. 

Although there are a variety of drugs in the market 

for Type 2 Diabetes, the desire for herbal medicines 

has increased as they can often be more accessible than 

traditional forms. In addition, these natural remedies are 

often more available and accepted than Western Medicine. 

There has always been a sector of herbal medicines called 

Complementary and Alternative Medicine (CAM); one 

of the oldest and most-well known practices is Ayurvedic 

medicine, which originates in India. Many herbs, fruits, 

and vegetables used in Ayurvedic medicine have shown 

promising results against diseases involving high blood 

pressure, anxiety, cancer, and more (Axe, 2015). One 

notable vegetable used is Momordica charantia, otherwise 

known as bitter melon. Numerous studies have been 

conducted that suggest bitter melon has hypoglycemic 

effects (Fuangchan, 2011; Jin Yan, 2015). 

Jin Yang et al. treated diabetic rats with bitter melon 

and found three functional components of bitter melon 

that were likely causing a hypoglycemic effect: charantin, 

vicine, and polypeptide-p (2015). By using three groups: 

a high fat control, high fat and 1% bitter melon, and high 

fat and 3% bitter melon, bitter melon had significantly 


improved glucose tolerance and insulin sensitivity. In the 

3% bitter melon, they found that it increased the levels 

of two insulin receptors, (phosphor-insulin receptor 

substreate-1 (Tyr612) and phosphor-Akt (Ser473), likely 

stimulating the hypoglycemic effect (Jin Yang & Mook 

Choi, 2015). 

Furthermore, bitter melon has been used in clinical 

studies using human patients with Type 2 Diabetes. 

Fuangchan et al. investigated the effects of varying doses 

of bitter melon (500 mg/day, 1000 mg/day, 2000 mg/day) 

when comparing it to metformin (1000 mg/day), a current 

diabetes medication (2011). By measuring the fructosamine 

concentrations over a 2-week time period from baseline to 

endpoint, they found that the 500 mg/day and 1000 mg/ 

day doses did not significantly impact glucose levels, while 

the 2000 mg/day dose did. When compared to metformin, 

the effects of bitter melon were still less (Fuangchan, 2011). 

Both groups did not experience extreme adverse effects; 

only mild headaches, dizziness, and increased hunger 

were experienced in the 2000 mg/day bitter melon group 

(Fuangchan, 2011). The drawbacks of this study include 

the limited time as it was only conducted for 4 weeks, and 

because effects were only seen for the 2000 mg/day dose, 

higher dose levels would need to be tested. 

1.2 – Silkworm Model 

Although bitter melon is known to have hypoglycemic 

effects, research surrounding the topic is not standardized 

and hard to compare. With very minimal clinical trials, 

the side effects of bitter melon are difficult to determine 

as well. In Matsumoto et al., scientists established the 

silkworm as a reliable model of diabetes (Matsumoto et al., 

2011). While silkworms do not have blood like humans, 

they have hemolymph which is a fluid “equivalent” to 

blood. In this study, glucose levels after treatment with a 

high glucose diet were higher than that of the silkworms 

fed a normal diet. By treating the hyperglycemic silkworms 

with insulin, glucose levels returned to normal. Moreover, 

they also tested an herbal extract, jiou, and found that it 

could mimic the effects of insulin by reducing glucose 

concentrations. 

Based on this research, the silkworm model could be 

used to determine the hypoglycemic effects of bitter melon. 

Here we show hyperglycemic silkworms that are treated 

with bitter melon extract, to study if their hemolymph 

sugar levels will go down without adverse reactions in 

terms of body size, body mass, or lifespan because bitter 

melon has been known to have hypoglycemic properties 

as used in ayurvedic medicine. This hypoglycemic effect 

is likely as bitter melon has the identified components 

charantin, vicine, and polypeptide-p and has been a cultural 

remedy as used in Ayurvedic medicine. 

BIOLOGY 

Figure 1. Anatomy of a silkworm. Length 

measurements were made from the thorax to the 

caudal leg. Injections were made in between the first 

proleg and the second proleg from the head capsule. 

2. Methods 

This study consisted of two preliminary experiments and 

two main experiments. The two preliminary experiments 

determined the equation for the Beer’s Law Plot to be 

used when calculating the D-glucose concentration of 

silkworms and established that a high glucose diet raised 

the D-glucose levels of silkworms. The main experiments 

tested the effect of insulin when compared to the same dose 

of bitter melon and evaluated the optimal concentration 

of bitter melon. The experiment unit was the addition 

and kind of hypoglycemic agent, measuring the change in 

average D-glucose levels. For the main experiments, the 

positive control was the hyperglycemic silkworm treated 

with insulin, a known hypoglycemic agent. The negative 

control was the silkworm fed a normal diet and injected 

with saline, to mimic the effect of an injection without the 

addition of a chemical agent. 

2.1 – Silkworm Diet 

The essentials of the silkworm diet consist of mulberry 

leaves. To create the silkworm diet, Carolina® Silkworm 

Diet was purchased from Carolina Biological. In a 2000 

mL glass beaker, ½ pound of the mulberry powdered diet 

was added to 720 mL (roughly 3 cups) of tap water. Using 

a stirring rod, the substances were thoroughly mixed to a 

uniform consistency. It was then covered with plastic wrap 

and secured with a rubber band. The beaker was placed in 

the microwave at high heat until the mixture came to a 

boil, usually after 1-2 minutes. This caused the mixture to 

rise and bubbles appeared on the surface. Once the mixture 

boiled, the beaker was removed from the microwave and 

stirred to again ensure uniform consistency. It was then 

placed back in the microwave to repeat the process. After 

the second boil and mixture, plastic wrap was tightly placed 

against the surface of the mixture to ensure no moisture 

escaped. Time was allowed for the beaker and substances 

to cool down. Then, the top of the beaker was secured 

with plastic wrap and a rubber band. It was placed in the 


efrigerator to store. 

2.2 – Silkworm Maintenance 

Silkworm eggs were purchased from Carolina 

Biological. They were placed in petri dishes and incubated 

at 29 °C. After roughly a week, the eggs hatched, and they 

were disposed of. The larva was transferred to a fresh plate 

with mulberry powdered diet placed on a paper towel. 

Feedings were made every other day to clean out feces and 

remove dried food. The experiment was performed during 

the fifth instar (around 4 weeks after hatching). Raising 

the temperature increased growth, whereas lowering the 

temperature delayed growth. 

2.3 – High Glucose Diet 

To induce hyperglycemic conditions, a high glucose diet 

of the Mulberry Chow was created by mixing appropriate 

amounts of D-Glucose and the Mulberry Chow. D-Glucose 

was added to the Mulberry Chow in a beaker, and then 

mixed until all contents were dissolved. A 10% and 15% 

D-Glucose diet were created. 

2.4 – Injection 

50 µl of each solution was injected into the hemolymph 

at the second abdominal segment of the larva after the 

first proleg using 1 mL syringes (Fig. 2). Injections were 

done on 12-hour cycles for 2 days after 2 days of the high 

glucose diet for the preliminary experiments. Injections 

were performed once 24 hours before extraction for the 

main experiments. The total treatment lasted 4 days with 

measurements taken on Day 5. 

was made with 0.9% NaCl and 0.1% acetic acid. A stock 

solution of bitter melon was created by combining 0.50 

grams of powdered bitter melon with 50 mL of distilled 

water to make a 0.1 g/mL solution. It was heated at 40 

°C, for 15 minutes and left overnight for 2 days. Then, 

vacuum filtration was performed 3 times to filter out 

particulates. It was then appropriately diluted to the 

varying concentrations used in the experiment. 

2.6 – Glucose Quantification 

Hemolymph was collected from the larvae through a cut 

on the first proleg after they had developed to the fifth instar. 

Precipitated proteins were removed by centrifugation at 

3000 rpm for 10 min. 175 µl of the supernatant was diluted 

with 175 µl distilled water for sugar quantification (350 µl 

total). The total sugar in the hemolymph was determined 

using the 0.05 % phenol-sulfuric acid (PSA) method. 

Hemolymph extract (350 µl) was mixed vigorously with 

1050 µl 70% sulfuric acid. Immediately, 210 µl of 5% phenol 

aqueous solution were added and mixed. The test tubes 

were held in a water bath at 90 °C. The samples were then 

cooled to room temperature. The absorbance at 490 nm 

was measured using a spectrophotometer. Serially diluted 

glucose solution was used as a standard. 

2.7 – Statistical Measurements 

Measurements of the mass (g) and length (cm) of 

the silkworms were taken prior to experimentation. 

They were then monitored throughout the trial days 

and recorded until extraction. Data were analyzed using 

unpaired student t-tests with unequal variance, as sample 

size differed across the trials. Error bars were calculated 

using standard error of the mean (SEM). 

2.8 – Comparing the Effects of Insulin and Bitter Melon 

Figure 2. Injection site after the first proleg. 

Hemolymph was extracted from this same site as 

well. 

To determine whether bitter melon has hypoglycemic 

effects, a silkworm model was used as it has been previously 

identified as a workable model for diabetes research 

(Fuangchan, 2011). Three characteristics were measured to 

determine the effectiveness of the bitter melon treatment: 

body mass, body length, and hemolymph sugar levels. Each 

trial consisted of 6 treatments, separated into high-glucose 

and normal diet models. In addition to these treatments, 

the effect of insulin on both the hyperglycemic and normal 

silkworm was used to provide a standard of comparison to 

see the success of the bitter melon extract (Table 1.1). 

2.5 – Injection Solutions 

A 35 µg/mL solution of insulin was created by diluting 

a 20 mg/mL solution to 15 mL. The dilution solution 


Table 1.1. Experimental design comparing effect of 

equal dose bitter melon 

Diet 

Treatment Normal Diet High Glucose Diet 

Saline “Normal” with 

Saline (NS) 

“High” with 

Saline (HS) 

Insulin “Normal” with 

Insulin (NI) 


Insulin (HI) 

Bitter 

Melon 

“Normal” with 

Bitter Melon 

(NB) 


Bitter Melon 

(HB) 

A. 

B. 

2.9 – Determining the Ideal Concentration of Bitter Melon 

Following the experimental model from the Injections, 

varying concentrations of bitter melon were tested in the 

silkworms to determine which concentration of bitter 

melon has the largest effect on the sugar concentration in 

silkworms (Table 1.2). 

Table 1.2. Experimental design comparing effect of 

varying doses of bitter melon 

Diet 

Treatment Normal Diet High Glucose Diet 

Saline “Normal” with 

Saline (NS) 

“High” with Saline 

(HS) 

Insulin - “High” with 

Insulin (HI) 

Bitter 

Melon 1 

- “High” with Bitter 

Melon 1 (HB1) 

Bitter 

Melon 2 


Melon 2 (HB2) 

Bitter 

Melon 3 


Melon 3 (HB3) 

Figure 3. (A, B) Glucose standards undergoing phenolaqueous 

protocol and depicted visually with a 

gradient of yellow colors, shown from left to right as 

(1) Blank, (2) 1.00 M D-Glucose, (3) 0.055 M D-Glucose, 

(4) 0.0275 M D-Glucose 

3.2 – High Glucose Diet 

Before proceeding with the experiment, a baseline of a 

high glucose diet needed to be tested. In this preliminary 

experiment, three treatments were tested: a normal diet 

of mulberry chow, a 10% glucose diet after 24 hours, and 

the same diet after 48 hours. A sample size of 9 silkworms 

was used for the normal diet. Five silkworms were used for 

both of the 10% glucose diet treatments. Average glucose 

levels across the hemolymphs of the silkworms are shown 

(Fig. 4). 

3. Results 

3.1 – Glucose Quantification 

To understand and best utilize the phenol aqueous 

method, a series of D-glucose standards were used to 

create a Beer’s Law plot (Fig. 3A,B). Different D-Glucose 

concentrations were used to generate a standard. These 

included 0.0275 M, 0.055 M, and 1.000 M. 

BIOLOGY 


A. 

B. 

3.3 – Comparing the Effects of Insulin and Bitter Melon 

After establishing that a hyperglycemic diet could 

induce high glucose concentrations in silkworms, the 

effect of insulin was tested on a normal and high glucose 

diet. This standard concentration would also be replicated 

by an equal dose of bitter melon extract. Based on the 

results of the previous experiments, silkworms were fed a 

high glucose diet for 2 days to induce hyperglycemia (Fig. 

4). On Day 3 of the diet, injections were performed with 

the various treatments. Then, on Day 4, their hemolymphs 

were extracted and quantified for glucose concentration. 

C. 

Figure 4. (A) Average D-glucose concentration. (B) 

Average mass. (C) Average length. Three different 

treatment groups were measured: Normal (mulberry 

diet); High Day 1 (mulberry + 10% glucose for 24 

hours); High Day 2 (mulberry to 10% glucose for 48 

hours). (p < 0.05= *, p < 0.01= **, p < 0.001 = ***, p > 0.05= 

ns). Error bars ± SEM. 

The average glucose concentration of a normal 

silkworm is about 30.0 mg/mL (Fig. 4A). With the addition 

of a high glucose diet, the average glucose concentration is 

52.4 mg/mL after one day and 79.1 mg/mL after two days. 

This means that after 48 hours, there is almost a 2.5-fold 

increase. After conducting a t-test for further statistical 

analysis, p-values were calculated. The only statistically 

significant difference is between the normal diet and two 

days of the high glucose diet, indicating that 48 hours at 

a minimal of 10% D-glucose diet is required to increase 

hemolymph glucose levels significantly. There was no 

statistically significant difference between the mass and 

length of the normal silkworms and the two tested trials 

(Fig. 4B and 4C). 

Figure 5. Average glucose concentrations across 

various treatments. Six different treatment groups 

were measured: Normal with Saline (mulberry diet 

+ insect saline, n=8), Normal with Insulin (mulberry 

diet + 35 µg/mL insulin, n=7), Normal with Bitter 

Melon (mulberry diet + 35 µg/mL bitter melon 

extract, n=9), High with Saline (mulberry diet + 15% 

glucose for 84 hours + insect saline, n=9), High with 

Insulin (mulberry diet + 15% glucose for 84 hours + 

35 µg/mL insulin, n=10), and High with Bitter Melon 

(mulberry diet + 15% glucose for 84 hours + 35 µg/mL 

bitter melon extract, n=9). (p < 0.05= *, p < 0.01= **, p < 

0.001 = ***, p > 0.05 = ns). Error bars ± SEM. 

The average glucose concentration for each of the 

treatments are as follows (Fig. 5): 40.595 mg/mL (Normal 

diet with Saline), 26.929 mg/mL (Normal diet with 35 

µg/mL Insulin), 34.366 mg/mL (Normal diet with 35 µg/ 

mL Bitter Melon), 62.905 mg/mL (15% High glucose diet 

with Saline), 33.287 mg/mL (15% High glucose diet with 

Insulin), and 50.579 mg/mL (15% High glucose diet with 

bitter melon). As can be seen from the p-values, there was 

a statistically significant difference between the normal 

with saline and high with saline, and one between the 

high with saline and high with insulin. This corresponds 

to the predicted control values. There was no statistically 

significant difference between the high with saline and 

high with bitter melon. 


3.4 – Revised Standard Curve 

Using a different series of D-glucose solutions and a 

new spectrophotometer, a new standard curve was created 

(Fig. 6). 

A. 

B. 

Figure 6. New Glucose Standard Curve. Using 

concentrations of 12.5 mg/mL and 25 mg/mL, the 

standard curve for glucose was created. 

The linear fit was used to evaluate the following 

experiment in terms of finding the ideal dose of bitter 

melon for hyperglycemic silkworms. 

3.5 – Determining the Ideal Concentration of Bitter Melon 

Ultimately, a 35 µg/mL concentration of bitter melon 

was not effective, but it did show a decrease from the 

hyperglycemic silkworms (Fig. 5). By adjusting the 

concentration of bitter melon, the downward trend could 

be quantified further. 

The average values for each treatment were: 26.046 

mg/mL (Normal with Saline), 37.111 mg/mL (15% High 

glucose diet with Saline), 25.043 mg/mL (15% High glucose 

diet with 35 µg/mL Insulin), 24.029 mg/mL (15% High 

glucose diet with 35 µg/mL Bitter melon), 33.459 mg/mL 

(15% High glucose diet with 87.5 µg/mL Bitter melon), 

20.006 mg/mL (15% High glucose diet with 175 µg/mL 

Bitter melon), and 18.265 mg/mL (15% High glucose diet 

with 350 µg/mL bitter melon) (Fig. 7A). 

Insulin significantly reduced the glucose concentrations 

of the high glucose diet to the levels of the normal diet. Two 

bitter melon treatments yielded statistically significant 

results, with the 175 µg/mL bitter melon extract having a 

p-value that most closely resembled insulin (0.00303 and 

0.002954). 

Figure 7. (A) Determining a Statistically Significant 

Bitter Melon dose. Seven different treatment groups 

were measured: Normal with Saline (mulberry diet 

+ insect saline, n=8), High with Saline (mulberry diet 

+ 15% glucose for 84 hours + insect saline, n=9), High 

with Insulin (mulberry diet + 15% glucose for 84 hours 

+ 35 µg/mL insulin, n=7, High with 35 µg/mL Bitter 

Melon (mulberry diet + 15% glucose for 84 hours + 

35 µg/mL bitter melon extract, n=5), High with 87.5 

µg/mL Bitter Melon (mulberry diet + 15% glucose 

for 84 hours + 87.5 µg/mL bitter melon extract, n=8), 

High with 175 µg/mL Bitter Melon (mulberry diet 

+ 15% glucose for 84 hours + 175 µg/mL bitter melon 

extract, n=6), and High with 350 µg/mL Bitter Melon 

(mulberry diet + 15% glucose for 84 hours + 350 µg/ 

mL bitter melon extract, n=5). (p < 0.05= *, p < 0.01= 

**, p > 0.05 = ns). Error bars ± SEM. (B) (Left) 350 µg/ 

mL bitter melon diet fed silkworm exhibiting a 

yellowish color and less rigid than (Right) Normal 

diet fed silkworm. 

4. Discussion 

4.1 – Bitter Melon’s Effect on Hyperglycemia 

The results confirm Matsumoto et al.’s conclusion that 

a silkworm model can exhibit hyperglycemia (Fig. 4). By 

feeding a 10% glucose mulberry diet, the hemolymph sugar 

levels increased significantly. For the convenience of the 

model, the addition of glucose was increased to 15% to 

BIOLOGY 


ensure that the intake across samples would be the same 

and that the effect was the greatest. Additionally, this 

figure also examines the mass and length across treatments. 

Matsumoto et al. claimed that mass and length differed with 

the addition of glucose to the diet, however, no significant 

variation in these metrics were found, indicating these 

metrics were not good indicators of health (2011). 

From there, insulin was added as a control treatment 

to compare the effect of bitter melon. The 35 µg/mL was 

appropriately scaled by the average insulin treatment for 

humans, with guidance from Matsumoto et al. (2011). 

There was a significant decrease in glucose levels, again 

confirming the silkworm model. New treatments were 

tested with a 15% glucose mulberry diet instead of the 

10% glucose mulberry diet (Fig. 4 & Fig. 5). With an equal 

concentration of bitter melon, 35 µg/mL, there was no 

significance in the data (p = 0.12). However, bitter melon 

did reduce the glucose levels from the high glucose and 

saline sample. From this, it could be concluded that bitter 

melon requires a larger dose than insulin does to perform 

a hypoglycemic effect. 

This prediction was tested (Fig. 7). With four varying 

concentrations of bitter melon – 35 µg/mL, 87.5 µg/ 

mL, 175 µg/mL, and 350 µg/mL – the relationship of 

bitter melon doses and the hypoglycemic effect were 

discovered. The most effective dose out of the trials was 

175 µg/mL bitter melon, as it produced a similar p-value 

to the insulin treatment. However, the trend created 

with increasing doses did not suggest a linear trend, like 

predicted. Instead, it presented with a curved shape that 

is likely because of small sample size. The sample sizes of 

the various treatments ranged from 5 to 9, indicating that 

more samples would be necessary to determine if a linear 

trend exists. 

4.2 – Limitations 

Although the metrics of mass and length were not 

appropriate in quantifying the effect of bitter melon, 

qualitative observations of behavior helped define the 

health when compared to normal silkworms. With a high 

glucose mulberry diet, silkworms appeared lethargic and 

did not move as fast or eat as much as the normal diet 

silkworms. This could speak to food aversion or a toxicity 

of glucose in the diet. Silkworms have been frequently 

studied as a model of toxicity as they lack an adaptive 

immune system (Chen & Lu, 2018). They possess PGs and 

LPS, immune stimulators that silkworms have developed 

based on their cell walls (Chen & Lu, 2018). This allows 

silkworms to defend against pathogens and infections. If 

they had a similar response to the insulin or glucose diet, 

the model would not be ideal to study. With the addition of 

insulin, the worms were still slightly affected by this effect. 

In addition, silkworms in the bitter melon trials exhibited 

a yellowish tinge (Fig. 7B). The higher the dose, the higher 

the mortality rate. The study consisted of trials with 10 

initial worms, however, the sample sizes are significantly 

lower for the 35 µg/mL trial and the 350 µg/mL trial, as 

only 50% of the silkworms in those trials survived (Fig. 

7A). 

In addition, the administration of treatment also 

impacted the survival rate of the silkworms. As they 

have a limited hemolymph volume, injections caused 

severe hemolymph loss. Bruises and strain along the head 

and thorax were very visible. With extra pressure from 

the injections, silkworms often refrained from eating 

as they could not perform movement. This method of 

administering was rather ineffective as many samples 

could not be used for analysis. 

Lastly, the spectrophotometer used for the first half 

of the experiment was heavily used and therefore yielded 

unpredictable results. Therefore, results from the first 

part (Fig. 2 and 3) cannot be compared to results from the 

second part (Fig. 5), as there were two new standard curves 

created. Overall, similar results were yielded throughout 

the trials, which allows the general effect of the treatments 

to be compared. 

5. Conclusion and Future Work 

By using a silkworm model, diabetes could be effectively 

modeled. The most effective dose of bitter melon was 

determined to be 175 µg/mL, which had effects that closely 

resembled those of the insulin treatment. The data do not 

confirm a linear relationship between dose of bitter melon 

and hypoglycemic effect, as there was variation in the data. 

With this knowledge, future work is necessary before 

bitter melon can be marketed as a hypoglycemic agent. At 

this dose, side effects and symptoms should be generated to 

understand how it would impact human health. This can be 

done through mice and human clinical trials. Additionally, 

different methods of preparing the extract should be 

performed. A liquid extract was prepared with distilled 

water and a powdered form of bitter melon to make the 

injection solutions in the previously mentioned trials. 

The difference between powdered and fresh bitter melon 

should be studied, with the different parts of bitter melon 

as well (the core and the exterior). With the combination 

of all of these factors, the doses may vary depending on 

what is optimal. 

Bitter melon does not only show potential as a 

hypoglycemic agent, but also as a potential cancer 

and osteoarthritis therapy (Raina, 2016; Soo May, 

2018). Guided by bitter melon’s targeted effect against 

Type 2 Diabetes, Raina et al. attempted to provide a 

comprehensive view of the bioactivity of bitter melon’s 

different components and determine if they are applicable 

to cancer treatment (Raina et al., 2016). Specifically, they 

focus on how bitter melon interacts with other drugs. 

This is an important aspect to consider concerning 


diabetes as well, to see how bitter melon interacts with 

the mechanisms of insulin (Raina et al., 2016). Soo May 

et al. focus on bitter melon’s anti-inflammatory effects and 

how they can potentially reduce knee pain in osteoarthritis 

patients (Soo May et al., 2018). They concluded that with 

3 months of supplementation, bitter melon can reduce 

the need for analgesia consumption, while also showing 

reductions in body weight, body mass index, and fasting 

blood glucose (Soo May et al., 2018). Overall, bitter melon 

has a variety of beneficial effects that are not well studied, 

so it is important to understand how it affects the body to 

better recommend this natural remedy. 

Once this has been completed, health care providers 

can use this information, especially in Asian countries, to 

inform their patients about additional foods to include into 

their diet, with the appropriate intake. As there are many 

different vegetables and roots that are said to exhibit this 

hypoglycemic effect, they can be tested similarly to how it 

has been done in this paper and examine the effects before 

formally recommending inclusion into the diet. 

6. Acknowledgments 

I would like to thank Dr. Kimberly Monahan for being 

an encouraging mentor and guiding me through the 

research process. Thank you to the Research in Biology 

class of 2019 for providing support throughout this project. 

Thank you to Kevin Zhang and Tyler Edwards for being 

my lab assistants over the summer. Finally, I would like to 

thank Dr. Sheck, the North Carolina School of Science and 

Mathematics, and the Glaxo Endowment for allowing me 

the opportunity to experience research. 

7. References 

Fuangchan, A. (2011) Hypoglycemic effect of bitter melon 

compared with metformin in newly diagnosed type 2 

diabetes patients. Journal of Ethnopharmacology, 134(2), 

422–428. https://doi.org/10.1016/j.jep.2010.12.045 

Insulin, Medicines, & Other Diabetes Treatments. (2016, 

November 01). Retrieved January 27, 2018, from https:// 

www.niddk.nih.gov/health-information/diabetes/ 

overview/insulin-medicines-treatments 

Jin Yang, S., Mook Choi, Jung. (2015). Preventive effects 

of bitter melon (Momordica charantia) against insulin 

resistance and diabetes are associated with the inhibition 

of NF-κB and JNK pathways in high-fat-fed OLETF rats. 

The Journal of Nutritional Biochemistry, 26(3), 234–240. 

https://doi.org/10.1016/j.jnutbio.2014.10.010 

Matsumoto, Y., Sumiya, E., Sugita, T., & Sekimizu, K. 

(2011). An Invertebrate Hyperglycemic Model for the 

Identification of Anti-Diabetic Drugs. PLOS ONE, 6(3), 

e18292. https://doi.org/10.1371/journal.pone.0018292 

Raina, K., Kumar, D., & Agarwal, R. (2016). Promise 

of bitter melon (Momordica charantia) bioactives in 

cancer prevention and therapy. Seminars in Cancer 

Biology, 40–41, 116–129. https://doi.org/10.1016/j. 

semcancer.2016.07.002 

Soo May, L., Sanip, Z., Ahmed Shokri, A., Abdul Kadir, 

A., & Md Lazin, M. R. (2018). The effects of Momordica 

charantia (bitter melon) supplementation in patients with 

primary knee osteoarthritis: A single-blinded, randomized 

controlled trial. Complementary Therapies in Clinical Practice, 

32, 181–186. https://doi.org/10.1016/j.ctcp.2018.06.012 

Axe, J. (2015, August 29). 7 Benefits of Ayurvedic Med 

icine: Lower Stress, Blood Pressure & More. (n.d.). 

Retrieved September 22, 2018, from https://draxe.com/ 

ayurvedic-medicine/ 

CDC Press Releases. (2016, January 1). Retrieved 

January 27, 2018, from https://www.cdc.gov/media/ 

releases/2017/p0718-diabetes-report.html 

Chen, K., & Lu, Z. (2018). Immune responses to bacterial 

and fungal infections in the silkworm, Bombyx mori. 

Developmental & Comparative Immunology, 83, 3–11. https:// 

doi.org/10.1016/j.dci.2017.12.024 

Drive, A. D. A. 2451 C., Arlington, S. 900, & Va 22202 

1-800-Diabetes. (n.d.). Diabetes Symptoms. Retrieved 

October 26, 2018, from http://www.diabetes.org/ 

diabetes-basics/symptoms/ 

BIOLOGY 


TETRAETHYL ORTHOSILICATE-POLYACRYLONITRILE 

HYBRID MEMBRANES AND THEIR APPLICATION IN 

REDOX FLOW BATTERIES 

Ethan Frey 

Abstract 

Redox flow batteries (RFBs) are a reliable solution to long term energy storage, but lack an inexpensive and effective 

proton exchange membrane. Polyacrylonitrile (PAN) nanoporous membranes have a high chemical stability but 

low hydrophilicity when compared to Nafion, the standard membrane. The addition of tetraethyl orthosilicate (TEOS) 

increases the mechanical and thermal properties of membranes and may also increase their hydrophilicity due to the 

presence of hydrophilic silicon hydroxide bonds. Therefore, doping a nanoporous hydrophobic PAN membrane with 

TEOS is hypothesized to increase the hydrophilicity of the membrane, while still maintaining a high chemical stability 

and low vanadium crossover. Membranes of Nafion 212, nanoporous PAN, and a nanoporous hybrid TEOS/PAN were 

prepared through a phase inversion method and tested for chemical stability, proton and vanadium crossover in a model 

RFB, and water contact angle. The TEOS/PAN hybrid membrane had a higher hydrophilicity than both PAN and Nafion. 

The addition of TEOS had no impact on chemical stability. However, the TEOS/PAN hybrid membrane did have a higher 

vanadium crossover and lower proton/vanadium selectivity. It was concluded that TEOS can increase hydrophilicity, 

but more research needs to be done to improve proton/vanadium selectivity, potentially by optimizing pore size. Since 

TEOS was proven as an effective additive to membranes, progress was made towards the development of an ideal proton 

exchange membrane and a solution to long-term energy storage. 


Recently, polyacrylonitrile (PAN) nanofiltration 

membranes have proven to be a promising alternative to 

Nafion membranes due to their high chemical stability, 

and inexpensive cost. However, PAN membranes have 

been found to lack the proton conductivity of Nafion, a 

property that could be increased through the addition of 

additives to a PAN membrane. The addition of tetraethyl 

orthosilicate to a PAN membrane may not only increase 

the hydrophilicity and proton conductivity of PAN, but 

also improve the membrane’s mechanical strength and 

thermal properties, while still maintaining the chemical 

stability of PAN. The creation of a hybrid membrane of 

PAN and TEOS could make progress towards the creation 

of a cheaper membrane with properties comparable to that 

of Nafion. 

As renewable energy has become increasingly popular, 

demand for long term energy storage increased as well. As 

a result, a lot of attention has been given to redox flow 

batteries (RFBs) due to their ability to store energy for 

an indefinite period of time. What differentiates a RFB 

from other battery types is that it behaves essentially as a 

reversible fuel cell. 

Figure 1. Redox Flow Battery Design: Two different 

oxidation states of vanadium are stored in the tanks 

on either side of the battery and pumped into two 

adjacent half cells where the vanadium is reduced or 

oxidized and a flow of electrons is created. However, 

a proton exchange membrane is essential to allow 

this reaction to occur. 

The fuel is stored in tanks separate from where the 

oxidation and reduction occurs (Fig. 1). This fuel is 

pumped into two adjacent half cells separated by a proton 

exchange membrane. The vanadium on one side of the 

half cell is reduced and the other side is oxidized before 

being pumped back into the fuel tank. The battery is 

finished charging or discharging when all of the fuel has 

been reduced or oxidized. Commonly used batteries, such 

as lithium-ion batteries, can slowly discharge while not in 

use, resulting in a loss of charge over time. This is due to 

the fuel being stored where the reduction and oxidation 

occurs, allowing spontaneous reactions to take place even 

26 | 2018-2019 | Broad Street Scientific CHEMISTRY

when the battery is not in use. Since the fuel in RFBs is 

stored externally, the battery cannot slowly discharge over 

time while not in use, and energy can be stored for an 

indefinite period of time. Vanadium is most typically used 

in RFBs due to its multiple oxidation states and large ion 

size (Alotto et al., 2014). 

The expensive proton exchange membrane prevents 

the use of RFBs on a commercial scale. The most widely 

used membrane is Nafion. This membrane is expensive and 

its properties could still be improved upon. It still exhibits 

vanadium ion crossover, and a higher hydrophilicity and 

proton conductivity could increase its efficiency. However, 

it is challenging to manipulate these properties while still 

maintaining a high level of chemical stability. Vanadium 

ion crossover is difficult to decrease while still maintaining 

proton conductivity. Similarly, proton conductivity is 

difficult to increase without decreasing chemical stability 

or increasing vanadium crossover. A membrane must 

be both hydrophobic to maintain chemical stability and 

hydrophilic to conduct protons. An alternative is to create 

a membrane that is simply very hydrophobic and has 

nanopores to allow protons to pass through. However, an 

extremely hydrophobic membrane struggles to keep the 

nanopores big enough to allow protons to pass through 

but small enough to prevent vanadium crossover. Nafion is 

designed (Fig. 2) such that it has a fluorinated carbon chain 

that allows for high chemical stability and hydrophobicity. 

Yet, Nafion still has a S-OH bond that allows for some 

hydrophilicity. 

exceeding 170°, indicating its high chemical stability (Feng 

et al., 2002). However, high hydrophobicity can become 

problematic when it prevents the membrane from 

conducting protons. Polyacrylonitrile membranes with 

nanopores from a phase inversion method (Zhang et al., 

2011) and conditioning in an alkali solution (Karpushkin 

et al., 2017) have been investigated. These investigations 

found that, as expected, polyacrylonitrile lacks the proton 

conductivity of Nafion. Therefore, doping PAN to 

increase its hydrophilicity could create a membrane that 

is comparable to Nafion. 

Doping Nafion with metal oxides to improve its 

hydrophilicity has been explored repeatedly and proven 

successful (Noto et al., 2007). Specifically, silicon dioxide 

has been proven effective due to its ability to significantly 

increase the thermal properties and hydrophilicity 

of Nafion (Yu et al., 2007). Doping PAN with metal 

oxides should also increase its hydrophilicity. Tetraethyl 

orthosilicate (TEOS) has been explored as an additive to 

a polyvinylidene fluoride membrane (Liu et al., 2008). 

However, only the enhanced mechanical properties and 

effect on pore size were explored and the doped PVDF 

membrane was not tested in application for RFBs. TEOS 

has also been tested and used for the creation of a super 

hydrophilic surface in photovoltaic cells, proving its use as 

a hydrophilic material (Yan et al., 2015). Its hydrophilicity 

is due to the presence of hydroxide bonds after being 

polymerized and hydrolyzed, like the ones in Nafion (Fig. 

3). 

Figure 2. Nafion Structure: Nafion contains 

a hydrophobic fluorinated backbone with a 

hydrophilic sulfur hydroxide bond. 

As a result of its structure, Nafion maintains a high 

chemical stability while still allowing protons to cross over 

the membrane. The cost of Nafion is mainly due to the 

manufacturing cost of making fluorinated membranes. 

Therefore, non-fluorinated membranes have been 

investigated as a cheaper alternative. However, many 

lack the chemical stability of fluorinated membranes, 

which poses a challenge for their application in vanadium 

RFBs. Polyacrylonitrile has recently been recognized 

as a promising option for non-fluorinated membranes 

due to its high chemical stability despite the absence of 

fluorine. In fact, PAN has been explored in applications as 

a superhydrophobic polymer with a water contact angle 

CHEMISTRY 

Figure 3. Hydrolyzed and Polymerized TEOS: Just 

like Nafion’s hydrophilic sulfur hydroxide bonds, 

TEOS contains hydrophilic silicon hydroxide bonds. 

Doping PAN with TEOS should have the same effect 

that adding a metal oxide, like silicon oxide, would have. 

TEOS has also demonstrated an increase in the thermal 

stability of membranes (Liu et al., 2008). Therefore, 

doping PAN with TEOS should combine the superhydrophobicity 

and chemical stability of PAN with the 

super-hydrophilicity, thermal stability, and high tensile 

strength of TEOS without compromising the high 

chemical stability or proton/vanadium selectivity of PAN. 

Engineering a more efficient and less expensive proton 

exchange membrane will allow energy to be generated 

and stored for an indefinite period of time, making 


enewable energy much more reliable and allowing entire 

cities to depend on it. The future of RFB membranes lies 

in the development of a non-fluorinated membrane, due 

to the cheaper manufacturing cost. Through testing how 

to improve the properties of a promising non-fluorinated 

membrane, progress is made towards engineering an ideal 

proton-exchange membrane. 

2. Methods 

2.1 – Preparing the Membranes 

The following procedure was adapted from Liu (2008). 

The nanoporous PAN membrane was prepared by casting 

a 15 wt% PAN (M w 

= 150,000), 3 wt% LiCl, and 4 wt% 

polyvinylpyrrolidone (PVP) solution in DMSO on a glass 

plate and leveling it with an RDS40 wire round rod (RD 

Specialties, USA). A normal phase inversion was then 

conducted using a water bath at room temperature. The 

membrane was left in the water bath for 1-2 days to 

remove any remaining solvents. The hybrid membrane 

was prepared by lowering the weight percentages to 12.5 

wt% PAN, 2.5 wt% LiCl, 3.25 wt% PVP, and adding 6.7 

wt% TEOS. A normal phase inversion was then conducted 

in an acid bath (pH = 1), to allow the polymerization of 

TEOS, and then transferred to a water bath for 1-2 days. 

2.2 – Model Redox Flow Battery 

The model redox flow battery was designed using 

two mini-variable flow peristaltic pumps (Fisher) and 

a modified fuel cell (Heliocentris). The fuel cell was 

composed of graphite, two carbon felt electrodes, and a 

membrane. The design is shown below (Fig. 4). 

60-90°C. After the dissolution of V 2 

O 5 

, temperature was 

maintained while .7mL of glycerol was stirred in. Once a 

uniform blue color was obtained, indicating the formation 

of V 4+ , the reaction was completed. 

2.4 – Proton/Vanadium Selectivity Test 

Proton vanadium selectivity was tested by filling one 

side of the fuel cell with water and the other side with 

2M VO 2 

+ 

and 7M HCl. The pumps were run for 45 

minutes with samples being taken every 5 minutes. The 

concentration of vanadium in these samples was measured 

using a spectrophotometer. The absorption at 765nm 

was measured and Beer’s law was used to calculate the 

concentration using a molar absorptivity of 13.40 (Choi 

et al., 2013). The concentration of protons was measured 

through pH measurement using a pH meter (Vernier). All 

data were recorded in LoggerPro (Vernier). 

2.5 – Chemical Stability 

The chemical stability of the membranes was estimated 

by placing the membranes in a solution that consisted of 

1M V 2 

O 5 

and 5M HCl at 50°C for 30 days. The presence of 

VO 2 

+ 

indicated that the membrane had been oxidized. The 

stability of the membrane was determined by calculating 

the percentage of VO 2 

+ 

in the sample in comparison to a 

control solution without a membrane. 

3. Results 

3.1 – Water Contact Angle 

Figure 4. Model Redox Flow Battery Design: The 

design consists of two peristaltic pumps, two fuel 

tanks, and two adjacent half cells with electrodes 

separated by a membrane. 

2.3 – Vanadium 4+ Preparation 

All fuel was prepared through the reduction of V 2 

O 5 

to VO 2 

+ 

using glycerol in the presence of HCl (Small et 

al., 2017). 38.9 mL of deionized water, 50.0 mL of 12.1M 

HCl, and 5.0g of V 2 

O 5 

was added to a beaker and stirred at 

Figure 5. Water contact angle of a drop of water on 

the membranes. The photo was taken on an IPhone 

6s and the angles were measured using Logger Pro. 

(A) Nafion (B) PAN (C) Hybrid 


Hydrophilicity of Nafion, PAN, and the hybrid 

membrane was demonstrated by measuring the contact 

angle of a water droplet (Fig. 5). It was found that Nafion 

had the largest water contact angle of 87.09° followed by 

PAN at 55.01° and the hybrid membrane at 42.27°. 

3.2 – Proton and Vanadium Crossover 

membrane the highest (Fig. 6A). The TEOS/PAN hybrid 

membrane also had the highest proton conductivity (Fig. 

6B). The overall proton/vanadium selectivity was similar 

for the Nafion and PAN membranes. However, the hybrid 

membrane had a much lower selectivity (Fig. 6C). 

3.3 – Chemical Stability 

Table 1. The percent of V 5+ reduced to V 4+ .This 

indicated ongoing oxidation of the membrane 

and therefore can be used to analyze the chemical 

stability of the membrane. Concentrations were 

measured using a spectrophotometer. 

Percent 

Vanadium 

Reduced (%) 

Percent 

Reduced 

Compared 

to reference 

Nafion PAN Hybrid Reference 

4.72% 1.20% 1.67% 2.41% 

96.06% -49.94% -30.54% ________ 

The chemical stability of the prepared membranes was 

measured as the percent of the original vanadium that was 

reduced (Table 1), indicating ongoing oxidation of the 

membrane. The percent of vanadium reduced was highest 

for Nafion with 4.72% followed by the hybrid membrane 

with 1.67% and the PAN membrane with 1.20%. However, 

2.41% of the vanadium in the reference sample (the sample 

without a membrane) was reduced. When comparing the 

measured percentages to the reference percentages, it is 

found that the percent reduced in the Nafion was 96% 

higher than that of the reference sample, PAN was 49.9% 

smaller, and the hybrid was 30.5% smaller. 

4. Discussion and Conclusion 

Figure 6. (A) Vanadium Crossover (B) Proton 

Crossover (C) Proton/Vanadium Selectivity 

The vanadium and proton crossovers were measured 

over a 45 minute period. A model redox flow battery 

was used. An acidic V 4+ solution was placed on one side 

of the battery and deionized water on the other side. 

The concentration of vanadium was measured over time 

using a spectrophotometer and the proton concentration 

was measured using a Vernier pH probe. The proton/ 

vanadium selectivity was determined as the ratio of the 

proton crossover to the vanadium crossover. 

Nafion was found to have the lowest vanadium 

crossover, PAN the second highest, and the hybrid 

The goal of this project was to demonstrate that 

hydrolyzed and polymerized TEOS could effectively 

increase hydrophilicity and provide a suitable substitute 

for Nafion in a vanadium redox flow battery. The results 

of the experiments are summarized in Table 2. Introducing 

TEOS to the PAN membrane improved its hydrophilicity 

as demonstrated by the water contact angle test (Fig. 5). 

The smaller the water contact angle, the more hydrophilic 

the material, because the water is not as repelled to the 

polymer. The hybrid membrane had a smaller water 

contact angle than both PAN and Nafion, indicating a 

high hydrophilicity. Its high hydrophilicity was further 

demonstrated when tested in a model redox flow battery. 

The hybrid membrane showed a higher proton crossover 

than both Nafion and PAN (Fig. 6). These tests show that 

the proton conductivity of the PAN membrane was most 

likely successfully increased. The chemical stability of 

CHEMISTRY 


Table 2. Data Summary: The water contact angle, vanadium crossover, proton crossover, proton/vanadium 

selectivity, and chemical stability of Nafion, PAN, and the hybrid TEOS/PAN membrane. 

Water 

Contact 

Angle (°) 

Permeability to 

V 4+ (cm 2 min -1 ) 

Permeability to 

H + (cm 2 min -1 ) 

Proton/ 

Vanadium 

Selectivity 

Chemical Stability (% Reduced 

Compared to Reference) 

Nafion 87.09 7.54x10 -5 3.77x10 -4 18.19 96.06% 

PAN 55.01 1.57x10 -4 7.83x10 -4 15.74 -49.94% 

Hybrid 42.27 2.37x10 -4 1.18x10 -5 8.75 -30.54% 

the membrane was also maintained with the addition of 

TEOS, as none of the membranes had a significant amount 

of oxidation occur in the presence of a strong oxidizer. 

However, the TEOS-PAN hybrid did show increased 

vanadium crossover. Prevention of vanadium crossover 

is an essential function of a proton-exchange membrane. 

The different oxidation states of vanadium on either side of 

the membrane need to remain unmixed while still allowing 

protons to cross over. Therefore, proton/vanadium 

selectivity is measured as the ability of the membrane to 

allow protons to cross over but prevent vanadium ions 

from crossing over. A higher proton/vanadium selectivity 

is ideal. However, the hybrid membrane displayed a 

lower proton/vanadium selectivity than both Nafion 

and the PAN membrane. Further optimization will need 

to be performed in order to improve proton/vanadium 

selectivity. 

There are several areas that can be explored to improve 

upon this research. The PAN and TEOS-PAN membranes 

were cast through a phase inversion method and developed 

nanopores, which allow these ions to cross over. The size 

of the nanopores has a significant effect on the selectivity 

of the membrane. To aid in the casting process, a lower 

polymer concentration was used for the hybrid membrane. 

However, this may have resulted in an increased pore size, 

causing the decreased selectivity and increased vanadium 

crossover. This could be examined with scanning electron 

microscopy to verify the pore sizes. It could be inferred 

that if the increased vanadium crossover is only due to an 

increased pore size, then the increased proton crossover 

is also only due to an increased pore size. However, it was 

demonstrated that the membrane was more hydrophilic 

in the water contact angle test. Therefore, the hybrid 

membrane should easily allow protons to cross over even 

with a reduced pore size. 

This study successfully demonstrated that the addition 

of hydrolyzed and polymerized TEOS to a PAN membrane 

was effective in increasing membrane hydrophilicity. 

Further research needs to be done to investigate if the 

addition of TEOS results in an increased vanadium crossover 

or if this could be overcome through the optimization of 

the pore size of the hybrid membrane. The membranes’ 

properties should also be tested in a functional redox flow 

battery to test the effects of the increased properties on the 

efficiency of the battery. TEOS was proven as an effective 

additive to membranes in increasing their mechanical and 

thermal properties and hydrophilicity. A hybrid TEOS- 

PAN membrane with an optimized pore size may create 

a membrane with properties comparable to that of Nafion 

at a cheaper cost. 


I would like to thank Dr. Michael Bruno for his help 

and mentorship throughout the research project as 

well as the help and support of my fellow Research in 

Chemistry peers. Finally, I would like to thank the NCSSM 

Foundation for funding my research project as it has been 

an invaluable experience. 

6. References 

Alotto, P., Guarnieri, M., Moro, F. (2014). Redox flow 

batteries for the storage of renewable energy: A review. 

Renewable and Sustainable Energy Reviews, 29, 325-335. 

doi:10.1016/j.rser.2013.08.001 

Choi, N. H., Kwon, S., Kim, H. (2013). Analysis of the 

Oxidation of the V(II) by Dissolved Oxygen Using UV- 

Visible Spectrophotometry in a Vanadium Redox Flow 

Battery. Journal of The Electrochemical Society, 160(6). 

doi:10.1149/2.145306jes 

Feng, L., Li, S., Li, H., Zhai, J., Song, Y., Jiang, L., Zhu, 

D. (2002). Super-Hydrophobic Surface of Aligned 

Polyacrylonitrile Nanofibers. Angewandte Chemie 

International Edition, 41(7), 1221-1223. doi:10.1002/1521- 

3773(20020402)41:73.0.co;2-g 

Karpushkin, E. A., Gvozdik, N. A., Stevenson, K. J., 

Sergeyev, V. G. (2017). Membranes based on carboxylcontaining 

polyacrylonitrile for applications in vanadium 

redox-flow batteries. Mendeleev Communications, 

27(4), 390-391. doi:10.1016/j.mencom.2017.07.024 

Liu, X., Peng, Y., Ji, S. (2008). A new method to prepare 

organic–inorganic hybrid membranes. Desalination, 

221(1-3), 376-382. doi:10.1016/j.desal.2007.02.056 


Noto, V. D., Gliubizzi, R., Negro, E., Vittadello, 

M., Pace, G. (2007). Hybrid inorganic–organic proton 

conducting membranes based on Nafion and 5wt.% of 

M x 

O y 

(M=Ti, Zr, Hf, Ta and W). Electrochimica Acta, 

53(4), 1618-1627. doi:10.1016/j.electacta.2007.05.00 

Small, L. J., Pratt, H., Staiger, C., Martin, R. I., Anderson, T. 

M., Chalamala, B., Subarmanian, V. R. (2017). Vanadium 

Flow Battery Electrolyte Synthesis via Chemical Reduction 

of V 2 

O 5 

in Aqueous HCl and H 2 

SO 4 

. doi:10.2172/1342368 

Yan, H., Yuanhao, W., Hongxing, Y. (2015). TEOS/Silane- 

Coupling Agent Composed Double Layers Structure: A 

Novel Super-hydrophilic Surface. Energy Procedia, 75, 

349-354. doi:10.1016/j.egypro.2015.07.384 

Yu, J., Pan, M., Yuan, R. (2007). Nafion/Silicon oxide 

composite membrane for high temperature proton 

exchange membrane fuel cell. Journal of Wuhan 

University of Technology- Mater. Sci. Ed., 22(3), 478-481. 

doi:10.1007/s11595-006-3478-3 

Zhang, H., Zhang, H., Li, X., Mai, Z., Zhang, J. (2011). 

Nanofiltration (NF) membranes: The next generation 

separators for all vanadium redox flow batteries (VRBs)? 

Energy Environmental Science, 4(5), 1676. doi:10.1039/ 

c1ee. 

CHEMISTRY 


NOVEL SYNERGISTIC ANTIOXIDATIVE 

INTERACTIONS BETWEEN SOY LECITHIN AND 

CYCLODEXTRIN-ENCAPSULATED QUERCETIN IN A 

LIPID MATRIX 

Anirudh Hari 

Abstract 

Food oils stale via multiple mechanisms, the most damaging being oxidation by free radicals through reaction with oxygen 

in the air. Antioxidants are used to combat this oxidation, but many that are commonly used have carcinogenic properties. 

Quercetin is a safer polyphenolic phytochemical known to possess antioxidative properties in lipid matrices. Soy lecithin, 

a common food emulsifier primarily composed of phospholipids, also possesses antioxidative properties in lipid matrices, 

one of its primary mechanisms being the dispersion of less lipid-soluble antioxidants in the matrix. Phosphatidylcholine, 

the primary component of soy lecithin, is capable of forming a hydrogen bond from its polar head to a hydroxyl group of 

quercetin to create a complex known as a phenolipid. This phenolipid has a greater antioxidative effect than soy lecithin 

or quercetin do alone. However, one issue that remains prevalent is rapid degradation of quercetin in the lipid matrix. 

Beta-cyclodextrin is a ring-shaped molecule which can encapsulate quercetin, but it has not been tested for its ability to 

protect quercetin against degradation in oils. 

A novel phenolipid was formulated between a quercetin-cyclodextrin complex and soy lecithin, thus doubly encapsulating 

quercetin in order to potentially increase antioxidative lifetime by protecting the molecule while still maintaining the 

dispersive effect of lecithin. An accelerated oxidation test was conducted and time points were analyzed for radical 

scavenging activity. Results revealed that the novel phenolipid scavenged radicals more effectively than quercetin or 

lecithin by themselves, and also had a greater antioxidative lifetime, showing much higher radical scavenging activity than 

the quercetin-lecithin phenolipid and quercetin or lecithin alone after 12 days of the oxidation test. This implies novel 

applications for beta-cyclodextrins in the protection of polyphenolic antioxidants in lipid matrices. 


The oxidation of lipids is a major concern in the food 

industry, especially with unsaturated and polyunsaturated 

fats which are very sensitive to degradation (Ramadan et 

al., 2012). When excited by light, molecular oxygen in the 

air forms the superoxide anion, a free radical that oxidizes 

molecules in the oil, causing a radical chain reaction that 

results in cleavage of the double bonds in unsaturated fatty 

acids, resulting in their degradation into aldehydes and 

ketones. This process, known as oxidative rancidification, 

is responsible for the characteristic stale smell of old oils. 

While there are a number of ways to prevent oxidative 

rancidification, including wrapping containers with foil 

to prevent reactions catalyzed by sunlight and vacuumsealing 

containers to prevent interactions with oxygen, 

the most effective way to protect oils is the addition of 

antioxidants to scavenge free radicals (Judde et al., 2003). 

Antioxidants are used in the food industry to protect 

oils by reducing reactive free radicals. The addition of 

antioxidants to oils inhibits oxidative rancidification, 

slowing the rate of decline in oil quality (Ramadan et 

al., 2012). However, many of the most commonly used 

synthetic antioxidants, including butylated hydroxyanisole, 

butylated hydroxytoluene, propyl gallate, and tert-butyl 

hydroquinone, have been shown to promote carcinogenesis 

(National Toxicology Program). 

Flavonoids are a group of natural polyphenolic 

phytochemicals consisting of more than 4000 

molecules that vary in structure and properties. 

3,5,7,3′,4′-pentahydroxyflavone, also known as quercetin, 

is a yellow-colored flavonoid that possesses antioxidative 

properties in lipid matrices and is considered safe at much 

higher doses than most common synthetic antioxidants. 

Quercetin is used in the food industry as an alternative to 

synthetic antioxidants, but the degradation of quercetin 

via glycosylation at its hydroxyl groups is a major limit 

to its application in foods. The structural feature of 

quercetin most involved in its antioxidative mechanism is 

the hydroxyl group on the 4′ carbon, which can donate a 

hydrogen atom to a free radical to reduce it (Ozgen et al., 

2016) (Fig. 1). 

Figure 1. Quercetin reduces a free radical, labelled R, 

by donating a hydrogen atom from its 4′ hydroxyl 

group to form a stable radical 4′-quercetin. 


Phospholipids are another class of antioxidants that 

have different antioxidative mechanisms than flavonoids. 

Soy lecithin is a mixture of amphipathic phospholipids, 

primarily phosphatidylcholine, and is a common 

emulsifying agent that also possesses antioxidative 

properties in lipid matrices. Soy lecithin has multiple 

antioxidative mechanisms by itself. The choline group on 

the phospholipid head of phosphatidylcholine is capable 

of accepting a free electron from radical molecules. 

Phospholipids also form an oxygen barrier at the 

atmospheric interface of oils to prevent oxidation (Judde 

et al., 2003). 

It has been shown that soy lecithin also helps to 

disperse other antioxidants present within the oil to allow 

them to scavenge radicals more efficiently. Quercetin 

and soy lecithin exhibit higher radical scavenging activity 

when mixed together than when tested individually. The 

catechol group on the 3′ and 4′ carbons of quercetin allows 

for intramolecular and intermolecular hydrogen bonding 

with phosphatidylcholine to create a “phenolipid” between 

quercetin and the phospholipid (Fig. 2). This phenolipid is 

fat soluble, increasing the accessibility of quercetin in oil. 

However, although soy lecithin fully surrounds quercetin 

in the phenolipid formation, it does not inhibit the 

breakdown of quercetin, which remains a limiting factor 

in its application (Ramadan et al., 2012). 

of beta-cyclodextrin may be able to bond with soy lecithin 

through hydrogen bonds to form a novel phenolipid, 

increasing the solubility of a complex of quercetin and 

beta-cyclodextrin in oil. This would form a double 

encapsulation of quercetin (Fig. 3). Beta-cyclodextrin 

could prevent the degradation of quercetin with its 

encapsulation of the molecule, while soy lecithin facilitates 

its dispersion in oil, allowing this novel phenolipid to have 

a higher antioxidative effect compared to quercetin or 

lecithin by themselves as well as an increased antioxidative 

lifetime. This could be important in controlling the rate of 

oxidation of quercetin both in food protection and medical 

applications. 

Figure 3. Potential double encapsulation of quercetin 

by beta-cyclodextrin and phosphatidylcholine. 

Figure 2. Quercetin-phosphatidylcholine phenolipid 

complex. 

Cyclodextrins are ring-shaped molecules that can 

encapsulate certain molecules through hydrogen bonding. 

Such encapsulation has been performed with quercetin 

and has shown an increase in solubility (Zheng et al., 

2005). However, an important potential application of 

the cyclodextrin-quercetin complex that has not been 

previously investigated is the possible protection of 

quercetin from degradation. 

Since beta-cyclodextrin increases the water solubility 

of quercetin, it would decrease the lipid solubility, making 

the cyclodextrin-quercetin complex unsuitable for use in a 

lipid matrix by itself. However, the outer hydroxyl groups 

It was hypothesized that a phenolipid between lecithin 

and the quercetin-cyclodextrin complex would increase 

availability of quercetin in sunflower oil, and that this 

phenolipid would have a greater antioxidative lifetime 

than the phenolipid made of quercetin and lecithin. 

Alternatively, the double encapsulation may prevent 

degradation of quercetin without increasing the availability 

of quercetin in sunflower oil. 

In the present study, a molecular docking model of the 

encapsulation of quercetin by beta-cyclodextrin indicated 

that in the most stable conformation, the 4′ hydroxyl group 

important to the antioxidative mechanism of quercetin is 

not encompassed by the cyclodextrin, while the rest of 

the quercetin molecule is, suggesting that quercetin could 

retain its antioxidative ability in the beta-cyclodextrin 

complex. 

The novel phenolipid was prepared along with 

the quercetin-lecithin phenolipid and the quercetincyclodextrin 

complex. Each antioxidant sample was mixed 

in sunflower oil and incubated in an oven to accelerate 

oxidation. Radical scavenging activity assay was conducted 

CHEMISTRY 


periodically to measure reduction in antioxidant 

effectiveness over time. Results from radical scavenging 

activity assay indicate that the doubly encapsulated 

quercetin does have a greater antioxidative lifetime than 

the quercetin-lecithin phenolipid, and it also has a greater 

initial ability to scavenge radicals than quercetin or soy 

lecithin alone. This suggests a new potential application of 

beta-cyclodextrins to allow antioxidants to last longer in 

oils, which would also increase the lifetime of the oils due 

to longer term protection from oxidative rancidification. 

2. Materials and Methods 

2.1 – Molecular Modeling 

Before experimentation, the encapsulation of quercetin 

by beta-cyclodextrin was computationally modelled 

by molecular docking using PatchDock. PDB files for 

quercetin and beta-cyclodextrin were obtained and fed 

into the server, which found the most stable conformation 

of quercetin inside beta-cyclodextrin using shape 

complementarity and electrostatic interactions. 

2.2 – Encapsulation of Quercetin in Beta-Cyclodextrin 

Quercetin dihydrate, 97% (Alfa Aesar) was encapsulated 

in beta-cyclodextrin hydrate, 99% (Acros Organics) using 

physical kneading. An equimolar ratio of quercetin and 

beta-cyclodextrin powder was mixed in a mortar using a 

pestle for 10 minutes. Then, a small amount of ethanol 

(Fisher Scientific) was added and the mixture was kneaded 

for 40 more minutes. After kneading, the mixture was 

dried in a vacuum desiccator for 24 hours. 

50 mg of the dried mixture was dissolved in 50 mL 

of acetonitrile (Fisher Scientific), causing the betacyclodextrin 

and quercetin+beta-cyclodextrin complexes 

to precipitate, while the free quercetin that did not 

get complexed remained in solution. The absorbance 

spectrum of quercetin was taken using a Vernier UV- 

Vis spectrophotometer in a Hellma QS 282 1.000 quartz 

cuvette, showing 2 UV peaks: one at 260 nm and one at 

370 nm. A standard curve was made with absorption at 

370 nm as a function of quercetin concentration (Santos et 

al., 2015) (Fig. 4). 

Figure 4. Standard curve of quercetin in acetonitrile 

at 370 nm. 

The solution of complex in acetonitrile was allowed 

to settle for 3 days. The concentration of free quercetin 

in solution was determined using the standard curve. 

This concentration was compared to the total quercetin 

concentration in the solution, and the entrapment 

efficiency (EE) was determined using the following 

equation: 

free quercetin concentration 

EE = 1 - 

total quercetin concentration 

2.3 – Formation of Phenolipid Complexes 

The complex was removed from acetonitrile solution 

by vacuum filtration and mixed with soy lecithin (Alfa 

Aesar) at a 3:97 ratio complex to lecithin by mass. The 

complex was then dissolved in 10 mL ethyl acetate (Fisher 

Scientific). Several control groups were also dissolved 

in ethyl acetate: quercetin, quercetin with lecithin 3:97 

(phenolipid), and quercetin encapsulated in cyclodextrin 

without lecithin. 

Each sample was incubated at 40°C for 24 hours to 

facilitate dissolution. The samples were then dried by 

creating a vacuum within a chamber using a Chemglass 

Scientific Apparatus Vacuum for 2 hours. 

2.4 – Accelerated Oxidation 

Each sample was added to 100% sunflower oil (Loriva, 

cold pressed) at a concentration of 500 parts per million. 

The Schaal oven accelerated oxidation test was run on the 

4 samples as well as a sample with only sunflower oil as 

a negative control. Each mixture was placed in a 20 mL 

clear glass bottle. Each bottle was completely sealed and 

incubated in an oven at 60°C (Ramadan et al., 2012). 

Samples were withdrawn at 0, 3, 9, and 12 days and 

analyzed by Radical Scavenging Activity (RSA) assay. 

1,1-Diphenyl-2-picrylhydrazyl (DPPH) radical (Alfa Aesar) 

was dissolved in reagent grade toluene (Fisher Scientific) 

at a concentration of 10-4 M. 10 mg of each experimental 

sample was dissolved in 100 µL of toluene. This solution 

was mixed with 390 µL of the DPPH solution, and the 

mixture was vortexed at maximum speed for 20 seconds 

at ambient temperature. The decrease in absorbance at 

515 nm between the time of making the mixture and 1 

hour later was measured in a quartz cuvette using a UV- 

Vis spectrophotometer. As a control, radical scavenging 

activity towards the toluenic DPPH solution was measured 

without addition of sample. Percent inhibition was 

calculated by comparing the absorbance after 1 hour of the 

control to each of the test samples: 

% inhibition = 

abs of control - abs of test sample 

abs of control 

RSA was measured as the difference in 515 nm 

absorption between the beginning and end of the assay. 

RSA was compared between each time-point taken for 

each sample (Ramadan et al., 2012). 


3. Results and Discussion 

In order to determine if quercetin would likely maintain 

its antioxidative properties while encapsulated by betacyclodextrin, 

molecular docking was computationally 

modelled. In the lowest energy conformation, the 4′ 

hydroxyl group of quercetin, shown in light blue, extends 

out of the cyclodextrin ring, while the rest of the molecule 

sits inside the cyclodextrin (Fig. 5). This suggests that 

beta-cyclodextrin can protect quercetin from degradation 

without compromising its effectiveness as an antioxidant. 

radical scavenging activity with the least decrease in 

activity over time, while the samples that did not include 

cyclodextrin scavenged radicals less effectively after 12 

days, degrading more quickly. The quercetin encapsulated 

with cyclodextrin without lecithin also showed increased 

antioxidative lifetime compared to quercetin alone (Fig. 7, 

8). 

Figure 6. Radical scavenging activity assay was 

conducted immediately after mixing the antioxidant 

formulations in sunflower oil. 

Figure 5. Molecular docking model of quercetin 

in beta-cyclodextrin. Beta-cyclodextrin is shown 

in pink, quercetin is shown in yellow, and the 

4′ hydroxyl group of quercetin is shown in light 

blue. The 4′ hydroxyl group is key to quercetin’s 

antioxidative effect. 

According to the hypothesis, the doubly encapsulated 

quercetin formulation would scavenge radicals more 

effectively than quercetin or lecithin alone before the 

acceleration test, and have a smaller decrease in radical 

scavenging activity over time than the quercetin-lecithin 

phenolipid formulation. The entrapment of quercetin 

in beta-cyclodextrin was successful, and the entrapment 

efficiency was determined by UV absorbance to be 45%. 

Radical scavenging activity assay conducted on the 

day the complexes were mixed in sunflower oil revealed 

that the quercetin-lecithin phenolipid formulation had 

the highest radical scavenging activity, followed by the 

novel double encapsulation formulation. Quercetin and 

soy lecithin alone had similar radical scavenging activity 

results (Fig. 6). 

Samples in the Schaal oven accelerated oxidation test 

were withdrawn at 3, 6, and 12 days and assayed for 

radical scavenging activity. After incubation for 12 days, 

the doubly encapsulated quercetin sample had the highest 

Figure 7. Radical scavenging activity was assayed 

at 3, 6, and 12 days after initiating the accelerated 

oxidation test. 

Figure 8. Decrease of RSA after 12 days of oxidation 

test compared to RSA before oxidation test was 

begun. 

CHEMISTRY 


The higher initial RSA of the doubly-encapsulated 

quercetin compared to quercetin and lecithin alone 

suggests that the polar head of phosphatidylcholine did 

hydrogen bond to the exterior hydroxyl groups of the 

quercetin-cyclodextrin complex, dispersing the quercetin 

in the sunflower oil as hypothesized (Fig. 6). The initial 

RSA of the quercetin-cyclodextrin complex without 

lecithin was lowest of the groups tested, which was 

expected since cyclodextrin increases the water-solubility 

of quercetin, decreasing its availability in sunflower oil. 

The higher antioxidative lifetimes of both the 

doubly encapsulated quercetin and the single quercetincyclodextrin 

complex suggest that beta-cyclodextrin 

provides protection to quercetin from degradation in 

sunflower oil, consistent with the hypothesis (Fig. 6, 7). 

4. Conclusion 

The use of cyclodextrins in the protection of flavonoidclass 

antioxidants from degradation in lipid matrices has 

been unexplored, as have phenolipid bonds between 

cyclodextrins and phospholipids. The novel phenolipid 

formulation constructed in this experiment, consisting of 

quercetin doubly encapsulated in beta-cyclodextrin and 

soy lecithin, had a higher radical scavenging activity than 

quercetin or soy lecithin alone and a higher antioxidative 

lifetime than known phenolipid formulations of quercetin 

and lecithin. These results indicate that cyclodextrins 

can increase the antioxidative lifetime of flavonoids 

without compromising antioxidative ability if paired 

with a phospholipid to disperse the complex in the 

lipid matrix, opening up new avenues of lipid oxidation 

research with applications in food oils. Future work 

would include repeating the accelerated oxidation test 

and radical scavenging activity assays for improved 

statistical significance, as well as testing different types of 

polyphenols, phospholipids, and oils to determine whether 

the same effects are observed. 


I would like to thank Dr. Michael Bruno for selecting 

me for the Research in Chemistry program and providing 

guidance throughout the development and execution of my 

project. I would also like to thank the NCSSM Foundation 

for providing funding for the purchase of materials and 

equipment used in my experimentation. 

6. References 

Di Donato, C., et al. (2016). Alpha- and Beta-Cyclodextrin 

Inclusion Complexes with 5-Fluorouracil: Characterization 

and Cytotoxic Activity Evaluation. Molecules, 21(12), 

1644. doi:10.3390/molecules21121644 

Judde, A., Villeneuve, P., Rossignol-Castera, A., & Guillou, 

A. L. (2003). Antioxidant effect of soy lecithins on vegetable 

oil stability and their synergism with tocopherols. Journal 

of the American Oil Chemists Society, 80(12), 1209-1215. 

doi:10.1007/s11746-003-0844-4 

Kahveci, D., Laguerre, M., & Villeneuve, P. (2015). 

Phenolipids as New Antioxidants: Production, Activity, 

and Potential Applications. Polar Lipids, 185-214. 

doi:10.1016/b978-1-63067-044-3.50011-x 

National Toxicology Program (2001). Carcinogens 

Nominated for 11th Report on Carcinogens. JNCI Journal 

of the National Cancer Institute, 93(18), 1372-1372. 

doi:10.1093/jnci/93.18.1372-a 

Ozgen, S., Kilinc, O. K., & Selamoğlu, Z. (2016). Antioxidant 

Activity of Quercetin: A Mechanistic Review. Turkish 

Journal of Agriculture - Food Science and Technology, 

4(12), 1134. doi:10.24925/turjaf.v4i12.1134-1138.1069 

Panya, A., Laguerre, M., Bayrasy, C., Lecomte, J., 

Villeneuve, P., Mcclements, D. J., & Decker, E. A. (2012). 

An Investigation of the Versatile Antioxidant Mechanisms 

of Action of Rosmarinate Alkyl Esters in Oil-in-Water 

Emulsions. Journal of Agricultural and Food Chemistry, 

60(10), 2692-2700. doi:10.1021/jf204848b 

Ramadan, M. F. (2012). Antioxidant characteristics 

of phenolipids (quercetin-enriched lecithin) in lipid 

matrices. Industrial Crops and Products, 36(1), 363-369. 

doi:10.1016/j.indcrop.2011.10.008 

Santos, E. H., Kamimura, J. A., Hill, L. E., & Gomes, C. 

L. (2015). Characterization of carvacrol beta-cyclodextrin 

inclusion complexes as delivery systems for antibacterial 

and antioxidant applications. LWT - Food Science and 

Technology, 60(1), 583-592. doi:10.1016/j.lwt.2014.08.046 

Tanhuanpää, K., Cheng, K. H., Anttonen, K., Virtanen, 

J. A., & Somerharju, P. (2001). Characteristics of Pyrene 

Phospholipid/ γ -Cyclodextrin Complex. Biophysical 

Journal, 81(3), 1501-1510. doi:10.1016/s0006- 

3495(01)75804-3 

Zheng, Y., Haworth, I. S., Zuo, Z., Chow, M. S., & 

Chow, A. H. (2005). Physicochemical and Structural 

Characterization of Quercetin-β-Cyclodextrin Complexes. 

Journal of Pharmaceutical Sciences, 94(5), 1079-1089. 

doi:10.1002/jps.20325 


UTILIZATION OF ATOMIC LAYER DEPOSITION TO 

CREATE NOVEL METAL OXIDE PHOTOANODES FOR 

SOLAR-DRIVEN WATER SPLITTING 

Annie Wang 

Abstract 

A major obstacle of dye-sensitized photoelectrosynthesis cells is the recombination of 60% of the injected electrons from 

the dye into the photoanode. Creating core/shell structures is one technique of slowing down electron recombination. 

There has been no work done on TiO 2 

/SnO 2 

structures or on TiO 2 

/TiO 2 

structures using atomic layer deposition, so the 

aim of the project was to successfully deposit these materials, optimize the deposition, and compare the behavior of the 

structures to the standard SnO 2 

/TiO 2 

core/shell. Novel deposition of TiO 2 

and SnO 2 

onto mesoporous TiO 2 

thin films 

was achieved using atomic layer deposition with the TDMAT and TDMASn precursors. Subsequently, the dye loading 

capabilities of the core/shell structures were measured after being loaded with the RuP chromophore. The samples were 

characterized through XPS after varying deposition parameters to optimize deposition conditions in order to create TiO 2 

and SnO 2 

shells of comparable thicknesses. Dye loading onto TiO 2 

/TiO 2 

was found to be affected by parameters other 

than pore size, including type of TiO 2 

used and processing conditions. Deposition of SnO 2 

initially resulted in SnO, but 

TiO 2 

/SnO 2 

structures were able to be synthesized by using dyesol TiO 2 

instead of mixed-phase TiO 2 

. The successfully 

created TiO 2 

/SnO 2 

and TiO 2 

/TiO 2 

core/shells can be studied to differentiate competing electron recombination theories. 


As the world is becoming increasingly dependent on 

our dwindling supply of nonrenewable sources of energy, 

clean energy is the only viable long-term option. A 

promising method for solar energy conversion is the use 

of dye-sensitized photoelectrosynthesis cells (DSPECs) 

(Brennaman et al., 2016). The DSPEC shares similar design 

features and applies similar principles as the dye-sensitized 

solar cell (DSSC), and although less developed, holds 

much promise for the future of solar energy conversion. 

Photoelectrosynthesis cells convert light to chemical 

energy in the form of stored hydrogen fuel. Rather than 

producing electrical energy as in solar cells, DSPECs use 

photons from sunlight to split water into hydrogen and 

oxygen gases (Fujishima & Honda, 1972). The oxidation 

of water occurs at the anode and the reduction of hydrogen 

occurs at the cathode. The key advantage of this model 

is that hydrogen is able to be stored as chemical fuel for 

future use. Photoanodes used in these cells are often made 

of metal oxide semiconductors due to their ability to form 

high surface area films, ability to accept photoinjected 

electrons from dye molecules, and transparency in the 

visible spectrum because of their optimally high band gap 

energies. (Ashford et al., 2015). 

In addition, a crucial component of the DSPEC is the 

electron injection from chromophores (dye molecules) 

attached to the surface of the mesoporous (containing 

pores with diameters between 2 and 50 nm) film into 

the semiconductor. It is therefore essential to minimize 

undesired back electron transfer (BET) in these devices. 

Back electron transfer/electron recombination occurs 

CHEMISTRY 

when electrons injected into the semiconductor conduction 

band recombine with the oxidized dye, which ultimately 

results in lower DSPEC performance because the electrons 

are not able to travel to the cathode to reduce hydrogen. 

One technique used to slow BET rates in DSPECs is 

the application of SnO 2 

/TiO 2 

core/shell photoanode 

structures (Bakke et al., 2011). Core/shell structures 

allow for electron injection without interference, while 

maintaining a barrier against electron recombination. It 

has been proven by many past studies that these structures 

greatly reduce back electron transfer and enhance DSPEC 

efficiencies (Gish et al., 2016). There is still much debate 

over the underlying theory of how electron recombination 

is reduced in core/shell structures. Two competing 

theories shown in Figure 1a and 1b include the band edge 

offset model (proposing an energy barrier created by the 

difference in band edge between the core and shell) and a 

model proposing the existence of a unique electronic state 

at the core/shell interface (James et al., 2018). 

To study this more closely, it is therefore necessary to 

create samples with different band edges for the core and 

shell as well as structures with the core and shell made 

of the same material in order to compare their electron 

kinetics. In addition, the different samples must have 

comparable shell thicknesses. 


Figure 1a. In the band edge model, an energy barrier 

created by conduction band (CB) edge differences 

prevents electrons from traveling back and 

recombining from the fluorine doped tin oxide 

(FTO). 

Figure 1b. In this model, a unique electronic state 

between the core and shell (left) exhibits special 

properties that cause a change in electron transfer 

behavior, in contrast to electronic states within the 

core and shell (right). 

Atomic layer deposition (ALD) is one method of 

creating core/shell structures (George, 2010). The 

technique involves depositing the shell layer onto 

nanoparticles through successive self-limiting reactions on 

the surface of the material. ALD consists of multiple cycles 

of precursor pulsing and purging to obtain extremely 

precise monolayers on the Angstrom scale. Due to its selflimiting 

nature, ALD produces very smooth, conformal 

films because all parts of the surface react completely with 

the precursor to grow the film (Wang et al., 2017). ALD 

has been used widely to create metal oxide films such as 

Al 2 

O 3 

, TiO 2 

, ZnO, ZrO 2 

, SiO 2 

, and VO 2 

(George, 2010). 

This study will mainly focus on deposition of SnO 2 

and 

TiO 2 

on TiO 2 

. TiO 2 

has thus far produced the highest light 

conversion efficiencies out of all the metal oxides, and is 

widely used in DSSCs as a photoanode (Jafari et al., 2016). 

SnO 2 

also has favorable characteristics for its anodic 

abilities, such as its stability, high reversible capacity, nontoxicity 

and low cost (Knauf et al., 2015). 

The goal of this study was to successfully synthesize 

and characterize TiO 2 

/SnO 2 

and TiO 2 

/TiO 2 

core/ 

shell nanostructures using tetrakis(dimethylamido) 

titanium (TDMAT) and tetrakis(dimethylamido)tin(IV) 

(TDMASn) precursors. Since TiO 2 

/SnO 2 

has not been 

created before, the hypothesis was that TiO 2 

/SnO 2 

would 

behave similarly to the more common SnO 2 

/TiO 2 

core 

shells and would help differentiate the mechanism actually 

in use by core/shell structures to inhibit recombination. 

Previously, it had been common practice to deposit TiO 2 

onto TiO 2 

by treating the TiO 2 

thin film with a TiCl 4 

chemical bath deposition, which was demonstrated to 

reduce back electron transfer (Lee et al., 2012). However, 

this method is very unreliable and difficult to control. 

According to the hypothesis, it would be possible to 

create TiO 2 

/TiO 2 

core/shell structures using ALD for 

the first time which would allow a much more controlled 

deposition while still reducing electron recombination. 

The second aim of this project was therefore to deposit 

TiO 2 

on TiO 2 

using solely atomic layer deposition, a much 

more controllable and reproducible method. 

The TiO 2 

-TiO 2 

deposition was in fact found to be 

successful without necessitating the TiCl 4 

treatment which 

was previously utilized to create TiO 2 

/TiO 2 

structures. 

It was also found that using the TDMASn precursor to 

deposit tin resulted in stannous oxide (SnO) rather than 

the expected SnO 2 

. After thorough studies, the stannous 

oxide was successfully removed by using pure anatase 

dyesol TiO 2 

paste, a commercial paste, for the thin films 

instead of mixed-phase TiO 2 

. In addition, dye loading was 

measured for each of the samples. It was found that dye 

loading in TiO 2 

/TiO 2 

slides does not decrease consistently 

as in SnO 2 

/TiO 2 

slides, so there are other factors besides 

pore size that have an effect on dye loading. 

2. Materials and Methods 

2.1 – Thin Film Preparation 

FTO (fluorine doped tin oxide) glass plates were 

washed in an ultrasonic bath immersed in ethanol, then 

acetone, for 20 minutes each. Previously prepared TiO 2 

paste was coated on the slides through doctor blading 

and tape-casting. The thin films were stored in a 125°C 

oven to prevent water adsorption on the TiO 2 

. They were 

then sintered at 450°C for 60 minutes with a 120 minute 

ramp-up time. Selected films were annealed at 450°C for 

30 minutes with a 120 minute ramp up time. 


2.2 – Atomic Layer Deposition 

Atomic layer deposition was conducted using 

an Ultratech/Cambridge Nanotech Savannah S200. 

TDMASn and TDMAT precursor reactant gases were 

transported to the reactor chamber through heated gas 

lines using nitrogen carrier flow. Nitrogen gas was used 

to purge the reactant chamber after each precursor step. 

Deposition was performed at 150°C while TDMAT and 

TDMASn were held at 75 °C and 60°C, respectively. Gas 

flow and purge times were controlled electronically by a 

LabVIEW sequencer. 

minimized with additional shell coatings. After examining 

the results, it can be concluded that the change in pore size 

is not the only factor affecting dye loading levels. Rather, 

it is hypothesized that there is an uneven preferential 

deposition of TiO 2 

onto the TiO 2 

causing increased dye 

loading which does not occur on the SnO 2 

thin films. 

The decrease in dye loading from 25 to 45 cycles can be 

attributed to the decreased pore size, the effect of which 

eventually overbears that of the preferential deposition 

and leads to an overall decrease in dye loading. 

2.3 – Dye Loading 

The RuP chromophore was loaded to the films by 

soaking the slides in anhydrous methanol solutions 

containing 0.0003 M RuP for several days. The slides were 

removed and subsequently soaked in methanol to remove 

unadsorbed dye. UV-vis absorbances of the dye-loaded 

thin films were taken in 0.1 M HClO 4 

using a Cary 60 UV− 

vis absorbance spectrophotometer. 

2.4 – Characterization 

Profilometry measurements were done with a Bruker 

Optics DektakXT® stylus profiler. All films were between 

4-6 μm thick. Characterization of the deposited thin films 

was done through infrared spectroscopy using a Bruker 

Optics Alpha FTIR Spectrometer, transmission electron 

microscopy using a TEM JEOL 2010F-FasTEM, X-ray 

photoelectron spectroscopy (XPS) using a Kratos Axis 

Ultra DLD X-ray Photoelectron Spectrometer, and Raman 

spectroscopy using a Renishaw inVia Raman microscope. 

Ellipsometry to measure ALD-deposited shell thickness 

was also conducted using a JA Woollam ellipsometer. All 

data were analyzed using Igor Pro (WaveMetrics Inc.). 

Figure 2a. Infrared spectrum of TiO 2 

/TiO 2 

core/ 

shells of various numbers of ALD cycles confirming 

successful deposition. 

3. Results 

The goal of this project was to deposit both SnO 2 

and 

TiO 2 

onto mesoporous TiO 2 

thin films using atomic layer 

deposition (ALD). 

3.1 – TiO 2 

/TiO 2 

Deposition 

The TiO 2 

/TiO 2 

deposition was successfully achieved 

using ALD with the TDMAT and water precursors. 

The slides were characterized using FTIR and TEM and 

confirmed to have shells made of the correct material (Fig. 

2). Following this, the dye loading of the samples was 

collected (Fig. 3). The results reveal different trends from 

those of the more commonly studied SnO 2 

/TiO 2 

core/ 

shell structures. While the data show a clear continuous 

decrease in dye loading of SnO 2 

/TiO 2 

with increasing 

ALD cycles, the dye loading of TiO 2 

/TiO 2 

increases from 

0 to 25 cycles and then decreases. This is inconsistent 

with previous theories that suggested dye loading always 

decreases with increasing ALD cycles because pore sizes are 

CHEMISTRY 

Figure 2b. TEM image of an anatase TiO 2 

nanoparticle 

with an amorphous TiO 2 

shell created using 20 ALD 

cycles. 


Figure 3. Dye loading of mixed phase TiO 2 

/TiO 2 

samples and SnO 2 

/TiO 2 

samples. 

This phenomenon of an initial increase then decrease in 

dye loading was further studied with dyesol (pure-phase) 

TiO 2 

/TiO 2 

samples, annealed and unannealed, as well as 

annealed mixed phase TiO 2 

/TiO 2 

samples (Fig. 4). Dyesol 

TiO 2 

contains larger pores and anneals more easily than 

mixed-phase TiO 2 

. In addition, mixed-phase TiO 2 

creates 

a less well-connected film. All of the samples exhibited 

higher dye loading than the unannealed mixed phase TiO 2 

/ 

TiO 2 

samples. The data clearly display an overall trend for 

each sample type. The unannealed dyesol slides increase 

initially in dye loading but decrease starting at 35 cycles, 

while the annealed dyesol slides demonstrate the same 

behavior but do not decrease in dye loading until 40 cycles. 

These results are consistent with the trends observed 

for the unannealed mixed phase TiO 2 

/TiO 2 

structures. 

The annealed mixed phase TiO 2 

/TiO 2 

samples, however, 

continuously decrease in dye loading from 0 all the way 

to 50 cycles, suggesting that annealing the samples affects 

the dye loading behavior of mixed phase TiO 2 

. Based on 

the results, it can be concluded that dye loading is not 

solely determined based on pore size and can be affected 

by different processing conditions as well as the type of 

TiO 2 

used. 

Figure 4a. Dye loading on dyesol TiO 2 

/TiO 2 

slides 

created with dyesol TiO 2 

paste, unannealed. 

Figure 4b. Dye loading on dyesol TiO 2 

/TiO 2 

slides 

created with dyesol TiO 2 

paste, annealed. 

Figure 4c. Dye loading on TiO 2 

/TiO 2 

slides created 

with mixed phase TiO 2 

paste, annealed. 

3.2 – TiO 2 

/SnO 2 

Deposition 

The TDMASn precursor deposition was initially 

performed with the standard recipe used for the mixed 

phase TiO 2 

/TiO 2 

structures and resulted in a brown layer 

on the slides, which is not the normal appearance of SnO 2 

shells. Upon further characterization, the layer was found 

to be SnO. The SnO formation can be attributed to the 

poor oxidative properties of water. Furthermore, the ALD 

growth rate of SnO 2 

is naturally higher than that of TiO 2 

. 

When the same recipe is used for depositing both SnO 2 


and TiO 2 

, there is a larger growth per cycle for SnO 2 

. Thus, 

it was necessary to vary the ALD conditions in order to 

gain a better understanding of how each parameter affects 

growth so the SnO 2 

growth per cycle could be equalized to 

the TiO 2 

growth per cycle. 

Numerous attempts were made to first convert the 

SnO to SnO 2 

by changing the deposition parameters. 

The TDMASn deposition was initially attempted using 

an ozone precursor and using a combination of water 

and ozone precursors, but both approaches still resulted 

in SnO shells instead of SnO 2 

. In addition, an increase 

of the ALD reactor chamber temperature from 150°C to 

250°C resulted in an uncontrolled island-like growth of 

the SnO, as calculated from ellipsometry measurements of 

the shell thicknesses. The exponential growth of the SnO 

thicknesses for the 250°C samples (Fig. 5) indicates the 

uncontrolled nature of the deposition, which often results 

in island-like growth rather than a smooth conformal 

coating as desired. 

After testing the effects of increasing the reactor 

temperature and using the ozone precursor, a postdeposition 

heat treatment was administered at 210°C in an 

attempt to remove the SnO, but this was unsuccessful and 

resulted in increased SnO peaks in the Raman spectra of 

the sample (Fig. 6). Next, the films were annealed at 450°C 

in an effort to convert the existing SnO to SnO 2 

. Although 

the SnO was successfully converted, the annealing process 

led to delamination of the TiO 2 

. This occurred because 

the extreme heat induced expansion of the TiO 2 

, but 

the rigidity of the crystal structure forced the TiO 2 

to 

eventually crack and delaminate from the slide due to 

internal pressure. 

Figure 5. Comparison of growth rates of SnO on 

planar silicon at 150°C and 250°C based on ellipsometry 

of shell thickness. 

Figure 6. Raman spectra characterizing TiO 2 

/SnO 2 

samples before and after heat treatments. 

The SnO 2 

deposition was then attempted using a 

H 2 

O 2 

precursor instead of the water precursor with other 

varying parameters. H 2 

O 2 

is a stronger oxidant than water 

and does not degrade as easily as ozone, so it offered a 

possible option to convert the SnO to SnO 2 

during the 

ALD process. In order to study the effects of varying each 

parameter, samples with varying precursor pulse, hold, 

and purge times were created and analyzed through XPS 

and ellipsometry on planar silicon. X-ray photoelectron 

spectroscopy (XPS) is a spectroscopic technique that is 

used to analyze the elemental composition of the surface 

of a material by measuring the kinetic energy of escaped 

electrons after focusing a beam of X-rays into the material, 

while ellipsometry is an optical technique used to measure 

thin film thickness. 

Table 1. XPS atomic concentrations obtained for 

each sample created with different ALD deposition 

parameters using the H 2 

O 2 

precursor. 

TD- 

MASn 

Pulse 

H 2 

O 2 

Pulse 

Ti 

Atomic 

Concentration 

(%) 

Sn 

Atomic 

Concentration 

(%) 

Sn/Ti 

Atomic 

Ratio 

0.5 sec 0.02 sec 10.84 18.89 1.74 

0.5 sec 0.1 sec 13.6 16.98 1.25 

0.5 sec 1.0 sec 6.1 24.54 4.02 

0.1 sec 1.0 sec 9.69 21.01 2.17 

0.5 sec 0.02 sec, 

40 sec 

hold 

0.5 sec 0.5 sec, 

60 sec 

hold 

15.75 15.28 0.97 

15.66 15.13 0.97 

CHEMISTRY 


Table 1 shows the atomic concentrations and Sn/ 

Ti ratio obtained through the XPS analysis. The atomic 

concentrations for Sn using the H 2 

O 2 

precursor were 

much higher than the typical values encountered during 

ALD deposition, indicating that the precursor is extremely 

reactive and causing highly uncontrolled growth onto the 

films. In this case, increasing the TDMASn pulse increased 

growth. Increasing the H 2 

O 2 

precursor pulse from 0.02 

seconds to 0.1 seconds did not have much effect on growth, 

but increasing its pulse time from 0.1 seconds to 1.0 seconds 

increased SnO 2 

deposition significantly. Based on the data, 

it is clear that H 2 

O 2 

results in heavy growth of the shells, 

but the growth is most likely uneven. Another weakness 

of H 2 

O 2 

is its inconsistency as a precursor because of its 

tendency to disproportionate in the precursor cylinder to 

water and O 2 

. H 2 

O 2 

is overall not an optimal precursor to 

use in SnO 2 

deposition on TiO 2 

for the purposes of this 

study, but holds promise for future research. 

The deposition was further optimized by utilizing 

dyesol TiO 2 

paste to doctor blade the slides instead of 

mixed-phase TiO 2 

, because of the characteristics of dyesol 

TiO 2 

as a pure phase substance. This left a slight amount 

of SnO on the films immediately after deposition, but the 

SnO was completely removed after heating slightly at 

200°C. Unlike the mixed phase TiO 2 

/SnO 2 

structures, the 

dyesol TiO 2 

/SnO 2 

did not require annealing to convert the 

SnO to SnO 2 

, which would have been impractical for realworld 

purposes. 

The dyesol TiO 2 

/SnO 2 

was then characterized using 

XPS to confirm that the correct form of the material was 

deposited. The correct peak for Sn 4+ was observed at 486.3 

eV (Fig. 7), which was extremely close to the recorded 

value of 486.6 eV (Stranick & Moskwa, 1993). 

confirming that the correct form of Sn 4+ was formed and 

not Sn 2+ . 

Table 2. Atomic concentrations of Ti, Sn, O obtained 

through XPS of dyesol TiO 2 

/SnO 2 

samples. 

ALD 

Cycles 

Ti 

Atomic 

Conc.(%) 

Sn 

Atomic 

Conc.(%) 

O Atomic 

Conc. 

(%) 

(Ti% + 

Sn%) / 

O% 

30 8.17 24.27 61.96 0.52 

40 3.48 30.78 59.63 0.57 

50 3.91 30.47 60.18 0.57 

Samples created using varied parameters were analyzed 

again using XPS and ellipsometry to determine the effect 

of changing each condition on deposition of SnO 2 

using 

dyesol TiO 2 

. Figure 8 shows the effect of changing each 

ALD parameter other than temperature on both the 

thickness of SnO 2 

deposited on planar silicon obtained 

through ellipsometry as well as the ratio of Sn to Ti atomic 

concentrations from TiO 2 

/SnO 2 

samples determined by 

XPS. A lower growth rate is desired for this deposition 

because the SnO 2 

shell naturally is thicker than the TiO 2 

shell, but they should be similar thicknesses in order to 

compare their electron transfer kinetics. The optimal hold 

time is around 60 seconds for decreasing SnO 2 

thickness. 

The decreased growth caused by both increased hold 

and purge time is likely due to removal of moisture and 

impurities introduced into the chamber during the pulse 

and hold times. The lowest growth rate occurred on the 

sample with 0.1 second TDMASn pulse, 0.02 second H 2 

O 

pulse, 20 second hold time, 60 second purge time. This 

recipe resulted in a growth rate of 0.07 nm per cycle, which 

decreased from the 0.09 nm per cycle growth rate achieved 

with the standard recipe used for TiO 2 

deposition. 

Figure 7. XPS spectra of Sn 3d region, displaying peak 

at 486.3 eV. 

In addition, the atomic concentrations were collected 

of Ti, Sn, and O (Table 2). If the deposited material was 

all SnO 2 

, the ratio of (Ti % + Sn %):O% should be 1:2. The 

ratios calculated for the samples were all very close to 0.5, 

Figure 8. SnO 2 

shell thickness determined by 

ellipsometry (left axis) and atomic ratio of Sn to Ti 

determined by XPS (right axis) with varying ALD 

conditions, at 15 cycles. 


4. Conclusion 

Atomic layer deposition was conducted to create 

novel TiO 2 

/TiO 2 

and TiO 2 

/SnO 2 

core/shell structures. 

Dye loading studies conducted on TiO 2 

/TiO 2 

with the 

RuP chromophore revealed that dye loading in TiO 2 

/ 

TiO 2 

increases to a certain point, and then decreases, 

contradicting the trends of SnO 2 

/TiO 2 

which show 

continuously decreasing dye loading which was attributed 

to decreasing pore size. This inconsistency suggests the 

importance of multiple other factors such as processing 

conditions and the type of TiO 2 

used to synthesize the 

core. Moreover, initial attempts to create TiO 2 

/SnO 2 

resulted in the formation of SnO, but this was removed 

by using dyesol TiO 2 

to create the thin films rather than 

mixed phase TiO 2 

. The effects of each ALD parameter 

were studied to create films of similar thicknesses for both 

TiO 2 

/SnO 2 

and TiO 2 

/TiO 2 

, and the growth rate of the 

SnO 2 

was able to be decreased from the standard recipe. 

Future directions will include the conduction of transient 

absorption spectroscopy in order to understand the 

differences in the dynamics of interfacial electron kinetics 

between the TiO 2 

/TiO 2 

and TiO 2 

/SnO 2 

structures. In 

addition, the electron kinetics should be studied in core/ 

shells of various other oxide materials. 


We would like to thank Dr. Jillian Dempsey as well as 

Michael Mortelliti for their incredible mentorship over 

this project. This work was performed in part at the 

Chapel Hill Analytical and Nanofabrication Laboratory, 

CHANL, a member of the North Carolina Research 

Triangle Nanotechnology Network, RTNN, which is 

supported by the National Science Foundation, Grant 

ECCS-1542015, as part of the National Nanotechnology 

Coordinated Infrastructure, NNCI. In addition, the 

project was funded by a grant from the RTNN Kickstarter 

Program for fabrication & analytical costs. 

6. References 

Fujishima, A., & Honda, K. (1972). Electrochemical 

Photolysis of Water at a Semiconductor Electrode. Nature, 

238(5358), 37-38. https://doi.org/10.1038/238037a0 

George, S. M. (2010). Atomic Layer Deposition: An 

Overview. Chemical Reviews, 110(1), 111–131. https:// 

doi.org/10.1021/cr900056b 

Gish, M. K., Lapides, A. M., Brennaman, M. K., Templeton, 

J. L., Meyer, T. J., & Papanikolas, J. M. (2016). Ultrafast 

Recombination Dynamics in Dye-Sensitized SnO 2 

/TiO 2 

Core/Shell Films. The Journal of Physical Chemistry 

Letters, 7(24), 5297–5301. https://doi.org/10.1021/acs. 

jpclett.6b02388 

Jafari, T., Moharreri, E., Amin, A. S., Miao, R., Song, W., 

& Suib, S. L. (2016). Photocatalytic water splitting - The 

untamed dream: A review of recent advances. Molecules, 

21(7). https://doi.org/10.3390/molecules21070900 

James, E. M., Barr, T. J., & Meyer, G. J. (2018). Evidence 

for an Electronic State at the Interface between the SnO 2 

Core and the TiO 2 

Shell in Mesoporous SnO 2 

/TiO 2 

Thin 

Films. ACS Applied Energy Materials, acsaem.7b00274. 

https://doi.org/10.1021/acsaem.7b00274 

Knauf, R. R., Kalanyan, B., Parsons, G. N., & Dempsey, J. 

L. (2015). Charge Recombination Dynamics in Sensitized 

SnO 2 

/TiO 2 

Core/Shell Photoanodes. Journal of Physical 

Chemistry C, 119(51), 28353–28360. https://doi. 

org/10.1021/acs.jpcc.5b10574 

Stranick, M. A., & Moskwa, A. (1993). SnO 2 

by XPS. 

Surface Science Spectra, 2(1), 50–54. https://doi. 

org/10.1116/1.1247724 

Wang, D., et al. (2017). Layer-by-Layer Molecular 

Assemblies for Dye-Sensitized Photoelectrosynthesis 

Cells Prepared by Atomic Layer Deposition. Journal of 

the American Chemical Society, 139(41), 14518–14525. 

https://doi.org/10.1021/jacs.7b07216 

Ashford, D. L., Gish, M. K., Vannucci, A. K., Brennaman, 

M. K., Templeton, J. L., Papanikolas, J. M., & Meyer, T. 

J. (2015). Molecular Chromophore-Catalyst Assemblies 

for Solar Fuel Applications. Chemical Reviews, 

115(23), 13006–13049. https://doi.org/10.1021/acs. 

chemrev.5b00229 

Brennaman, M. K., et al. (2016). Finding the Way to 

Solar Fuels with Dye-Sensitized Photoelectrosynthesis 

Cells. Journal of the American Chemical Society, 138(40), 

13085–13102. https://doi.org/10.1021/jacs.6b06466 

CHEMISTRY 


USING A HYBRID MACHINE LEARNING APPROACH 

FOR TEST COST OPTIMIZATION IN SCAN CHAIN 

TESTING 

Luke Duan 

Abstract 

Continual technological advances have led to more complex microchip designs, which in turn, have led to the need for 

more complex fault testing. As a result, higher testing costs (increased test time and data volume) have emerged as well. 

This work examines one application of hybrid machine learning (ML) to optimize the costs of scan chain testing. We 

used fifty-one benchmark circuits to train the models and analyze their performances. We generated training data by 

performing scan chain test simulations on each of these circuits using MentorGraphics tools DFTAdvisor and FastScan 

and compiled them into files readable by the ML framework Weka. We then trained three individual ML models and 

evaluated their accuracies by comparing them against a test set. Finally, we created a hybrid model by combining these 

individual models, with different weights allotted to each model based on their individual accuracy. Findings showed that 

there was a slight increase in performance by using a hybrid approach. We concluded that this method can be improved 

by using larger training sets and better heuristic algorithms when assigning weights. This research could be useful for the 

microchip industry by reducing time-to-market. 


Technological advances in the field of engineering have 

allowed integrated circuit/microchip design companies 

to figure out how to continuously add more and more 

transistors (along with gates) onto smaller and smaller 

devices. In order to completely test for all the possible faults 

in a microchip, more complex and costly testing is needed 

on these denser designs (Bushnell & Agrawal, 2005). 

One procedure for fault testing occurs during the design 

phase of chips - in the form of scan chain testing. In this 

type of testing, a certain number of scan chains are chosen 

for insertion into a circuit, with varying numbers of scan 

chains having different test costs. It can become extremely 

tedious to test all possible scan chain numbers, and 

manually pick out the most cost-efficient number to use. 

In order to make that decision, machine learning models 

can be trained with circuit data, along with the number 

of scan chains inserted. Then, when provided with a new 

circuit, they would be able to predict the best number of 

scan chains to use. (Zipeng & Chakrabarty, 2016) proposed 

a method to optimize test cost by choosing parameters, 

such as scan chain length, using a support vector regression 

(SVR) machine learning model. In this work, we will 

examine the parameter optimization of the number of 

scan chains. The primary focus is to explore how well a 

hybrid machine learning model performs in predicting the 

optimal number of scan chains to use in scan chain testing. 

1.1 – Design for Testability (DFT) 

Design for Testability, or DFT, can be described as the 

set of methods that make testing for faults in microchips 

easier. In the next section, we break down DFT and explain 

the connections between digital logic, data flip-flops, shift 

registers, and scan chain testing. 

1.1.1 – Context 

There exist two types of digital logic: combinational and 

sequential, with the latter involving a memory component 

as well as a clock signal for regulation. The physical 

manifestation of digital logic can be found in digital circuits. 

A flip-flop (FF) is a prime example of a component in a 

sequential digital circuit. It is not uncommon for instances 

of sequential logic/circuits to incorporate combinational 

logic. 

The Data FF (Fig. 1), or DFF, is the simplest type of 

FF, and consists of an input (D), a clock signal (CLK) and 

an output (Q). The “scan-enabled” DFF comes with an 

additional scan-in and scan-out port (scan-out port not 

pictured). 

Figure 1. A typical scan-enabled flip-flop (Gupta, 2014) 

It is a basic storage element in sequential logic, able to 

hold a stable state of either 0 or 1. The DFF may receive 

an input, but unless the clock signal is turned “on,” the 

output will not change. This reduces the occurrence of 

any unnecessary output changes, thus saving power. A 

44 | 2018-2019 | Broad Street Scientific ENGINEERING

shift register is essentially a linear chain of these DFF’s, 

all connected to and regulated by the same clock signal. 

The output of one DFF is directly connected to the input 

of the next. The input can be controlled, and the output of 

the register can be observed. For our purposes, we do not 

worry about what happens within the register. 

1.1.2 – Scan Chain Testing 

Scan chain testing (Fig. 2) is a common method for testing 

for faults in silicon when manufacturing circuits. Either 

one or multiple scan enabled shift registers are formed, 

with each DFF being replaced with their “scan-enabled” 

versions, which simply means they come equipped with 

scan-in and scan-out ports. The total number of flip-flops 

are divided as equally as possible over the number of scan 

chains in a circuit. A clock signal is established, and testing 

begins. An input test pattern generated by pseudo-random 

methods is scanned in by each register, and the scannedout 

output will be compared to the expected output. 

The expected output is the output that would have been 

reached if all gates in the combinatorial logic had been 

working correctly. If the two outputs do not match, then 

a fault is detected. Scan chain testing can be characterized 

by its test application time (time for the test to occur), 

and test data volume (number of test patterns inserted to 

test for all faults) (Gupta, 2014). These costs can change 

depending on the number of scan chains used. 

connected to every single neuron in the next layer. The 

input values, each multiplied by a unique weight, are 

summed up and passed through an activation function. 

If above a certain value, the neuron “fires” (information 

is passed on to the next layer). A neural network uses 

feedback (comparison to actual value) to learn and slowly 

correct itself to become the best predictor it can be 

(Mitchell, 1997). 

Figure 3. A visual representation of an artificial 

neural network; two hidden layers. 

Random forests (Fig. 4) essentially take a collection 

of decision trees, and output either the mode or mean 

predictions of the individual trees. Decision trees work by 

breaking a dataset into smaller pieces and formulating a set 

of rules for decision-making based on previous data. They 

have the ability to decide which features are important and 

which features can be dropped (as they contribute little to 

the prediction process) (Donges, 2018). 

Figure 2. A typical scan chain (Gupta, 2014) 

1.2 – Machine Learning (ML) 

1.2.1 – Basic Principles 

Machine learning (ML) is a subset of artificial 

intelligence, which is built around the idea of self-learning 

and self-improvement. To begin, a ML model is trained 

with a set of training data. In supervised learning, both the 

input and expected output are fed into the model. After 

becoming sufficiently trained, the model can be tested 

against a test set. Accuracies for the model can be found 

by comparing the predicted outputs from the model to the 

actual outputs of the test set (Mitchell, 1997). 

1.2.2 – Machine Learning Model Descriptions 

The artificial neural network (NN) (Fig. 3) consists of 

an input layer, one or several hidden layers, and an output 

layer. Each layer consists of several neurons, which are 

Figure 4. A visual representation of a random forest; 

two separate decision trees - red nodes represent 

the individual output of each tree, which are then 

combined in some way to form the output of the 

random forest (Donges, 2018). 

Support vector regression (SVR) (Fig. 5) works by 

optimizing a line between two sets or classes of data. In 

other words, while learning, it attempts to minimize 

ENGINEERING 


error by adjusting a hyperplane. The accuracy is generally 

dependent on setting good parameters (Cortes and 

Vapnik, 1995). 

A formula for a Test Cost (TC) may be obtained from Test 

Application Time (TT) and Test Data Volume (TV): 

TT max 

and TV max 

represented the maximum test 

application time and maximum test data volume for a 

circuit, respectively. This was to normalize a value for TC 

(Zipeng and Chakrabarty, 2016). 

Figure 5. A visual representation of SVR applied to 

two classes of data (black circles and blue squares); 

hyperplane represented by the green line. 

1.2.3 – Weka Software 

Weka is a software tool that provides a collection of 

many developed ML models, including neural networks, 

random forests, and support vector regression. This 

application contains a user interface, which simplifies the 

experience when working with and applying ML to data 

(Weka Machine Learning Group, n.d.). 

2. Methodology 

This project was divided into four phases: Training Data 

Generation, Individual ML Model Training, Allotment of 

Weights, and Hybrid ML Model Performance. 

2.1 – Training Data Generation 

In this phase, the tools DFTAdvisor (MentorGraphics, 

n.d.) (to insert scan chains) and FastScan (MentorGraphics, 

n.d) (to generate and compare test patterns) were applied 

on a collection of 51 pre-constructed benchmark digital 

circuits from the ISCAS89 library. With each circuit, we 

recorded several features: the number of primary inputs, 

the number of primary outputs, the number of gates, 

the number of flip-flops, and the number of scan chains 

inserted. Five variations of each circuit were tested, from 

one scan chain inserted to five scan chains inserted. 

For context, the features had the following ranges 

(Table 1): 

Table 1. Range of values for features. 

# of Primary Inputs 6 - 80 

# of Primary Outputs 1 - 320 

# of Gates 26 - 26115 

# of Flip-Flops 3 - 1728 

# of Scan Chains Inserted 1 - 5 for each circuit 

We also took note of the test application time and test 

data volume in performing each scan chain test. 

2.2 – Individual ML Model Training 

In this phase, Weka was used to individually train three 

types of regression ML models: artificial neural networks, 

random forest, and SVR. Out of the 51 total circuits that 

were given, 42 circuits were used for training the models, 

while the remaining 9 circuits were used for testing. A true 

random number generator was used to select the circuits 

in each set. The ML model was trained and run against the 

testing set. The outputted TC was compared to a manually 

calculated TC from the actual FastScan data. 

2.3 – Allotment of Weights 

In this phase, the weights that each individual ML 

model will have in the hybrid model were empirically 

selected. This was performed on the following basis: the 

higher the accuracy, the more weight it had. There were 

many different methods for weight selection, which left 

this phase open to a lot of trial and error. 

2.4 – Hybrid ML Model Performance 

In this phase, the hybrid model was fed a different 

set of training data and tested against a different testing 

set (though still chosen out of the same collection of 

benchmark circuits). The minimum TC was chosen, 

and the scan chain number correlated with that TC was 

compared to the actual FastScan output. The accuracy of 

the hybrid model was evaluated. 

3. Data Analysis 

We performed tests on the 9 circuits not used for 

training. 

3.1 – Weighting (Individual Models) 

For each individual model, results were labeled with the 

following: 

• Off if the scan chain number correlating with the 

lowest ML test cost prediction didn’t match the scan 

chain number correlating with the lowest actual 

FastScan output (Table 2) 

• Success if the scan chain number correlating with 

the lowest ML test cost prediction did match the 

scan chain number correlating with the lowest actual 

FastScan output (Table 3) 


Example of Off for a test circuit: 

Table 2. Comparison between actual and predicted 

test cost example 1 - from NN. 

The scan chain number correlating with the lowest 

ML test cost prediction (0.9474) is 1, while the scan chain 

number correlating with the lowest actual FastScan output 

(0.8733) is 4. 

Example of Success for a test circuit: 

random forest had a weight of 0.8000, and the support 

vector regression had a weight 0.1000 (a 1:8:1 ratio). 

3.2 – Hybrid Model 

Both hybrid models had 3 Successes, so further 

evaluation had to be completed. Specifically, the total 

difference between the actual test cost corresponding 

with the predicted SC number and the actual lowest test 

cost was found for the 9 testing circuits. A lower total 

difference is indicative of a more accurate model. 

The differences with the first weighting combination 

(2:3:1) are shown in Table 5. 

Table 5. Hybrid model (Weights 2:3:1) total differences. 

Table 3. Comparison between actual and predicted 

test cost example 2 - from NN. 

The scan chain number correlating with the lowest ML 

test cost prediction (0.9951) is 5, matching the scan chain 

number correlating with the lowest actual FastScan output 

(0.9504). 

The lowest values for Test Cost are highlighted in 

boldface. If the lowest Predicted Test Cost does not 

match the lowest Actual Test Cost, then Off. If the lowest 

Predicted TC matched the lowest Actual TC, then Success. 

We only focus the lowest values of cost, because this is the 

main objective of our optimization. 

Weights for the hybrid model were assigned based on 

the number of Successes (Table 4). 

Note: A difference of 0.0000 means Success. 

The total difference for the hybrid model with weights 

2:3:1 is 0.3502. 

The differences with the second weighting combination 

(1:8:1) are shown in Table 6. 

Table 6. Hybrid model (Weights 1:8:1) total differences. 

Table 4. Number of Successes for each model. 

The total difference for the hybrid model with weights 

1:8:1 is 0.3560. 

Thus, our initial weighting of the hybrid model was in 

a 2:3:1 ratio. The artificial neural network had a weight of 

0.3333, the random forest had a weight of 0.5000, and the 

support vector regression had a weight 0.1667 in the hybrid 

model. We also decided to investigate heavily weighting 

the best-performing individual model as compared to the 

other two models. In this weighting of the hybrid model, 

the artificial neural network had a weight of 0.1000, the 

We next compare these differences to those of the 

individual models. The Artificial Neural Network isn’t 

considered in these comparisons, due to predicting invalid 

test costs. 

The differences for the random forest model are shown 

in Table 7. 

ENGINEERING 


Table 7. Random forest model total differences. 

optimized weights. There could very well exist a weight 

set for the hybrid model that provides an even better 

performance. Moreover, we hope that with promising 

results, this methodology may be applied to industriallevel 

circuits for real-world use. 


The total difference for the random forest model is 0.3560. 

The differences for the support vector regression model 

are shown in Table 8. 

Table 8. Support vector regression model total 

differences. 

I would like to express my sincerest thanks towards 

Dr. Jonathan Bennett for his constant encouragement 

and accepting me into the Research in Physics program 

at NCSSM. I would also like to acknowledge Dr. Sarah 

Shoemaker for organizing and directing the Summer 

Research Internship Program. 

I am very thankful to Dr. Krishnendu Chakrabarty for 

granting me permission to work with his research group 

at Duke University. 

I would like to thank Zhanwei Zhong, Shi Jin, Thomas 

Napoles, and the Duke Office of Information Technology 

for their assistance with local issues. 

Last but not least, I would like to express my gratitude 

towards my mentor, Arjun Chaudhuri, for his patience 

and dedication in guiding and challenging me. 

6. References 

The total difference for the support vector regression 

model is 0.4894. 

The hybrid model with weights 2:3:1 had lower total 

differences compared to the total differences of the 

individual models, as well as a hybrid model with 

nonoptimal weighting, showing evidence of a slightly 

better performance. This provides basic evidence that 

there is, in fact, an improvement in accuracy by using a 

hybrid ML method. 

4. Conclusion and Future Work 

4.1 – Conclusion 

This work offered the possibility of using a hybrid 

machine learning model to predict the best number of scan 

chains to use for cost optimization. Though individual 

ML models, such as the artificial neural network, random 

forest, and support vector regression work well on their 

own, a hybrid model with correct weighting appears to 

offer a slightly better performance. With this in mind, 

microchip testers could potentially use this new method 

to further decrease test costs and improve time-to-market. 

4.2 – Future Work 

Running a program or algorithm may offer further 

Bushnell, M., Agrawal, V. (2005). Essentials of Electronic 

Testing, Springer. 

Zipeng, L., Chakrabarty, K. (2016). Test Cost Optimization 

in a Scan-Compression Architecture using Support- 

Vector Regression. Proc. IEEE Test Symposium (VTS). 

Gupta, N. (2014). Overview and Dynamics of Scan 

Chain Testing, Retrieved from https://anysilicon.com/ 

overview-and-dynamics-of-scan-testing/ 

Mitchell, T.M. (1997). Machine Learning, McGraw-Hill. 

Donges, S. (2018). The Random Forest Algorithm. 

Retrieved from https://towardsdatascience.com/therandom-forest-algorithm-d457d499ffcd 

Cortes C., Vapnik V. (1995). In Support-vector networks, 

Machine Learning (vol. 20, pp. 273-297). 

Machine Learning Group. (n.d). Weka 3: Data Mining 

Software in Java, University of Waikato. 

DFTAdvisor Reference Manual. (n.d). MentorGraphics. 

FastScan and FlexTest Reference Manual. (n.d). 

MentorGraphics. 


NOVEL WATER DESALINATION FILTER UTILIZING 

GRANULAR ACTIVATED CARBON 

Geoffrey Fylak 

Abstract 

As the human population continues increasing, so does the demand for freshwater resources. The scarcity of freshwater 

will likely impact one-third of the world’s population within the next decade. While there are many proven methods of 

water desalination, most are cost- and energy-intensive. Our research seeks to improve upon capacitive deionization: 

an emerging, yet proven, scalable method of desalination that removes charged species from water using low levels of 

electricity. The filter utilizes granular activated carbon (GAC), an affordable, naturally abundant material commonly used 

in industrial Brita® water filters to remove uncharged contaminants. We anticipate that GAC’s electrically conductive 

properties will enable the material to adsorb sodium chloride. Our goal is to determine and enhance the performance 

capabilities of GAC by altering operational parameters and system design. Initial tests demonstrated low performance 

due to inadequate operational parameters and design flaws. Through systematic improvements, researchers have greatly 

increased system performance. The filter’s charge efficiency has increased from 13% to 63% while the adsorption capacity 

has increased from 10.3 µg/g to 452.0 µg/g. Based upon success in removing sodium chloride, our filter’s application could 

be extended to remove more harmful, charged water contaminants in the future. 


1.1 – Significance 

As the human population continues increasing, so 

does the demand for freshwater resources. The scarcity 

of freshwater will likely impact one-third of the world’s 

population within the next decade. While there are many 

proven methods of water desalination, most are cost 

and energy-intensive. Our research seeks to improve 

upon a novel desalination technique, which would 

expand available drinking water sources on a global 

scale. The technology investigated is based on capacitive 

deionization (CDI), an emerging, yet proven, scalable 

method of desalination that removes charged species from 

water using low levels of electricity. The filter will utilize 

granular activated carbon (GAC), an affordable, naturally 

abundant material commonly used in industrial Brita® 

water filters to remove uncharged contaminants. We 

anticipate that GAC’s electrically conductive properties 

will enable the material to adsorb sodium chloride. 

Our goal is to determine and enhance the performance 

capabilities of GAC by altering operational parameters 

and system design. Emerging contaminants widely exist 

in raw and treated drinking water and present an ongoing 

threat to human health and the planet. Certain substances, 

such as PFAS, are suspected carcinogens and pose a risk to 

humans even at trace levels (ng/L to µg/L). Thus, there 

exists a need to develop viable methods and technologies 

to remove charged contaminants from water resources. 

Ultimately, our filter’s application can be extended to 

remove more harmful charged contaminants in the future. 

1.2 – Background Literature Review 

Water treatment is a broad field consisting of many 

different methods and focuses. Water desalination is a 

sub-field which focuses on removing salt from water. 

Many industrial scale water desalination techniques 

exist, such as reverse osmosis and thermal distillation; 

however, these techniques are highly energy intensive. 

CDI technology improves upon these other techniques 

through its low energy requirement. 

CDI cells operate based off of the electrochemical 

principles of charge. Essentially, saltwater is a solution 

containing two sets of molecules: salt compounds and 

water molecules. Salt compounds are composed of two 

types of ions: positively charged sodium ions and negatively 

charged chloride ions. Moreover, when opposite electrical 

charges are given to two parallel plates, an electric field is 

created. This electric field will immobilize sodium chloride 

ions and separate them based off of their respective 

electrical charge, directing the positively charged ions to 

attach to the negatively charged plate and vice versa for 

the negatively charged ions. However, the most crucial 

component of a CDI system is the electrode, the part that 

captures the charged salt ions, thus removing them from 

the water, resulting in pure water (Suss et al., 2015). 

Previous research has proven CDI technology to 

successfully remove salt on the lab scale (Porada et al., 

2013) and industrial scale (Welgemoed & Schutte, 2005). 

These experiments describe the salt removal process, as 

well as detail the various essentials of a successful CDI 

system. The most important physical component is the 

electrode material, as the resistivity and specific surface 

area of the material determine the amount of salt that can 

be adsorbed. Materials with high specific surface areas and 

ENGINEERING 


porosity are most efficient at removing salt. 

As researchers attempt to expand the applicability of 

CDI technology, they are experimenting with a variety 

of electrode materials. One particular electrode material, 

granular activated carbon, is contained within Brita® 

water filters, removing uncharged contaminants with its 

desirable properties. Researchers determined granular 

activated carbon (GAC) to have a promising surface 

conductivity and adsorption capacity (Jia & Zhang, 2016). 

Another set of researchers packed an electrode chamber 

with granular activated carbon and discovered up to two 

and a half times more salt removal (Bian et al., 2015). 

However, their research did not assess the potential of 

GAC as a primary electrode material. Our study seeks 

to determine performance metrics, as well as compare 

our findings with pre-existing data. In doing so, we will 

be able to gain a holistic view of the efficiency of GAC 

as an electrode material. Since many industrial water 

filters, such as Brita®’s, utilize GAC, the transition to an 

industrial-scale desalination system will be feasible if GAC 

is proven to be efficient. 

However, to accurately assess the efficiency of GAC 

as an electrode material, we must first ensure that the 

CDI system’s design is sufficient. Charge efficiency 

is an important, quantifiable indication of a system’s 

effectiveness. A system’s charge efficiency is a measurement 

in the form of a percentage, which demonstrates the moles 

of salt removed per moles of electrical charge emitted 

to electrodes. A system with a charge efficiency of 100% 

removes one mole of salt per mole of electrical charge. 

One set of researchers discovered that CDI cells must be 

charged at a positive voltage to achieve the highest charge 

efficiency (Avraham et al., 2009). Therefore, our project 

will utilize critical findings to ensure that the electrode 

parameters are under enable the maximum performance 

of GAC. 

Though there is substantial research surrounding 

the CDI process, there is no significant information 

concerning the efficiency of GAC as an electrode material. 

By conducting this research, GAC could potentially prove 

to be a useful electrode material, consequently sparking 

feasible industrial filter production. Conversely, GAC 

could prove to be inefficient, allowing researchers to 

focus on other potential modifications. The purpose of 

this study is to determine the efficiency of the electrode 

material granular activated carbon in comparison with 

pre-existing materials. 

2. Materials 

2.1 – Novel CDI System Design 

The novel filter was designed, modeled, and assembled 

using materials funded by the Call Lab at NC State 

University. As a novel design, each material and component 

must be considered to achieve optimal functionality. 

The assembled and disassembled GAC filter design is 

illustrated below (Fig. 1, Fig. 2). Certain materials such 

as the hex nuts, screws, and barbed tube fittings did not 

require modification; however, the polycarbonate plate, 

graphite plates, rubber gaskets, and glass fibre prefilters 

needed to be cut. Each part plays an instrumental role 

in adapting GAC to carry electrical charge and remove 

sodium chloride. 

Figure 1. A rendered model of the assembled filter. 

Figure 2. A disassembled model of the GAC filter. 

Numbers coincide with different parts and materials: 

1. Rubber gaskets and glass fibre prefilter; 2. Nylon 

screws; 3. Barbed Tube Fittings; 4. Polycarbonate 

plates; 5. Graphite plates 1/8” thick; 6. Graphite plates 

1” thick. 

Water will enter through the top barbed tube fitting 

and exit through the bottom, passing through the 

cylindrical chambers that contain the electrode material. 

An electrical charge must be given to the system through 

an anode and a cathode. Hence, the 1/8” thick graphite 

plates have an extended area designated for anode and 

cathode attachment. Graphite was chosen as the material 

to house the granular activated carbon (GAC) because it 

is electrically conductive. However, since two oppositely 


charged chambers are created, they must be separated 

to ensure the system does not short-circuit. A series of 

glass fibre prefilters (spacers) accomplishes this goal. The 

middle spacer separates the anode and cathode chambers, 

ensuring the GACs in either chamber do not touch and 

cause system failure. 

Gaskets are used in combination with the spacers to 

prevent leakage from occurring. Each of these components 

is held together by two nylon screws. It is essential to use 

nylon, plastic, or any other non-conductive material so 

that the system does not short-circuit when an object is in 

contact with both the anode and cathode chambers at the 

same time. The nylon hex nuts allow researchers to tighten 

the system, preventing leakages and pressure build-ups. 

With computer-aided design, the 3D model was 

converted into 2D sketches and each individual part 

was able to be cut in NC State’s Machine Shop. Lastly, a 

3D-printer in the NCSSM Fabrication Lab was used to 

create a stand to hold the filter upright and prevent the 

filter from lying horizontally (Fig. 3). 

Figure 4. A top view of the resistance experienced 

from the anode/cathode connection sites to various 

locations within the electrode chamber. 

Figure 5 shows a few photos of the GAC filter 

completely assembled. 

Figure 5. The GAC filter completely assembled, from 

a variety of angles. 

3. Specific Aims and Research Design 

We seek to address the following research questions: 

Figure 3. A red, 3D-printed stand supports the GAC 

filter and enhances the system’s vertical flow path. 

Aside from flow path, system resistance was a challenge 

that the design needed to overcome. Thus, researchers 

filled the chambers with GAC and measured the resistance 

from the anode or cathode connection points to various 

locations within the chamber (Fig. 4). These data 

demonstrate that graphite sufficiently emits charge to all 

of the electrode material. Although resistance increases in 

areas furthest away from the graphite, electrical charge can 

still travel to those areas and facilitate salt removal (Fig. 4). 

3.1 – Specific Aim 1 

Determine the relationship between flow rate and CDI 

system performance by running tests with different flow 

rates and comparing the respective performances. 

3.2 – Rationale and Hypothesis 

The flow rate of water through a CDI system impacts 

the volume of salt entering the system. Exposure to higher 

salt concentrations should enable electrodes to capture 

more salt. However, increased flow rates facilitate pressure 

build-ups and leakage issues that may negatively impact 

system performance. By analyzing the impact of flow 

rate on system efficiency, researchers can discover the 

operational parameters necessary to yield maximum salt 

removal. 

Typical lab-scale, flow-by CDI cells utilize 0.200 g of 

electrode material; however, this novel design incorporates 

20.0409 g of electrode materials. Due to the much higher 

system volume, researchers expect higher flow rates to 

increase CDI cell performance. Moreover, incremental 

ENGINEERING 


changes in flow rate will likely impact performance less 

because of the large volume. Thus, researchers may need 

to greatly increase flow rate to produce significant changes 

in performance. 

3.3 – Supporting Preliminary Data 

We previously analyzed the relationship between flow 

rate and cell performance using flow-by CDI cells. We 

concluded that increasing flow rate negatively impacted 

CDI performance across all performance metrics (Table 

1). Our current experiment utilizes flow-through CDI 

cells; thus, our design differs from the one featured in this 

study. 

Nevertheless, it is important to observe the implications 

of these findings on the electrochemical level, as this study 

indicates that higher flow rates induce pressure build-ups 

and consequently, Faradaic Reactions. Faradaic Reactions 

contribute to pH fluctuations and electronic charge storage 

without salt ion adsorption (Na + or Cl - ). 

Table 1. Comprehensive visualization of the 

impact of increasing flow rate on flow-by CDI 

cell performance. Noticeably, each performance 

parameter decreases as flow rate increases. 

Adsorption 

Capacity 

Charge 

Efficiency 

4 mL/min 6 mL/min 8 mL/min 

2.507mg/g 1.212mg/g 1.075mg/g 

18.87% 9.44 % 10.07 % 

3.4. – Methods 

After calibrating the pump, tubing was attached from 

the pump through the CDI system, then directed into a 

properly labeled waste container. Next, distilled water 

was pumped through the system to ensure that no leakage 

occurred. 

Finally, flow cells were attached outside of the system 

to allow researchers to measure the conductivity of water 

exiting the system (Fig. 6). 

Figure 6. A visualization of the research setup, 

including the pump, salt solution, CDI cell, 

conductivity flow cell, pH flow cell, waste bucket, 

and tubing. 

With the system assembled, we created one liter of 100 

mM salt solution. The 100 mM solution is then diluted 

into a 10 mM salt solution and pumped through the CDI 

cell. This step saves time creating solutions in the future, 

as it is much easier to dilute a solution than create one. 

For this specific project, we chose to test flow rates of 

5 mL/min and 10 mL/min. Using the calibration which 

we previously conducted, we programmed the pump to 

each of these flow rates in different tests. All of our other 

system parameters were kept constant during this test: 

voltage during charge was 1.2 V, charge cycle time was 5 

minutes, the alligator clips were positioned from anode to 

cathode, and the system ran for three cycles. 

We first measure the conductivity and pH of the water 

before it enters the system. The flow cells containing 

conductivity and pH probes are used to measure the 

conductivity and pH of the water exiting the system. 

Conductivity is directly related to salt concentration, so the 

combination of these measurements enables researchers to 

analyze salt removal over time. Each probe captures data 

points one minute apart, allowing researchers to observe 

the behavior of the cell over time, minute by minute. 

3.5 – Data Analysis 

A charge cycle occurs under an applied voltage while the 

system is removing salt. However, the electrodes will reach 

an adsorption capacity and cannot remove salt forever. A 


discharge cycle occurs when the voltage is removed or 

reversed, allowing electrodes to flush captured salt ions 

into a brine stream. During each cycle there are various 

performance metrics that researchers observe to assess 

system efficiency. These metrics are adsorption capacity, 

adsorption rate, and charge efficiency. Adsorption capacity 

refers to the mass of salt collected per mass of electrode 

material. Adsorption rate is an indication of the rate of salt 

adsorption as per mass of electrode. Charge efficiency is a 

measurement in the form of a percentage; a ratio of moles 

of salt removed per mole of electric charge. 

Since conductivity is directly proportional to salt 

concentration, we were able to derive each performance 

metric by finding the area under the effluent conductivity 

curve (Fig. 7). 

Figure 7. Conductivity versus time graph that 

graphically illustrates the importance of the 

integral of effluent conductivity in determining salt 

removed. 

The following demonstrates the mathematical analysis 

performed to derive each performance metric. 

Adsorption Capacity: 

Ultimately, these mathematical formulas are the key 

to transform raw data into meaningful analysis. These 

performance metrics are accepted throughout the larger 

CDI community. 


Determine the relationship between charge and 

discharge cycle length and CDI cell performance by 

increasing the time during which voltage is applied to the 

system. 


The charge and discharge cycle length determine the 

time during which salt removal will occur. However, 

considering the adsorption capacity of electrodes, we 

expect for salt removal rates to vary as the pores become 

more filled with salt. Accordingly, proper cycle lengths 

are essential for an accurate measurement of electrode 

material performance. By analyzing the impact of cycle 

length on system efficiency, researchers can maximize 

the effectiveness of the electrode and determine the true 

potential of the material. 

Researchers expect longer cycle times to coincide 

with increased system performance. The large volume of 

electrode material should theoretically require more time 

to reach maximum adsorption. However, exceedingly long 

cycle times will decrease charge efficiency, as charge enters 

into electrodes that are unable to hold more salt ions. Thus, 

it is imperative that researchers systematically determine 

the proper charge cycle to enhance system performance. 

Ultimately, researchers expect cycle time to be 

significantly longer than the five-minute period that is 

adequate for smaller cells. 

3.8 – Supporting Preliminary Data 

Figure 8 illustrates the salt concentration over time 

for the first test run on the CDI cell. The test run below 

consisted of three, five-minute charging and discharging 

cycles. 

Charge Efficiency: 

Figure 8. Conductivity versus time graph for a flow 

rate of 5 mL/min, at 1200 mV, for 3 complete charge 

and discharge cycles each 5-minutes long. 

ENGINEERING 


This length was not adequate since the system was still 

removing salt at the end of the charging cycle. At the end 

of a charging period, effluent conductivity should return to 

influent conductivity so that the system reaches maximum 

adsorption and returns to a state of equilibrium. These 

findings demonstrate that the CDI system needs a longer 

charging cycle period, likely because of the large relative 

volume of the cell. This result is limited because it does 

not indicate what an adequate length would be, it merely 

demonstrates that it needs to be longer than 5 minutes. 

Thus, the researchers will be conducting systematic testing 

to determine the appropriate charge cycle time. 

3.9 – Methods 

For this specific test, researchers knew that the charge 

and discharge cycle needed to be longer than 5-minutes 

however, they did not know how long it needed to 

be. First, researchers decided to increase cycle time 

gradually in order to analyze the system behavior. This 

enabled researchers to analyze the system’s consistency 

as well as GAC performance under different operational 

parameters. Consequently, this heuristic continually 

yielded an inadequate cycle time. Hence, researchers 

decided to systematically determine the charge cycle by 

conducting a ‘single-cycle test’. In this test, researchers set 

the charge length to 300 minutes, and observed the data 

to determine the time at which the electrodes had reached 

their maximum adsorption and returned to equilibrium. 


Researchers used the same mathematical and graphical 

approach to derive the performance metrics for the CDI 

cell as in Specific Aim 1. 

In addition to this quantitative data analysis, this data 

required graphical analysis based off of graph qualities. 

Researchers focused on observing the effluent versus 

influent conductivity at the end of each cycle time to 

observe whether the system was at equilibrium at the end 

of the cycle. 


Determine the impact of design modifications on CDI 

cell performance by decreasing the total volume of the 

system. 


Although the filter was experiencing great increases 

in adsorption capacity, the charge efficiency was still very 

low. Charge efficiency is a measure of the percentage of 

electrical charge allotted to salt removal. A low charge 

efficiency indicates that much of the GAC is not removing 

salt and not receiving electrical charge. Researchers 

hypothesized that the large volume of the system was 

contributing to a poor distribution of electrical charge. 

Thus, system performance is expected to increase as the 

filter’s volume decreases. 


For this specific test, researchers decided to decrease 

the system’s volume by half. Researchers hypothesized that 

the large volume of GAC in the filter was contributing to 

the low charge efficiency, thus researchers anticipated that 

this modification would improve adsorption capacity and 

charge efficiency. The following image demonstrates the 

design modification that occurred. 

Figure 9. Graphic illustration of the design 

modification that decreased the filter volume from 

45.23 mL to 25.24 mL. 

Researchers decided to test the filter using the 5 mL/ 

min flow rate because higher flow rates caused too many 

leakage issues. Moreover, the applied voltage of 1.2 V 

remained constant. A charge cycle time of 20 minutes was 

deemed appropriate after qualitative graph analysis. 




cell as in the previous specific aims. 


Determine the impact of design modifications on 

CDI cell performance by rearranging anode and cathode 

attachment locations. 

3.16 – Rationale and Hypothesis: 

Although the filter once again experienced an increase 

in adsorption capacity, the charge efficiency decreased. 

Researchers hypothesized that the arrangement of the 

anode and cathode connection was not facilitating the ideal 

electron flow. Thus, researchers decided that increasing 

the distance between the applied voltages was necessary 

for the electrical field to encompass all of the GAC within 

the filter. Researchers hypothesize that this change may 

increase charge efficiency and overall system performance. 


For this specific test, researchers decided to change 

the location of the anode and cathode connection points. 

Researchers hypothesized that this modification would 

increase the reach of the electrical field, enable more GAC 


to be charged, and increase the filter’s charge efficiency. 

Figure 10 demonstrates the design modification that 

occurred. 

Figure 10. Graphic illustration of the design 

modification that increased the reach of the electric 

field by moving the anode and cathode connection 

plates further away from one another. 

Researchers decided to keep the operational parameters 

constant to ensure that the design modification was 

the only factor that could contribute to differences in 

performance. Thus, the flow rate remained 5 mL/min, 

the voltage applied remained 1.2 V, and the cycle time 

remained 20 minutes long throughout testing. 




cell as in the previous specific aims. 

4. Results 

to use a flow rate of 5 mL/min in future testing to avoid 

these issues. Nevertheless, these tests were successful in 

establishing baseline performance capabilities of GAC. 

4.2 – Impact of Cycle Time on Filter Performance 

The following table displays the performance of the 

CDI system as the cycle time increases. As expected, 

system performance increased as cycle time increased, 

since the cell spent more time at peak adsorption (Table 3). 

Additionally, the cell spent more time expelling salt during 

discharge cycles so the GAC was able to adsorb even more 

salt for a longer period of time. 

Table 3. Performance metrics comparison between 

elongated cycle periods demonstrates that the longer 

cycle time increased performance efficiency. 

Adsorption 

Capacity 

Charge 

Efficiency 

5 min 10 min 20 min 50 min 

20.2 

µg/g 

31.9 

µg/g 

96.0 

µg/g 

155.4 

µg/g 

22.32% 30.65 % 35.04% 35.64% 

Figure 11 displays the salt concentration over time for 

the lowest charge time tested (five minutes). 

4.1 – Impact of Flow Rate on Filter Performance 

After testing, researchers observed that a higher flow 

rate yielded more efficient filter performance. Flow rate 

directly impacts the performance metrics of the flowthrough 

CDI cell (Table 2). The lower flow rate was 

significantly less efficient than the higher flow rate. 

Table 2. The performance metrics of the flowthrough 

CDI cell at two different flow rates: 5 mL/ 

min and 10 mL/min. Each performance metric rises 

with flow rate, demonstrating that higher flow 

rates increase performance. 

Adsorption 

Capacity 

Charge 

Efficiency 

5 mL/min 10 mL/min 

10.7 µg/g 20.2 µg/g 

13.125 % 22.32 % 

The novel system has a relatively large electrode 

volume, causing alterations in operation parameters 

to impact system performance less than expected. 

Accordingly, additional testing with a higher range of flow 

rate values may be necessary to cause greater variations in 

performance. Moreover, the larger flow rate introduced 

many leakage issues and pressure build-ups which 

increased internal system resistance. Researchers chose 

Figure 11. Salt concentration over time for a cycle 

time of 5 minutes. Operational parameters: applied 

voltage of 1.2 V, cycle time of 5 minutes, and flow rate 

of 5 mL/min. 

During this test, the filter was not at equilibrium at the 

end of the charge and discharge cycle periods. Here, very 

brief, ineffective discharge periods inhibited the amount 

of salt that the electrodes were able to adsorb. From this 

qualitative analysis, it was evident that cycle time must be 

increased. Figure 12 displays the salt concentration over 

time for an increased charge and discharge cycle length of 

10 minutes. 

ENGINEERING 


indication regarding the potential of GAC to adsorb salt 

when operating under ideal conditions. 


time of 10 minutes. Operational parameters: applied 


of 5 mL/min. 

Noticeably, the discharge cycles were more effective as 

the area under the curve during discharge cycles appears 

much larger, which was confirmed through quantitative 

analysis. However, the effluent and influent conductivities 

were still not equal at the end of the respective cycle time 

lengths (Fig 12). After increasing the cycle time again to 

20 minutes, graphical analysis once again demonstrated 

a need for increased cycle time. However, these results 

were limited because they did not indicate the ideal cycle 

time. Researchers conducted a ‘single-charge test’ to 

finally determine the optimal cycle time. In doing so, 50 

minutes was found to be ideal. The system performance 

was considerably higher under the 50-minute charge and 

discharge cycle time (Table 3). Figure 13 illustrates the salt 

removal over time under this elongated cycle time. 


time of 50-minutes. Operational parameters: applied 


of 5 mL/min. 

4.3 – Impact of System Volume on Filter Performance 

Due to the exceptional volume of electrode material 

contained within the original design, researchers 

decided to decrease system size and analyze the impact 

on performance. The system design was maintained, 

researchers merely decreased the volume of each large 

graphite chamber to half of its original size. This change 

decreased the amount of electrode material from 20.04 g 

to 8.44 g. Figure 14 illustrates the salt removal over time 

using the smaller system. 

Figure 14. Salt concentration over time for the 

system after the design modification. Operational 

parameters: applied voltage of 1.2 V, flow rate of 5 

mL/min, and a cycle time of 20 minutes. 

Qualitative analysis demonstrates that the conductivity 

was nearing equilibrium at the end of the charge and 

discharge cycles, so a cycle time of 20 minutes was adequate 

for the smaller system. The performance metrics of the 

system were considerably higher than the larger systems, 

indicating an improved performance with the design 

modifications (Table 4). 

Table 4. Performance metrics and size comparisons 

between the two filters of different sizes illustrate 

that a decrease in filter size coincides with an 

increase in adsorption capacity but a decrease in 

charge efficiency. Researchers attribute the decrease 

in charge efficiency to an inadvertent decrease in 

GAC density. 

Large Filter 

Small Filter 

Volume 45.23 cm 3 25.24 cm 3 

Mass of GAC 20.04 g 8.433 g 

Density of GAC 0.433 g/cm 3 0.334 g/cm 3 

In this test, GAC demonstrated the impressive ability 

to remove salt at maximum adsorption for an extended 

period of time (~35 min.), which is a positive indication 

of GAC capability and system performance (Fig. 13). In 

conclusion, the results of this experiment were a positive 

Adsorption 

Capacity 

Charge 

Efficiency 

155.4 µg/g 287.7 µg/g 

35.6 % 27.3 % 


The adsorption capacity increased, indicating that the 

GAC in the filter adsorbed more salt than in previous tests. 

However, the charge efficiency decreased which meant that 

less charge was directed towards salt removal. Researchers 

hypothesize that the difference in GAC densities between 

the chambers caused this decrease in charge efficiency. As 

the chamber becomes less dense, it is more difficult for 

charge to be administered across the GAC, thus a lower 

charge efficiency should coincide with a lower electrode 

density. Nevertheless, the broader goal of this research 

project was to study the adsorption capabilities of GAC, 

using our filter as the avenue to do so. Thus, this increase 

in adsorption capacity was another promising sign. 

4.4 – Impact of Anode/Cathode Arrangement on Filter 

Performance 

In this design modification, researchers changed the 

location of the anode and cathode attachments to expand 

the amount of GAC impacted by the applied voltage (Fig. 

10). The results from this design modification are shown 

in Table 5. 

Table 5. Performance metrics before and after 

increasing the distance between anode and cathode 

attachment plates demonstrate that a wider 

electrical field significantly increases the charge 

efficiency and adsorption capacity of the system. 

Adsorption 

Capacity 

Charge 

Efficiency 

ENGINEERING 

Previous 

Design 

New Design 

287.7 µg/g 452.0 µg/g 

7.3 % 63.1 % 

This modification caused the most significant increase 

in charge efficiency experienced by the filter. Additionally, 

there was a large increase in adsorption capacity which was 

likely due to the amount of charge contributing towards 

salt removal. The distance between the applied voltages 

was much larger than before, likely causing the increase 

in charge efficiency. Moreover, charge efficiency reflects 

the performance of the filter, while adsorption capacity 

reflects the performance of the GAC. Thus, the correlation 

between increases in filter performance and increases in 

GAC performance indicate that GAC has even more 

potential to serve as an electrode material as the system 

design continues to improve. 

5. Discussion and Conclusions 

The aforementioned study established the efficiency 

of a novel electrode material, granular activated carbon, 

commonly used in portable water filters. Many industrial 

water filtration companies leverage GAC’s adsorptive 

capabilities to remove uncharged contaminants. Without a 

preexisting design, researchers leveraged their innovation 

and created a system that dispersed electrical charge 

across a chamber of GAC. Throughout experimentation, 

researchers have improved GAC’s adsorption capacity 

from ~10 µg/g to ~450 µg/g. The filter was initially 

invented as a lab-scale device aimed to determine the 

potential of GAC. This profound increase in adsorption 

capacity proves GAC to be a promising potential electrode 

material for use on the industrial scale. Moreover, the 

charge efficiency of the system was increased from ~13% to 

~63% over the course of various design modifications. It is 

important to consider that operational conditions are not 

yet ideal, and are evidently limiting system performance. 

Nevertheless, researchers proved that the electrochemical 

technique capacitive deionization is compatible for use 

with granular activated carbon. Thus, researchers have 

created a portable device that feasibly adapts GAC for salt 

removal. The low cost and energy requirements of this 

desalination technique will become a valuable resource 

to those impacted by the growing demand for freshwater 

resources. Furthermore, though currently untested, 

the device may have the potential to adapt GAC for the 

removal of other, more harmful, charged contaminants. 

6. Acknowledgement 

My work for this project was completed at North 

Carolina State University’s Environmental Engineering 

Lab from June 2018- January 2019 under the mentorship 

of Dr. Douglas Call and Dr. Shan Zhu. Both of my mentors 

played fundamental roles in building my competency with 

capacitive deionization technology. Moreover, these 

mentors initially proposed the idea to design a filter that 

could utilize granular activated carbon (GAC) based upon 

their knowledge of the advantages of GAC. While the filter 

was designed, modeled, and assembled entirely by myself, 

I have sought their guidance throughout the development 

of my specific research aims to ensure continual system 

enhancement. 

7. References 

Jia, B., Zheng, W. (2016). Preparation and Application of 

Electrodes in Capacitive Deionization (CD): a State-of-Art 

Review. Nanoscale Research Letters, 11. 

Schutteb, C. F., Welgemoeda, T. J. (2005). Capacitive 

Deionization Technology TM : An Alternative Desalination 

Solution. Desalination, 183, 327-340. 

Avraham, E., Noked, M., Bouhadana, Y., Soffer, A., 

Aurbach, D. (2009). Limitations of Charge Efficiency in 

Capacitive Deionization. Journal of the Electrochemical 

Society, 156, 157-162. 


Suss, M. E., Porada, S., Sun, X., Biesheuvel, P. M., Yoon, 

J., Presser, V. (2015). Water desalination via capacitive 

deionization: what is it and what can we expect from it? 

Energy Environmental Science, 8, 2296-2319. 

Porada, S., Zhao, R., Van der Wal, A., Presser, V., 

Biesheuvel, P. M. (2013). Review on the science and 

technology of water desalination by capacitive deionization. 

Progress in Materials Science, 58, 1388-1442. 

Bian, Y., Huang, X., Jiang, Y., Liang, P., Yang, X., Zhang, C. 

(2015). Enhanced desalination performance of membrane 

capacitive deionization cells by packing the flow chamber 

with granular activated carbon. Water Research, 85, 371- 

376. 


LONG PRIME JUGGLING PATTERNS 

Daniel Carter and Zach Hunter 

Abstract 

There are a large variety of ways to juggle balls. Different juggling patterns can be modeled by a sequence of states that 

describe the positions of the balls in regular time intervals. A pattern is said to be prime if it does not repeat states more 

than once per cycle. We investigate the problem of finding the longest prime pattern for a given number of balls and 

maximum throw height. Solutions up to a maximum throw height of 9 were found by computer search. We completely 

solve the 2-ball case and provide a very strong upper bound for all other cases. This upper bound differs by no more than 

1 from every computed case. 


Juggling and mathematics are intricately connected. 

The math YouTube channels Mathologer (Polster & 

Geracitano, 2015) and Numberphile (Wright & Haran, 

2017) have both released videos on juggling. This introduction 

reiterates the information in those videos and introduces 

the main problem of this paper. 

There are many ways to juggle balls. For example, two 

basic 3-ball patterns are cascade, where the balls travel in a 

figure eight, and shower, where they travel in a circle. We 

can represent these patterns by following which hand the 

balls are in or traveling to over time in a ladder diagram, 

such as the ones shown in Figure 1.1. 

The left and right columns of dots represent the left 

and right hands, and the lines represent the paths of the 

balls. For the cascade, every ball is thrown so that it lands 

in the opposite hand 3 steps later. In other words, the ball 

is thrown to height 3. However, for the shower, the right 

hand throws balls to height 5 and the left hand throws balls 

to height 1. 

Jugglers assign siteswap notation to these patterns. This 

notation lists the sequence of throw heights in a pattern. 

For example, cascade has a siteswap of “3” and shower has 

a siteswap of “51.” It is worth noting that siteswap notation 

does not distinguish the left hand from the right. In fact, 

these patterns could be juggled using just one hand. Also 

worth noting is that there may be multiple siteswaps that 

refer to one pattern; for example, 51 and 15 represent the 

same pattern. Finally, a 0 in siteswap means all balls are in 

the air and there is no ball ready to be thrown. 

We can also describe the states reached by a pattern. 

Each state is a sequence of 1’s and 0’s representing the 

positions of the balls in the air. A 1 in the kth position 

indicates a ball in the air will land k steps later. At most 

one ball may be in each position, because two balls in the 

same position will fall into the same hand at the same time, 

which isn’t allowed in basic juggling. In the cascade, the 

only state is (111): one ball is always just about to land, 

one will land in two steps, and one will land in three steps. 

Jugglers call this state ground state, as it is the state with all 

balls in the lowest position. For the shower, the two states 

are (11010) before a throw of height 5 and (10101) before 

a throw of height 1. Two throws of a shower are shown 

diagrammatically in Figure 1.2. 

Figure 1.1. Ladder diagrams of the 3-ball cascade and 

3-ball shower. 

MATHEMATICS AND COMPUTER SCIENCE 


e decomposed into the prime patterns 42 and 3. Prime 

patterns correspond to cycles on the graph, which are 

closed walks that do not repeat vertices. 

The number of (not necessarily prime) patterns is wellestablished 

(Takahashi, 2015). The more difficult question 

of the number of prime patterns has a partial answer 

(Banaian et al., 2015). We attempt to find the longest 

prime pattern for each combination of balls and maximum 

throw height. 

2. Empirical Results and Symmetry 

Figure 1.2. Converting location of balls to states. 

Bolded arrows indicate throws and are labeled 

with the throw height. Dotted arrows show balls 

dropping due to gravity. 

Reading from the bottom to the top, marking a 1 for 

every ball and a 0 for every gap gives the states (11010) 

and (10101). In this diagram, the balls are colored 

differently for clarity. However, we will consider each ball 

indistinguishable for our analysis. 

As seen, throws can change the state of the balls. In 

general, every throw each “1” moves left one place (i.e. 

the corresponding ball falls slightly) except for a ball in 

the leftmost position, which is thrown to some currently 

empty spot. We can make a directed graph describing 

every possible state and throw. A closed walk in this graph 

is a repeating pattern of throws — a juggling pattern. The 

graph for 3 balls with a maximum throw of 5 is shown in 

Figure 1.3. 

There are finitely many prime patterns, because for any 

finite graph, there are finitely many cycles. Therefore, a 

computer can search and find the longest prime pattern. 

Call L(n, b) the length of the longest prime pattern for b 

balls and maximum height n. The values of L(n, b) for 0 ≤ n 

≤ 9 are given in Table 2.1. 

Table 2.1. Lengths of longest prime patterns for max 

height 9 or less. 

For example, the value at b = 3, n = 5 is 8 because the 

longest prime pattern has siteswap 55150530, which is 

length 8. In Figure 1.3, this corresponds to the 8 states in 

the center of the diagram that are arranged in an octagon. 

Interestingly, the table appears symmetrical, with L(n, b) = 

L(n, n−b). We will now prove this. In fact, we will prove a 

somewhat stronger result. 

Figure 1.3. Juggling graph from 3 balls and max height 

5. Vertices represent states and edges represent 

throws. Vertices are labeled with the state they 

represent, and edges are labeled by throw height. 

We will denote the graph for b balls with max throw 

height n as J(n, b). The diagram above represents J(5,3). 

If a pattern visits each state no more than once, jugglers 

call it a prime pattern. This is because if a state is visited 

multiple times, the pattern can be decomposed into two 

or more prime patterns. For example, the pattern with 

a siteswap of 423 visits the state (11100) twice and can 

Theorem 2.1. There exists a bijection between patterns with b 

balls and n − b balls. 

Proof. Consider a valid juggling pattern for b balls with 

maximum height n. List the states of this pattern in order. 

Now, switch 0’s and 1’s, mirror each state left-to-right, 

and reverse the order of the list. This new list is a valid 

pattern for n − b balls. For example, there is a 3-ball pattern 

of height 5 with siteswap 5511. The states reached are, in 

order: 

(11100) 

(11001) 

(10011) 

(10110) 

60 | 2018-2019 | Broad Street Scientific MATHEMATICS AND COMPUTER SCIENCE

The new list in this case is 

(10010) 

(00110) 

(01100) 

(11000) 

Which are, in fact, the states reached by the 2-ball 

pattern with siteswap 4004. 

To see why this bijection works, consider two seemingly 

unrelated questions: “What happens to the 0’s in the state 

after each throw?” and “What states could have led into 

some particular state?” 

For the first problem, there are three cases. The first 

is the case where there is a 0 in the leftmost position, so 

a throw of height 0 is the only option. In this case, all 0’s 

except the leftmost move left one position (i.e. fall) and 

a 0 appears in the rightmost position. Next, if a throw 

of maximum height is made, all 0’s simply move left one 

position. Finally, for any other throw, all 0’s move left one 

position, a 0 appears in the rightmost position, and one 

of the 0’s disappears because it was filled by the ball just 

thrown. 

For the second problem, there are also three cases. 

The first is if there is a 1 in the rightmost position, so the 

previous throw must have been maximum height. In this 

case, the previous state had the 1’s (except the rightmost 1) 

moved right one step, and there was a 1 was in the leftmost 

position. Next, there is the case where the previous throw 

was height 0, and the previous state had all 1’s simply 

moved right one position. Finally, for any other throw, all 

1’s were moved right one position, a 1 was in the leftmost 

slot, and one of the 1’s disappears because it has not been 

thrown yet. 

Clearly, these problems are equivalent! Simply swap 0 

and 1 and left and right. This accounts for the swapping of 

0’s and 1’s and the left-to-right mirroring in the bijection. 

The reversal of the order of states is a reversal of time, 

which comes from the statement of the second question. 

Due to this bijection, any pattern for b balls with max 

height n corresponds to a pattern for b gaps — that is, n − b 

balls. 

Corollary 2.2. L(n, b) = L(n, n − b). 

Corollary 2.3. To construct J(n, n − b) given J(n, b), reverse 

all arrows and relabel each vertex by switching 0 and 1 and 

mirroring left-to-right. 

Borrowing terminology from graphical linear algebra, 

we call the state formed after doing the bijection the bizarro 

of the initial state, denoted S * . We introduce the functions 

next and prev of a state S which return the set of possible 

states that could follow or precede S, respectively. From 

this theorem, if S 2 

∈ prev(S 1 

), then S 2 

* 

∈ next(S 1* 

). 

We will derive some basic upper bounds on the lengths 

of the longest prime patterns. 

3. Basic Upper Bounds 

Obviously, we cannot have a prime pattern with more 

states than the number of possible states. 

Lemma 3.1. The number of possible states is 

J(n, b) has vertices. 

. Equivalently, 

Proof. Each state is a permutation of b copies of 1 and 

n − b copies of 0. Therefore, the number of distinct states 

is . 

Corollary 3.2. L(n, b) ≤ . 

In fact, if b > 1 and n−b > 1, this inequality is strict because 

it is impossible to reach all states without repetition. This 

is proven below. 

Lemma 3.3. If b > 1 and n − b > 1, L(n, b) < . 

Proof. Consider the state with all balls in the highest 

possible position, 

Because this state ends in 1, the previous throw must 

have been max height and the previous state was 

However, this new state also ends in 1, so the previous 

throw must have been max height, and so on until all b 

copies of 1 are exhausted and the state is ground state, 

In other words, the only way to get to the original 

state S is to do b max height throws from ground state. 

However, because S begins with n − b copies of 0, the next 

n − b throws must all be height 0. After those throws, we 

return to ground state, closing the walk. 

This means that S is only reached by a single prime 

pattern. This pattern has length n, and for b > 1 and n − 

b > 1, n < . In other words, the longest prime pattern 

will either reach S and be length n or not reach S at all. 

Therefore, if b > 1 and n − b > 1, L(n, b) < . 

Now, consider the simple case where b = 2. Using a more 

complex argument, stronger bounds can be constructed. 



4. The Case b = 2 

The argument hinges on simplifying the problem by 

considering the distance between the two balls, rather than 

their exact positions in the states. The distance between 

two balls in a state is the difference in the position of their 

corresponding 1’s. For example, the distance between the 

balls in the state (01001) is 5 − 2=3, because the first 1 is in 

position 2 and the second 1 is in position 5. 

With only two balls, the distance between the balls and 

the position of the first ball completely describe a state. 

However, notice that if the first ball is not in the leftmost 

position, the only possible throw is 0 until that ball falls 

into the leftmost position. Therefore, any pattern that 

reaches a state with some distance d necessarily reaches 

the state with distance d and a 1 in the leftmost position. 

This implies that each throw where height ≠ 0 in a prime 

pattern must lead to a unique distance. 

By considering only the states with a 1 in the leftmost 

position, we can construct a weighted directed graph with 

each vertex representing a unique distance and the weights 

on the edges indicating the maximum number of throws 

from one distance to another, using only one throw 

of height > 0. For example, the graph for 2 balls with 

maximum height 5 (or maximum distance 4), is shown in 

Figure 4.1. 

Then, an edge exists between states d and d′ if d′ < d or 

d + d′ ≤ n. Its weight is 

Rather than drawing the graph, it is simpler to consider 

a modified adjacency matrix where the entry in row x and 

column y is the W(x, y), if that edge exists. The example 

n = 5 is below. 

A cycle on the weighted graph does not repeat states, 

so it is also a prime pattern. Its length is the sum of the 

weights of the edges that it traverses. In the n = 5 case, the 

longest prime pattern created using this strategy is length 

8. The edges it traverses are circled in the matrix below. 

In fact, there is a general pattern that gives very long 

prime patterns and a lower bound on L(n,2). 

Lemma 4.1. 

Figure 4.1. Condensed 2-ball juggling graph with 

max height 5. Vertices represent states with a 1 in 

the leftmost position and edges represent throws. 

Vertices are labeled with distance, and edges are 

labeled with the number of states reached in the 

transition from one state to another. 

The edge from distance 3 to distance 2 has weight 3 

because the longest path from (10010) to (10100) is length 

3, given by the throws 5, 0, 0. 

The edge weights can be calculated easily. Every time a 

ball is thrown, it can either be thrown to a higher position 

than the other ball or to a lower position. If it is thrown 

higher, the next ball to land will be the second ball, which 

happens in d steps. If the first ball is thrown lower, say 

height h, it will be the next to land, h steps later. Let d′ be 

the target distance. Then h = d - d′ . Finally, this transition 

is only possible if d′ < d (so we can throw lower) or d + d′ ≤ 

n (so we don’t throw above max height). 

Proof. Construct a sequence of distances as follows: 

• Begin with distance n − 1. 

• Go to . 

• Alternate across n/2, each time going to the distance 

closest to n/2 not yet reached. For odd n, begin by 

increasing the distance, and for even n, begin by 

decreasing the distance. 

• When distance 1 is reached, go to distance n − 1. 

For example, take n = 10. The sequence of distances 

formed by this procedure is 9, 5, 4, 6, 3, 7, 2, 8, 1. Looking 

at the matrix representation makes it much more obvious 

what this process does. For n = 10, the edges traversed are 

The sum of the weights of the edges traversed (the 

circled numbers) is the length of the pattern. This sum is 


upper bound on L in terms of C. 

For upper integer bound n, this is equal to . 

Writing it in this way shows a difference of just from 

the upper bound . 

In fact, we will later see that the notion is 

exactly L(n,2). This is due to a stronger upper bound for L 

that is derived by extending the notion of distance to cases 

with b > 2. 

5. Extension to b > 2 

For some state S with a ball in the lowest position, write 

the sequence of distances between each ball and the nexthighest 

ball, starting with the lowest ball. Call the sum 

of this sequence m and append n − m to the sequence to 

construct the distance notation of a state. We write distance 

notation in brackets and without commas or space 

between entries. For example, for the state (100101), the 

distance notation is [321]. 

Distance notation is useful because after a max height 

throw and all subsequent height 0 throws, the distance 

notation rotates one place. Again taking the state (100110), 

after the siteswap 600, the state is (1010100) and the 

distance notation is [213]. The states corresponding to all 

unique rotations of a distance notation and all “in-between” 

states that have a 0 in the leftmost position form a subcycle: 

a set of states formed when doing only max height and 

height 0 throws. Each state is part of exactly one subcycle. 

Subcycles are very useful for finding long prime 

patterns, because a particular subcycle contains many 

states that cannot be reached by and cannot reach any state 

outside the subcycle. All states that end in 1 must have had 

the previous throw be max height, so the previous state 

was in the subcycle. Furthermore, all states that begin in 0 

must have the next throw be height 0, so the next state will 

be in the subcycle. 

Not all subcycles have the same number of states. For 

example, (1010) and (0101) form a subcycle with b = 2 

and n = 4, but (1100), (1001), (0011), and (0110) are also a 

subcycle with b = 2 and n = 4. If the number of states in a 

subcycle is m, the ratio n/m is the multiplicity of the subcycle, 

denoted with the letter x. Multiplicity can also be seen as a 

property of a state and is the number of times a string of 1’s 

and 0’s is repeated to form that state. For example, (1010) 

has multiplicity 2 because it is (10) repeated 2 times. States 

have the same multiplicity of the subcycle of which they 

are part. Clearly, x must be a divisor of n. x must also be a 

divisor of b, because each repetition must include the same 

number of balls. Therefore, x must be a divisor of gcd(n, b). 

Let C x 

(n, b) be the number of subcycles of multiplicity 

x with max throw n and b balls. We obtain the following 


Theorem 5.1. If b > 1 and n − b > 1, L(n, b) ≤ 

or equivalently L(n, b)≤ 

means α is a divisor of β. 

. The notation α|β 

Proof. This bound essentially states that in each subcycle, 

we can hit at most one fewer state than the number of 

states in that subcycle. 

To see why this is true, consider all states with a 1 in 

the leftmost position. For brevity, we call these states 

grounded. These are the only states that can reach any state 

outside the subcycle. Consider a particular grounded state 

S. Then there is the next state in the subcycle S′ formed 

after doing a max height throw from S. The only state that 

can reach S′ is S. 

Now consider a prime pattern that includes all 

grounded states in a subcycle S 1 

,S 2 

,...,S n/x 

. Unless the prime 

pattern has no states outside this subcycle, at some point a 

throw lower than max height must be made from one of 

the grounded states S i 

. However, the state after S i 

in this 

subcycle could not be reached without repeating S i 

. 

Therefore, if a prime pattern includes states from 

multiple subcycles, it can hit at most one fewer than the 

number of states in each subcycle. The number of states 

in a subcycle of multiplicity x is n/x, so multiplying n/x 

− 1 by the number of subcycles with multiplicity x, then 

summing across all possible multiplicities gives the upper 

bound 

Equivalently, we can start with the total number of 

states and subtract 1 for each cycle to get 

The exceptions are when the longest prime pattern is 

actually just one subcycle, and the length of that subcycle 

is greater than the bound above. This only occurs when 

there is only one subcycle, which happens when b = 1 or 

b = 0. 

We will define 

for 

simplicity. 

How many subcycles of a particular multiplicity are 

there? We can construct several recurrence relations that 

uniquely define C x 

. 

Lemma 5.2. 

Proof. Recall that x counts the number of repetitions of a 

string needed to form a state with multiplicity x. Each of 

these strings is also a state with b/x balls and max height 

n/x. For example, (101010) is a state with 3 balls and max 


height 6, and the repeating unit (10) is a state with 1 ball 

and max height 2. 

Every subcycle of multiplicity 1 with b/x balls and 

max height n/x uniquely determines a subcycle of 

multiplicity x with b balls and max height n. Therefore, 

. 

Table 5.2. Upper bound on length of longest prime 

pattern given by Theorem 5.1. 

Let s x 

(n, b) be the number of states with multiplicity x, 

max height n, and b balls. Clearly, C x 

(n, b) = , because 

each subcycle of multiplicity x has n/x states by definition. 

From the previous lemma, we have 

. We 

also have the following relation that involves s. 

Lemma 5.3. 

Proof. Each state has a unique multiplicity, so summing 

across all possible multiplicities yields all states. 

Table 5.3. Difference between the upper bound and 

actual value of longest prime pattern. 

This is enough information to calculate any value of C 

and s, and therefore the upper bound on L. As an example, 

we will find L≤(6,3): 

In fact, L(6,3) = 15. 

Below are tables for values of C 1 

, L≤, and L≤ − L. C x 

, and 

therefore L≤, is not defined for b = 0 or n − b = 0, so those 

entries are omitted. 

Table 5.1. Number of subcycles with multiplicity 1. 

Table 5.3 shows that in many cases, L≤ = L. However, for 

cases where n = 2b and b > 2 (the central values in every 

other row), L(2b, b) < L≤(2b, b). Before this is proven, we 

will establish some necessary conditions to lose only 1 

state in each subcycle, instead of more. 

We define the first grounded state in a subcycle. As the 

name implies, this is the grounded state in a subcycle that 

first appears in a prime pattern. Not every grounded state 

can be a first grounded state. 

Lemma 5.4. If a grounded state has 1 as its last distance, it 

cannot be a first grounded state. 

Proof. If a state has 1 as its last distance, it must have a 1 in 

the rightmost position. However, this means the previous 

throw must have been a max height throw, so the previous 

state was a grounded state in the same subcycle. Thus, a 

state with 1 as its last distance cannot be the first grounded 

state reached in a subcycle. 

Now consider, for example, the state (1101000), or in 

distance notation, [124]. If this is the first grounded state 

and the prime pattern only misses 1 state from its subcycle, 

then [241] and [412] will also be reached. The states 

missed are the non-grounded states “in-between” [412] 

and [124]. These are S 1 

= (0001101), S 2 

= (0011010), and 


S 3 

= (0110100). Out of these, only S 2 

and S 3 

can be reached 

from a state outside this subcycle, because S 1 

has a 1 in the 

rightmost position. If S 3 

was reached first, then both S 1 

and 

S 2 

will be missed. Then to miss exactly one state, assuming 

[124] is the first grounded state reached in the subcycle, S 2 

must be the first state reached in the subcycle. 

In general, the first state reached in a subcycle must be 

two states after some grounded state S G 

in a subcycle, if 

only 1 state is to be missed in that subcycle. We call this 

state an entry state for a subcycle. The singular state missed 

in that case is the state from throwing maximum height 

from S G 

. 

Which states can reach this particular entry state? From 

Theorem 2.1, this question is equivalent to asking for the 

bizarro of the states that can be reached by the bizarro of 

the entry state. In our example, the entry state is (0011010), 

which has bizarro (1010011). There are 4 states that can 

immediately follow this state: (1100110), (0110110), 

(0101110), and (0100111), which have bizarros (1001100), 

(1001001), (1000101), and (0001101). Obviously, we 

discard the last of these, because it is in the same subcycle 

as the entry state. 

The distance notation for the three states that work 

are [313], [331], and [421]. These would be the last states 

reached in their subcycle, or the leaving states, and the 

corresponding first ground states reached would be [133], 

[313], and [214]. Recall our original grounded state [124]. 

Notice the state [133] is just [124] with the second-to-last 

distance incremented and the last distance decremented. 

Notice as well that the other two states both have 1 as their 

second-to-last distance. These are in fact the only two 

possibilities, a fact that we will prove. 

Before the proof, we introduce the function entry of a 

grounded state S G 

, which returns the unique entry state if 

the first grounded state reached in S G 

’s subcycle is S G 

. From 

the above example, entry((1101000)) = (0011010). We also 

introduce the function fg of a grounded state S H 

, which 

returns the unique first grounded state of S H 

’s subcycle if 

S H 

is the leaving state. fg(S H 

) is also the next grounded state 

after S H 

in S H 

’s subcycle. If fg(S H 

) = S G 

, then entry(S G 

) is the 

state formed after a max height and height 0 throw from 

S H 

. 

Lemma 5.5. Let S G 

with distance notation [d 1 

d 2 

...d b−1 

d b 

] be the 

first grounded state of a subcycle. Then let {S p1 

,S p2 

,...} be all states 

in prev(entry(S G 

)) but not in the same subcycle as S G 

. For each 

S pi 

, let S qi 

= fg(S pi 

). Then the distance notation of each S qi 

is either 

[d 1 

...d b−2 

(d b−1 

+ 1)(d b 

− 1)], or S qi 

has 1 as the second-to-last distance 

and either d b 

− 1 or d b 

− 1 + d 1 

as the last distance. 

Let fg(S H 

) = S G 

. Then 

We know the entry state is the state after a max height 

throw and one throw of height 0 from S H 

, so 

The bizzaro is 

We know 

. In fact, only 

entry(S G 

)* is omitted in 

There are three cases 

for possible throws from entry(S G 

)*: the aforementioned 

max height throw, a throw of height 1, and every other 

throw. We must consider only the latter two. 

Case 1: After a throw of height 1, the state is 

which has bizarro 

Then 

, which is 

S qi 

has distance notation [d 1 

...d b−2 

(d b−1 

+ 1)(d b 

− 1)]. This 

is the first possibility described by the lemma. 

Case 2: After a throw of height < n, the state is either 

Case 2a, where the ball is thrown somewhere in the middle: 

or Case 2b, where the ball is thrown close to the end: 

with the circled 1 representing the thrown ball. These 

two subcases are essentially the same and correspond to 

the two possibilities for the last distance, d b 

− 1 and d b 

− 1 

+ d 1 

. We will only show the rest of Case 2a, but Case 2b 

follows similarly. 

After absorbing the circle 1 into the adjacent groups, 

we have 

Proof. We will begin by constructing entry(S G 

). We have 



which has bizarro 

not work because its last distance is 1. Therefore, pfg(S 1 

) 

consists of exactly one state S′ 1 

, which has distance notation 

We have S qi 

= fg(S pi 

), so 

Then S qi 

has a 1 as its second-to-last distance and d b 

− 1 

as its last distance, which is the second possibility described 

in the lemma. As mentioned before, Case 2b corresponds 

to the final possibility described in the lemma, with 1 as the 

second-to-last distance and d b 

− 1 + d 1 

as the last distance. 

As these are the only two possibilities, the proof is 

complete. 

We will denote pfg as the set of these previous first 

grounded states. That is, pfg(S G 

) is the set of fg(S pi 

) for each 

S pi 

in prev(entry(S G 

)) but not in the same subcycle as S G 

. 

There is the additional constraint that any S′ ∈ pfg(S G 

) must 

not have 1 as its last distance, because then S′ could not be 

a first grounded state from Lemma 5.4. 

We have the following useful corollary. 

Corollary 5.6. For any grounded state S that does not have 

1 as its second-to-last distance, there is exactly one S′ where 

S ∈ pfg(S′). This S′ has the same distance notation as S but with 

the second-to-last distance decremented and the last distance 

incremented. 

We now have the groundwork to tighten the bound for 

L(2b, b). 

Theorem 5.7. For b > 2, L(2b, b) < L≤(2b, b). 

Proof. This proof relies on the unique subcycle of 

multiplicity n/2. There are 2 states in this subcycle: 

From Theorem 5.1, we know that only S 1 

could ever be 

reached in a prime pattern, except if that pattern consists 

of just S 1 

and S 2 

. Consider pfg(S 1 

). From Lemma 5.5, each 

S qi 

∈ pfg(S 1 

) satisfies at least one of the following criteria: 

• The distance notation is 

• The second-to-last distance is 1 and the last distance 

is 2 − 1=1. 

• The second-to-last distance is 1 and the last distance 

is 2 − 1+2=3. 

The third possibility is actually the same as the first 

in this case. From Lemma 5.4, the second possibility does 

S 1 

is also the leaving state, so consider the possible 

states S i 

where S 1 

∈ pfg(S i 

). Because S 1 

does not have 1 as its 

second-to-last distance, Corollary 5.6 applies and the only 

state S where S 1 

∈ pfg(S) has distance notation 

This is actually S′ 1 

. 

Therefore, if only one state is to be missed in each 

subcycle, the subcycle containing S′ 1 

must both immediately 

precede and immediately succeed S 1 

. The only prime 

pattern that satisfies this consists of only that subcycle 

minus one state and S 1 

, so it has length n. For any b > 2, 

this is not as long as the longest possible prime pattern, so 

we miss out on S 1 

. Therefore, for b > 2, L(2b, b) < L ≤(2b, b). 

6. Concluding Remarks 

The cases where n = 2b are not the only cases where 

L

AN ANALYSIS OF A NOVEL NEURAL NETWORK 

ARCHITECTURE 

Vatsal Varma 

Abstract 

Artificial Intelligence is a rapidly growing field in computer science, and the pinnacle of this field is the Artificial Neural 

Network (ANN). Modeled after neuronal connections in the brain, neural networks have proved exceptional in locating 

and discriminating amongst patterns in vast datasets. Each neural network contains a multivariate function, which is 

known as the error function. Using a different optimization function, the neural network attempts to reach a minimum 

of its error function by reaching the respective minima of its weights and biases. This study aims to determine the effects 

of four different neural network architectures (NNA) on their overall convergence rates holding all other variables 

constant. The architectures are based on different types of neural networks: The Deep Residual Network (DRN), the 

Multilayer Perceptron Network (MLP), the Extreme Learning Machine (ELM), and one novel design dubbed as the 

Encoded Learning Machine (EncLM). A previous study used Boolean functions to determine the rate of optimization, 

and the novel design topped out of the tested networks. However, this study utilizes the Modified National Institute of 

Standards and Technologies (MNIST) Dataset, a dataset of images of handwritten digits. Each of the networks was run 

over the 60,000 images for one epoch, and within that epoch, was optimized every 100 images using backpropagation. 

It was determined that the MLP and DRN were the weakest networks for fast optimization as they took the longest to 

converge. The EncLM was once again the fastest architecture to converge upon a satisfactory result. 


1.1 – Neural Networks 

An artificial neural network (ANN) is an abstraction of 

the biological nervous system, using artificial neurons and 

axons to create a web and a means to solutions unfound. 

The popularity of such networks stems from their ability 

to adapt, learn and generalize. Due to these abilities, 

artificial neural networks can solve many computational, 

classification and pattern-recognition problems via a 

learning-based algorithm. 

In this study, every neural network is constructed and 

implemented with three factors remaining constant: the 

optimization method, the framework of the network, and 

the data used by each network. The data that each of the 

neural networks being tested will use are derived from the 

Modified National Institute of Standards and Technology 

(MNIST) dataset. The dataset in question is around 

60,000 images of handwritten digits, each of which has 

intrinsic properties that the network must derive to attain 

a successful output. Before specifying the steps taken to 

process the image, some notation needs to be defined. Let 

(w i 

, h i 

, l i 

), where w represents width, h represents height 

and l represents length, be defined by dimension d i 

. Each of 

the handwritten digits comes in an uncompressed format 

of d u 

= (28, 28, 1). Using a program, each one of those 

images was compressed to a size of d c 

= (24, 24, 1) to make 

computation and pooling easier for the neural networks. 

There were two parts to each of the networks tested 

in this study: the convolutional neural network, and the 

feed forward network. The convolutional neural network 

formed the back end of each of the networks, as it allowed 

further compression of the handwritten digits into a onedimensional 

input vector with meaningful data readable 

by the feed forward layer. The feed forward layer forms 

the front end of the network. The one-dimensional output 

vector determined by the convolutional neural network is 

then used as the input vector for the feed forward network. 

The feed forward architecture is what is being tested in 

this study. Thus, before describing the intricacies of each 

network, it is important to know how each network works 

mathematically. 

Before delving into the mathematics of neural 

networks, a few notation issues must be sorted out. Let n L j 

define each neuron in the feed forward layer. Let o L define j 

the activation of the neuron in layer L at position j and 

β L define the bias of the neuron at layer L and position j. 

j 

Similarly, let ī L define the net input a neuron receives. 

j 

L 

Next, let w 1 ,L 2 

j,k 

define the weight of the link between neuron 

j of layer L 1 

and neuron k of layer L 2 

. Finally, let σ define 

the activation function of the neuron. 

Figure 1. An artificial neuron model. 



Each neuron in the feed forward network is derived 

from the McCulloch & Pitts neuron model (Fig. 1). 

The model describes neurons as synaptically linked to 

each other, and each neuron may have multiple links 

to multiple other neurons. Each link to a neuron holds 

a specific value, called its weight w, as described earlier. 

That value represents the importance of the link to the 

neuron the information is going to. To exemplify, say 

there existed a neural network with two layers L 1 

and L 2 

. 

Layer L 1 

has two neurons, and L 2 

has one neuron. In this 

network, there would only be two links, with weights, w 1 

= w 1,2 and w = 1,1 2 w1,2. 

If w = 0 it would mean that the input 

2,1 1 

of the neuron ī 2 would remain unaffected by the output 

1 

o 1 . This is also reflected in the way the net input of each 

neuron is calculated. 

The input of each neuron in successive layers is 

calculated based on the sum of the product of the output 

of each neuron and the respective weight of the link 

propagating that output. 

A neuron not only holds its net input, but also is 

responsible for calculating its net output, which is a 

function of its net input ī and its bias β. Each neuron’s 

output is calculated as follows where σ is representative 

of the sigmoid function, and this is true for both the 

convolutional neurons and the feed forward neurons. 

In this model, three activation functions are used: 

sigmoid σ(n), hyperbolic-tangent h(n) = tanh(n), and 

exponential linear units ε(n), a modification of the rectified 

linear units function. 

The sigmoid activation function suppresses each of the 

outputs within a range of (0,1). The hyperbolic tangent 

serves a similar purpose and suppresses the outputs within 

a range of (−1,1). The Exponential Linear Units serves a 

different purpose. It is used on the convolutional neurons 

within the convolutional neural network part of the 

entire network. Since the convolutional neural network 

(CNN) is tasked with compressing an image, its inherent 

purpose is to process each pixel of an image. The value 

of each pixel locus is the atomic number of the element 

present at that location, and zero otherwise. This is not 

adequately processed by the sigmoid or hyperbolic tangent 

functions, as when the pixel values become larger and 

larger, the hyperbolic tangent and sigmoid functions will 

become more and more saturated. Furthermore, negative 

pixel values are typically regarded as zero. That is why the 

Convolutional Neural Network utilizes the exponential 

linear units activation function instead of another. 

There are two types of neurons in the CNN, the 

convolutional neuron n C 

and the pooling neuron n P 

. The 

n C 

neurons operate in a similar fashion to the feed forward 

neurons, but the n P 

neurons have a different purpose. Each 

of the n P 

neurons take a (2, 2) section of the image and 

finds the largest value within its section, and sets that value 

as its output. This essentially carries the most important 

pixel value for the next layer of processing. As the network 

progresses layer by layer, the image is compressed further 

and further until it becomes a one-dimensional input 

vector for the fully connected layer. 

The CNN, like the feed forward network, is built in 

layers; however, the way those layers are designated is 

completely different from the feed forward network. 

The CNN operates through filters and convolutions. 

Essentially, a filter is a set of weights which are applied 

to sections of the image to create a net input for the 

convolutional neuron that filter is sending its data to. A 

convolutional layer can be described with a dimension d i 

where its length represents the number of filters that layer 

has. For example, the image of dimensions d c 

= (24, 24, 1) 

is convoluted upon to create a layer of size d L 

= (24,24,3). 

This means that there are three filters in that convolutional 

layer, each responsible for one (24, 24) section of that 

layer. To further explain how filters work, let there be 

three filters f 0 

, f 1 

and f 2 

, for the layer discussed above. Each 

filter acts upon the entire depth of the input image, thus 

the length of the filter must be the length of the previous 

image. Say the dimension of the filter was d f 

= (3, 3, 1). 

Filter f 0 

would operate on consecutive three by three by 

one sections of the input image. In the next convolutional 

layer, each filter would operate on consecutive three by 

three by three sections of the previous layer, and so on. 

Between convolutional layers exists a pooling layer, which 

further compresses the layer. For example, if the layer 

was of size (24, 24, 13) the max pooling layer would be of 

size (12, 12, 13). That is essentially how a CNN operates. 

Successive layers will convolute upon the previous layer’s 

output image, and slowly pool the image down to a 

manageable size. That output vector will then be used as 

an input vector for the feed forward network. 

All of the above tools, when put together, can be 


used to deliver an output of the network that classifies 

the handwritten digit with its respective value. This is 

known as the feed-forward stage. Initially that output will 

be meaningless, and it will remain so until the network 

is trained. The training vector t → is built using already 

optimized geometries to train the network. t → will be of the 

dimension of the input image. To then train the network, 

the error of the network is calculated using t → and the 

Mean Squared Error formula (MSE), and then that error is 

backpropagated throughout the network. 

If t → defines the correct/preferred output of the network, 

and → ρ is the actual output of the network, MSE can be 

calculated as follows. 

(3) 

Backpropagation has three main procedures: first, to 

determine the effect that a neuron’s output has on the 

error; then to determine the effect that the neuron’s bias 

and net input has on that same error; and finally, using 

the value calculated by the net input, to determine the 

effect that each link’s weight has in that error. Through 

backpropagation, the network is attempting to minimize 

the error function MSE(t → , → ρ) by calculating its negative 

gradient. 

To do this, the network calculates the partial derivative 

of each weight and bias with respect to the error function. 

Symbolically, this can be represented as for bias 

and for weights. By applying the chain rule we can 

expand these basic equations to finish the implementation 

of the entire backpropagation rule. The first step is to 

calculate a δ value for each neuron which indicates the 

direction the neuron’s output needs to step for it to reach 

a minimum. The δ is calculated differently depending on 

whether the neuron in question is an output neuron or 

not. 

(4) 

In these functions the sigmoid activation function can 

be replaced by any other activation function discussed 

earlier in the Introduction. 

Using this δ the network is able to calculate the 

effects of the weights and biases on the total error of the 

network and take a step down the gradient of the error 

function. The equations below define the backpropagation 

algorithm where η β 

is the constant that determines the 

size of the step the bias β must take, and η w 

is the constant 

that determines the size of the step the weight w of each 

individual link must take. 

(5) 

(6) 

All of the above will remain constant in this study. 

The only variable will be the neural network architecture, 

or, in other words, the number of connections and how 

each network is linked together. What changes, then, is 

how the gradients δ are calculated. If different neurons 

are connected, then their outputs will differ based on the 

outputs of the neurons they are connected to. Thus, what 

happens if the neurons are connected in specific ways? 

How does that architecture change the performance of 

the network? Most importantly, is it possible to hybridize 

two architectures and obtain properties of both? This 

study is designed to test the differences between various 

neural networks based purely on their architecture. 

Using the image dataset, each neural network will be 

run for the entirety of 60,000 iterations, during which 

data corresponding to the neuron will be collected every 

iteration, and data corresponding to the network will 

be collected only when the total error is backpropagated 

(every 100 iterations). Data will include the neuron biases 

and activations, link weights and the overall network 

error. The aim of this study is to find the quickest and most 

efficient NNA convergence rate based on its architecture 

and architecture only. Furthermore, using the knowledge 

gained from the three initial networks, a novel NNA by the 

name of the Encoded Learning Machine, which employs 

principles of various networks in attempt to obtain faster 

and more efficient convergence, can be implemented. 

2. Computational Approach 

The designing and visualization of these NNAs took 

place in two parts. First, each neural network was designed, 

implemented and tested in Java. Then, using Mathematica 

and Microsoft Excel, the data were visualized and each 

neural network was heuristically evaluated and given a 

score relative to the other networks. A high score meant 

the network converged faster than the other networks; 

a lower score meant that either the network’s error was 

high, or the network’s accuracy was low. However, before 

delving into the object-oriented implementation of these 

NNAs, an overview of the structures must be given. 



2.1 – Structure Overview 

sets. They have further been used to obtain "higher-quality 

contact prediction” data for proteins (Wang, 2017). The 

DRN most likely will not show its full potential during this 

study due to all networks only being 4 layers deep, when 

DRN architectures can be upwards of 100 layers. 

Figure 2. A Multilayer Perceptron (MLP) model 

visualized in STELLA. 

There are four different NNAs used in this study: 

The Multilayer Perceptron, the Deep Residual Network, 

the Extreme Learning Machine, and finally the Encoded 

Learning Machine. The MLP is one of the most basic 

implementations of a neural network. Generally, an MLP 

consists of an input layer, usually represented as a vector, 

an output layer, and n hidden layers (Fig. 2). To clarify, 

layers are simply objects that hold an array of neurons. 

Each neuron is then connected with every neuron in the 

layer in front of it with links starting from the input layer 

and ending at the output layer. The MLP architecture is a 

simple and efficient structure that has been proven to be 

able to organize and classify data. It is often used as a part 

of a larger neural network. 

Figure 3. A Deep Residual Network (DRN) model 

visualized in STELLA. 

The DRN is quite like the MLP, but instead of only 

connecting consecutive layers, it can include connections 

that span more than one layer at regular intervals 

throughout the network (Fig. 3). The theory behind 

this network is that previous input is being forward 

propagated many layers to prevent loss of data and enhance 

generalization capabilities. DRN architectures have proven 

adept at generalizing images and other complicated data 

Figure 4. An Extreme Learning Machine (ELM) 

model visualized in STELLA. 

The ELM is basically a network with an input layer, 

a hidden layer and an output layer. The only catch is 

that there exist random forward links from each neuron 

which can connect to any other neuron in the network 

if the prospective neuron is not in a layer behind the 

neuron requesting connection (Fig. 4). This network was 

designed to reduce the slow training speed of other types 

of neural networks (Ding, 2015). It has also proven to be 

better at generalization and have a faster learning rate 

(Ding, 2015). Furthermore, due to the stochastic nature 

of the connections, the number of hidden layers becomes 

arbitrary, therefore making it pointless to initialize this 

network with multiple hidden layers like the others. This 

network is actually trained differently from the rest of the 

networks when applied in other cases; however, in this 

study it remains like a network with random connections. 

The design was simply an experiment to analyze how a 

neural network with such connections would behave 

when trained by backpropagation. 

Finally, the EncLM is a hybridization of two different 

types of architectures: an autoencoder and the ELM. An 

autoencoder takes an input vector and transposes it to a 

layer identical to it (Fig. 5). This essentially means that 

it encodes the same information in a different pattern, 

meaning that more intricate aspects of the data can be 

seen by the network. Furthermore, due to the speedy 

performance, but higher average error, of the ELM, 

it became the second part of this network. Due to the 

autoencoder, it was theorized that if the neural network 

was able to process the same information coming from 

more neurons, it would be able to step down the gradient 

faster and more efficiently. 


Figure 5. An Encoded Learning Machine (EncLM) 

model visualized in STELLA. 

A convolutional neural network’s job is to compress a 

two or three-dimensional input tensor, such as an image, 

into a one-dimensional vector that can be processed by the 

front end neural network. Each convolutional network 

is based on the idea of filters, where each filter “learns” 

to differentiate certain aspects of that tensor from other 

aspects. For handwritten digits, filters might be used to 

recognize loops or edges within the numbers. Each layer 

in the convolutional neural network depends on the filter 

assigned to it. Each convolutional neural network in this 

study had 10 filters of size five pixels by five pixels. At the 

end of the convolutional network, a pooling layer was 

constructed that normalized and compressed the previous 

outputs so that the front end networks had less data to 

analyze, and therefore saved memory on the computer. 

For this network, one convolutional layer of dimension 

d L 

= (24, 24, 10) was used. Ten filters of size (5, 5) pixels 

were applied to this layer. After this layer determined its 

output, the next layer was the max pooling layer, which 

condensed the output of the convolutions into a smaller 

space and allowed faster forward propagation of the 

network. Each neuron in the CNN was connected to the 

neurons in the previous layer based on the size of its filter. 

The feed forward network had 4 layers, each with 32 

neurons as the initial layer, 20 neurons, 16 neurons and 

finally 10 neurons as the output layer. The input neurons 

of the feed forward layers were connected to the output 

neurons in the max pooling layer of the convolutional 

neural network. 

This is a brief overview of how the architectures were 

set up in this study. 

2.2 – Data Generation 

The initial objective of the NNA implementation was 

to write the algorithm that converted the MNIST data into 

a readable form. Using a small Python script, each image 

of the 60,000 images in the dataset was converted into a 


one-dimensional input vector and put into a file readable 

by the Java ParseCSV class. Once this was complete, each 

one of those images needed to be assigned a classification 

vector → t . By obtaining the correct value for each image, 

the ParseCSV class was able to create classification vectors 

for each image. Each vector would be used to train the 

network after it determined the output on the input image. 

After generating the datasets, the NNAs had to be 

implemented as well. To keep implementation easy 

and understandable, each neural network is based on 

a superclass, NeuralNetwork.java, which allowed each 

subclass, corresponding to the neural networks in this 

study, to be initialized by defining how they are run, 

trained, and connected. All neural networks were run by 

forward propagation, which involves taking all the layers 

of a network, starting from the input layer, and calculating 

the output or activation o L of each neuron within that 

j 

layer via equation 2 until it reached the final neuron, 

which, when its output was calculated, would represent 

the network’s integer answer to → b. 

While a run operates, the neural network writes 

data to its respective csv file. Data are written either 

every activation cycle, or every optimization cycle. 

Every activation cycle, each neuron’s activation o L j 

and its net input ī L is written to a file by the name of 

j 

”NetworkNameNeuronData.csv” where NetworkName 

would be replaced by the acronyms given to each network 

in this study. Furthermore, since the weights and biases 

are updated every optimization cycle, those are written in 

a smaller file by the name of “NetworkNameNetworkData. 

csv”. This file also contains the network error data that will 

be crucial to the analysis of these NNAs. During every 100 

iterations, or optimization cycle, the network is trained 

through backpropagation. This is done in two steps: first, 

recursively going backwards along the various links and 

neurons and creating a δ value for each neuron according 

to equation 4; then, again recursively back propagating 

along the links and neurons to update the biases β L and j 

L 

weights w 1 ,L 2 L L 

j,k 

according to the δ j 

and o j 

of the neuron (see 

equations 5 and 6). 

This summarizes, in brief, the constants and variables 

to which each neural network will be subject. 

3. Results 

Each network has a vast amount of data to analyze and 

visualize. Therefore, to make this section more organized, 

it will be split into two subsections: error & accuracy, 

which will discuss the evaluation, speed and efficiency of 

the networks, and visualization, which will discuss what 

is being visualized and why it was chosen to represent the 

neural network. 


3.1 – Error and Accuracy 

The objective of this study was to determine the highest 

performing network based on two parameters, accuracy a 

and error e. By those two metrics, a heuristic score can be 

assigned to each of the networks, where the higher score is 

the better score, based on the function H as shown below. 

As accuracy increases, the score increases, and as error 

decreases the score increases. In other words H(a, e) ∝ a 

and H(a, e) ∝ 1/e. 

Before delving into the meaning of each one of those 

numbers, definitions of error and accuracy are required. 

Error is the average difference between the perfect output, 

and the actual output of the network. It can range from 

−∞ to ∞. Accuracy is the percentage of the images the 

network classified correctly while training. It can only 

range from 0 to 100. Finally, the prediction percentage of 

the network is how sure the network is of its answer. 

Table 1. Error and Accuracy Averages for the Two 

Trials. 

TRIAL DRN ELM MLP ENCLM 

1-Error 0.030128 0.005262 0.034142 0.006175 

2-Error 0.038433 0.016085 0.039846 0.03341 

1- 

Accuracy 0.1374 0.118417 0.113433 0.125283 

2- 

Accuracy 0.153573 0.095943 0.124493 0.217846 

Avg. 

Accuracy 0.145487 0.10718 0.118963 0.171565 

Avg. 

Error 0.034281 0.010674 0.036994 0.019793 

Scores 424.395 1004.12 321.574 866.796 

For each of these networks, interesting observations 

can be gathered. Of course, if more trials were performed, 

the data would reflect the actual performance of the 

network more accurately. According to Wang et. al’s 

paper, the DRN was particularly adept at accurately 

classifying various things, especially images (Wang, 2017). 

Seeing the scores, this seems to be true, as the DRN has the 

second highest accuracy of all of the network architectures 

tested in this study. This means that, out of every 100 

predictions, about 15 predictions were correct, even if 

they were predicted by a low margin. That low margin is 

indicated by the relatively high error value that the DRN 

has. In other words, the DRN can guess correctly, but it is 

quite unsure about its guesses. For example, if an image of 

a three was input into the DRN, it may guess it correctly, 

but its prediction percentage for an eight would also be 

relatively high. 

Next, the ELM architecture was the highest scoring 

of all of the networks. The random connections of the 

architecture seemed to have helped it achieve the low 

error. However, its high score is deceiving. The accuracy 

of the ELM is the lowest of all of the network architectures 

tested in this study, a mere 10.718% across 60,000 images. 

Its very low error, coupled with its low accuracy, can only 

be attributed to one occurrence; the network was trained 

in a manner that associated several features with the wrong 

values. For example, if a three was input into the network, 

it would guess that eight was the answer because of the 

similar features, with a very high prediction percentage. It 

was sure of its answer, even if that answer was incorrect. 

Third, the MLP architecture was the lowest scoring 

of all of the networks. The orderly connections of the 

architecture as seen in Fig. 2 seem to have inhibited the 

network from learning the nuances present within the 

structures of each of the handwritten digits. The MLP 

had both the lowest accuracy as well as the highest error. 

For example, if a three was put into the network, it would 

be unsure of what the output should be, and would guess 

based on whatever it found familiar, leading to a guess of 

eight, nine, and sometimes, three. 

Fourth and finally, the new hybrid network 

architecture, EncLM, was the second highest scorer 

of the four tested networks across the two trials. The 

orderly connections that form its first two layers, instead 

of inhibiting the network’s performance, seem to have 

enhanced it. The worst performer combined with the best 

performer created a hybrid network with new capabilities. 

It had the highest accuracy, with the second lowest error. 

Comparing this with the other architectures, if a three was 

put into the EncLM architecture, the architecture would 

guess three and be fairly sure of its guess, meaning that it 

would have a high prediction percentage. 

From these numbers, this is a summary of the 

observations that can be made. The performance of each 

of the networks can further be broken down when looking 

at their performance over time. 

3.2 – Visualization 

For each one of the networks, the accuracy over time 

a(t) and the error over time e(t) were plotted (Fig. 6, 7, 8, 

9). The trends visible in each of the graphs indicate the 

aspects of the networks discussed above, but they show the 

networks’ training speed as well. 

A more accurate network will have an a ′(t) slope larger 

than most networks. In this case, by observation, the DRN 

has the fastest increasing accuracy. Perhaps, after several 

iterations over the same data, the DRN will have the 


highest accuracy out of all of the networks. One thing that 

the a(t) graph indicates about the network is its potential 

to learn within the limited vision it was given. Obtaining 

a higher average accuracy is often indicative of the faster 

training of the network. It is true in all cases that the 

accuracy value goes up as the number of iterations goes 

on. Where the real differences in the networks can be seen 

is in the e(t) graphs. 

The faster training network will have an e ′(t) slope 

value greater in magnitude to all of the other networks. 

That network will reach the error threshold, where the 

error begins to flatten, much more quickly than the other 

networks. During backpropagation, such a network is 

more likely to take the correct step down the gradient 

of its error function −∇MSE(t → , → ρ). Furthermore, another 

property of the e(t) graphs is the stochastic nature of the 

values, or how far they deviate from a proper curve. It is 

here where the differences are quite noticeable between 

networks. 

As the networks trained, the data corresponding 

to each error and accuracy were collected every 100 

iterations. Using Wolfram Mathematica, those data were 

then visualized. 

Figure 8. Trial one error data for each network. 

Figure 9. Trial two error data for each network. 

4. Discussion 

Figure 6. Trial one accuracy data for each network. 

Figure 7. Trial two accuracy data for each network. 


Ultimately, the goal of the study is to prove that the 

new hybrid network architecture is viable for use in 

various situations. Furthermore, this study can open up a 

new area of neural network research, where properties of 

two different architectures, whether they be mathematical 

or structural, can be hybridized to obtain a hybrid network 

that reflects the desired properties of both networks. In 

the case of the EncLM, the high accuracy of the DRN 

architecture and the fast training of the ELM architecture 

were the desirable properties. Over both studies, the 

EncLM expressed both of these properties, becoming a 

fast training and highly accurate network. 

As with any hybridization, unexpected results come up. 

Those results manifested themselves in the form of the 

error graphs e(t) of each network. 

Each error graph looks similar, but varies in one aspect, 

the deviation of the error in between each iteration. The 

MLP has the least deviation, and the EncLM has the most 

deviation. This deviation is akin to exploration. The more 

the error deviates, the more the network is exploring its 

individual error function to locate its minima. Half of this 

is luck. The network might reach optimized values that 

could drop the error to a very low value. The other half is 


exploration, or how much the neural network is willing 

to deviate from certain relative minima to find the next 

lowest possible error. This feature is indicative in the 

lowest error values observed within each network. The 

stochastic connections within the EncLM and ELM gave 

the two networks error values that were nearly a sixth of 

the more orderly DRN and MLP architectures. A random 

set of connections, it seems, enables a network to see the 

input data as a whole, rather than seeing it in layers. This 

allows the network to traverse its respective error function 

rapidly. However, the one drawback is that the network 

cannot determine with certainty whether the output it 

creates is the correct one. For the orderly networks, this 

was their strength, especially in the DRN. Even with its 

high error, it was able to accurately classify each digit. 

The two properties of the DRN and ELM, when 

combined, seem to have amplified each of their individual 

effects. The exploratory nature of the EncLM is enhanced 

by the DRN’s orderly connections, and the accuracy of the 

network overall is higher than all other networks in the 

study. 

5. Conclusion 

The EncLM is, ultimately, a hybrid network architecture, 

employing tools from both orderly connected networks as 

well as stochastic connected networks. The end result is 

very satisfactory. The error it achieves is comparable to 

the error the ELM achieved, and the accuracy is higher 

than all other networks. 

Overall, the novel architecture proved to be an 

intriguing development in neural network architectures, 

as it furthered the idea of speedy and efficient convergence 

to a global minimum. It is safe to say that the novel 

architecture is viable in all aspects when compared to the 

architectures tested in this study. To further this study, 

more research will be needed to determine whether the 

properties of the EncLM can be further generalized to 

more complex and larger datasets, maybe involving larger 

and more intricate images than handwritten digits. If this 

is possible, research also needs to be done to determine 

the convergence rates of the other architectures in this 

study on that same dataset to determine whether a more 

organized structure like that of the MLP, or DRN will 

be able to notice the complicated patterns present in the 

new dataset, or whether a similar pattern of stochastic 

dominance in this study will extrapolate onto that dataset. 

Furthermore, the possibilities of architecture mixing 

could potentially have uses in business, industry and other 

fields that require the management of more than one task 

at the same time. Another study could be carried out to 

determine the effects of mixing more architectures on a 

similar dataset. This could be used to determine the hybrid 

architecture that provides the best possible results for any 

problem. 


The author would like to thank Mr. Robert Gotwals for 

his sincere and expertful management and his fascinating 

insights into various tools and means that were used in 

this paper, including Excel, Mathematica and LaTeX. The 

author would also like to thank his mentor Mr. Keethan 

Kleiner for interesting insights and guidance throughout 

this project. Appreciation is also extended towards 

the North Carolina School of Science and Math for its 

investment into every one of its students. 

7. References 

Wang, S., Sun, S., Li, Z., Zhang, R., & Xu, J. (2017). 

Accurate de novo prediction of protein contact map by 

ultra-deep learning model. PLoS Computational Biology, 

13(1) doi:http://dx.doi.org/10.1371/journal.pcbi.1005324 

Ding, S., Zhao, H., Zhang, Y., Xu, X., & Nie, R. (2015). 

Extreme learning machine: Algorithm, theory and 

applications. The Artificial Intelligence Review, 44(1), 

103-115. doi:http://dx.doi.org/10.1007/s10462-013- 

9405-z 


EFFECTS OF RELATIVITY ON QUADRUPOLE 

OSCILLATIONS OF COMPACT STARS 

Abhijit Gupta 

Abstract 

In the present age of space-based photometry, telescopes such as K2 and TESS are providing pulsation frequencies of 

stellar objects to unprecedented accuracy, requiring equally precise theoretical models correlating these observations to 

mass- and composition-dependent characteristics of stars. At this precision, relativistic models are required for compact 

objects such as white dwarfs and neutron stars. We model these stars as polytropes using the Tolman-Oppenheimer- 

Volkoff equation, and compute relativistic nonradial stellar pulsations around this equilibrium state. Outside the stellar 

surface, we integrate the Zerilli equation to locate resonant quasinormal modes, where ingoing gravitational radiation 

vanishes. We compare the frequencies of a subset of these modes to their corresponding pressure-modes in the Newtonian 

limit, as a function of the strength of relativity inside the star. Our results contribute to our understanding of the impact 

of general relativity on stellar oscillations, and can be used to determine the conditions under which the Newtonian 

approximation is justified. 

1. Motivation 

1.1 – Asteroseismology 

Although stars generally evolve on extremely long 

timescales, they are not static but pulsate periodically 

around an equilibrium. The frequencies of these 

oscillations inform us about internal characteristics of the 

star, suchv as mass, radius, pressure, and density. While 

these variables cannot be directly measured, telescopes can 

detect luminosity deviations that stellar pulsation cause. 

The frequencies of these oscillations are the frequencies of 

the stellar pulsations. 

Asteroseismology, the study of these stellar pulsations, 

involves two components: theoretical calculations 

and experimental observations. Theoretical programs 

assume a particular equilibrium state, and then model 

perturbations on this system. Only pulsations that satisfy 

boundary conditions at both the interior and stellar surface 

can possibly occur. Each pulsation can be described by a 

frequency and spherical harmonic degree and mode. The 

experimental observations measure periodic luminosity 

oscillations of stars over long periods of time. A Fourier 

transform is performed, and after filtering, spikes in 

the frequency curve are used to determine potential 

eigenfrequencies (Fig. 1). 

Figure 1. Experimental results from the K2 mission. 

The top panels show the standard and phasefolded 

light curves. The bottom panel shows the 

amplitude and residual spectrum after the pulsation 

frequencies are removed. Red vertical lines indicate 

observed pulsation frequencies (Bowman, D. M. et 

al., 2018) 

Given these experimentally determined frequencies, 

programs can be run to determine the predicted central 

density, central pressure, total mass, radius, and many 

additional stellar variables. Asteroseismology presents an 

additional method to calculating these variables, alongside 

existing procedures. The combination yields stronger 

approximations than any method individually. 

1.2 – Compact Objects 

In recent years, space-based telescopes such as NASA’s 

Kepler spacecraft and Transiting Exoplanet Survey 

Satellite (TESS) are providing pulsation frequencies of 

stellar objects with unprecedented accuracy. Equally 

precise theoretical models correlating these observations 

to mass and composition-dependent characteristics of 

stars is required to make full use of these satellites. Present 

theoretical models have reduced error to less than 1 part 

PHYSICS 


in 10 7 , roughly equivalent to the observational accuracy of 

these telescopes (Christensen- Dalsgaard & Mullan, 1993). 

However, when studying highly dense compact objects, 

general relativity can have a noticeable impact on the 

stellar structure and pulsations, requiring more rigorous 

models. 

While most stars are not substantially affected by general 

relativity, a class of compact objects require general 

relativistic corrections to accurately model the pulsations 

to the desired accuracy due to their extreme densities. 

Among compact objects, there are two main classes of 

stars: white dwarfs and neutron stars. White dwarfs are 

the remnants of low-mass to medium-mass stars that have 

exhausted their hydrogen and helium supplies. These 

stars are composed of heavier elements such as carbon 

and oxygen, and support themselves against gravitational 

collapse with electron degeneracy pressure. The density of 

a white dwarf is some 10 6 times greater than that of our 

Sun. 

Even more extreme are neutron stars, formed by the 

supernova explosions of stars not quite large enough to 

produce black holes. Neutron stars are similar to white 

dwarfs except are composed almost entirely of neutrons 

and supported with neutron degeneracy pressure instead 

of electron degeneracy. Physicists are still unsure exactly 

what types of matter are present at the very center of a 

neutron star, where density is the highest. Neutron stars 

are believed to be the densest macroscopic objects in the 

Universe, with densities about 10 15 times higher than that 

of the Sun. 

Relativistic corrections have small but noticeable impacts 

on white dwarfs, but are essential to study the pulsation 

frequencies of neutron stars. By better understanding the 

pulsations of neutron stars, we gain a better understanding 

of their interiors. Recent research even suggests that the 

matter in a neutron star may be the strongest material 

in the Universe, 10 billion times stronger than steel 

(Caplan, Schneider, & Horowitz, 2018). Relativistic 

asteroseismology can assist in evaluating the different 

models attempting to describe the neutron star interior 

by providing accurate experimental data on neutron star 

properties. 

In this paper, we analyze how general relativity impacts 

the stellar pulsations of compact objects. By understanding 

when the Newtonian approximation is justified for a 

given error tolerance, we can improve the computational 

efficiency of theoretical asteroseismology without 

decreasing accuracy. On the other hand, computational 

improvements to previously published algorithms make 

our results potentially more accurate than existing 

results. Additionally, this research has applications to 

understanding the yet unknown physics governing the 

dense neutron star cores. 

2. Stellar Equilibrium 

The radius-dependent characteristics of compact objects 

affect their stellar pulsations, so an accurate model of the 

equilibrium state is required before computing stellar 

pulsation eigenfrequencies and other characteristics. To 

simplify calculations, a polytropic model is used in both 

the Newtonian and relativistic calculations (Knapp, 2011). 

A polytrope is a star where pressure (p) and density (ρ) are 

continuous with respect to radius, and are related by the 

equation of state: 

(1) 

where κ is the constant of proportionality, and n is the 

polytropic index. A polytropic index between 0.5 and 1 

generally models a neutron star well, while white dwarfs 

are modeled with a polytropic index of 3. 

2.1 – Newtonian Equilibrium 

In flat spacetime, the Lane-Emden equation describes 

the relationship between radius and density for polytropic 

stars, derived from the equation of hydrostatic equilibrium 

and the mass-continuity equation (Knapp, 2011) 

(2) 

where θ is defined by ρ = ρ c 

θ n , ρ c 

being the central density. 

ξ is the dimensionless radius defined by 

where G is the universal gravitation constant. The 

boundary conditions for this differential equation are θ(0) 

= 1 and θ′(0) = 0. For n = 0, n = 1, and n = 5, analytic 

solutions are available. For any other polytropic index, 

numerical integration to θ = 0 is required to analyze the 

equilibrium conditions of the star. Specifically, the Lane- 

Emden equation can be separated into two coupled firstorder 

ODEs using: 

(4) (5) 

Adaptive step-size fourth-order Runge-Kutta numerical 

integration is run on the system until the first step where θ 

< 0. Newton’s Method is then used to locate a more precise 

ξ where θ = 0. At this point, pressure and density become 

0, marking the outer edge of the star (Fig. 2). 

(3) 

76 | 2018-2019 | Broad Street Scientific PHYSICS

are used when analyzing the stellar pulsations. 

2.3 – Comparison 

We compare the results of the Lane-Emden Equation 

and TOV Equation for a neutron star with typical 

characteristics. While the shapes of the curves are similar, 

there is a noticeable difference in radius and mass (integral 

of density with respect to radius) in Newtonian and 

relativistic spacetime (Fig. 3). 

Figure 2. θ vs. ξ for varying n. n = 0 has θ decline the 

fastest, while n = 5 decreases asymptotically but 

never reaches θ = 0. Neutron stars have n ≈ 1, and 

white dwarfs have n ≈ 3. 

2.2 – Relativistic Equilibrium 

While the Newtonian model is accurate in predicting 

the oscillation frequencies of main sequence stars, general 

relativity is needed to accurately describe compact objects 

with immense densities. The quantity σ approximates 

how relativistic a star is: 

(6) 

The greater σ, the greater the impacts of general relativity 

on both the equilibrium and stellar oscillations. For a white 

dwarf star, σ ≈ 0.001, while for a neutron star, σ ≈ 0.1. 

In this paper, we shall consider all stars in Schwarzschild 

spacetime, where spherical symmetry is assumed and there 

is no stellar rotation or magnetism involved (Hartle, 2003). 

The Schwarzschild metric tensor describes the spacetime: 

(7) 

where e −λ(r) = 1 − 2M (r)/r. e ν relates to the mass of the star, 

but cannot be analytically represented for a relativistic 

polytropic star. This metric tensor is given in geometric 

units (c = 1, G = 1) and in standard Schwarzschild 

coordinates (t, r, θ, Φ). 

A relativistic equivalent of the Lane-Emden equation, 

the Tolman-Oppenheimer-Volkoff (TOV) Equation, takes 

into account curved spacetime in describing polytropic 

stars. It calculates P, ρ, and ν as a function of radius. The 

TOV Equation can be written as three coupled first-order 

ODEs (Tooper, 1964). 

(8) (9) (10) 

With boundary conditions M(0) = 0, ν(R) = 1 − 2M/R, 

and p(0) = p 0 

, we solve this system very similarly to the 

Lane-Emden equation. The numerical results of the 

equilibrium analysis, radius-dependent p, ρ, ν, λ, and M, 

Figure 3. Comparison of solutions to Lane-Emden 

Equation and TOV Equation for neutron star with 

polytropic index n = 1. The TOV Equation predicts 

smaller radius and mass. 

The TOV Equation is used in all relativistic stellar 

pulsation calculations as the equilibrium model. Relativistic 

effects can be attributed both to differences in the 

equations governing stellar equilibrium, and differences in 

the equations governing stellar pulsations. 

3. Stellar Pulsations 

To analyze stellar pulsations, a perturbation is applied 

and propagated through the polytropic equilibrium state. 

Only under certain eigenfrequencies will the solution be 

continuous throughout the star. These oscillations can 

be both radial or nonradial, and each have a spherical 

harmonic degree l and mode m. 

Furthermore, the oscillations can be grouped into 

families of modes, depending on their restoring forces. The 

two most important classifications are Pressure Modes 

(p-modes) and Gravity Modes (g-modes). P-modes are 

high frequency modes whose deviations from equilibrium 

are counteracted by pressure changes in the convective 

zone. G-modes are low frequency modes, counteracted by 

mass movement in the radiative zone. In this research, we 

focus on p-modes, although our methods apply to g-modes 

as well. 

PHYSICS 


For a specific spherical harmonic degree, spherical 

harmonic mode, and mode classification, there are 

multiple energy eigenmodes with ascending mode number 

k. The three variables, l, m, and k, along with the mode 

classification, fully describe a particular stellar pulsation. 

Multiple pulsations can occur simultaneously in a star, 

with resonant modes resulting from superposition (Fig. 4). 

perturbation variables y 1 

through y 4 

representing fractional 

changes in radius, pressure, gravitational potential, and 

gravitational acceleration. The solutions to the system are 

independent of all stellar equilibrium factors, except the 

polytropic index, allowing this dimensionless analysis. The 

differential equations for these variables can be written as 

one matrix equation (Unno, 1989). 

(13) 

The 1/x term in front of the matrix causes potential 

singularities in the integration, and also requires further 

emphasis closer to x = 0, where the system changes faster. 

To improve computational accuracy, we apply a change of 

variables from x to ln(x), yielding this simpler form: 

Figure 4. P-mode propagation for two harmonics. 

The number of reflections is the degree. The resonant 

modes result from a superposition of component 

waves travelling in opposite directions (Tosaka, n.d.) 

These stellar pulsations have separable time, angle, and 

radius dependence, given by: 

(11) 

(12) 

where f(t,r,θ,Φ) is a perturbation function, ω is the 

frequency, P lm 

(cosθ) is the associated Legendre polynomial, 

and N is a normalizing factor. By calculating f l 

(r), the radiusdependent 

perturbation for a specific eigenfrequency, the 

overall nature of the oscillations can be understood. While 

all degrees from 0 to ∞ could occur, in reality only the first 

few have substantial amplitude. l = 2 is the first degree 

at which gravitational radiation occurs in the relativistic 

model, making it the most optimal case study. 

4. Newtonian Quadrupole Oscillations 

4.1 – Pulsations Inside the Star 

In Newtonian spacetime, a set of 4 homogeneous firstorder 

differential equations describe the perturbations 

of radial displacement, pressure, gravitational potential, 

and gravitational acceleration. Physically, these relations 

are derived by maintaining continuous variables and 

appropriate boundary conditions. 

The system of differential equations originally is 

dimensioned, but can be made dimensionless, with 

(14) 

A * , U, V g 

, and c 1 

are dimensionless stellar equilibrium 

quantities as defined in Equations 15-18 below (Unno, 

1989). Although they contain ρ and p, all can be simplified 

to dimensionless form using ξ and θ. A * is the Eulerian 

pressure perturbation, c 1 

is an inverse scaled average 

density, and U and V g 

are common stellar variables. 

(15) 

(16) 

(17) 

(18) 

x is the dimensionless radius, ranging from 0 to 1. ω refers 

to the frequency of the oscillation being tested, and is made 

dimensionless by multiplying the dimensioned frequency 

by . ρ c 

and p c 

are the central density and pressure, 

respectively. 

The system of differential equations has central and 

surface boundary conditions, defined below. These 

conditions ensure the solution is physically acceptable at 

both boundaries (Unno, 1989). 

(19) 

(20) 

The differential equations are singular at both 

boundaries due to division by zero-valued variables. At 

the center of the star (x = 0), ln(x) is not defined, and 

at the outer surface of the star (x = 1), pressure is zero 

and V g 

and A * approach ∞. To handle this issue, we use 

the Magnus Multiple Shooting Scheme (Townsend & 


Teitler, 2013). Two arbitrary solutions satisfying the 

boundary conditions are created on both boundaries, and 

integrated to x = 0.5. They are inserted into a matrix, and 

the determinant is computed. Eigenfrequencies are found 

when the determinant of this square matrix is 0. Adaptive 

step-size fourth-order Runge-Kutta integration is used to 

integrate the system, and Newton’s Method is used during 

root-finding to locate where det(M) = 0 with quadratic 

convergence. 

4.2 – Algorithmic Roadmap 

In this section, we explain the specific steps taken to 

accurately compute the resonant modes of Newtonian 

polytropic stars. The code used to implement this 

algorithm was written in Python 3. 

1. The central pressure and density of the star are 

provided. The polytropic index n is given as well. 

From these, fourth-order Runge-Kutta integration is 

used on the Lane-Emden Equation (Eq. 2). 

2. For a test frequency and spherical harmonic degree, 

the perturbation variables are calculated at both 

boundaries using boundary conditions (Eq. 19-20). 

Two possible solutions on each end are integrated to 

r = 0.5R using fourth-order Runge-Kutta integration 

(Eq. 14). The equations are treated in matrix form for 

improved computational efficiency. 

3. Using the Magnus Multiple Shooting Scheme, the 

determinant of a 4x4 square matrix of partial solutions 

is calculated. Each row is a single integration 

from the previous step, and all 4 solutions are used. A 

determinant of 0 corresponds to an eigenfrequency. 

4. Steps 2 and 3 are repeated keeping the spherical 

harmonic degree constant and varying the test frequency. 

Newton’s Method is used to locate where 

det(M) = 0 with quadratic convergence. The derivative 

required for Newton’s Method is approximated by 

sampling 2 points slightly above and below the test 

frequency. Newton’s Method is run until a certain 

threshold accuracy is obtained. 

5. Steps 2 to 5 are repeated for each spherical harmonic 

degree. In this paper, results for l = 2 are shown, 

although others can be calculated with this algorithm. 

l = 2 is of particular importance because it accounts 

for the majority of gravitational radiation in the 

relativistic system. 

5. Newtonian Model Results and Discussion 

perturbation is the second-harmonic pressure-mode. We 

refer to the fundamental or lowest frequency mode in a 

family as the first-harmonic. l = 2 is chosen because it is the 

lowest spherical harmonic degree for which gravitational 

waves occur in the relativistic model. 

Figure 5 shows the results of this calculation. The 

four graphs left to right and top to bottom are radial 

perturbation, pressure perturbation, gravitational potential 

perturbation, and gravitational field perturbation, y 1 

to y 4 

in the above calculations. Radial displacement and pressure 

perturbations are largest near the center of the star, and all 

four perturbation variables approach zero near the surface 

of the star. 

Figure 5. Dimensionless Perturbations as a function 

of radius for l = 2 2 nd Harmonic Pressure-Mode for n=3 

Polytrope. x = 0 is center of star, x = 1 is stellar surface. 

While the perturbation dynamics are interesting, the 

eigenfrequency at which the pulsation occurs is generally 

more important, as it can be readily observed from Earth. 

The eigenfrequency for a Newtonian polytrope is solely a 

function of n among the equilibrium characteristics, and is 

also dependent on the spherical harmonic and particular 

mode. 

Prior calculations by Christensen-Dalsgaard and Mullan 

have yielded the first few p-mode eigenfrequencies for l = 

1, l = 2, and l = 3 to high precision (Christensen-Dalsgaard 

& Mullan, 1993). We compared the results of our method, 

described in Section 1.3, against these literature values. 

As a sample, Table 1 below shows a comparison of our 

calculations against theirs for the first 5 eigenfrequencies 

of a star with polytropic index n = 3 and spherical harmonic 

degree l = 2. 

We can visualize the normalized perturbations of a 

polytrope with index n = 3 as a function of dimensionless 

radius x. Although n = 3 best represents a white dwarf, 

a neutron star’s pulsations could be seen with n = 1. We 

use n = 3 for ease of comparison to prior calculations 

for main-sequence stars, also well approximated with n 

= 3. The spherical harmonic is l = 2, and this particular 

PHYSICS 


Table 1. Dimensionless Frequencies of low harmonic 

pressure-modes (l = 2) for n = 3 Polytrope 

Harmonic Literature Calculated Rel. Error 

Fundamental 3.90687 3.90687 1.2491x10 -7 

2nd Harmonic 

3rd Harmonic 

4th Harmonic 

5th Harmonic 

5.169468 5.169469 7.6588x10 -8 

6.439991 6.439990 4.5185x10 -8 

7.708951 7.708951 1.8080x10 -10 

8.975891 8.975891 3.1879x10 -8 

With higher harmonics, the perturbation variables 

have a higher spatial frequency in the interior of the star, 

and have more zeroes and relative extrema. This makes 

numerically simulating these scenarios more complex, and 

less accurate than lower harmonics for equal number of 

integration steps. With increased integration steps, our 

model is sufficiently accurate even for higher harmonics. 

Table 2 uses the same polytropic equilibrium as Table 1, 

and the same spherical harmonic degree l = 2. 

Table 2. Dimensionless Frequencies of high harmonic 

pressure-modes (l = 2) for n = 3 Polytrope 

Harmonic Literature Calculated Rel. Error 

31st Harmonic 

41.5192 41.5221 6.8743x10 -5 

32nd Harmonic 

42.7630 42.7664 7.8022x10 -5 

33rd Harmonic 

44.0065 44.0104 8.7698x10 -5 

34th Harmonic 

45.2497 45.2541 9.7599x10 -5 

35th Harmonic 

46.4927 46.4977 1.0758x10 -4 

Although the error in the higher harmonics is 

approximately 100 times larger in magnitude than the 

error in the lower harmonics, it is still 1 part in 10,000 

or less. Given the strong match for both low and high 

eigenfrequencies, this code can be used to calculate 

frequencies for higher harmonics than previously 

reported ((Christensen-Dalsgaard & Mullan, 1993) goes 

to 50 th ). However, these higher harmonics require greater 

energy, and thus occur at smaller amplitudes in real 

compact objects. Their study is useful for understanding 

patterns in stellar pulsations, but not for experimental 

asteroseismology. 

6. Relativistic Quadrupole Oscillations 

6.1 – Perturbation Metric 

Similar to the Newtonian case, we use a polytropic model 

of the equilibrium structure. A perturbation is applied, and 

as a result of the motion, the geometry of spacetime around 

the relativistic star is no longer described by Equation (7). 

Rather, the new metric, involving the perturbation metric 

h uv 

, becomes 

(21) 

In even-parity Regge-Wheeler gauge, the perturbation 

metric takes the form (Thorne & Campolattaro, 1967): 

(22) 

The variable μ is the dimensionless radius of the star 

(ranging from 0 to 1), Y = e iωt * Y lm 

is the time dependence 

multiplied by the spherical harmonic of the perturbation. 

H 0 

, H 1 

, and K are functions of r only. The Regge-Wheeler 

gauge is preferred for only introducing two terms outside 

the main diagonal. Substituting into Equation (21), we 

obtain: 

(23) 

6.2 – Perturbations Inside the Compact Object 

Inside the star, the perturbed fluid is described by a 

displacement ξ α , where: 

(24) (25) 

(26) 

The three fluid perturbations have separable timeand 

radius-dependence, allowing for calculations done 

at a specified time to represent the system with the 

necessary transformations. The variables W and V are 

fluid perturbation variables that must be solved for to 

describe the nonradial stellar pulsations. Five variables 

are dependent on radius, H 0 

, H 1 

, K, W, and V. The first 

three relate to the initial spacetime perturbation, and W 

and V describe fluid perturbations (Lindblom & Detweiler, 

1983). 

Einstein’s Field Equations can be applied to the 

spacetime metric given in Equation (23) to give differential 

equations for each perturbation variable. Using these 

relations, we can eliminate one variable, creating a system 

of four differential equations. Following Detweiler and 

Lindblom, H 1 

is eliminated instead of H 0 

, to avoid possible 

singularities (Lindblom & Detweiler, 1985). To simplify 

the resultant equations, X is defined as a function of W, 

V, and H 0 

: 


(27) 

The four first-order differential equations for H 1 

, K, W, 

and X, are (Lindblom & Detweiler, 1985): 

(28) 

(29) 

(30) 

in the interior of the star. We are mainly interested in 

solutions composed only of outgoing waves, as these 

represent resonant oscillation and the energy radiated 

from the star. These frequencies are called Quasi-Normal 

Modes (QNMs), and include the relativistic equivalent of 

Newtonian p-modes. 

To find these specific eigenfrequencies, we analyze 

the perturbation variables outside the compact object to 

determine the gravitational radiation produced. In the 

exterior of the star, the fluid perturbations W, V, and X 

are zero and the 2 metric perturbations H 1 

and K can be 

combined to obtain the single second-order differential 

equation known as the Zerilli equation (N. Andersson & 

Shutz, 1995). 

with the effective potential V Z 

given by: 

(32) 

(31) 

Equations (28) to (31) can be expressed in matrix form 

similar to Equation (14), and are handled computationally 

in this manner. A major difference between the 

Newtonian and relativistic calculations is that the 

relativistic calculations are dimensioned while Newtonian 

is dimensionless. Only 2 of the 4 linearly independent 

solutions to this system are well-behaved at the center of 

the star (at r = 0). The perturbed pressure must vanish at r 

= R, so X(R) = 0. From these conditions, a single acceptable 

solution is specified for each frequency ω. 

At the central boundary, r = 0, the differential equations 

are singular, as they contain multiple 1/r terms that tend 

the function to infinity. Since the numerical integration 

cannot be started at r = 0, a power-series approximation 

is used to determine an appropriate starting condition 

slightly away from the center, following the procedure 

described in (Lindblom & Detweiler, 1983) and (Lindblom 

& Detweiler, 1985). 

The power series approximations are used to r = 0.01R. 

Then, the differential equations are integrated using 

fourth-order Runge-Kutta integration to r = 0.5R. There 

are two linearly independent solutions, labelled Y 1 

and Y 2 

. 

Similarly, the three solutions from the exterior of the star 

are iterated to the midpoint of the interval, giving Y 3 

, Y 4 

, 

and Y 5 

. A linear combination of these five solutions exists 

that makes each variable H 1 

, K, W, and X continuous at the 

midpoint. With five solutions for four variables, there is 

an extra degree of freedom. This additional degree allows 

for free-scaling of the solution. 

6.3 – Perturbations Outside the Compact Object 

Given any spherical harmonic degree and frequency, 

we can find the unique solution for the radial dependent 

variables H 1 

, K, W, and X that define the perturbations 

The tortoise coordinate r * is defined by: 

(33) 

(34) 

The Zerilli equation is notable because it provides 

a Schrödinger-type equation for even-parity Regge- 

Wheeler perturbations of Schwarzschild geometry. This 

presents simplifications to the analysis of wave equations 

(Fackerell, 1971). The Zerilli function is defined in terms 

of the perturbations H 0 

(r) and K(r) 

(35) 

where the functions a(r), b(r), g(r), h(r), and k(r) are 

functions of the frequency, spherical harmonic degree, and 

mass and radius of the compact object, given in (Lindblom 

& Detweiler, 1983). We recover H 0 

with the following 

equation, similar to the relation defined between V and X 

in Equation (27). 

(36) 

Using Equation (35), we obtain initial conditions for Z (r * ) 

and dZ (r * ) /dr * . For a given (r * , Z) coordinate, Equation 

(32) can be used to calculate d 2 Z/dr *2 

, and in this manner 

we propagate Z through r * . In practice, we integrate Z from 

r * = R * to r * = 25ω −1 (Lindblom & Detweiler, 1983). Far 

away from the star, the Zerilli function can be expressed as 

a combination of 2 components, namely the ingoing and 

outgoing contributions. These individual solutions may be 

asymptotically expressed as power series. 

(37) (38) 

The solution Z − represents purely outgoing gravitational 

radiation, while Z + represents purely ingoing waves. The 

PHYSICS 


constants a j 

and the complex conjugates ā j 

are recursively 

defined in (Chandrasekhar & Detweiler, 1975). A solution 

to the Zerilli equation will be given by a constant linear 

combination of Z + and Z − . 

(39) 

For Quasi-Normal Modes, all the gravitational radiation 

is outgoing, so the particular solution Z should be a multiple 

of Z − , with no parts Z + . At r = 25ω −1 , the Zerilli equation 

numerically integrated is matched onto the asymptotic 

series, with j max 

= 2. We determine values of constants 

β(ω) and γ(ω), and use Newton’s method to search for ω 

such that γ(ω) = 0. The eigenfrequencies found are those 

of quasinormal modes, a subset of which correspond to the 

Newtonian Pressure-Modes. 

(40) 

We compare the difference in frequencies of these 

corresponding modes Equation (40) against the relativity 

parameter σ defined in Equation (6) to understand the 

effects of general relativity on pulsation frequencies of 

compact objects, the ultimate goal of this research. 

6.4 – Algorithmic Roadmap 

A similar approach is taken here compared to the 

Newtonian model (See Section 4.2). However, there are 

some key differences in the implementation. Instead 

of using the Lane-Emden Equation, the Tolman- 

Oppenheimer Equation is used (Eq. 8-10). After 

integrating the Equations (28)-(31) in the interior of the 

star, the Zerilli function and its derivatives are computed 

at r = R (Eq. 32-36). Runge-Kutta integration is used to 

iterate the Zerilli function far from the star where it is 

matched onto the asymptotic power series expansions and 

the coefficients β(ω) and γ(ω) are calculated (Eq. 39). γ(ω) 

replaces det(M) in the Newtonian model, and we proceed 

as before locating eigenfrequencies for various spherical 

harmonics. 

7. Relativistic Model Results and Discussion 

We calculate the normalized perturbations of a 

polytrope with n = 3 as a function of dimensionless radius 

r. Although n = 1 is most optimal for a neutron star, we 

use n = 3 initially to best compare to the Newtonian model. 

The spherical harmonic degree is l = 2, and this particular 

perturbation is the second harmonic pressure mode. 

Figure 6 shows the results of this calculation. The four 

graphs left to right and top to bottom are perturbation 

variables X, W, K, and X 0 

. Recall K and X 0 

represent metric 

perturbations (Eq. 23). The shape of these curves closely 

match the shapes of y 3 

and y 4 

in the Newtonian section 

(Fig. 5). X and W are different variables than y 1 

and y 2 

, 

explaining the differences in the shapes of the top two 

panels between Figure 5 and 6. 

Figure 6. Perturbation variables calculated for 

a specific pulsation, with corresponding Zerilli 

variable integrated outside the compact object using 

the Zerilli equation. 

These results show strong qualitative similarities to 

our previous Newtonian results, indicating the relativistic 

model is successful in predicting the general behavior of 

the interior perturbation variables. The single discernible 

frequency and sinusoidal shape of the Zerilli function 

indicate these methods can locate quasinormal modes 

fairly accurately. Further research is ongoing to search 

for exact quasinormal eigenfrequencies. Until then, we 

cannot numerically comapre quantitative results between 

the Newtonian and relativistic models. Nonetheless, 

our model successfully replicates the Newtonian model 

behavior within curved spacetime as well. 


I would like to thank Mr. Reece Boston (UNC-Chapel 

Hill), Dr. Charles Evans (UNC-Chapel Hill), and Dr. 

Jonathan Bennett (NCSSM) for their continued support 

and guidance throughout this research project. 

9. References 

Bowman, D. M., Buysschaert, B., Neiner, C., P ́apics, P. 

I., Oksala, M. E., & Aerts, C. (2018). K2 space photometry 

reveals rotational modulation and stellar pulsations in 

chemically peculiar a and b stars. A&A, 616, A77. Retrieved 

from https://doi.org/10.1051/0004-6361/201833037 doi: 

10.1051/0004-6361/201833037 

Caplan, M. E., Schneider, A. S., & Horowitz, C. J. 

(2018, Sep). Elasticity of nuclear pasta. Phys. Rev. Lett., 

121, 132701. Retrieved from https://link.aps.org/ 

doi/10.1103/PhysRevLett.121.132701 doi: 10.1103/ 

PhysRevLett.121.132701 

Chandrasekhar, S., & Detweiler, S. (1975). The quasinormal 

modes of the schwarzschild black hole. The Royal 

Society. 


Christensen-Dalsgaard, J., & Mullan, D. J. (1993). Accurate 

frequencies of polytropic models. Royal Astronomical 

Society. 

Fackerell, E. D. (1971). Solutions of zerilli’s equation for 

even-parity gravitational perturbations. The Astrophysical 

Journal. 

Hartle, J. B. (2003). Gravity: An introduction to einstein’s 

general relativity (1st ed.). San Francisco: Addison-Wesley. 

Knapp, J. (2011). Polytropes. 

Lindblom, L., & Detweiler, S. L. (1983). The quadrupole 

oscillations of neutron stars. The Astrophysical Journal. 

Lindblom, L., & Detweiler, S. L. (1985). On the nonradial 

pulsations of general relativistic stellar models. The 

Astrophysical Journal. 

N. Andersson, K. D. K., & Shutz, B. F. (1995). A new 

numerical approach to the oscillation modes of relativistic 

stars. 

Thorne, K. S., & Campolattaro, A. (1967, Sep). Nonradial 

pulsation of general-relativistic stellar models. i. 

analytic analysis for l ≥ 2. The Astrophysical Journal, 

149, 591. Retrieved from http://adsabs.harvard.edu/ 

abs/1967ApJ...149..591T doi: 10.1086/149288 

Tooper, R. F. (1964). General relativistic polytropic fluid 

spheres. The Astrophysical Journal, 140(434). Tosaka, W. 

C. C.-B.-S.-. G. (n.d.). 

Townsend, R., & Teitler, S. (2013). Gyre: An open-source 

stellar oscillation code based on a new magnus multiple 

shooting scheme. 

Unno, W. (1989). Nonradial oscillations of stars (2nd ed.). 

Tokyo: University of Tokyo Press. 

PHYSICS 


EFFECT OF ELLIPTIC FLOW FLUCTUATIONS ON THE 

TWO- AND FOUR-PARTICLE AZIMUTHAL CUMULANT 

Brian Lin 

Abstract 

We incorporate finite elliptic flow fluctuations for the 2-particle and 4-particle azimuthal cumulants. Starting from 

expressions that include transverse momentum conservation, we consider three potential v 2 

distributions: a Gaussian 

distribution, a Bessel-Gaussian distribution, and a power law distribution. For the Bessel-Gaussian distribution, we find 

the results are sensitive to the size of fluctuations, and c 2 

{4} values at large multiplicity range from 0 to significantly 

negative. Therefore, the 4-particle cumulant c 2 

{4} with transverse momentum conservation can be used to study elliptic 

flow fluctuations in both small and large systems. 


In the Pb+Pb and p+Pb collisions in heavy ion colliders, 

evidence indicates a nearly perfect fluid is produced in this 

system of quarks and gluons. The collective flow phenomenon 

that arises from these collisions is predicted very well by 

the use of hydrodynamics. Caused by the collision’s initial 

geometric anisotropies, we observe azimuthal anistropy in 

the produced particles, which is the clearest indicator of the 

collective flow phenomenon (Nagle and Zajc, 2018). 

The study of relativistic heavy ion collisions originates 

from the desire to learn more about the basic origins of matter 

and, in particular, a new form of QCD matter: the Quark- 

Gluon Plasma (QGP). Understanding of the QGP will reveal 

fundamental properties of matter in high-temperature and 

high-density systems, such as systems existing in the core of 

neutron stars and theorized to have existed in the early stages 

of the Big Bang (Jacak and Steinberg, 2010). 

Elliptic flow is an essential observable that can reveal 

the equation of state of the QGP among other important 

characteristics of dense matter, so accurate measurement of 

elliptic flow has impactful theoretical implications (Snellings, 

2011). However, momentum conservation and jet quenching 

add non-flow effects to measurements of elliptic flow, so our 

measured elliptic flow values contain non-flow effects. Thus, 

the field of relativistic heavy ion collisions utilizes cumulants, 

which suppress the effects of non-flow factors and emphasize 

the true effects of collective flow. Elliptic flow reflects the 

initial geometric anisotropies of the overlapping nuclei. Even 

at the same centrality, the elliptic flow for each event differs 

due to fluctuations. Therefore, we need to consider effects 

of fluctuation on multi-particle cumulants (Bilandzic et al., 

2011). 

The clearest way to remove non-flow effects from elliptic 

flow coefficients is to analyze the azimuthal cumulants 

associated with the collisions. However, even these azimuthal 

anisotropies differ as the overlap between the two nuclei 

varies. Thus, the measured elliptic flow coefficient is expected 

to follow a probability distribution. This paper calculates the 

effect of elliptic flow distributions on two-particle and fourparticle 

cumulants, which have been calculated assuming 

global transverse momentum conservation. 

We calculate new expressions for c 2 

{2} and c 2 

{4} by 

incorporating three predicted distributions of elliptic flow 

that originate from geometric anisotropies between events. 

We primarily analyze the the effect of the distribution 

characteristics on the values of the azimuthal cumulants. 

2. Methods 

The two- and four-particle azimuthal cumulants are 

functions of elliptic flow, v 2 

, so we determine new expressions 

for c 2 

{k} by incorporating the v 2 

fluctuations as probability 

density distributions, P(v 2 

). The single-event average 2-particle 

and 4-particle azimuthal correlations are defined as follows, 

where the brackets represent averaging over all particles in 

the event: 

The average 2-particle and 4-particle azimuthal correlations 

over many events may be written as follows, where we 

denote two averages, first over all particles in an event and 

then over all events: 

The single-event average 2-particle and 4-particle 

azimuthal cumulants are defined as: 

Due to fluctuations in the average azimuthal cumulants 

between events, we calculate the event-averaged 2-particle 

and 4-particle cumulant values, denoted and 

respectively, by incorporating a probability distribution 

on v 2 

as follows: 


3. Results 

3.1 – Gaussian Formulas 

We begin by incorporating a Gaussian distribution of 

v 2 

, as follows: 

We now introduce the formulas derived earlier 

(Bzdak and Ma, 2018) using the assumption of transverse 

momentum conservation (TMC). This assumption has a 

key contribution for small systems because the last particle’s 

momentum is restricted. The effect of TMC diminishes as 

the number of particles, N, increases because the effect of 

one particle’s momentum also decreases with N. 

2.1 – TMC Formulas 

For the Gaussian distribution, we have: 

3.2 – Bessel-Gaussian Formulas 

The Bessel-Gaussian distribution that we give v 2 

is 

defined as: 

where I n 

(x) denotes the Bessel function of the n-th kind. 

For the Bessel-Gaussian distribution, we have: 

The original formulas (Bzdak and Ma, 2018) contained 

the variable v 2 

(p), which we denote for brevity v 2 

. In 

both instances, this represents the elliptic flow value at a 

specific momentum p. We see the 2-particle and 4-particle 

cumulant as a function of other variables such as transverse 

momentum p, number of produced particles N, and the 

expected value of the square of transverse momentum 

over the full phase space . We define 

3.3 – Power Law Formulas 

We continue by incorporating the power law 

distribution of v 2 

, given as: 

For the Power-Law distribution, we have: 

2.2 – General Distribution 

We denote 

We plot the three distributions under the condition that 

= 0.05. 

When we incorporate the probability distribution P(v 2 

) 

we integrate in terms of v 2 

. Thus, for a general distribution, 

we may express 

Figure 1. Plotted above are sample probability 

distributions where = 0.05. 

PHYSICS 


The probability distributions are plotted so that the 

expected value of elliptic flow remains 0.05. We define w = 

for the Bessel-Gaussian distribution. When w = 0, 

the Bessel-Gaussian distribution reduces to the Gaussian 

distribution. We see the Gaussian, Bessel-Gaussian with w 

= 0, and Power Law curves are all very similar, while the 

Bessel-Gaussian curve with w = 2 differs from the other 

three curves. 

3.4 – An Example of Numerical Comparisons 

Our graphs take a reasonable value for as 0.05. 

Additionally, we assume = 0.025 and = 0.25 

(GeV/c) 2 . This allows us to solve for the unknown σ in the 

Gaussian distribution, σ and in the Bessel-Gaussian 

distribution, and α in the Power Law distribution. The 

Gaussian σ value turns out to be around 0.0564, while α 

≈ 313. 

For this section, we tune the Bessel-Gaussian mean 

and width to that of the Gaussian (i.e. we control both 

and ). Our solutions are ( , σ) = (0, 5.64 × 10 −2 ) 

when we equate the variances of the Bessel-Gaussian and 

Gaussian distributions, and ( , σ) = (1.57 × 10 −2 , 5.42 × 

10 −2 ) when we equate the variances of the Bessel-Gaussian 

and Power Law distributions. 

The Gaussian and Bessel-Gaussian with w=0 

distributions are identical, and the power law distribution 

is very similar to them. Thus, these three distributions 

result in an upward shift of the 2-particle cumulant. 

Because of the similarity between all three distibutions, all 

three lead to essentially the same 2-particle cumulant (fig. 

2). 

The event-averaged 4-particle cumulant approaches 

for large N. Specifically for the 

Gaussian distribution, from section 3.1, the Gaussian 

approaches 0 regardless of σ. Because we set the 

variance of the Bessel-Gaussian distribution equal to 

that of the Gaussian distribution, and is similar to that 

of the power law distribution, the behaviors of all three 

distributions are very similar. More general features of the 

Bessel-Gaussian will be shown in section 3.5. 

Figure 2. The event-averaged 2-particle cumulant 

(top panel) and 4-particle cumulant 

(bottom panel) for all three distributions (Gaussian, 

Bessel-Gaussian, and Power Law distributions) 

plotted as a function of event multiplicity N. The 

results obtained without elliptic flow fluctuations 

are shown in comparison in black. Additionally, 

experimental results obtained from the ATLAS 

collider are shown in green. 

3.5 – Effects of Relative Fluctuation Size 

The Bessel-Gaussian probability distribution, defined 

in Section 3.2, may be rewritten in terms of two instead 

of three variables. Denoting u = /σ and w = /σ, we 

may rewrite the Bessel-Gaussian distribution in terms of 

u and w as: 

We define the relative fluctuation of v 2 

as: 

We can show that r(v 2 

) = r(u) is a function of w only, and 

so to manipulate the Bessel-Gaussian distribution, we only 

need to manipulate w. Specifically, when w = 0, the relative 

fluctuation of v 2 

reaches a maximum of . 

In the large event multiplicity limit, the 4-particle 

Bessel-Gaussian cumulant is given by: 

while the cumulant neglecting elliptic flow fluctuation is 


. Both these values depend on two parameters: 

and σ. However, their ratio, which approaches 

, is dependent only on w. The ratio of 

these two values is shown as the dashed curve, and the 

relative fluctuation of v 2 

is shown as the solid curve (fig. 3). 

When we equate the variance of the Bessel-Gaussian 

distribution to that of the Gaussian distribution, we 

obtain w = 0, and the maximum relative fluctuation (fig. 

3). The corresponding large-N limit of for w = 0 

is 0. On the other hand, for large w, relative fluctuations 

are small (fig. 3). In that limit, the Bessel-Gaussian eventaveraged 

4-particle cumulant will approach the results 

obtained through TMC that neglected v 2 

fluctuation, i.e. 

at large N, and so the relative 

fluctuation for small w approaches 1. 

Because both curves (fig. 3) are solely functions of w, 

the ratio 

at large N may be expressed as a 

function of the v 2 

relative fluctuation only. The 4-particle 

cumulant ratio against r(v 2 

) is shown as the dashed curve 

in Figure 4. As the relative fluctuation increases, we see 

the ratio decrease from 1 to 0, i.e. the goes from 

significantly negative to 0 for large N. 

For the 2-particle cumulant, we observe its large event 

multiplicity limit to be . Additionally, the 2-particle 

cumulant neglecting elliptic flow fluctuation approaches 

, so at large N, the ratio between the two may be 

written as 

Figure 4. The ratio between Bessel-Gaussian and 

non-fluctuation cumulants at large N for both c 2 

{2} 

and c 2 

{4}. 

Therefore, we see as the relative fluctuation increases, 

the ratio increases from 1 to 4/π. 

To visualize our results, we plot in Figure 5 the Bessel- 

Gaussian 2-particle and 4-particle cumulants for varying 

relative fluctuations of v 2 

under the condition that 

= 0.05. These relative fluctuations were chosen so that 

r(v 2 

) = 0.523, 0.466, 0.319, and 0 correspond with w values 

of 0, 1, 2, and ∞ respectively. 

Figure 3. Relative fluctuation of v 2 

for the Bessel- 

Gaussian distribution (solid) and the ratio between 

the large-N limits of the Bessel-Gaussian and nonfluctuation 

cumulants (dashed) as functions of w. 

Figure 5. The event-averaged (first) and 

(second) for the Bessel-Gaussian distribution 

as a function of event multiplicity N. Results with 

various amounts of relative fluctuations of v 2 

are 

shown. Again, ATLAS results for the 4-particle 

cumulant are shown in green. 

PHYSICS 


4. Conclusions 

When elliptic flow fluctuations are included in the 

calculations of two and four-particle azimuthal cumulants, 

there is a definite shift in the cumulant. For the two-particle 

cumulant, there is an increase for the Gaussian, Bessel- 

Gaussian, and power law distributions of v 2 

fluctuations. 

Meanwhile, for the four-particle cumulant, we observe a 

large positive shift so that its value is close to 0 for large 

event multiplicity when we incorporate Gaussian or 

Power Law elliptic flow distributions. The Bessel-Gaussian 

distribution allows for variation of the relative fluctuation 

of v 2 

. When the relative fluctuation is small, the 4-particle 

cumulant tends towards a significantly negative value at 

large event multiplicity, approaching results obtained 

previously without including v 2 

fluctuations. When the 

relative fluctuation is large, the cumulant goes to zero 

at large event multiplicity, approaching results from 

the Gaussian and power law elliptic flow distributions. 

Therefore, the c 2 

{4} observable may be used to probe the 

fluctuation of elliptic flow in both small and large systems. 

5. Acknowledgements 

Firstly, we would like to acknowledge Dr. G.L. Ma of 

Fudan University for his patient, insightful mentorship. 

We would like to gratefully thank Dr. Z.W. Lin for 

valuable discussions and feedback. We also acknowledge 

Dr. J. Bennett for engaging in weekly discussions and 

offering advice as well as the NCSSM Foundation for 

providing the necessary support and resources to carry out 

his research. 

6. References 

Nagle, J. L., & Zajc, W. A. (2018). Small System Collectivity 

in Relativistic Hadronic and Nuclear Collisions. Annual 

Review of Nuclear and Particle Science, 68 (1), 211-235. 

Snellings, R. (2011). Elliptic flow: A brief review. New Journal 

of Physics, 13(5), 055008. 

Jacak, B., & Steinberg, P. (2010). Creating the perfect liquid 

in heavy-ion collisions. Physics Today, 63(5), 39-43. 

Bzdak, A., & Ma, G. (2018). A remark on the sign change 

of the four-particle azimuthal cumulant in small systems. 

Physics Letters B, 781, 117-121. 

Bilandzic, A., Snellings, R., & Voloshin, S. (2011). Flow 

analysis with cumulants: Direct calculations. Physical Review 

C, 83(4). 


AN INTERVIEW WITH DR. VALERIE ASHBY 

From left, Navami Jain, BSS Editor-In-Chief; Emily Wang, BSS Editor-In-Chief; Dr. Jonathan Bennett, BSS Faculty 

Advisor; Dr. Valerie Ashby, Dean of Trinity College of Arts & Sciences at Duke University; Kathleen Hablutzel, Publication 

Editor-In-Chief; and Jackson Meade, BSS Essay Contest Winner 

What drew you to chemistry? 

That’s an easy answer. My dad was a math and science 

teacher; he taught chemistry and various versions of math 

in high school… so science was never scary to me. It just 

seemed like what we did… The second thing is I had a great 

high school chemistry teacher. I actually did something I 

don’t recommend to my own Duke students, which is to 

decide what you’re going to major in before you arrive. 

Leaving high school, I said that I’m going to be a chemistry 

major and I’m not going to change my major, because I had 

heard these stories about how college students hit their 

first hard course or their second hard course and they shift 

their major. I decided I was not going to do that. The good 

news was that I loved it, even at the college level… that’s 

how I decided I was going to major in chemistry. Science 

was always my thing. 

So you’re in more of an administrative position now. Do 

you ever wish you could go back to the lab? 

Oh, you mean every 30 seconds? I wish for you that every 

job you have is your favorite job. And I have led this crazy, 

lovely life where every single job that I have held has been 

my favorite job at that moment. When I was a faculty 

member, it was my favorite job. Who I am and what I do 

have overlapped my entire life. That’s a gift that I get to 

be who I am in my job. I am a teacher, that's who I am. 

Even though I’m out of the classroom, that’s still who I am. 

The way that it presents itself now is through inspiring 

other teachers, encouraging other faculty, and mentoring 

students...I have office hours with students every Friday 

even though I’m not teaching. They come and talk to me 

about their lives and I get to do the thing that I love… 

I also miss running my old research group. I kept my 

research group at UNC when I took this job… I graduated 

my last PhD students from UNC Chapel Hill last year. 

For the first time in twenty years I haven’t had my own 

research group. I’m so busy that I don’t have time, but 

I miss training graduate students and I miss creating 

knowledge. There’s something about waking up every 

day trying to do something that nobody else has ever done 

and answering a question that remains open, and then 

teaching other people how to do that… it is so much fun. 

FEATURED ARTICLE 


We were wondering how the scientific and problemsolving 

skills you’ve gained as a chemist have translated 

into other roles such as your current role? 

It was absolutely great training. When you do scientific 

research, it is team-based with vertically integrated teams; 

so a professor, a postdoc, graduate students, undergrad 

students, and then high school students who come in the 

summer or during the academic year. That team-based 

approach and learning how to work with every level of 

that team are great training for what I do here. 

I have an administrative team and it’s a vertically integrated 

team… When you run a research group, you’re not just 

doing science, you’re doing people - people who spend a 

lot of time together in close proximity. Teaching graduate 

students how to navigate being in a group that has a 

personality and a culture… I had to manage all the finances 

of the group, so I learned how to do big budgets for grants. 

You learn how to write, you learn how to communicate 

- so many different parts of running a research team. It’s 

like a small business if you're doing science... So what 

do I do in my present job? I run the finances - they’re 

my responsibility. Human resources, the well-being of 

students, faculty, and staff, making sure that we’re being 

collaborative and collegial - all my responsibility. It’s 

absolutely great training and I think I use all of that now. 

My day-to-day life is really all of those skills that you learn 

about being in a team and managing people. 

And my job is to raise money. If you’re going to do science, 

you better know how to raise money. You may know 

who Joe DeSimone is - I was his first PhD student so we 

have known each other for a very long time and one of 

my favorite Joe quotes is “Val, a vision without funding 

is just a hallucination.” And as a scientist, if that’s not 

your mindset, you can’t actually do your science. This 

enterprise doesn’t run without funding, so being a little bit 

entrepreneurial is important... for this job. 

While at UNC you worked with an NSF grant to 

increase the number of underrepresented minority 

students who receive doctoral degrees in STEM fields. 

What were some of your more effective policies and 

what challenges have you personally faced as a minority 

woman in STEM? 

Quite frankly, I never paid any attention to being a woman 

or being underrepresented. Now, that’s a luxury. People 

treated me so well it was never my experience. Now when 

I advocate for women and underrepresented people I have 

to say to them “I haven’t had a bad experience. My goal 

is for you not to. And if you have, my goal is to help you 

with it.” My PhD advisor was incredible - some people 

have trouble with that. The reason I want to help so many 

people is because I have had such a wonderful experience. 

I always say to people if somebody tried to offend me at 

some point or did something, I just didn’t take it in… it just 

never affected me. So that’s my history with that. 

I loved working in that program and it had a model that 

worked already and my job was to not break it and to 

try to expand it. It’s a cohort model of students and it’s 

everything from making sure students are onboarded into 

their departments. It’s very isolating to be a grad student. 

Especially if you are an underrepresented student, you 

could be the only one in the program. If you’re not in a 

group that welcomes you and has a great culture, it can feel 

even more isolating. We were the place where students 

could come when they hit roadblocks...Sometimes we 

were the place that would support them in going to talk 

about their research... we would pay for travel for them 

to go to conferences... we would help them engage with 

faculty and collaborators... So many different ways. It 

was quite successful and we were able to expand it into 

the humanities, because all grad students need support for 

different reasons. 

What do you think is the future for women in STEM, 

and what can we do to make sure that the STEM fields 

are inclusive for all people? 

That’s a great question. When you look at the number of 

women faculty that we have in each one of our disciplines, 

we are not very different from most universities... we have 

more women who are humanists than social scientists 

and scientists. I think 23-27% of our science faculty are 

women. 50% of the graduate students are women... but 

the numbers just don’t translate into the faculty for several 

reasons... so we have a lot of work to do here for women 

in science. Part of that is making sure that we have a 

culture that is welcoming, but also that we are thinking 

about how families and having children affects women 

and men differently. It’s serious when you’re a scientist 

because you have to be in the lab, right? There are several 

family-friendly things that we can do… but making sure 

that people have the mentorship that they need is really 

important… [and] making sure the climate is such that we 

are equally supportive of every single person. That’s not 

trivial to pull off. 

What can you do [referring to Navami, Emily, and 

Kathleen]? Stay in. Don’t quit. If you love it, stay in. Even 

if it gets hard just stay in there. Find some great mentors… 

I have four mentors that I’ve had for more than twenty 

years, including my PhD advisor. They keep me going. 

When it got hard, I wanted to quit. And they kept me 

going. Get good mentors. What can you do [referring to 

Jackson]? What you do is more important than what they 

do. All of my mentors are men. That actually is just what 

90 | 2018-2019 | Broad Street Scientific FEATURED ARTICLE

happened in my life. I’m not saying it’s a good or bad thing. 

But you being equally as supportive is important… I’ve got 

four of them [mentors] and they’ve been incredible. They 

were just the right people for me… 

If you love it, don’t let anything keep you from doing it. 

Your part is to find what you love and don't give anybody 

the power to take you out of doing what you are supposed 

to be doing. 

What advice do you have in general for STEM majors? 

Get some sleep is what I tell my Duke students. Just relax. 

It’s okay. It can be pretty intense. Have some fun. I’m 

serious about that. I think the reason I love what I was 

doing and what I have always done is because I have a 

balanced life. The sooner you start taking care of your 

whole self and form that habit, the better. 

The problem with being an independent scientist is that 

you’re independent, which is the same problem I have 

with this job… nobody’s telling me when to come to work 

every day and nobody’s telling me when to go home. The 

problem is that if you are a crazy workaholic, you can do 

this 24/7. As an independent scientist, you are actually 

working for yourself because you’re running your own 

small business. When do you not work because everything 

you're doing is for you and your group? Start practicing 

now being more balanced. The other recommendation I 

would have, at least my experience with STEM majors, is 

to make sure you really get a great liberal arts education. 

You’re going to be smart enough; that’s not the question. 

This navigating across culture, ethics, language… that’s 

actually going to make you a more creative scientist. You 

never know where you’re going to land in this world, 

right? You might be doing your science on the other side of 

the world. You need to feel an appreciation for differences 

in culture and religion. Get a great liberal arts education 

with depth in your science and I think it sets you up in a 

beautiful way. 

Can you tell us about a time that you failed and what 

you’ve learned from that experience? 

When I failed? Sure - you want to talk about last week or 

yesterday or 20 minutes ago? [laughs] 

So in graduate school at UNC, you can get a high pass, you 

can get a pass, you can get a low pass. That’s the grading 

scale. So I took a mechanistic organic chemistry class and 

I got an L. And what that means is that if you’re in a PhD 

program you get bumped out of the PhD program down 

to the Master's program. Let me give some context to you. 

We don’t admit Master's students typically into chemistry. 

Because you can go from a B.A. or B.S. to a PhD and almost 

FEATURED ARTICLE 

nobody gets a Master’s degree intentionally and stops. 

So I got bumped down to the Master's and had to earn 

my way back into the PhD program, meaning that I had 

to pass. So having a good mentor is a good thing, because 

right there I would have been gone and everything after 

that would not have been possible had my PhD advisor 

not said, “Val this is not a big deal. You weren’t prepared 

because you didn’t know you were going to graduate 

school.” And watching somebody else not flinch is really 

good. He was so supportive. He said “This is not a problem. 

We’re going to do what we need to do here. We’re gonna 

pretend like this didn’t happen and we’re gonna keep you 

moving as if you’re on the PhD track.” So I took my PhD 

comps. 

And I did all of the hourly exams - we took them on 

Saturdays; you have to pass a certain number before you 

qualify to take the actual oral exam. And then after I took 

my comps I had to request in a letter to be readmitted. And 

I did and there I was. And it was as if it didn’t happen… 

Thank goodness for mentorship, because when your head 

is not in the right place, your mentor can keep your feet 

moving until your head catches back up… 

The beauty for me of that failure is that when a student 

comes in here and they have had an academic failure they 

don’t think I’ve had one, right? Because they think you 

can’t really do the Dean stuff, can you? What I get to say to 

them is, it turns out, you can. You’re fine. You can recover. 

And then I tell them my story. 

I mentor students who think that their first failure is the 

end of the road. Turns out you can get a C in physics and 

still be the Dean. Perfection is not required. 

For sports, Duke or UNC? 

Oh - so I’m glad you asked me this. So Duke. I have to tell 

you my story - this is so fun. So I hated Duke because I had 

two UNC degrees and not only that I had an undergraduate 

degree and when you have an undergraduate degree 

from UNC the hate is deep. It’s like genetic. I was such a 

Duke hater that I would root for anybody playing Duke 

because I just wanted Duke to lose and badly, with shame. 

[laughs] So when one of my mentors suggested that I 

interview for this job, I said to him, “How am I going to 

be able to do this?” And he said, “Val, get over yourself.” 

And he is a UNC alum and he said this is a great job and 

it’s a great place and you’re going to love the people, 

you’re going to love the students. And all of that stuff 

is going to go away the moment you show up and meet 

people. And in my first interview, I walked out and I said 

if they offer me this job I’m taking it. And I just found 

my people sitting right there at the table and it was just 

stunning to me… It’s a serious lesson for me on diversity. 


It’s easy to not like people from a distance. The moment 

I know you, the game is over. Everything I told myself 

about you is no longer true. You just become another 

person, and that’s what I found. I sat at that table and I 

thought “I love these students.” I love the ideals and 

the values and I’m like, “These are my people.” I love 

this place. I’m all in Duke. I’m fiercely competitive in 

sports and I love great coaching. Duke 100%. On the 

weekends, I’m in full Duke gear. It drives my friends 

insane. [laughs] But it was surprisingly easy. The people 

made all the difference and I love this place. I really do. 

So this isn’t a newfound hatred for UNC, it’s a newfound 

understanding? 

It’s a newfound understanding and I never thought you 

could love both of those places. I so appreciate what UNC 

has done for me. I love how UNC grew me and supported 

me and got me here. And I love that these guys have 

accepted me but I also love what we do here - it’s pretty 

doggone special and those students are incredible. I get to 

love both. 

BROAD STREET SCIENTIFIC 

The North Carolina School of Science and Mathematics Journal of Student STEM Research 

ncssm.edu/bss 

VOLUME 8 | 2018-2019 

92 | 2018-2019 | Broad Street Scientific FEATURED ARTICLE

Broad Street Scientific 2018-2019

Create successful ePaper yourself

Delete template?

Save as template?