SkyShot - Volume 1, Issue 1: Autumn 2020
The inaugural issue of SkyShot, an online publication for promoting understanding and appreciation for outer space. As an international community, we share the work of undergraduate and high school students through a multidisciplinary, multimedia approach. Features research papers, astrophotography, informative articles, guides, and poetry in astronomy, astrophysics, and aerospace.
The inaugural issue of SkyShot, an online publication for promoting understanding and appreciation for outer space. As an international community, we share the work of undergraduate and high school students through a multidisciplinary, multimedia approach. Features research papers, astrophotography, informative articles, guides, and poetry in astronomy, astrophysics, and aerospace.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
SkyShot Autumn 2020
a unique purpose, such as convolution
layers for generating feature maps from
the image, pooling layers for extracting
key features such as edges, dense layers
for combining features, and dropout layers
that prevent overfitting to the training
set. [10]
This method was applied to galaxy
classification by researchers at the National
Astronomical Observatory of Japan
(NAOJ). The Subaru Telescope, an
8.2-meter optical-infrared telescope at
Maunakea, Hawaii, serves as a robust
source of data and images of galaxies
due to its wide coverage, high resolution,
and high sensitivity. [11] In fact, earlier
this year, astronomers used Subaru Telescope
data to train an algorithm to learn
theoretical galaxy colors and search for
specific spectroscopic signatures, or
light frequency combinations. The algorithm
was used to identify galaxies in the
early stage of formation from data containing
over 40 million objects. Through
this study, a relatively young galaxy HSC
J1631+4426, breaking the previous record
for lowest oxygen abundance, was discovered.
[12]
In addition, NAOJ researchers have
been able to detect nearly 560,000 galaxies
in the images and have had access
to big data from the Subaru/Hyper
Suprime-Cam (HSC) Survey, which
contains deeper band images and has
a higher spatial resolution than images
from the Sloan Digital Sky Survey. Using
a convolutional neural network (CNN)
with 14 layers, they could classify galaxies
as either non-spirals, Z-spirals, or
S-spirals. [10]
This application presents several important
takeaways for computational
astrophysics. The first is the augmentation
of data in the training set. Since
the number of non-spiral galaxies was
significantly greater than the number of
spiral galaxies, the researchers needed
more training set images for Z-spiral and
S-spiral galaxies. In order to achieve this
result without actively acquiring new
images from scratch, they flipped, rotated,
and rescaled the existing images with
Z-spiral and S-spiral galaxies, generating
a training set with roughly similar numbers
for all types of galaxies.
Second, it is also important to note
that the accuracy levels of AI models may
reduce when working with celestial bodies
or phenomena that are rare, due to a
reduction in the size of the training set.
The galaxy classification CNN originally
achieved an accuracy of 97.5%, identifying
spirals in over 76,000 galaxies in
a testing dataset. However, this value
decreased to only 90% when the model
was trained on a set with fewer than 100
images per galaxy type, demonstrating
the potential for concerns if more rare
galaxy types were to be used.
A final important takeaway is regarding
the impact of misclassification and
differences between the training dataset
and the testing dataset. When applying
the model to the testing set of galaxy images
to classify, the model found roughly
equal numbers of S-spirals and Z-spirals.
This contrasted with the training set, in
which S-spiral galaxies were more common.
Although this may appear concerning,
as one would expect the distribution
of galaxy types to remain consistent, the
training set may have not been representative,
likely due to human selection
and visual inspection bias. In addition,
the authors point out that the criterion
of what constitutes a clear spiral is ambiguous,
and that the training set images
were classified by human eye. As a result,
while the training set only included images
that had unambiguous spirals; the
validation set may have included more
ambiguous cases, causing the model to
incorrectly classify them.
Several strategies can be used to combat
such issues in scientific machine
learning research. In terms of datasets,
possible options include creating a new,
larger training sample or employing numerical
simulations to create mock images.
On the other hand, a completely
different machine learning approach -
unsupervised learning - could be used.
Unsupervised learning would not require
humans to visually classify the
training dataset, as the learning model
would identify patterns and create classes
on its own. [10]
In fact, researchers at the Computational
Astrophysics Research Group at
the University of Santa Cruz have taken
a very similar approach to the task of
galaxy classification, focusing on galaxy
morphologies, such as amorphous elliptical
or spheroidal. Their deep learning
framework, named Morpheus, takes in
image data by astronomers and uniquely
does pixel level classification for various
features of the image, allowing it to
discern unique objects within the same
image rather than merely classifying the
image as a whole (like the models used
by the NAOJ researchers). A notable benefit
of this approach is that Morpheus
can discover galaxies by itself and would
not require as much visual inspection or
human involvement, which can be fairly
high for traditional deep learning approaches
- the NAOJ researchers worked
with a dataset that required nearly
100,000 volunteers. [13] This is crucial,
given that Morpehus could be used to
analyze very large surveys, such as the
Legacy Survey of Space and Time, which
would capture over 800 panoramic images
per night. [13]
Examples of a Hubble Space Telescope
Image and its classification results
using Morpheus [13].
41