A Framework for Evaluating Early-Stage Human - of Marcus Hutter

More documents

Recommendations

Info

whethertheoriginalutterancecontainedatargetword,and thentrainedseveralclassifiersonthelabeleddata. They usedtheseclassifierstoclassifyutterancesbasedonwhether theycontainedthetargetword. Thistechniqueachieved moderatesuccess,butthedatasetwassmall,anditdoesnot producewordboundaries,whichisthegoalofthiswork. ThisworkmakesuseoftheVotingExperts(VE)algorithm.VEwasdesignedtodowithdiscretetokensequences exactlywhatwearetryingtodowithrealaudio. Thatis, givenalargetimeseries,specifyallofthelogicalbreaksso astosegmenttheseriesintocategoricalepisodes.Themajorcontributionofthispaperliesintransforminganaudio signalsothattheVEmodelcanbeappliedtoit. OverviewofVotingExperts TheVEalgorithmisbasedonthehypothesisthatnatural breaksinasequenceareusuallyaccompaniedbytwoinformationtheoreticsignatures(Cohen,Adams,&Heeringa 2007)(Shannon1951). Thesearelowinternalentropyof chunks, andhighboundaryentropybetweenchunks. A chunkcanbethoughtofasasequenceofrelatedtokens.For instance,ifwearesegmentingtext,thentheletterscanbe groupedintochunksthatrepresentthewords. Internalentropycanbeunderstoodasthesurpriseassociatedwithseeingthegroupofobjectstogether.Morespecifically,itisthenegativelogoftheprobabilityofthoseobjectsbeingfoundtogether.Givenashortsequenceoftokens takenfromalongertimeseries,theinternalentropyofthe shortsequenceisthenegativelogoftheprobabilityoffindingthatsequenceinthelongertimeseries.Sothehigherthe probabilityofachunk,theloweritsinternalentropy. Boundaryentropyistheuncertaintyattheboundaryof achunk. Givenasequenceoftokens,theboundaryentropyistheexpectedinformationgainofbeingtoldthenext tokeninthetimeseries. Thisiscalculatedas HI(c) = − � m h=1 P(h,c)log(P(h,c))where cisthisgivensequence oftokens, P(h,c)istheconditionalprobabilityofsymbol h following cand misthenumberoftokensinthealphabet. Wellformedchunksaregroupsoftokensthatarefoundtogetherinmanydifferentcircumstances,sotheyaresomewhatunrelatedtothesurroundingelements. Thismeans that,givenasubsequence,thereisnoparticulartokenthat isverylikelytofollowthatsubsequence. Inordertosegmentadiscretetimeseries,VEpreprocessorsthetimeseriestobuildann-gramtrie,whichrepresents allitspossiblesubsequencesoflengthlessthanorequalton. Itthenpassesaslidingwindowoflengthnovertheseries. Ateachwindowlocation,two“experts”voteonhowthey wouldbreakthecontentsofthewindow.Oneexpertvotes tominimizetheinternalentropyoftheinducedchunks,and theothervotestomaximizetheentropyatthebreak. The expertsusethetrietomakethesecalculations.Afterallthe voteshavebeencast,thesequenceisbrokenatthe“peaks”locationsthatreceivedmorevotesthantheirneighbors.This algorithmcanberuninlineartimewithrespecttothelength ofthesequence,andcanbeusedtosegmentverylongsequences.Forfurtherdetails,seethejournalarticle(Cohen, Adams,&Heeringa2007). ItisimportanttoemphasizetheVEmodelovertheactual 139 implementationofVE.Thegoalofourworkistosegment audiospeechbasedontheseinformationtheoreticmarkers, andtoevaluatehowwelltheyworkforthistask. Inorder todothis,weuseaparticularimplementationofVotingExperts,andtransformtheaudiodataintoaformatitcanuse. Thisisnotnecessarilythebestwaytoapplythismodelto audiosegmentation.Butitisonewaytousethismodelto segmentaudiospeech. The model of segmenting based on low internal entropyandhighboundaryentropyisalsocloselyrelatedto theworkinpsychologymentionedabove(Saffranetal. 1999). Specifically,theysuggestthathumanssegmentaudiostreamsbasedonconditionalprobability.Thatis,given twophonemesAandB,weconcludethatABispartofa wordiftheconditionalprobabilityofBoccurringafterAis high.Similarly,weconcludethatABisnotpartofawordif theconditionalprobabilityofBgivenAislow.TheinformationtheoreticmarkersofVEaresimplyamoresophisticatedcharacterizationofexactlythisidea.Internalentropyisdirectlyrelatedtotheconditionalprobabilityinsideofwords. Andboundaryentropyisdirectlyrelatedtotheconditional probabilitybetweenwords.Sowewouldliketobeableto useVEtosegmentaudiospeech,bothtotestthishypothesis andtopossiblyfacilitatenaturallanguagelearning. ExperimentalProcedure Ourprocedurecanbebrokendownintothreesteps.1)Temporallydiscretizetheaudiosequencewhileretainingtherelevantinformation. 2)Tokenizethediscretesequence. 3) ApplyVEtothetokenizedsequencetoobtainthelogical breaks.Thesethreestepsaredescribedindetailbelow,and illustratedinFigure2. 30 25 20 15 10 5 100 200 300 400 500 600 700 800 Figure1:Avoiceprintofthefirstfewsecondsofoneofour audiodatasets. Theverticalaxisrepresents33frequency binsandthehorizontalaxisrepresentstime. Theintensity ofeachfrequencyisrepresentedbythecolor.Eachvertical lineofpixelsthenrepresentsaspectrogramcalculatedover ashortHammingwindowataspecificpointintime. Step1 Inordertodiscritizethesequence, weusedthediscrete FouriertransformintheSphinxsoftwarepackagetoobtainthespectrograminformation(Walkeretal.2004).We alsotookadvantageoftheraisedcosinewindowerandthe pre-emphasizerinSphinx.Theaudiostreamwaswindowed into26.6mswidesegmentscalledHammingwindows,taken every10ms(i.e. thewindowswereoverlapping). The windoweralsoappliedatransformationonthewindowto emphasizethecentralsamplesandde-emphasizethoseon theedge. Thenthepre-emphasizernormalizedthevolume acrossthefrequencyspectrum. Thiscompensatesforthe naturalattenuation(decreaseinintensity)ofsoundasthe frequencyisincreased.
FinallyweusedthediscreteFourierTransformtoobtain thespectrogram.Thisisaverystandardproceduretoobtain thespectrograminformationofanaudiospeechsignal,and technicalexplanationofeachofthesestepsisavailablein theSphinxdocumentation(Walkeretal.2004). WeperformedtheFourierTransformat64points.However,sinceweareonlyconcernedwiththepoweroftheaudiosignalateachfrequencylevel,andnotthephase,then thepointsareredundant.Onlythefirst33containedunique information.Thistransformationconverteda16kHzmono audiofileintoasequenceofspectrograms,representingthe intensityinformationin33frequencybins,takenevery10ms throughthewholefile. Thesespectrogramscanbeviewed asavoiceprintrepresentingtheintensityinformationover time.Figure1showsavoiceprinttakenfromthebeginning ofoneofthedatasetsusedinourexperiments. Step2 Afterdiscretizationthenextstepistokenization. Oncewe obtainedthespectrogramofeachHammingwindowover theentireaudiosequence,weconvertedittoatimeseries composedoftokensdrawnfromarelativelysmallalphabet. InordertodothiswetrainedaSelfOrganizingMap(SOM) onthespectrogramvalues(Kohonen1988). AnSOMcanbeusedasaclusteringalgorithmforinstancesinahighdimensionalfeaturespace.Duringtraining,instancesarepresentedtoa2Dlayerofnodes. Each nodehasalocationintheinputspace,andthenodeclosesttothegiveninstance“wins.”Thewinningnodeandits neighborsaremovedslightlyclosertothetraininginstance intheinputspace.Thisprocessisrepeatedforsomenumberofinputs. Oncetrainingiscompletethenodesinthe layershouldbeorganizedtopologicallytorepresenttheinstancespresentedintraining.Instancescanthenbeclassifiedorclusteredbasedonthemap. Givenanewinstance, wecancalculatetheclosestnodeinthemaplayer,andthe instancecanbeassociatedwiththatnode.Thiswaywecan groupalloftheinstancesinadatasetintoclusterscorrespondingtothenodesintheSOM. However,thisapproachhasitsdrawbacks.Forinstance, itrequiresthespecificationofasetnumberofnodesinthe networklayerbeforetrainingbegins.Layersizeselectionis notaninconsequentialdecision.Selectingtoomanynodes meansthatsimilarinstanceswillbemappedtodifferent nodes,andselectingtoofewmeansdissimilarinstanceswill bemappedtothesameone. Insteadofguessingandchecking,weusedaGrowing Gridselforganizingnetwork(Fritzke1995).TheGrowing GridstartswithaverysmalllayerofSOMnodesarranged inarectangularpattern.Ittrainsthesenodesonthedataset asusual,andmapsthedatasettothenodesbasedontheir trainedvalues.Thenodewhosemappedinstanceshavethe highestvarianceislabeledastheerrornode. Thenanew roworcolumnisinsertedintothemapbetweentheerror nodeanditsmostdissimilarneighbor. Thenewnodesare initializedastheaverageofthetwonodestheyseparate,and themapisretrainedontheentiredataset. 140 Figure2:IllustrationoftheAudioSegmentationProcedure.
Page 2 and 3:
Ben Goertzel, Pascal Hitzler, Marcu
Page 4 and 5:
Artificial General Intelligence Vol
Page 6:
Preface Artificial General Intellig
Page 9 and 10:
Organizing Committee Tsvi Achler U.
Page 11 and 12:
A formal framework for the symbol g
Page 13 and 14:
Importing Space-time Concepts Into
Page 15 and 16:
one structure, everything else adap
Page 17 and 18:
generalize is essential to understa
Page 19 and 20:
First Application The above framewo
Page 21 and 22:
problem solving, use of knowledge,
Page 23 and 24:
Of particular note is the separatio
Page 25 and 26:
Conclusions and Future Work While p
Page 27 and 28:
Conceptual Spaces The conceptual ar
Page 29 and 30:
perceptions and actions. The lingui
Page 31 and 32:
means of the knowledge stored in th
Page 33 and 34:
grams given some (syntactically res
Page 35 and 36:
For example, let reverse([]) = [] r
Page 37 and 38:
Igor2 produced solutions with auxil
Page 39 and 40:
evolve neural net modules as quickl
Page 41 and 42:
ain”, consisting of some dozen or
Page 43 and 44:
Population Chr NListPtr m Chrm is a
Page 45 and 46:
and shaping, we feel that exploring
Page 47 and 48:
Table 2 reviews the key capabilitie
Page 49 and 50:
One way to achieve this goal would
Page 51 and 52:
question. Our approach of staying a
Page 53 and 54:
H0 δB(H0) δF(H0) H1 δB(H1) δF(H
Page 55 and 56:
Discussion and future work Testing
Page 57 and 58:
Types of Reasoning Corresponding Fo
Page 59 and 60:
Integrating Different Types of Reas
Page 61 and 62:
good overview of analogy models can
Page 63 and 64:
��
Page 65 and 66:
� ��
Page 67 and 68:
��
Page 69 and 70:
A Unified Framework for IP Conditio
Page 71 and 72:
negative evidence when added to a r
Page 73 and 74:
ing strategy. Due to GOLEM’s rand
Page 75 and 76:
any state index, n∈IN the current
Page 77 and 78:
is the best observation summary, wh
Page 79 and 80:
to a code for the rewards only, whi
Page 81 and 82:
have exponentially many states (2O(
Page 83 and 84:
where ˆσ 2 =Loss( ˆw)/n. Given
Page 85 and 86:
• The cost function can be improv
Page 87 and 88:
decides to introduce inflation or d
Page 89 and 90:
€ € The diffusion matrix is the
Page 91 and 92:
tuning may be done adaptively by te
Page 93 and 94:
Of course, Harnad’s argument is n
Page 95 and 96:
By exploring variations of Harnad
Page 97 and 98:
Discussion The representational sys
Page 99 and 100:
the cognitive map is less of a map
Page 101 and 102: models of cognition except in cases
Page 103 and 104: 7(b), we can see that the straight
Page 105 and 106: eaction times, error rates, or exac
Page 107 and 108: comparing behavior with and without
Page 109 and 110: Doorenbos, R. B. 1994. Combining le
Page 111 and 112: egion, we leave the type of object
Page 113 and 114: extract features in support of reco
Page 115 and 116: with a minimum score of -1.0. The
Page 117 and 118: the NASA Human Error Modeling compa
Page 119 and 120: whether their models are capable of
Page 121 and 122: Incorporating Planning and Reasonin
Page 123 and 124: see an unclaimed ten-dollar bill al
Page 125 and 126: swering user queries) and the state
Page 127 and 128: Program Representation for General
Page 129 and 130: distribution P corresponding to the
Page 131 and 132: with all Ei replaced by the unbound
Page 133 and 134: Consciousness in Human and Machine:
Page 135 and 136: at the sensory receptors, is pre-pr
Page 137 and 138: The second choice is to take a deta
Page 139 and 140: Hebbian Constraint on the Resolutio
Page 141 and 142: The Case of Inputs that Change by T
Page 143 and 144: This route is relevant if the noise
Page 145 and 146: 1. Oculus Info. Inc. Toronto, Ont.
Page 147 and 148: Our implemented Turing judge used a
Page 149 and 150: 2.7; Human vs. JabberWacky t(157.6)
Page 151: UnsupervisedSegmentationofAudioSpee
Page 155 and 156: Secondly,thereexistsmorethanonelogi
Page 157 and 158: Parsing PCFG within a General Proba
Page 159 and 160: known in advance, although many imp
Page 161 and 162: Figure 2: Shows the underlying weig
Page 163 and 164: Self-Programming: Operationalizing
Page 165 and 166: expressivity. More over, self-progr
Page 167 and 168: existence of a distance function ov
Page 169 and 170: Bootstrap Dialog: A Conversational
Page 171 and 172: elation is typically a verb. In the
Page 173 and 174: node RDF proposition N1 N2 N3 N4 N5
Page 175 and 176: Analytical Inductive Programming as
Page 177 and 178: Problem domain: puttable(x) PRE: cl
Page 179 and 180: eq Hanoi(0, Src, Aux, Dst, S) = mov
Page 181 and 182: Human and Machine Understanding Of
Page 183 and 184: case orderings t correspond to the
Page 185 and 186: in (1) received a different denotat
Page 187 and 188: Abstract This paper analyzes the di
Page 189 and 190: Having grounded meaning: In an inte
Page 191 and 192: to experience” can be argued to b
Page 193 and 194: Abstract Case-by-case Problem Solvi
Page 195 and 196: First, since NARS is designed in th
Page 197 and 198: typical response is to find such an
Page 199 and 200: What Is Artificial General Intellig
Page 201 and 202: Finally, the facts that both seem t
Page 203 and 204:
differ. NARS uses Narsese, the fair
Page 205 and 206:
Integrating Action and Reasoning th
Page 207 and 208:
states, with the condition that the
Page 209 and 210:
action model. The system is able to
Page 211 and 212:
Neuroscience and AI Share the Same
Page 213 and 214:
Relevance Based Planning: Why Its a
Page 215 and 216:
General Intelligence and Hypercompu
Page 217 and 218:
To appear, AGI-09 1 Stimulus proces
Page 219 and 220:
Distribution of Environments in For
Page 221 and 222:
The Importance of Being Neural-Symb
Page 223 and 224:
Improving the Believability of Non-
Page 225 and 226:
Understanding the Brain’s Emergen
Page 227 and 228:
Abstract The challenge of creating
Page 229 and 230:
Importing Space-time Concepts Into
Page 231 and 232:
HELEN: Using Brain Regions and Mech
Page 233 and 234:
Holistic Intelligence: Transversal
Page 235 and 236:
Achieving Artificial General Intell
Page 237:
Achler, Tsvi . . . . . . . . . . .
show all

A Framework for Evaluating Early-Stage Human - of Marcus Hutter

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?