ong>Statusong> ong>Reportong> on UrbanSim and theOpen Platform for Urban SimulationPaul WaddellDepartment of City and Regional PlanningUC Berkeleywww.urbansim.orgMay 28, 2010
Agenda• Funded Research Projects• NSF: Dynamic Discrete Choice Models• FHWA: Urban Continuum• European Research Council: SustainCity• UC: SB375 Tools and Case Study• UCTC: ScenarioBuilder• Pending Proposals• NSF: Behavioral and Geometric Modeling• NSF: IGERT – Community Information Technologies:Interdisciplinary Education and Science (CITIES)• MTC: Climate Initiatives Program
Agenda• Brief Updates on UrbanSim Development• Zone version of UrbanSim• Parcel and Zone Database Schema Generators• New Models From Templates– Auto Ownership, Tenure, Building Type, Demolition• Demographic Models• Data Imputation• Model Uncertainty• Interactive Model Exploration• Nested Logit Models• Endogeneity and Nested Logit with Sampling• Correlated Outcomes with Latent Classes• Interactive Database and Scenario Creation
Zone Version of UrbanSim4
UrbanSim Models:Über-Über-Simplified Zonal VersionHouseholdLocationModelsHousehold Transition ModelHousehold Location Choice ModelEmploymentLocationModelsEmployment Transition ModelEmployment Location Choice ModelNo representation of supply side of real estate market, or prices. No relocation ofagents once placed. Becomes an ‘incremental’ model, allocating growth.
UrbanSim Models:Über-Simplified Zonal VersionHouseholdLocationModelsHousehold Transition ModelHousehold Relocation ModelHousehold Location Choice ModelEmploymentLocationModelsEmployment Transition ModelEmployment Relocation ModelEmployment Location Choice ModelBeing used in Research Triangle Park, North Carolina. No representation of supplyside of real estate market, or prices. Last resort when there is no data on supply.
UrbanSim Models: Zonal VersionHouseholdLocationModelsLandDevelopmentModelsReal Estate Price ModelResidential Development ProjectLocation Choice ModelNonresidential Development ProjectLocation Choice ModelBuilding Construction ModelEmploymentLocationModelsHousehold Transition ModelHousehold Relocation ModelHousehold Location Choice ModelEmployment Transition ModelEmployment Relocation ModelEmployment Location Choice Model
UrbanSim Models: Full Parcel VersionLandDevelopmentModelsProcess Pipeline EventsReal Estate Price ModelHouseholdLocationModelsExpected Sale Price ModelDevelopment Proposal Choice ModelBuilding Construction ModelEmploymentLocationModelsHousehold Transition ModelHousehold Relocation ModelHousehold Location Choice ModelEmployment Transition ModelEmployment Relocation ModelEmployment Location Choice ModelEconomic Transition ModelWorkplaceLocationModelsHome-based Job Choice ModelWorkplace Location Choice ModelJob Change Model
Zone Model System Documentation On-line9
Animated Map IndicatorsAutomated Animated GIF images produced from Mapnik maps, using ImageMagick10
Database Schema Generatorsfrom elixir import *# Select the database connection by (un)commenting and adapting# the following options######################metadata.bind = "sqlite:////Users/pwaddell/sqlite/sample_zone.db"#metadata.bind = "postgres://account:password@localhost/sample_zone"#metadata.bind = "mysql://account:password@localhost/sample_zone"######################metadata.bind.echo = Trueclass AnnualEmploymentControlTotal(Entity):using_options(tablename='annual_employment_control_totals')year = Field(Integer)sector = ManyToOne('EmploymentSector', colname='sector_id')home_based_status = Field(Integer)number_of_jobs = Field(Integer)11
Database Schema Definition using Elixirclass Building(Entity):using_options(tablename='buildings')building_id = Field(Integer, primary_key=True)building_quality_id = Field(Integer)building_type = ManyToOne('BuildingType', colname='building_type_id')improvement_value = Field(Integer)land_area = Field(Integer)non_residential_sqft = Field(Integer)residential_units = Field(Integer)sqft_per_unit = Field(Integer)year_built = Field(Integer)stories = Field(Integer)tax_exempt = Field(Integer)parcel = ManyToOne('Parcel', colname='parcel_id')12
New Models from Templates13
Demographic Process Modelshttp://www.urbansim.org/Research/DemographicProcessModelsImplemented:• Aging, Fertility, Mortality, Child Leaving Home, Separation, Workat Home, Workplace ChoiceRemaining:• Family Formation, Roommate, School Attainment, End Schooling,Begin Working, Change Jobs, Retirement, EarningsRelated Models for Future Development:• Private/Public School Choice, School Location Choice, SchoolDevelopment Model, School Enrollment Models14
Data Integration in UrbanSim15
Data Integration Challenges• Messy Data• Many outliers, errors, and missing data• Inconsistent coding schemes among data sources• Difficult to integrate with other data sources• Building-level data• Business establishment data• Market information (vacancies, prices, rents)• Volume of data too massive to manually correct• 2+ milliion parcels in Bay Area• Problems hard to diagnose• Which data is wrong? (which attributes/sources are incorrect? May havesystematic patterns of omissions – e.g. tax-exempt properties)• Misgeocoding: some businesses are geocoded to the wrong place.Complicates the diagnosis.16
The Magnitude of the ProblemSeattleThis map shows only buildings withmissing values for “Building TypeID”, a description variable.195,501 out of ~1,200,000Building Type ID = NullKing, Kitsap, Pierce & Snohomish Co.Tacoma
Data Imputation Tool?
Data Imputation Tool
Ways Forward on Data IntegrationOption 1:• Machine Learning/Data Mining• Model patterns in the observed data• Use the models to detect outliers, impute data• Preliminary work on this now implemented using WEKA libraryOption 2:• In some cases, the missingness level is very high• Developing countries (e.g. Ghana and South Africa)• Potential to Synthesize much of the data, subject toconstraints, using procedural modelingOption 3:• Potential to hybridize statistical/machine learning andprocedural modeling, to synthesize from disparate sources?
Data ImputationK-Nearest Neighbors for Continuous AttributesAttributes: Stories, Bldg SF,Improvement Value, etc."KNN Basics:"Finds k closest neighbors in ndimensional space."Uses k neighbors target valuesto make prediction."
Data ImputationSupport Vector Machines for Categorical AtributesAttributes: Building Use Code,Land Use Code, etc."SVM maps training instancesinto higher dimensionalspace."Creates hyper planes that havemaximum distances frominstances as categoryboundaries."
Machine Learning/Data MiningSo far, only applied to single tables, single outputNeed to develop analysis for:• multivariate outcomes,• across tables,• some of which have poorly-defined (spatial) crossreferences• and mixtures of continuous, categorical and orderedoutcomes24
Option 2: Procedural ModelingTown centerPopulationParcelsTerrainParksJobsBuildingsInputOutputVanegas, et al, 2009
Results: Completion and ValidationReal CitySynthetic CityVanegas, et al, 2009
Assessing Uncertainty: Bayesian Melding• Developed rigorous methodology for assessment ofuncertainty in integrated land use and transportmodels based on Bayesian Melding (published inTransportation Research A, 2007)• Currently testing an application to the question: whatwould happen if the Alaskan Way Viaduct adjacent tothe waterfront in the Seattle CBD were demolished? Itis at risk of collapse in the next earthquake.28
Alaskan Way ViaductScheduled for Demolition in…The next earthquake?Some claim thatalternatives which donot have comparableTraffic capacity willCause massive failureof traffic in CBD andon I5.Others claim thatwe should replaceit with surface streetand transit, and reclaimthe waterfront. It won’tcause much trafficimpact because peopleadapt.How much would a low-capacity alternative affect travel times over 10 years?29
Assessing Uncertainty withBayesian Melding30
Assessing Uncertainty withBayesian MeldingLikelihood and posterior distribution31
Assessing Uncertainty withBayesian MeldingHow much differencein travel time on thosefamous commutes ifwe remove the Viaductin 2010 and simulateland use and transportto 2020?32
Database and Scenario Creation Tools andGraphical Interface – A Work in Progress33
Database Schema Management• SQLAlchemy is an Object-Relational Mapper anddatabase abstraction library, allowing use of standardPython syntax to create and query databases inmultiple back-end database servers:• Postres, MySQL, SQLite, MS SQL Server, Access• Elixir is a Declarative Layer on SQLAlchemy• Makes it easy to define database schema• Has a GUI builder, called Camelot, for browsing/editing tables34
Scenario Creation Graphical Interface• Requirements• Easy means to select parcels or other geographies• Ability to edit attributes• Ability to edit geometry (where appropriate)• Open source, multi-platform, easy to integrate with OPUS• Strategy:• Quantum GIS for spatial data interaction, with customization• SQLALchemy/Elixir/Camelot for tabular data management35
Geographic Selection• Basic spatial selection Point and Click• e.g. parcels near a planned subway stop
Geographic Selection• Location or proximity selection• e.g. parcels within buffer distance of subway alignment
Geographic Selection• Selection by attributes• e.g. parcels within buffer with FAR =< 2
Simple Table Editing• Edit DevelopmentConstraints for SelectedParcels• Constraint changes aremade to a copy of the parceltable• Changed values overrideoriginal values in model runs
Simple Interactive Table Filtering/EditingFor example, to edit control totals tables