A Bayesian machine learning model for estimating building occupancy from open source data

Abstract

Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods, which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the population data tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT aims to estimate ambient occupancy in units of people/1000 ft2 for a number of building types at the national and sub-national level with the goal of providing global coverage. We present the PDT model, situate the work within the larger community, and report on the progress of this multi-year project.

Publication
In Natural Hazards
Date