Projects

DeepSolar++

What is critically needed but lacking today to understand the technology diffusion of solar PVs over time is a highly granular spatiotemporal dataset for solar installations, as well as the method to efficiently construct and maintenance it. In this project, we bridge this gap by developing computer vision models to deal with low-resolution historical satellite and aerial images to identify when each solar PV system was installed. With this model, we constructed a nationwide spatiotemporal dataset for solar PVs. We further demonstrated the value of this dataset by analyzing it from the technology adoption lifecycle perspective to answer the questions such as: What factors are associated with the onset of solar adoption? What factors are associated with the saturated adoption level? What types of financial incentives are associated with higher saturated adoption levels, especially for low-income communities?

DeepSolar

We built a nationwide solar installation database for the contiguous US utilizing a novel deep learning model applied to satellite and aerial imagery. The data are published as the first publicly available, high-fidelity solar installation database covering all states in the contiguous US. For each solar installation, the database contains the geolocation, size, and subtype information. We demonstrated its value by identifying key environmental and socioeconomic factors correlated with solar deployment. We also developed high-accuracy machine learning models to predict solar deployment density utilizing these factors as input. We hope the data produced by DeepSolar can aid researchers, policymakers, and the industry in gaining a better understanding of solar adoption and its impacts. (project website)

DeepGrid

Detailed and location-aware distribution grid information is a prerequisite for various power system applications such as renewable energy integration, wildfire risk assessment, and infrastructure planning. However, a generalizable and scalable approach to obtain such information is still lacking. In this project, we developed a machine-learning-based framework to map both overhead and underground distribution grids using widely-available multi-modal data geospatial data. It is developed with the data in the U.S. but can be directly transferred to Africa without any re-training or fine-tuning. By applying this model to California, we not only identified multiple levels of disparities in distribution grid vulnerability to wildfire, but also proposed a cost allocation scheme that can make distribution grid protection projects equitably affordable to communities at all income levels.

Urban2Vec

Understanding intrinsic patterns and predicting spatiotemporal characteristics of cities require a comprehensive representation of urban neighborhoods. Existing works relied on either inter- or intra-region connectivities to generate neighborhood representations but failed to fully utilize the informative yet heterogeneous data within neighborhoods. In this work, we propose Urban2Vec, an unsupervised multimodal framework which incorporates both street view imagery and point-of-interest (POI) data to learn neighborhood embeddings. Specifically, we use a convolutional neural network to extract visual features from street view images while preserving geospatial similarity. Furthermore, we model each POI as a bag-of-words containing its category, rating, and review information. Analog to document embedding in natural language processing, we establish the semantic similarity between neighborhood (“document”) and the words from its surrounding POIs in the vector space. By jointly encoding visual, textual, and geospatial information into the neighborhood representation, Urban2Vec can achieve performances better than baseline models and comparable to fully-supervised methods in downstream prediction tasks. Extensive experiments on three U.S. metropolitan areas also demonstrate the model interpretability, generalization capability, and its value in neighborhood similarity analysis.