Research

Current Projects

BikeButler: Creating and Previewing Personalized, Context-Sensitive Bicycle Routes Jan 2025 - Present

University of Washington

Bicycle route preferences are inherently contextual. We built BikeButler: a tool that lets you plan, generate, and view bike routes from preferences on eight different infrastructure features.

Urban Routing Computer vision Machine learning

Urban cycling benefits personal wellbeing, public health, and global sustainability. While current tools such as Google and Apple Maps provide bike route recommendations, they do not account for a person’s dynamic context (e.g., commuting, recreation). We introduce BikeButler, a personalized, context-sensitive bicycle route generation tool that enables users to generate, compare, virtually preview, and iteratively customize bike routes via custom profiles that encode seven bikeability features, including bike lane existence, slope, vegetation, and surface quality—fusing data from OpenStreetMap, open government data, and a custom VLM-based analysis of Street View images. To design BikeButler, we employed a human-centered, iterative approach starting with formative interviews and culminating in a user study (N=16). Our findings demonstrate that bike routing preferences change as a function of context, that BikeButler enables users to quickly create and iterate context-sensitive routes, and that generated routes differ significantly from Google Maps bike routing, reinforcing the importance of personalization.

Related Publications

BikeButler: A Personalized, Context-sensitive Bike Routing Tool using Open Data and VLM-based Analyses of Street View Imagery

Jared Hwang, John S O`Meara, Zeyu Wang, Jasmine Zhang, Jon E. Froehlich

CHI ’26

PDF | DOI | Code

AccessParkCV: Understanding and Detecting Disability Parking Oct 2024 - Present

University of Washington

Accessible parking is critical for people with disabilities (PwDs). This work aims to understand how PwDs use disability parking, and develop a way to scalably detect and characterize (e.g., the width) disability parking in the built environment.

Accessibility Computer vision Machine learning

Accessible parking is critical for people with disabilities (PwDs), allowing equitable access to destinations, independent mobility, and community participation. Despite mandates, there has been no large-scale investigation of the quality or allocation of disability parking in the US nor significant research on PwD perspectives and uses of disability parking. In this work, we first present a semi-structured interview study with 11 PwDs to advance understanding of disability parking uses, concerns, and relevant technology tools. We find that PwDs often adapt to disability parking challenges according to their personal mobility needs and value reliable, real-time accessibility information. Informed by these findings, we then introduce a new deep learning pipeline, called AccessParkCV, and parking dataset for automatically detecting disability parking and inferring quality characteristics (e.g., width) from orthorectified aerial imagery. We achieve a micro-F1=0.89 and demonstrate how our pipeline can support new urban analytics and end-user tools. Together, we contribute new qualitative understandings of disability parking, a novel detection pipeline and open dataset, and design guidelines for future tools.

Related Publications

“Where Can <em>I</em> Park?” Understanding Human Perspectives and Scalably Detecting Disability Parking from Aerial Imagery thumbnail

“Where Can I Park?” Understanding Human Perspectives and Scalably Detecting Disability Parking from Aerial Imagery

Jared Hwang, Chu Li, Hanbyul Kang, Maryam Hosseini, Jon E. Froehlich

ASSETS ’25 | Acceptance Rate: 29.7% (83 / 279)

PDF | DOI | Code | arXiv

Past Projects

Machine Learning for the Digital Typhoon Dataset Jan 2023 - Aug 2023

National Institute of Informatics (NII)

Preparing, publishing, and using the Digital Typhoon Dataset for machine learning. The dataset is the longest typhoon satellite image dataset available. We developed ways to interact with the dataset for ML, and performed benchmarks in analysis, reanalysis, and forecasting.

Machine learning Dataset Climate

The Digital Typhoon Project aims to collect, standardize, and curate typhoon (hurricane) satellite images in the Western North Pacific. It was started in 1997, and the Digital Typhoon Dataset covers over 40 years from 1978 to 2023 – the longest typhoon satellite dataset available.

With the dataset, the project hopes to reduce the burden of researchers to begin peforming ML on tropical cyclones without needing in depth meteorology and satellite remote sensing knowledge.


An overview of the dataset and projection of globe images to 2D.

Releasing the Dataset: the Last Mile

Before I joined the project, the dataset existed only as part of a browsable UI on the Digital Typhoon website. It wasn’t an ML-ready dataset:

There was no organized file structure. No defined delineation of seasons (years), typhoon sequences, etc.
No way to interact with the dataset in a controlled manner
- Loading into a program isnt as simple as reading in all the images and labels
- Without thoughtfully splitting the dataset into test/train/validation sets, leakage could occur between sequences and years
- How to link ground truth data to images? The ground truth is taken from best track data, which doesn’t line up 1:1 to the satellite images
- Doing engineered splits, such as splitting between satellite generations
We had no benchmarks as to how well the dataset worked with current ML models

The result was a highly valuable dataset only accessible via the web UI, which many people scraped to access, resulting in ML models trained off of lossly compressed JPEG images with potential graphical artifacting, and non-standardized.

pyphoon2


An overview of the structure of the dataset

To cover the last mile and make the Digital Typhoon Dataset an easily downloadable dataset with a fast download-to-ML turnaround time, I developed pyphoon2, a PyTorch dataloader which offers:

Automatic reading and loading of the data into sequence objects, with ground truth labels accessible via object attributes
Hands-off random splitting by image, sequence, or season
Filtering, transforms, and selection by any label attribute
Interpolation and dataset modification
All the flexibility of PyTorch, making using the dataset “plug and play”

You can find pyphoon2 here.

Benchmarking

I also led the benchmarking efforts between myself and the two other students on the project. We defined three tasks applicable to the dataset and developed models to address all three:

Analysis
Forecasting
Reanalysis

You can see the definitions and our results in our paper, where we published the benchmarks and dataset to the NeuRIPS 2023 Datasets and Benchmarks track:

Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones [pdf]

Or browse the Digital Typhoon website.

Related Publications

Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones

Asanobu Kitamoto, Jared Hwang, Bastien Vuillod, Lucas Gautier, Yingtao Tian, Tarin Clanuwat

NeuRIPS ’23 Dataset and Benchmarks Track | spotlight | Acceptance Rate: 32.6% (322 / 987)

PDF | DOI | Code | arXiv

Machine Learning on Liquid Argon Time Projection Chamber (LArTPC) images May 2020 - Sep 2021

Tufts University

Building tools with ML for analyzing neutrino detector images from a Liquid Argon Time Projection Chamber. Using a VQVAE to compress large image files, and performing instance segmentation on the LArTPC images to extract neutrino trails.

Machine learning Physics

Liquid argon time projection chambers (LArTPCs) are a type of particle detector used in neutrino experiments. To make the most of the detector data, we must reconstruct and identify particle paths through the detector accurately. The Neutrino Group at Tufts University explores the use of deep neural networks to reconstruct particle interactions.

Project 1: Attempting centroid based instance segmentation on LArTPC images

To separate the different elements of a neutrino interaction (trunk, shower, etc.), we attempt to use a novel centroid based loss-function to perform instance segmentation. See the loss function paper here.


The ground truth of instance segmentation in our project. Note the shower is missing from the highlight, a problem with the labeling upstream.

Unfortunately, application of the technique gave poor results for LArTPC data, possibly due to the vastly different shape of objects in the dataset.

Project 2: Compressing LArTPC images using Autoencoders

LArTPC detectors produce terabytes of data per second. Finding a way to make these images smaller would vastly decrease cost and increase flexibility and convenience of analysis. Using a VQVAE (Vector Quantized Variational Autoencoder), we can learn a latent space for the LArTPC images and interactions, train an encoder and decoder, and save just the encoded images. At analysis, we then can apply the decoder.

By tiling LArTPC images into VQVAE-acceptable image sizes, and reconstructing decoded tiles, we achieved up to 80% reduction in file size compared to traditional JPEG compression.


An example of tiles compressed and reconstructed using the VQVAE. Top is the original images, bottom is the reconstructed.

Isotope Decay at Rest Experiment (IsoDAR) Jun 2018 - March 2020

Massachusetts Institute of Technology (MIT)

Developing a simulation code from the ground up for Radio Frequency Quadrupoles (RFQs). Designed to support existing simulation infrastructure, accurately simulate space-charge (interaction between charged particles), and parallelization. Faster and more scientifically accurate than existing alternatives!

Physics Simulation

The IsoDAR experiment searches for sterile neutrino oscillations and non-standard neutrino interactions. Once built, the IsoDAR accelerator will send dihydrogen molecules through a radio frequency quadrupole (RFQ), then a cyclotron, to then impinge onto a Be and Li target, producing antineutrinos that we detect and can study.

To predict the behavior of the H₂+ through the RFQ, we simulate it.


Video of an RFQ simulation

The Problem

All RFQ simulation software at the time had at least one problem:

By not simulating space-charge (interaction between charged particles) it was inaccurate.
It was proprietary.
It didn’t play well with other software (such as the input and output of other particle accelerator elements), and was difficult to use and tune.
It was slow

The Solution

I developed PyRFQHelper, an RFQ simulation software from the ground up designed to solve those problems and more. To address the issues:

It’s built on top of WARP, a particle-in-cell simulation framework that takes space-charge into account.
It’s built open-source.
It’s built to accept a wide variety of input data forms and output a simple format.
Despite already being faster, we integrated parallel-processing capability.

In addition, it:

Visualizes simulation results
Automatically calculates statistics of interest (emittance, electric field, etc.)
Uses clustering to determine particle bunch size and statistics at simulation end

Simulations were also easily customizable (being built on top of WARP), so new components in different shapes could easily be added or removed from the simulation.

Results

A simulation software that performs as fast, and with parallelization, faster than competing software, that’s flexible and scientifically accurate.

More about IsoDAR can be read at the website or in the paper.

Project code can be found here.