Airbnb | How do they classify their listing photos?

Airbnb is known to all. Founded in 2008 by Brian Chesky, Joe Gebbia, Nathan Blecharczyk, Airbnb is an American vacation rental online marketplace company based in San Francisco, California, United States.

Airbnb has millions of listings on it from across the world. Each of these listings has many photos of the said property. But until a few years ago, there was no way of using these photos to optimize the customer experience. Today, Airbnb's users find it easier to find relevant information in each of Airbnb's listing photos, and we are going to look into how.

But Why?

Classifying images in home listings improve the customer experience in a few ways.

Take a home tour.

With rooms well classified, a simple virtual home tour is now possible

Validation of Info

Validate information, such as the number of rooms.

Find the most important photos first

Re-layout and Re-Rank photos based on distinct room types so that photos that people are most interested in would surface first.

Review Listings

Helps Airbnb automatically review listings to ensure they abide by Airbnb marketplace standards.

Now that we have established its relevance let's look into data.

Data

One would think that Airbnb should have no work in finding data. After all, it has millions of apartment listings. While the above is partially true, how much of it can be used? And how will they label them?

Airbnb's photos used to come unlabeled. Many companies would invest in an external agency to label the data into room types - Bedrooms, Bathrooms, Living Rooms, Kitchens, Swimming Pools, and Views. It wasn't an economically viable option for Airbnb, where millions of photos need to be labeled. They approached it in a blended way.

One was to ask vendors to label a relatively small number of photos correctly. This served as a gold dataset. Random sampling was used to ensuring the data was unbiased.

The second was to make use of image captions of the different hosts on Airbnb. But how to ensure these are accurate and reliable? To simply assume that if a specific keyword - say the bedroom is found in a label, the photo would be a bedroom is a straightforward approach, but unfortunately, heavily error-prone. The team found that many a time, image captions do not match the actual room type.

So to filter out such cases, The team wrote heuristic rules.

Problem & Solution:

The problem was a classic Image Classification problem.

The expected output labels were

Bedrooms
Bathrooms
Living Rooms
Kitchens
Swimming Pools
Views

This Image Classification problem is highly similar to the ImageNet classification problem. Millions of images to a set of output labels. After a few experiments, the Airbnb team decided to go with a modified ResNet50

ResNet50

I have written a brief overview of ResNets here.

Re-training a DNN is a heavy task. The Airbnb team used an AWS P2.8xlarge Instance with Nvidia 8-core K80 GPU and sent a batch of 128 images to 8 GPUs per training step. They performed parallel training with Tensorflow as the backend. They compiled the model after parallelizing it. TO speed things up further, model weights with pre-trained imagenet weights loaded from Keras.applications.resnet50.ResNet50 was used.

The best model was obtained after 3 epochs of training, which lasted about 6 hours.

But unlike what we would expect, a multi-class model was not the final model. Instead, they went for multiple binary class models. In the end, 6 models were shipped to different product teams of Airbnb. It is interesting to observe that these 6 models vary very much in precision and recall. FOr example - bedrooms have a higher precision when compared to say living rooms. But in general, precision is generally above 95% and recall is generally above 50% for all models.

And this is an overview of how listing images classification happens within Airbnb

Resources:

Official Paper: ResNet50

Official blog from Airbnb

Team includes :

Shijing James Yao

After obtaining his doctorate from the University of California Berkely, Dr.Shijing James Yao worked with Uber before joining Airbnb in 2017 as Senior Machine Learning Scientist, Tech Lead of Applied ML. Now he serves as the Head of Core/Platform Data Science (ML, Inference, Analytics), China

Krishna Puttaswamy

Krishna Puttaswamy earned his doctorate from the University of California, Santa Barbara. He then joined Alcatel-Lucent Bell Labs as Cloud Computing Researcher. He then went on to work with LinkedIn and Airbnb. He is currently with Uber as Senior Staff Engineer, Marketplace

Xiangyu Zhang

After earning his doctorate from Xi'an Jiaotong University, Dr. Xiangyu Zhang joined MEGVII Technology where he now serves as Research Lead & Senior Researcher

Alfredo Luque

Alfredo Luque is a Senior Software Engineer of ML Infra at Airbnb. A BA Economics graduate from the University Of Chicago, he had previously worked with several companies including CogoLabs, and co-founded a startup on his own, before joining Airbnb in 2017

Terms used in the blog:

Random Sampling:

Random sampling is a part of the sampling technique in which each sample has an equal probability of being chosen. A sample chosen randomly is meant to be an unbiased representation of the total population. Read more here

Heuristic Rules:

A heuristic process may include running tests and getting results by trial and error. As more sample data is tested, it becomes easier to create an efficient algorithm to process similar data types. As stated previously, these algorithms are not always perfect but work well most of the time. Source

ImageNet Classification Problem

The ImageNet is an open-source repository of thousands of output labels with 1.5 million images. Visit the webpage here.

See more here

Being Enfa