Effects of various factors on Seattle Airbnb prices

Annie Thomas
6 min readSep 17, 2020
Source :airbnb.ca

Introduction

Airbnb is an internet marketplace for short-term house and apartment rentals. It allows you to, rent (list) out your house for a week while you are away, or rent out your empty bedroom. One challenge that Airbnb hosts face is determining the optimal rent price. In many areas, guests are presented with a good selection of listings and can filter by criteria like price, number of bedrooms, room type, and more. Since Airbnb is a market, the amount a host can charge is ultimately tied to market prices.

What features of Airbnb listings have the biggest impact on price, and can price be accurately predicted based upon these features?

When making a host listing on Airbnb, a variety of information is needed. This includes standard quantitative information such as the number of bedrooms, bathrooms, etc. as well as data that includes more personalization such as host information and written descriptions of the property. Everything that the guests can see when making their decision on which listing.

The research we are performing makes new contributions to the community through an enhanced understanding of factors leading to higher profits in online marketplaces and providing a model to predict where prices lie.

You can find my project in github you can find here For data I am using the Seattle AirBnB dataset from kaggle you can find here.

Let’s dive in

There were many features on which some major chopping off, after data wrangling operations were performed and still many features were left. We will look at some of the features correlated to the listing price of Seattle Airbnb. We have performed ordinary squares regression, in predicting price using a few other variables.

Table of Initial OLS results

The test results of R-squared value is .581, which gives an indication of majority variation of price using the selected variables. During the test, we found that the different features such as bathrooms, bedrooms, and number of guests had a higher effect on pricing.

Visualisation is a good way to explore some introductory data. We performed heatmap on several factors.

Correlation Heat Map

From the above result we can understand that the prices depend on the listed features like accommodates, bathroom, people and number of bedrooms. It was also interesting to see how each of the review sub-categories correlated with each other.

Neighborhood

Do listing price relates to the neighborhood, in which the listing is located? Houses generally have a higher price if it is located in a good neighborhood, we assume that neighborhood is a likely predictor for listing price. In order to understand this better we can perform calculations on the dataset within the Seattle area. The graph compares the average listing price of different neighborhoods.

Average Listing Price Per Neighborhood

We observed that Magnolia stands out from the group with an average price of nearly 175 dollars, nearly 100 dollars more than the lowest average of Delridge at around 75 dollars per listing. From this we can assume that the neighborhood has a large impact on listing price.

Do Review Matters?

Airbnb allows a guest to rate their experience of stay. We could perhaps gain an insight of what makes an enjoyable Airbnb experience. Using these factors, we can analyse how this is going to effect the pricing.

Our methods in this analysis were the same, first performing a regression, before building a predictive model and measuring accuracy in predicting the overall review score of an Airbnb.

OLS Regression Results for ratings

Although our least squared regression for ratings returned a high r-squared value of 0.719, this equation also included the sub-reviews.These features lead us to predict variance in the data. Intuitively, a feature like cleanliness can be a strong predictor of the overall review score. Additionally, other important features like price and number of reviews to the overall rating as well.

To view the accuracy of the model without the sub-reviews, we conducted another multivariate linear regression, this time without the sub-reviews.

OLS Regression Results for ratings (without sub-reviews)

This significantly dropped our R-squared value, to 0.190, indicating the variation in Airbnb rating. We hypothesized that this is because the variables we are left with only represent a small fraction of what goes into a positive or negative Airbnb experience. Factors such as property type or price simply explain the listing and do not give much insight into a negative or positive experience. However, the regression did show significant factors. Higher price generally meant better reviews.

Does other Features Matter?

In order to understand how the other features impacted the average listing price.We take a look how the features correlated with the listing price of Seattle Airbnb

The accommodates field, which shows the number of people that the property can accommodate, has the strongest correlation to listing price. The number of bedrooms and bathrooms and the number of guests that are included for the price also show a strong correlation to the listing price. These features helps us to predict the listing price

What is the popularity of Property types?

The other important feature to predict price of the rental is, type of property. Like by a apartment, lot, house or other type. It makes sense in asking higher price if the offer is for the entire flat/house.

Comparison between property_type for a low and high price in Seattle

Condominium is bit popular than Bed & breakfast type property in Seattle. The property type helps us to predict the listing price for the rental

Can We Predict Listing Price?

Airbnb listing prices in Seattle are a strong function of: Neighborhood, size, property type and other features . Let’s turn our attention , can we predict listing prices.The Seattle airbnb data was relatively clean, but removal of null and obviously flawed data caused a reduction in the listing dataset. See the code in the github as cleaned-listings-dummies.csv.

I measured my results using R squared, mean absolute error, and mean squared error. The results are shown below

Looking at the testing results for this algorithm, graphically as a histogram, we can see that the prediction was within $10 of the actual value.

Looking at the results as a scatter plot also illustrates how good they were. There are a few outliers, but there is a huge clustering around 0 for most of the points.

It helps to interpret how good these results are by comparing it to the scatter plot that I got for the Random forest regressor. Note the wider spread of values.

As good as the current results are, they still could be made better. There are additional fields that are available, that have some correlation to price. From above test, we can predict the listing price of the rental

Conclusions

In this article ,we were able to gather some insight that the Airbnb rent prices noticeably depends on many factors. The result of this analysis can help potential Airbnb host get an information how to predict the price .The most factors that influence price like neighborhood ,property type, total number of rooms, accommodates (total number of people to accommodate), room type and reviews. This information will allow Airbnb hosts to find and improve certain variables of their listing that are the highest indicators when it comes to setting price.

--

--

Annie Thomas

As a beginner in Data Science and in writing blog .I am exploring and expanding my knowledge to write blogs.