Hieu Nguyen — October 30, 2020
Google Sheets as a database service is great for many scenarios, though comes with some limitations
Spreadsheet services like Google Sheets or Airtable have already used by many people, teams, and companies. What makes them more interesting is, they introduce non-tech users to the no-code movement, open doors for interesting possibilities.
No-code is like Starbuck, a kind of coffee also for people who don’t drink coffee. What great about Starbuck isn’t the taste, but it allows more people to enjoy coffee. In other words, no-code is a programming language for non-programmers.
Recently, I’ve examined using Google Sheet as a database service when integrating to Inverr — a no-code builder. It can possibly operate for sites or apps with less than 5M visits/month, with big enough data storage for small apps. Though, it comes with some limitations.
In this post, I’ll address a few technical issues when using Google Sheets as a database service, its effects, and how to overcome it.
The main functions of the database are store, retrieve, and search for data. While Google Sheets is doing excellent at storing and retrieving data, the searching part in Google Sheets is a bit tricky.
On many apps, search provides discoverability. For example, when you’re building a movie discovery app, users search a movie by name, filter by a category, or popularity. Without search, the movie app is no different than a movie list.
To use the searching feature, Google Sheets APIs provide a property called “developerMetadata”. It’s an additional data on each cell, and invisible to Google Sheets users. When searching, it will use these additional data.
“developerMetadata” has a limit of 30,000 characters for each spreadsheet. It’s too small to do anything great. Though, you can optimize for sentence, paragraph by only store important terms.
Google Sheets has a storage limit at 5,000,000 cells (including blank ones), and up to 256 columns per sheet.
Let’s take an example as a movie discovery app, it needs at least 6 properties (id, title, description, release date, rating, category). As a rough estimation, you can only store up to 5,000,000 / 6 = ~833k movies. Since there are currently 555,913 movies released, in theory, it’s still acceptable.
Another limitation when using Google Sheet APIs is its request quota. Starting at 500 requests per 100 seconds (or 5 requests per second), and start charging when excessed. In many cases, using proper caching techniques will reduce the number of requests to Google Cloud APIs services.
Query Performance Google Sheets
This is an unrealistic measure, but to test the limit and get a better overview, I tried getting data from a spreadsheet with a large dataset.
The above graph demonstrates the performance of retrieving thousands of cells per second from Google Sheets. It took around 2 seconds when getting 100k cells all at once, which 100k / 6 = ~16k movies. This is acceptable performance, and grows as the number of cells grows.
Google Sheets is like a database with its own presentation. For example, you have a spreadsheet for monthly expenses. It contains a list of expenses, their category. And in the end, it can have a total expense value by the sum of all expenses in the month.
The layout freedom is convenient when using it as a spreadsheet, but a nightmare for a database. Because there is always some kind of fixed layout that the database follows. It allows the database engine to calculate the distance and navigate between rows of data. And no database designed without a layout.
Besides, unless you decided that no-one will ever touch the spreadsheet, someone might accidentally screw it up by renaming the tab or remove it.
To conclude, Google Sheets is widely used as a tool. But as a database, it comes with some limitations. With proper caching and optimization, it is safe to use it as a database for webs/apps with < 5.000.000 visits monthly.
Besides, its search limitation can be a huge bottleneck in many scenarios. You can use “developerMetadata” to enable data lookup. With 30k characters limits, it’s a good practice to store only “meta” data.
Share this post
I'm a developer, hobbyist photographer. Building Inverr — a NoCode Site Builder