Host a static JSON file with bunny.net and keep it up to date using serverless functions

Liz Fedak
7 min readJan 5, 2022

--

I run a niche dev agency where I help no-code agencies add code to their websites that are built in a platform called Duda (Duda is a really amazing platform, by the way, and you should check it out if you’re a WordPress shop). This article has a high level overview on how you can use the same tools I use. My use case was pretty specific, so I don’t share the code here.

Recently a customer had a very specific request—they needed to make an Airtable base with potentially up to 12,000 records load quickly on their customer’s website and wanted to add live search updates and filtering of the records. The Airtable records were queried using Duda’s built-in Collections feature and were displayed as searchable articles on their website. It worked great—but was taking upwards of 15 seconds for the first load.

I tested my own widgets with 15,000 mock data records and while it was faster, it seemed like the lag was from Duda handling the cached data. To see if I could speed it up, I uploaded the mock data as a JSON file to Github and loaded that instead, and it was night and day, the widget could now load in microseconds.

Now that I knew how to make their widgets fast, I had to solve the question of “How can I keep an up to date JSON file at a static endpoint with the contents of the Airtable base so the customer has minimal maintenance effort day to day?” I decided the most straight forward solution would be to use the Airtable API and rewrite the JSON file daily (or more frequently if needed) using a static file hosting solution, serverless function, CRON job, and applicable APIs.

Tools for this configuration

Why I chose those tools

JSON file: This was the fastest option I could personally think of. As for rewriting the entire file, I decided to do that because of the nature of records being updated daily, or potentially deleted or moved. I could handle each of those as individual updates and keep them in a serverless database, but then I’d have to manage the table for the user if they restructured their data in any way. The JSON file meant they could add any data they’d like to to the Airtable base and have it just grab the most recent version at each update.

Serverless functions: My usual go-to is Vercel. However, I knew the timeout might be problematic in this scenario. Airtable sends you 100 records per request and only permits 5 requests per second. So with 15K records, it could run for around 30 seconds minimum. Each request sends an offset parameter that needs to be used in the next request, so I wrote a recursive request that gathered all of the records and spit out a new JSON file that overwrote the previous file.

As for AWS vs GCP, I don’t like AWS:) Also, with GCP you can code your function directly in their serverless function UI and easily add Node packages in the package.json file. They have a ton of training materials and examples to reduce friction to using their serverless function tool. I could hardly even figure out where to activate a serverless function with AWS (exaggerating) and wanted this to be something I could easily replicate for future customers who have a boatload of records to display on a Duda site. I saw some other platform options, but once I looked at GCP for 5 minutes, I knew it was the one! The pricing tiers also meant it would be free for this very small project to run.

Bunny.net: This one was due to a bit more of “I hate AWS”. Bunny.net is amazing! It has an easy to use API with a simple authorization token instead of having to have a phD in AWS to configure the authorization using Signature Version 3 or 4, I could just plop that into an environmental variable.
Now, when I first ran my request I had the wrong region in my request and it was showing an authentication error. I emailed their support and they got back to me within the hour. How many days do you think it would’ve taken Amazon to reply? lol. I wound up finding that error on my own (I had assumed the API page would update that, so keep that in mind!), but they were extremely helpful and followed up until I got the request to run with a curl request locally and even recommended adding --data-binary for the json file I was uploading via curl request.

CloudScheduler: It’s built in to GCP. Easy win.

The Setup

First, create a new account in Bunny.net. Add a storage zone and a pull zone. Upload a file. You can also upload it with a simple curl request:

curl --request PUT \\
--url <{YOUR FILE PATH}> \\
--header 'AccessKey: {YOUR ACCESS KEY}' \\
--header 'Content-Type: application/octet-stream' \\
--data-binary '{YOUR FILE NAME AND PATH}'

Next, write your code. I like to use repl.it or Codepen while I’m writing requests just to easily test them as I go. You’ll need to write your function that handles rewriting the JSON file based on where you are pulling the data. If you need help querying Airtable and handling the offsets, leave a comment, as I can write a tutorial specifically on that later. Here’s a basic request to update the file in Bunny using GCP:

fetch('<{YOUR FILE PATH IN BUNNY.NET}>', options)
.then(fetchResponse => {
response.status(200).send('complete');
})
.catch(err => console.error(err))

The line with response.status(200).send('complete'); is how I end the GCP function.

To setup the GCP cloud function, you’ll need to set up a Google Cloud account and activate billing. Then, create a project and navigate to the cloud function section to create a new one. The url will be something like this: https://console.cloud.google.com/functions/list?project={your project name}. Be sure to enable the APIs for Cloud Functions and Cloud Scheduler. Use the URL https://console.cloud.google.com/apis/library?project={your project id} to open the API enablement portal.

On the first page, you’ll set up the Trigger and the Runtime, build, connections and security settings.

Settings:

  • Trigger: HTTP
  • Runtime: Increase limits depending on what your function needs for memory and a preferred timeout limit. Use as few resources as possible to avoid getting an outrageous bill. :)
  • Instances: This is for how much compute power you might run. For something like updating a JSON file I’m going to keep the number of instances between 0 (min) and 1 or 2 (max). For something like training a model I might consider using more but I’m no ML person.
  • Environmental variables: Don’t publish your secret keys in your code. Add them here and refer to them in your code as env.variable.YOUR_VAR_NAME

Then click “Next” and you’ll be able to add your function code index.js. If you’re using Node, you’ll see files for the function code and your package.json file. Add any dependencies, like the fetch library in the package.json file and require them in your function. Remember to terminate both your function and the GCP function in your code so it doesn’t timeout after your promises wrap up.

After you set up the function, you’ll need to add an allUsers role that has the permission to invoke and view the function.

To do so, check the box next to your function then click “Permissions” after that appears. Google changes their UI all the time, so if you don’t see that exact button, look for something similar.

The Permissions will open up. Click the “Add principal” button. (It might change to say something else, so click the button that lets you add a role.) Add the role called allUsers. Search for permissions similar to Cloud Functions Invoker and Cloud Functions Viewer. This will be a public role, so limit it to that. If you don’t see those exact permissions, use logic to figure out what the new names are for those permissions. 😅 This is as of 1/5/22. Save that, confirm you want to save it, and you’ll be all set to call your function.

In your cron scheduler, set up a schedule to call your GCP function. You can get the URL from the overview screen by clicking the “Trigger” tab. You can test the request in your terminal with curl -X POST YOUR_FUNCTION_URL_HERE.

If you’re using Google Cloud Scheduler, you’ll need to enable the Cloud Scheduler API. To open up Cloud Scheduler, search for the app in the search bar and click to open it up (not the API). Once you’re on the page, click Create a Job. If you haven’t memorized how to write a unix-cron scheduler, search for a website that can translate it for you. For the execution, add these settings:

  • Target type: HTTP
  • URL: Trigger path for your GCP function
  • HTTP Method: POST
  • User agent: leave the default value
  • Configure the retries as you wish

That’s about it! Be sure to test your job by clicking the “Run now” button and verifying that it runs successfully. Let me know how you’ll use this in your static sites in the comments below! ↓

Author’s note: SHOUT OUT TO THIS BLOG for the addUsers explanation. That one was a real pain in the butt! :)

--

--

Liz Fedak
Liz Fedak

Written by Liz Fedak

Journalist and endlessly curious person. One half of @hatchbeat.

No responses yet