Creating an Open Graph Image API

The Open Graph spec allows websites like this one to attach metadata to their pages so that it can be consumed by third parties. In this post, I explore how I went about doing a low cost copy of Github's framework for building Open Graph images.

A user (or more likely, myself) decides to share an article on social media. An image very much like the following should pop up for easy sharing.

Preview image referencing an article entitled A Horizontal Snap Scroll Container. Contains this text, some ancillary description text and a image of a fox for good measure.

What this means is that we'll need a few things.

An endpoint. I've elected to expose a subdomain of this site (https://opengraph.jackofalltrad.es/) to the internet via CloudFlare's DNS settings. This routes to an Azure Container App. The container app has one path ogimage.
A query string. The query string of our api should have a url key with a value that corresponds to the url of the article being shared.
A service that returns an image with information about the article. To do that we'll need some browser automation and some server code that can run in a container.
An HTML element. A <meta> tag in the <head> like the one below. It needs a property attribute with a value of og:image, then a content property with a value that corresponds to the endpoint of our image generating API.

<meta
  property="og:image"
  content="https://opengraph.jackofalltrad.es/ogimage?url=https://jackofalltrad.es/posts/creating-an-open-graph-image-api"
/>

Before you go on ...

I'd like to highlight the two big-kid-on-the-block solutions from Vercel and Github.

Vercel has a tool for doing just this
A linked above, Github has blogged about their solution

Deciding on the browser automation tool to use

In Github's adventures getting something similar finished, they used browser automation to get the job done, so I resolved to do the same. But while they used Google's Puppeteer, I decided to use the newer Microsoft version of the same thing: Playwright. The goal here is to have Playwright running, receive a request for an image, load up the page being shared, grab a few necessary pieces of information, such as URL, title, and description, the pop it into a nearly-blank HTML page, take a screenshot, and sent the image back to the user.

There are some immediately obvious downsides to my chosen approach (that I will pigheadedly stick to). Loading up a page takes time, meaning this API will not be particularly fast. It will be running in the broke-ass tier of an Azure Container App, meaning if no traffic is accessing the url (a very realistic scenario for tiny blog like this), then the container app will be shut down and take a few extra seconds to start up when being unused. If I start to see real traffic this might be fine, but otherwise the issue will persist. (Side note: you can specify the maximum and minimum number of containers to scale up/down to in the Azure portal or your ARM template.)

Other solutions, such as Vercel's, don't use browser automation at all, but rather use a tool like Satori to convert HTML into an SVG. I didn't delve deeply into this, but it may well be a more performant solution.

Using Docker and Azure Container Apps instead of Azure Functions

My first attempt at this was mired in obstacles to actually installing Playwright in the Azure Functions runtime. Eventually, I found that using a Docker image with Playwright preinstalled and deploying that to an Azure Container App produced the right results with little to no fiddling around. Luckily there's already a container in the Microsoft Container Registry for this very purpose. Extending from that image was very easy, and I ended up with a Dockerfile that looked like the following.

# Extend from the Playwright Docker Image
FROM mcr.microsoft.com/playwright

# Administrative container bits
LABEL org.opencontainers.image.title="Playwright og image endpoint" \
    org.opencontainers.image.description="A bit of function code to return an og:image" \
    org.opencontainers.image.authors="@wibjorn"

# Create directory in container image for app code
RUN mkdir -p /usr/src/app

# Copy app code (.) to /usr/src/app in container image
COPY . /usr/src/app

# Set working directory context
WORKDIR /usr/src/app

# Install dependencies
RUN npm install

# Build TypeScript files
RUN npm run build

# Port to expose
EXPOSE 8080

## Command for container to execute
CMD ["npm", "run", "fastify-start"]

Deploying this via the command line

So far, I've not gone through the trouble of add continuous integration for this little project. Instead I've been deploying it with the Azure CLI. If you

az containerapp up  --name <nameyourapp> --resource-group <name-of-resource-group> --environment <nameyourenvironment> --location westus --source .

You'll need to ensure the resource group you are deploying to already exist. There's also a tutorial about deploying Container Apps on Microsoft Learn. (p.s. I am part of the Microsoft Learn engineering team, but it is not my intention to shill for any particular cloud on this blog.)

Setting up the folder, package.json and TypeScript

We have an direction and an environment for the code to run in consistently. Time to set up a server. Fastify is one of the preeminent NodeJS server framework around these days. We can get a small server up and running with just a few lines of code. The project folder will look like this:

package.json
src/
  index.ts
  markup.ts
tsconfig.json

Our package.json will contain our dependencies and the start script for the server itself. These days, I use a handy tool called wireit to orchestrate scripts that depend on one another. Perhaps a topic for another post. But for now, I'll simply explain the scripts. The first build will build our TypeScript file (targeting src/index.ts) into a dist folder by running tsc. The second fastify-start will invoke our built JavaScript in node ./dist/src/index.js.

A small Fastify server

const fastify = require("fastify")({ logger: true });

const options = {
  // tbd, see query section later in article
};

fastify.get("/ogimage", options, async (request, reply) => {
  // We'll put our image generating code in here.
});

// What follows from here is just server startup boilerplate.
const port = process.env.PORT || 8080;
const host = "0.0.0.0";
const start = async () => {
  try {
    await fastify.listen(
      {
        port,
        host,
      },
      (err) => {
        console.log(`Server listening at ${host}:${port}`);
      }
    );
  } catch (err) {
    fastify.log.error(err);
    process.exit(1);
  }
};

// Start it!
start();

With the above we create on GET route, available at /ogimage. We first import fastify, then call the get method with the name of our route, options, and a callback function. Further down, we then create a start function, the key part of which is fastify.listen, which actually starts up the server.

Fastify Route Options

Its a good idea when accepting query strings with Fastify to allow for some validation via route options. There is a lot you can do with this options object, as you'll see on the Fastify Route docs options section. What we need to do for our purposes is ensure our query string is validated. In order to do that, our options object needs a schema property, which will in turn have a querystring property. We'll then specify what query params this route will accept, and Fastify will take care of the rest - even producing tidy error responses when the query string does not match what we specify. See here:

const options = {
  schema: {
    querystring: {
      type: "object",
      properties: {
        url: {
          type: "string",
        },
      },
      required: ["url"],
    },
  },
};

You can see inside querystring that we're writing a JSON schema. It specifies the query string needs to be an object with a required url key that has a value of type string. So when our request for an image comes in, if the url doesn't have a url key in the query string, Fastify will automatically respond with a 400. You can try it by visiting this URL: https://opengraph.jackofalltrad.es/ogimage?noturl=https://jackofalltrad.es/posts/a-horizontal-snap-scroll-container.

You should get back a 400 (Bad Request) response error in JSON that looks like this:

{
  "statusCode": 400,
  "code": "FST_ERR_VALIDATION",
  "error": "Bad Request",
  "message": "querystring must have required property 'url'"
}

Pretty neat! It's a very easy way to ensure things are the way they should be before you get into your route handler. Always nice to sidestep some error handling if you can.

A route handler that takes pictures of web pages

Now that we have a Fastify route with some basic query string validation. We finally get to write the route handler itself. At its most basic, this handler will be a function that takes a request and reply parameter and at some point calls reply.send. A status of 200 (OK) is default, so we don't even really need to specify.

fastify.get("/ogimage", options, async (request, reply) => {
  return reply.send("This is the response body");
});

But of course we'll need to do more than that. At the top of the function, we'll add a little bit more validation. Sure, the URL needs to exist, but I want to make sure it only works for urls on my site. I'm not trying to pay for everybody's open graph images, wink, wink. You could probably also do this with the query validation object we specified above (JSON schema strings can have a pattern property), but whenever I can avoid writing a JSON escaped regexp, I tend to, even if that regexp is an easy one like https:\/\/jackofalltrad\.es.

const urlArg = request.query.url;
if (!urlArg.startsWith("https://jackofalltrad.es")) {
  reply.code(400);
  return reply.send("Only valid for https://jackofalltrad.es urls");
}

Now that we're past the boring stuff. We need to be able to open up a browser. First install the required dependencies: playwright and playwright-chromium.

npm i --save playwright playwright-chromium
# and
npx playwright install

Second, import chromium from playwright at the top of your file.

import { chromium } from "playwright-chromium";

Now that chromium is available to us, we can use it! The below is what it takes to visit the web page.

const urlArg = request.query.url;
// [... code omitted, see above ]

// launch a headless browser
const browser = await chromium.launch({ headless: true });
// create a new browser context
const context = await browser.newContext();
// create a new page object
const page = await context.newPage();
// visit the url of the page
await page.goto(urlArg, {
  waitUntil: "domcontentloaded",
});

Here is where we have to begin to make some design decisions. You'll note that in the above snippet we're only waiting for domcontentloaded rather than slower events such as load or networkidle. Note that Playwright has a several options for the waitUntil property listed in their docs. I've decided that in fact I don't really want to take a picture of the page itself and send it to users. I want to do something a bit more stylized. Taking a picture of the page itself, or even manipulating the DOM itself would be more brittle and tricky than what I'm hoping to do.

The approach

I've decided to ...

Grab the text from the page's h1.
Grab the description from the page's og:description metatag, which I write in the yaml front matter of my 11ty markdown file.
Use the url of the page itself to put above the title.
Create a sandbox page with a playwright route that I can visit. That page will just need the above information.
Style that page sort of like the real web page.
Resize the browser.
Take the screenshot.

Grabbing useful metadata from the page itself

To get the pages url we can reference the url arg from before or use playwright to get it.

const slug = page.url();

To get the title of the page, we can use a playwright locator with a css selector.

const title = await page.locator("css=h1").allTextContents();

To get the description out of the open graph description metatag, we can do something similar.

const description = await page
  .locator('css=meta[property="og:description"]')
  .getAttribute("content");

Just to keep things as quick as possible with this series of relatively slow tasks, we should wrap anything asynchronous in a Promise.all to ensure the tasks are performed in parallel. First, we'll just quickly assign the page's url to the slug variable. Then we'll grab then text in the page's level one header (h1) and assign it to the title property, then get the og:description's meta tag and check its content, assigning the result to the description variable.

const slug = page.url();
const [title, description] = await Promise.all([
  page.locator("css=.article > h1").allTextContents(),
  page.locator('css=meta[property="og:description"]').getAttribute("content"),
]);

A picture is worth a thousand characters of code

We now have all the information we need! But the fun is just beginning. I wanted the image taken to share styles with my site. That said, I did not want to load up the site itself (thus introducing even more slowness and flakiness with a network request). I resolved to avoid the network request by creating a fake HTML page and using Playwright's routing capabilities to serve my fake markup and to use an CSS inlining tool to help me make my fake page. After doing that, I'd create a function getMarkup that I could give the page's title, description, and slug.

// in markup.ts
export const getMarkup = (title, description, slug) => `<html>...etc...</html>`;

Side note, I also converted the fox image in the header to a data uri. This means the markup for the page is horrible to look at. Inline styles, a massive data uri image. Yuck! But this is all in the service of avoiding loading any other external content and still getting a stylized look.

Time to set up a page that is ready-made to get screenshot (screenshotted?). I've opted for a small 640x320 pixel screenshot. Time to resize the Playwright browser window to be that size. This should ensure the image we take matches those proportions.

await page.setViewportSize({ width: 640, height: 320 }),

Now that the viewport is the desired size, I need to set up a route that's ready to serve my fake markup. After setting up the route, we can tell Playwright to visit that page.

const fakeOrigin = "https://this.url.does.not.matter.at.all/";
await page.route(fakeOrigin, (route) => {
  return route.fulfill({
    status: 200,
    contentType: "text/html",
    body: getMarkup(title, description, slug),
  });
});
await page.goto(fakeOrigin, { waitUntil: "domcontentloaded" });

There's little to do now but close the browser and send the image back to the user!

browser.close();
return reply.header("Content-Type", "image/jpeg").send(imageBuffer);

Hooray! We did it!

Wrapping up

This solution is far from perfect. The consumption plan container app has already gone offline (many times!) due to inactivity. This has yielded slow requests while the app wakes up. I've had to go in and ensure it never scales down past one working app (which means for cost).

In terms of maintainability, there are also some obvious flaws, namely that if I change the look and feel of my site, I'll have to go into the open graph image app and change the look and feel there too. Using a CSS inlining tool was helpful for generating the markup, but it's hardly 100% effective (human edits still required!), nor is it a joy to work with.

That said, I'm happy enough with my little vanity project. We'll see how long it lasts!