All posts

Last edited: Mar 06, 2023

Using AFFiNE as your own blog - technical

The Open Source Engineering

Peng Xiao

AFFiNE is great and we think it is ready for use in the wild. So in order to align our actions with our beliefs, we have decided to migrate our existing blog to AFFiNE.

In this article, we will share with you the reasons behind our decision to switch to our own product and the benefits we hope to achieve, as well as the challenges we faced along the way. We hope that our experience will inspire and guide other users who are looking to integrate and utilise AFFiNE in their own workflows.

Switching to our own product

Aside from explicit testing we are always looking for more ways we can expand how we use AFFiNE in-house. Not only do we want to better understand and relate to our users but we also are looking to promote a positive work culture.

The more we use AFFiNE the more we can embrace and appreciate the feature ideas and/or issues raised by our users. We take community feedback very seriously and we give serious thought to everything that is brought up. By using the product internally, we will create more shared experiences among employees, which will foster our own sense of community and collaboration. We want everyone to feel proud of the work they have done, and the product that has been produced.

It will also make the process of writing on our blog much easier. Before this post, writers need to learn the basics of Markdown, Git & GitHub before contributing to the AFFiNE Blog, which is a rather large learning curve to any non-technical members. AFFiNE offers the perfect solution to make blogging an easy and accessible for everyone.

Setting up the new blog

The old blog was a standard Next.js blog and post files are static markdown files that are managed in a GitHub repo: the file directory & names generate the file tree and routes; post metadata is in form of frontmatter and is placed at the beginning of a post, and content itself is written in markdown. Each of the files will be parsed and pre-rendered as static Next pages.

There are lots of various great community-driven projects that introduce different approaches and tools that allow users to render their content and share on the internet. We wondered if it would be possible to write our blog using AFFiNE.

Although the AFFiNE project is relatively new, we think it would be cool to do the same. However, we have no official public APIs that make it easy to do this yet. AFFiNE Downhills introduces the public workspace feature, which means your workspace content can be shared with anyone else without the need for authentication. This gives us an opportunity to read the workspace documents (which is in YJS binary format) from the public AFFiNE API. For example, you can get the workspace data by callinghttps://app.affine.pro/api/workspaces/{workspaceId}/docs/{docId} .

Getting the markdown files from public workspace API

To build a Next.js site, it is essential to have a list of available files and the file contents. For real-world usage, we will also put some metadata information for the file contents, such as tags, authors, whether it is a draft or not (publish status), and so on.

Technically, affine-reader does the following:

Reads the page list from yDoc
For each page (which is a list of blocks), get Quill delta of each block
Converts delta to markdown (using a fork of https://github.com/frysztak/quill-delta-to-markdown)
Concatenates block-level markdowns into a single markdown page

If you are familiar with Next.js, we can use getStaticPaths to generate the static page routes and getStaticProps to get each page’s markdown content. In Next.13 App Layout, you can even render the post list and a random post like this pretty easily:

// [workspaceId]/page.tsx
import { getBlocksuiteReader } from "blocksuite-reader";

export default async function Post(params: { workspaceId: string, pageId: string }) {
  const reader = getBlocksuiteReader({
    params.workspaceId,
  });
  const pages = await reader.getWorkspacePages());
  return (
    <div>
       <ul>{/* ... render page list */}</ul>
    </div>
  );
}

// [workspaceId]/[pageId].tsx
import { getBlocksuiteReader } from "blocksuite-reader";

import { mdToHTML } from "./md-to-html";

export default async function Post(params: { workspaceId: string, pageId: string }) {
  const reader = getBlocksuiteReader({
    params.workspaceId,
  });
  const page = await reader.getWorkspacePage(params.pageId));
  return (
    <div>
      <article dangerouslySetInnerHTML={{ __html: mdToHTML(page.md) }} />
    </div>
  );
}

The primary issue is resolved. To be more creative, we can include frontmatter at the beginning of the page to define some metadata for the current page, such as authors, tags, whether or not to publish this page, etc. Additionally, we can use MDX to enrich both the default and custom elements of the page by embedding link previews and other interactive components.

By implementing Next.js infrastructure and enabling ISR, we eliminate the need to rebuild and deploy the blog each time content is updated. Moving forward, we can focus on using AFFiNE as our CMS to write posts for the blog. This will allow us to easily manage and publish our content with minimal hassle.

Migrating the old blog

At the time of writing there were around 50 articles that needed migrating over, that would be a fairly boring manual task to perform so why not make it a little more interesting and utilise some code to help us do that a little quicker

Challenge 1: Getting our files

Our old blog lived inside the general AFFiNE homepage repo and followed a structure of /blog/{YEAR}/{MONTH}/{SLUG}/ . An example would be blog/2023/02/moving-to-app-affine-pro and then inside this folder there is an index.md file which is the contents of the article, supported by an images folder with all of the relevant images for the article.

So first things first, let's take a copy of that blog folder. It would make things easier if everything was all in one folder, not in different sub directories and that's where our first challenge comes.

What do we want to do?

We want to find every file)
The structure is /blog/{YEAR}/{MONTH}/{SLUG}/index.md
We want to move these files to a shared directory
So we cannot leave them all with the index.md name

How can we do this?

Let's just move all our files //blog/ we just need to match the pattern /blog/*/*/*/index.md so that deals with the first task fairly easily. And for thje let's use the slug (the last part of the url of the article) as that's going to give a unique name without special characters. We can get the slug from checking the last folder name /blog/{YEAR}/{MONTH}/{SLUG}/.

So how shall we go about doing this? Well lets just put together a simple little bash script.

#!/bin/bash

for file in /blog/*/*/*/index.md
do
  directory=$(dirname "$file")
  article_title=$(basename "$directory")
  mv "$file" "$directory/../../../${article_title}.md"
done

So first we just declare this is a bash file #!/bin/bash
Next we get every file that matches our pattern /blog/*/*/*/index.md
One by one we take the matched file and then perform the do operations
Let's get the current directory directory=$(dirname "$file"
Let's get the article title by checking the last directory article_title=$(basename "$directory")
And finally let's move all these files - take our original $file and move it to up 3 folders ../../../ to the /blog folder giving it the new article_title name.

Great all our files are in the blog folder with unique names and we can just upload them now? No, not quite yet...

Challenge 2: Updating the metadata

There are some key things that our blog shows that are not strictly part of the article - these items are our metadata that contains lots of configurable options to be displayed on the article but are not part of the main content.

---
title: Moving to app.affine.pro
author: Christopher Smolak
tags: User Guide
cover: ./images/cover.jpg
description: A big step forward for AFFiNE
created: 2023-02-20
updated: 2023-02-20
layout: blog
---

Now this doesn't really prevent too many issues - our new blog system builds on the old setup and can understand this information and know to exclude it from the main article content. However, with this update we added two new features that are crucial to being shown and available on the blog.

What do we need to update?

We can remove cover as it's no longer relevant - we can upload images as blobs to AFFiNE and can simply use the first image in the article as the cover image
We need to add the new slug option to make our blog URLs beautiful and to avoid a mess of random characters. Good job we set our filename to our slug earlier
We need to add the new publish option and set it to true

How can we do this?

Let's again go with a simple little bash script.

#!/bin/bash

for file in $(ls -t -r /blog/*.md)
do
  slug=$(basename "$file" | sed 's/\.md//' | tr '[:upper:]' '[:lower:]')
  sed -i '' '5d' "$file"
  sed -i '' "4a\ 
slug: ${slug}\

" "$file"
  sed -i '' "9a\ 
publish: true\

" "$file"
done

As usual lets acknowledge this is a bash file #!/bin/bash
So again we are going to do things to each file in our blog folder - for all of the .md files
We want to keep the order of files (from oldest to newest) to import them in the write order hence we use ls -t -r /blog/*.md
Firstly let's get the file name - which I have called slug.
Get the filename basename "$file"."
Remove the .md extension with sed 's/\.md//'
Ensure the string is all lowercase tr '[:upper:]' '[:lower:]'
Next we remove line 5 which is the cover option sed -i '' '5d' "$file
Then at the end of line 4 we start a new line and define our slug option with sed -i '' "4a\ slug: ${slug}\ " "$file" - the new lines force the new line to be entered in the doc
Similarly, but starting on line 5, we start a new line and we set publish: true with sed -i '' "9a\ publish: true\ " "$file"- again the formatting is matched in the file (new lines)

Great, all done? Well we still need to upload these files to AFFiNE right?

Challenge 3: Uploading to AFFiNE

AFFiNE doesn't yet, at the time of writing, offer an API. That makes this task impossible right? Not quite... Again let's add some automation and see what we can do.

And poof! A neat little product that can automatically import your data into AFFiNE. For its initial purpose it has been setup to work with .md files but could be re-configured to import any content files - such as .doc files and .html files.

ShortCipher5/AFFiNE-Importer (github.com)

It's a much longer script so maybe another article if you are interested in this little code project specifically. And it's open-source for you to go ahead, check it out and have a play. Looking forward to your own projects and PRs are welcome.

To summarise this script uses Microsoft Playwright to automate Chromium and perform the actions of copy and paste which you'd otherwise have to do manually.

Even with this app it didn't solve everything. There was still the process of having to manually adding images (this also could be automated with more time). And in terms of the automation code itself it relies on copy-paste of data which means we have to force the code to wait sometimes to allow for the copy and paste actions to complete - with an arbitrary wait time - which no doubt can also be improved on.

The main part is that this helped significantly in lowering the manual cost and time needed to migrate these articles over.

Let's take a quick look at it in action:

hLo595V8PtMkLAz63KPasmu7KGHRtxgrwPpBCCqvLO4=

🗓️ Updates on Jan 22, 2024

Since this blog post was firstly posted nearly 10 months ago, we are now having more than 170 blog articles that are published in this way under https://affine.pro/blog. During this time, we also got Shule rebranded the whole AFFiNE.pro website and it is now serving by Nuxt.js, a similar popular web framework solution to Next.js.

Most of the explained approach under #Setting up the new blog section should still make sense. However due to some product changes in AFFiNE and new requirements for supporting new media types in the blog contents, we are also gradually evolving how we produce the blogs via consuming contents from AFFiNE app.

Intermediate Markdown Files are Now Cached in GitHub Repo

At the very beginning, blogs in affine.pro are served in an ISR (Incremental Static Regeneration) style that the blog assets are dynamically rendered/updated when a user visits any blog post. We found that this approach was sometimes not reliable and could cause a lot of unexpected traffic to our API server. Later we adopted a workaround which are now broke into two stages:

Use a GitHub Action to fetch page list and docs from public workspace. In this step, we will use affine-reader to generate Markdown files, but also update the files stored in the affine.pro project repo.
Our Website is repo is served on Vercel.io. When any repo file are changed on GitHub, the Vercel CI will kick in and regenerate static file assets by reading the updated markdown files

Public Workspace UI Entry is Removed

We used to provide an option for you to turn any private cloud workspace into a public one where a public workspace allows any users to access the contents in readonly mode without using credentials. However this option is removed from UI at the moment due to changes in product roadmap. The only way to access the doc via API is to provide a valid session token explicitly.

To get a valid token, make sure you are already logged in and then get the __Secure-next-auth.session-token cookie stored under https://app.affine.pro, which should be something like d628fcc9-4fde-455d-9000-894a21ce77fe. When you get it, you can then pass this value into getBlocksuiteReader option. Please also note that this token may not be a long lasting one and could expire at any time.

AFFiNE Docs Site

Along with the blog posts in https://affine.pro, we are also building the AFFiNE docs site using this approach. Compared to blog posts, the docs site is an organized static site that provides product, developer documents etc that more focused on AFFiNE itself.

B5w1r4pyLk9380Winv2zYkCLEj9LHPY9vmG1qtB6lIU=

A new thing about this one is that it uses a special page titled navigation:configto store the main navigation topology as nested lists:

OV2Gg0BC5Y1tmV8IyoUTA7QZZOSfRX-WwLlCcXWkNmc=

Simplified Solution is On the Way

Our goal to use AFFiNE app as a CMS for our blog site is already achieved. Unfortunately our provided solution here still requires some level programming experiences on Next.js or similar web frameworks. Also there are some inconvenience that should be improved, such as providing a long-term token for fetching the data.

In the future, we will provide a more simplified near-one-click solution for normal users to begin with, which might be a template that you can find when creating a new Vercel projects. Please stay tuned!

Conclusion

We hope you enjoyed this article and found some of the points and discussions useful. Be sure to check out the two GitHub projects mentioned, we welcome you to help expand, develop and even utilise these in your own projects.

AFFiNE Reader: https://github.com/pengx17/affine-reader
AFFiNE Importer: https://github.com/ShortCipher5/AFFiNE-Importer

Let us know if you enjoyed this article, and what type of technical (or non-technical) articles you'd like to see next.

Using AFFiNE as your own blog - technical

Switching to our own product

Setting up the new blog

Getting the markdown files from public workspace API

Migrating the old blog

Challenge 1: Getting our files

What do we want to do?

How can we do this?

Challenge 2: Updating the metadata

What do we need to update?

How can we do this?

Challenge 3: Uploading to AFFiNE

🗓️ Updates on Jan 22, 2024

Intermediate Markdown Files are Now Cached in GitHub Repo

Public Workspace UI Entry is Removed

AFFiNE Docs Site

Simplified Solution is On the Way

Conclusion

Get more things done, your creativity isn't monotone