codensuch

Dedicated AI server build complete! AI performance improvements incoming

Added 2022-01-10 07:34:14 +0000 UTC

First of all, happy new year! Hope everyone had a great holiday. I got a good chunk of time to work on the project and I have some great progress to report!

AI dedicated server is done!

The dedicated AI server well and working! All of the software has been setup and migrated over to the dedicated server. Future changes can also be quickly deployed to the dedicated server. The server will go live soon in the future.

Performance speed-up

A lot of work went into parallelization of AI processing in order to more efficiently utilize the dedicated server when concurrently serving users. The app used to process all incoming AI requests serially. This essentially created a queue for all AI processing requests which is obviously not ideal for a system that is meant to be shared among many users.

Here are a quick summary of some of the parallelization work, and some interesting discoveries:

1. OCR + translation

This is by far the largest bottleneck. Every bubble OCR and translation was serially processed. Now bubbles can be processed concurrently. The number of concurrent bubbles can be adjusted based on the hardware of the AI server.

2. Requests

A request is any operation that user wants to perform on from the AI server. For example "detect all bubbles", or "typeset this bubble" etc. Requests now also goes into a concurrent pool. All image post/preprocess also happen during requests which further improves the overall throughput.

3. The neural net inference

This one was weird. After much testing it turned out that my current bubble detection model cannot be parallelized. Any attempts at parallelization (either multi-thread or increasing batch size) gets serialized by Tensorflow regardless. Luckily it is currently fairly fast at about 150ms per page on a RTX 3070 (~7 pages per second). So I'm leaving this one as is and will revisit when I train a new model.

The result of all of this is higher throughput through more efficient CPU usage. The AI will be able to easily take advantage of processing power of future server upgrades (and there are still more optimizations can be done!) For those who have tried the app will notice at worse case a 2x speed-up per page (likely 5x plus), and will be able to server many more users at the same time.

Limited public testing coming

Like I mentioned in a previous post, I'm currently planning a limited public testing run. I'm building a very simple UI for people upload pages to get back a cleaned page. This is mainly to test the stability of the AI server and gauge some additional feedback. I'll be reaching out to people on Discord very soon so stay tuned!