AWS Textract Simple OCR with NodeJS 🚀🔎

OCR Overview 🔎

First, we need to know what OCR is, OCR is

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

From Wikipedia of course!

If you want to find a service that is compatible with OCR, you can use AWS Textract. Functionally, AWS Textract converts the image into text. To make it easier to understand, suppose you are a university admin employee in charge of registration. The registration form is usually in the form of a PDF or handwritten file. If you check one by one, of course, it will take a lot of time and there is a possibility of an error. Not to mention that you have to move documents in digital form.

AWS is a building block. Which means, we can use one or several services that may become a complete system. For that we use AWS Textract as OCR. Then, we can export the results in JSON form and save them to S3 (Object Storage), or we can save them in DynamoDB (Managed NoSQL DB) to analyze later.

In this article, we will learn a little about using AWS Textract. Then I'll explain why using AWS Textract would be so good.

If we explain it high level, the first step is that we have an image that will be uploaded to S3. Then we get the image file that is already in S3, then we analyze it using AWS Textract, and we will get the results of the analysis.

We can actually use the AWS Console with just a few clicks, but that's too easy. Here I will explain how to use Textract with Javascript. You can use the AWS SDK for Javascript, or you can use Lambda. In this article I will explain using the AWS SDK.

Just make sure you have the right role for using AWS Textract. For more information about the role, you check this link. And also, for credentials you have to set up your IAM role with AWS CLI.

After you setup AWS CLI, you have to test with AWS CLI commands, then make sure it runs well. I assume you have done this. So, let's we go to next step.

In plain Javascript (NodeJS) maybe you have to install AWS SDK by run command:

npm install aws-sdk

With simple code like this we can run the program by run node index.js

You will get the result in JSON form. If you want to save the JSON results in your local machine, you can save them in text or JSON form.

We understand that maybe you want to keep the document on your local machine, in essence you will keep the source files in a private place, and will also want to keep the results private. I think this is a good feature, where you don't have to save the source file in S3, but you can just use AWS Textract.

An example is like the picture above. With just a few changes to your code, you can use Textract as OCR only and don't use any other services.

We'll see the changes. The code only changes when we specify the source file from S3 or from the local file system. We don't even have to change the file in base64 form. We only need to read the file and then AWS reads it in bytes.

Now we see how beautiful it is 😀. From those model above we can actually choose the best way to implement. They key is AWS Textract, then you can choose the next step to process the results.

Thanks for reading, stay healthy, happy coding, PROFIT! 💻 👩‍💻