Skip to content

Amazon AWS - Textract

Nothing can be simpler to interact with Amazon Textract than the BotCity plugin.

The BotCity plugin for AWS Textract allows you to analyze and extract quickly from hundreds of documents, whether entered or handwritten.

Installation

pip install botcity-aws-textract-plugin

Importing the Plugin

After you installed this package, the next step is to import the package into your code and start using the functions.

from botcity.plugins.aws.textract import BotAWSTextractPlugin

Setting up connection

Note

There are two different ways to authenticate.

1. Creating the .aws folder in the home directory, you need to create two files.

# ~/.aws/config
[default]
region=<region_code>
# ~/.aws/credentials
[default]
aws_access_key_id=<your_aws_access_key_id>
aws_secret_access_key=<your_aws_secret_access_key>

2. Passing credentials in the class constructor.

# Using the `.aws` folder
textract = BotAWSTextractPlugin()

# Alternative using the credentials as constructor arguments
textract = BotAWSTextractPlugin(
            region_name='<region_code>',
            use_credentials_file=False,
            access_key_id='<your_aws_access_key_id>',
            secret_access_key='<your_aws_secret_access_key>',
)

As a demonstration of the library, let's build a simple example together that will parse the text from the following image:

otter_crossing.jpg

Click here to download

Reading text from the image

Now let's read the text from the image.

# Read the text from the image
textract.read("otter_crossing.jpg")

# Print the text from the image
print(textract.full_text())

The output should look like this:

CAUTION
Otters
crossing
for
next
6
miles

Complete code

Let's take a look at the complete code:

# Instantiate the plugin using the `.aws` folder
textract = BotAWSTextractPlugin()

# Read the text from the image
textract.read("otter_crossing.jpg")

# Print the text from the image
print(textract.full_text())

Tip

This plugin allows you to use method chaining so the code above could be written as:

text = BotAWSTextractPlugin() \
    .read("otter_crossing.jpg") \
    .full_text()
# Print the text from the image
print(text)