Amazon AWS - Textract¶
Nothing can be simpler to interact with Amazon Textract than the BotCity plugin.
The BotCity plugin for AWS Textract allows you to analyze and extract quickly from hundreds of documents, whether entered or handwritten.
Installation¶
pip install botcity-aws-textract-plugin
Importing the Plugin¶
After you installed this package, the next step is to import the package into your code and start using the functions.
from botcity.plugins.aws.textract import BotAWSTextractPlugin
Setting up connection¶
Note
There are two different ways to authenticate.
1. Creating the .aws
folder in the home directory, you need to create two files.
# ~/.aws/config
[default]
region=<region_code>
# ~/.aws/credentials
[default]
aws_access_key_id=<your_aws_access_key_id>
aws_secret_access_key=<your_aws_secret_access_key>
2. Passing credentials in the class constructor.
# Using the `.aws` folder
textract = BotAWSTextractPlugin()
# Alternative using the credentials as constructor arguments
textract = BotAWSTextractPlugin(
region_name='<region_code>',
use_credentials_file=False,
access_key_id='<your_aws_access_key_id>',
secret_access_key='<your_aws_secret_access_key>',
)
As a demonstration of the library, let's build a simple example together that will parse the text from the following image:
Reading text from the image¶
Now let's read the text from the image.
# Read the text from the image
textract.read("otter_crossing.jpg")
# Print the text from the image
print(textract.full_text())
The output should look like this:
CAUTION
Otters
crossing
for
next
6
miles
Complete code¶
Let's take a look at the complete code:
# Instantiate the plugin using the `.aws` folder
textract = BotAWSTextractPlugin()
# Read the text from the image
textract.read("otter_crossing.jpg")
# Print the text from the image
print(textract.full_text())
Tip
This plugin allows you to use method chaining
so the code above could be written as:
text = BotAWSTextractPlugin() \
.read("otter_crossing.jpg") \
.full_text()
# Print the text from the image
print(text)