This article will show you how to store rows of a Pandas DataFrame in DynamoDB using the batch write operations.
First, we have to create a DynamoDB client:
import boto3
dynamodb = boto3.resource('dynamodb', aws_access_key_id='', aws_secret_access_key='')
table = dynamodb.Table('table_name')
When the connection handler is ready, we must create a batch writer using the with
statement:
with table.batch_writer() as batch:
pass # we will change that
Now, we can create an iterator over the Pandas DataFrame inside the with
block:
with table.batch_writer() as batch:
for index, row in df.iterrows():
pass # to be changed
Want to build AI systems that actually work?
Download my expert-crafted GenAI Transformation Guide for Data Teams and discover how to properly measure AI performance, set up guardrails, and continuously improve your AI solutions like the pros.
We will extract the fields we want to store in DynamoDB and put them in a dictionary in the loop:
with table.batch_writer() as batch:
for index, row in df.iterrows():
content = {
'field_A', row['A'],
'field_B', row['B']
}
# there is still something missing
In the end, we use the put_item
function to add the item to the batch:
with table.batch_writer() as batch:
for index, row in df.iterrows():
content = {
'field_A', row['A'],
'field_B', row['B']
}
batch.put_item(Item=content)
When our code exits the with
block, the batch writer will send the data to DynamoDB.