D
D
dizlv2015-09-03 16:56:33
Hadoop
dizlv, 2015-09-03 16:56:33

How do I export data from DynamoDB and then modify it in AWS Data Pipeline?

We store logs in DynamoDB. The task is to use AWS Data Pipeline to select logs from DynamoDB tables that do not contain a set of strings ("bot", "python", "requests", etc.) and write them to another DynamoDB table (both the tables are identical in design, the only difference is that the first has "dirty" logs, and the second has "clean" ones).
As far as I understand at the moment, you need to create 2 DynamoDBDataNode for input and output (what Data Format to set?) And use HiveCopyActivity to make a request (how?). I tried to set up this whole thing, but in the end, the processes crash with various errors, which, unfortunately, do not carry any useful / recognizable information for me.
Does anyone have a ready recipe or at least a rough description of how to do it? The official documentation is very superficial and does not provide answers to questions that have arisen during the research.
Thank you.

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question