COVID tracking API in AWS using AWS Lambda.

COVID tracking API in AWS : One of the common use cases you would come across as an AWS developer very often is consuming data from an API ( external / internal ) from a Lambda and then save the data into either a S3 bucket or one of the AWS data stores.

In this article we are going to develop a Node JS Lambda that consume data from public COVID tracking project endpoint and then store the JSON data into S3 bucket. this data later can be used to create dashboards using AWS Quick sight ( not in scope) or can use Athena queries to run statistics.

Complete code is available on Github

COVID 19 tracking project REST API for states data – https://covidtracking.com/api/states

Lambda Implementation

Make a REST api call using HTTPS object and wrap the handler with async and await so that your lambda function doesn’t exit before consuming the data from rest api.

Convert the response into a JSON String

const response = await new Promise((resolve, reject) => {
        const req = https.get(STATES_API, function(res) {
          res.on('data', chunk => {
            dataString += chunk;
          });
          res.on('end', () => {
            resolve({
                statusCode: 200,
                body: JSON.stringify(JSON.parse(dataString), null, 4)
            });
            
          });
        });
        
        req.on('error', (e) => {
          reject({
              statusCode: 500,
              body: 'Something went wrong!'
          });
        });
    });

Make the putObject call on S3 object using required parameters, optionally you can also use a folder ( prefix) inside your S3 object. in this case you will be have adjust your IAM permissions for the Lambda role ( refer to GitHub repo)

// Save JSON response into S3
    var key = "covid-data-"+Date.now()+".json";
    var params = {
        Body: JSON.stringify(dataString),
        Bucket: BUCKET,
        Key: key
    };

Finally the putOjbect method.

const putObjectWrapper = (params) => {
  return new Promise((resolve, reject) => {
    s3.putObject(params, function (err, result) {
      if(err) reject(err);
      if(result) resolve(result);
    });
  })
}

This lambda uses two Environment variable

  • DATA_BUCKET = your bucket name
  • STATES_API = https://covidtracking.com/api/states

Below are my memory and timeout settings for the Lambda. I want to give few more seconds than it requires just in case if the COVID tracking project is having any latency issues. adjust these values based on your findings from Cloud Watch logs.

COVID tracking API in AWS

Dot not forget to use appropriate (least privilege permissions) IAM permissions to the Lambda role

{
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:us-east-1:xxx-ACCOUNT-NUM-xxx:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:xxx-ACCOUNT-NUM-xxx:log-group:/aws/lambda/consumeCOVID19DataLambda:*"
            ]
        },
        {
            "Action": [
                "s3:List*",
                "s3:Get*",
                "s3:Put*"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::bucket-name/*"
        }

Lets setup Cloud Watch schedule to automate the lambda such that it runs every day and consume data into S3 bucket. Go to Cloud Watch console click on Events -> Rules and create a New role by selecting Event Source as “schedule” as shown in the screenshot.

Cron expression is set to run lambda daily at 10 AM all week.

Cloud Watch Schedule for Lambda

Also refer to my article on how to use IAM permissions boundary to restrict broader access to developers.

0

Leave a Reply

Your email address will not be published. Required fields are marked *