0

I have a scenario where I need to upload a .csv file from a react frontend, using axios, to an AWS Lambda function written in Node.js, through AWS API Gateway. From there I will process the lines of the csv file and save them to a database.

So far I've managed to get the data into the lambda function using the code below:

React Snippet

...
 export const upload = async (file: any) => {
  await axios({
    method: "post",
    url: url,
    data: file,
    headers: { "Content-Type": "multipart/form-data" },
  })
    .then(function (response) {
      //handle success
      console.log(response);
      return response;
    })
    .catch(function (response) {
      //handle error
      console.log(response);
      return response;
    });

const upload = async () => {
    setIsLoadng(true);

    if (selectedFile) {
      debugger;
      const formData = new FormData();

      formData.append("File", selectedFile);

      await upload(formData);

      setIsLoadng(false);
    } else {
      setIsLoadng(false);
      alert("Please choose a file to upload.");
    }
  };
...

but it arrives encoded in base64. At this point this is how I extract the data from the lambda function:

AWS Lambda Function

exports.handler = async (event) => {

  try {
    console.log(`Event: ${JSON.stringify(event)}`);

    // Decoding the base64 encoded body from AWS API Gateway
    let buff = Buffer.from(event.body, "base64");

    console.log(`Buffer data decoded: ${buff}`);

    console.log(`Type: ${typeof buff}`);

    const response = {
      statusCode: 200,
      body: JSON.stringify("Some data here..."),
    };
    return response;
  } catch (err) {
    console.log(err);
    const response = {
      statusCode: 200,
      body: JSON.stringify(err),
    };
    return response;
  }
};

This is all fine and well, however, after decoding it I cant seem to parse it easily to JSON format for processing because the data is surrounded by a webkit form boundray:

'------WebKitFormBoundary0YYQOz5kaQyRTGk4
Content-Disposition: form-data; name="File"; filename="file001.csv"
Content-Type: application/vnd.ms-excel
...
the actual .csv data
...
------WebKitFormBoundary0YYQOz5kaQyRTGk4--
'

I think that axios has formatted my data and sent the request correctly based on content-type: multipart/form-data. I also believe that AWS API Gateway encodes data like this into base64 which is what I receive in the lambda function. However, once I decode the data I get this string that includes all the header information and the "------WebKitFormBoundary0YYQOz5kaQyRTGk4" top and bottom bits.

Is there a better way to handle the .csv data so that I can have a nicely parsed JSON object in my lambda function to then process. So far I've been trying to hack together a string "find and concat" solution but this feels so wrong.

Any help or guidance is sincerely appreciated!

B

6
  • "I've managed to get the data into the lambda function" - Show the code that does that. Commented Dec 6, 2021 at 14:59
  • Do you mean the axois function or the code inside the lambda function? Commented Dec 6, 2021 at 15:39
  • I mean the code that causes the un-parsed request body to appear in your AWS function. You're missing a processing step there. Commented Dec 6, 2021 at 15:41
  • I use the Buffer to decode the incoming data if that makes sense...? Commented Dec 6, 2021 at 16:11
  • The body is encoded, you need a thing called a body parser to sort it out. That's a bread-and-butter operation, you're just not doing it. Don't try to write your own, use an existing library. Usually web server frameworks do that transparently, you might have to plug it in as a middleware somewhere in your server-side code. Commented Dec 6, 2021 at 16:19

1 Answer 1

3

So I finally managed to parse the data received from AWS API Gateway and needed a body parsing library to get the .csv data I needed as Tomalak said.

Here's what I did in the end:

  • Set up AWS API Gateway with Proxy Integration which have me the .csv data in a base64 encoded format in the lambda function.
  • Used the node "Buffer" to decode the data from base64
  • To parse the data I used a library called "multipart"
  • Then I converted the data part of the converted FormData (I only had one file so I called the file at index 0) to JSON using a library called "csvtojson"

From there I had a nicely formatted JSON object to work with. I didnt realise you could pull the Boundry information directly from the header which you will need to parse the data with the "parse-multipart" library.

Lambda Code Snippet

const multipart = require("parse-multipart");
const csv = require("csvtojson");

exports.handler = async event => {   
    console.log(`Event: ${JSON.stringify(event)}`); 
    const body = Buffer.from(event["body"].toString(), "base64"); // AWS case    
    const boundary = multipart.getBoundary(event.headers["Content-Type"]);
    const buff = multipart.Parse(body, boundary);
    const csvDataString = buff[0].data.toString("utf8");
    const csvData = await csv().fromString(csvDataString);
    console.log(csvData); 
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.