Row Zero is now in public beta

← Back

Connect S3 Bucket to Spreadsheet for Data Analysis

05-16-2023 // Nick End, Founder

The easiest way to quickly analyze data stored in S3 is with Row Zero, a blazingly fast spreadsheet with native S3 connectivity.

Skip straight to instructions to connect a S3 bucket.

What is AWS S3?

Amazon S3 (Simple Storage Service) is a cloud-based object storage service provided by Amazon Web Services (AWS). It enables individuals and organizations to store and retrieve data, images, videos, and other files on the internet.

AWS S3 is designed to be highly scalable, durable, and secure. The service allows users to store unlimited amounts of data in virtually any format, with high availability and reliability.

Some common use cases for Amazon S3 include backup and disaster recovery, data archiving, content storage, distribution, and web hosting. Many businesses also use Amazon S3 as a data lake for big data analytics and machine learning.

Challenges with analyzing data in S3

While S3 makes it incredibly easy to store large amounts of data, it can be challenging to analyze data stored in S3 for 3 reasons:

Dataset Size

Processing large datasets is a challenge for spreadsheets, like Google Sheets and Microsoft Excel, and requires a more powerful spreadsheet. Additionally, exporting data from S3 to a local machine and then uploading to a hosted application is a slow and tedious process when datasets are large. It is much easier to connect directly to a S3 bucket and browse for files that open in your analytics tool of choice.

Access Control

S3 provides various access control mechanisms, such as IAM policies and bucket policies, which can restrict or limit access to data stored in S3. This can make it challenging to ensure that users have the necessary permissions to access and analyze the data they need.

Data Integration

Data stored in S3 may need to be integrated with other data sources before analysis. This can require additional processing and transformation steps to ensure data consistency and accuracy.

Analyze S3 Data In a Spreadsheet

There are 3 steps to easily analyze S3 data in a spreadsheet. Follow the instructions below to get started.

  1. Connect a S3 bucket to Row Zero.
  2. Browse the S3 bucket to select file.
  3. Open file in Row Zero.

Connect to a S3 bucket

The following instructions explain how to create an IAM role in AWS S3 and grant permissions for a Row Zero workbook to connect a specific S3 bucket and ingest data. To begin the process, in the Data menu, select Import from Amazon S3. Import from Amazon S3 You will see a link to the public datasets S3 bucket and a button in the upper right hand corner that says + connect bucket. Click the button to connect a new bucket. The public datasetst bucket is available to demo the integration and Row Zero features. Connect Bucket

The first step is indentifying the bucket name you intend to connect to Row Zero and entering the name in the provided text box. Bucket name Next, provide the Amazon Resource Name (ARN). To identify the name, follow the instructions below to create an IAM role for Row Zero and grab the ARN from the AWS console. At the top of the screen your Row Zero AWS Account ID and your External ID will be visible. Both will be needed to create the IAM role. ID Numbers

Create an IAM Role

  1. Log in to the AWS Management Console

  2. Select AWS Account

  3. Select Another AWS Account and enter 732940336628 as the account ID

  4. Select Require External Id and enter Your Account ID

  5. Click Next Enter AWS account IDs

  6. Select Create Policy (this will open a new tab) AWS create policy

  7. Select the tab that says JSON and replace the text with this: aws json text

  8. Click Next: Tags

  9. Click Next: Review

  10. Give the policy a name like 'RowZero-myS3bucket-ReadOnly'

  11. Click Create policy and close this browser tab AWS Policy Name

  12. On the first browser tab, refresh the list of policies

  13. Search for the policy you just created (e.g. RowZero-myS3bucket-ReadOnly), click the checkbox to its left, and click Next AWS permission policies

  14. Give the role a name, like RowZero-myS3bucket-ReadOnly

  15. Scroll to the bottom of the page, and click Create Role AWS create role

  16. Click View Role at the top of the screen and copy the role ARN and paste it in the text box at the top of the window. Click Next. AWS view role Copy ARN number Paste ARN

Import Data

After completing the above instructions your S3 bucket is connected to Row Zero. Now when selecting the Data tab from the top menu, you can easily browse your S3 bucket and select a file, which will import to Row Zero where you can begin working with it. Even if the file is large, Row Zero is a blazingly fast spreadsheet designed to support large file sizes down. To see the product in action, try it here.

Browse S3