How to edit a big CSV file?
Open the CSV in Row Zero. It's free and doesn't require a download. Row Zero works just like Excel or Google Sheets but won't crash or freeze when importing large CSV files. Row Zero is a fast flexible CSV editor and viewer. Sort, filter, search (ctrl + F), edit columns, rows, and values, merge CSVs, and export back to CSV. For more information, read on or skip to the sections below:
- Open a big CSV
- Clean and enrich CSVs
- Merge or separate CSVs
- Analyze a big CSV
- Export a CSV
- What are CSVs?
- What tools are best for editing big CSVs?
- What are common use cases for editing big CSVs?
- Conclusion
Open a big CSV
Before attempting any of the methods below, first import your data to Row Zero using the steps below. More information about opening big CSVs can be founder here.
- Click on the 'Data' menu
- Select 'upload from computer,' import from S3, or Connect a S3 Bucket.
- Review import details and determine whether to import into cells or as a data table. Then click 'Import.'
Row Zero as a CSV editor
In order to be a powerful CSV editor, an array of features are needed that are optimized to work with large data and also provide the flexibility needed to combine, transform, clean, enrich, and modify specific records. Below is a review of the Row Zero features that can be used to edit your largest CSV files.
Clean and enrich CSVs
Cleaning and enriching large data sets requires finding specific records and deleting or modifying them. Row Zero provides 3 features to help clean and enrich your biggest CSV files: Search with ctrl + F, Sort, and Filter. Instructions for each feature can be found in the bullets below.
Search (Ctrl + F)
- The Ctrl + F search function allows you to easily locate specific values or entires in a large CSV file. Search through data to make updates, corrections, or deletions to specific entries.
- To use the CTRL + F feature, navigate to the sheet in your Row Zero workbook that contains the imported CSV file. Press 'Control' + 'F' and the search feature will pop up in the upper right-hand corner. Enter the text or number you want to search for and Row Zero will identify and highlight every instance. Use the up and down arrows to jump to the subsequent or previous occurrence.
Sort
- Sorting arranges the CSV file's content in ascending or descending order based on the values in a selected column. It can be helpful for identifying duplicates or anomalies.
- To sort your data in Row Zero, simply select your data with CTRL + A or CTRL + SHIFT and the arrow keys. Once your data is selected, select sort and drop down arrows will appear in the header row. Click on the header for the column to be sorted by and use the sort menu to choose the sort method. More detailed instructions can be found in the sort documentation here.
Filter
- Filtering isolates subsets of data that meet certain criteria. It simplifies the process of focusing on particular categories, which can be useful for detailed examination. It also makes it possible to delete specific data points from a CSV.
- To filter your data in Row Zero, simply select your data with CTRL + A or CTRL + SHIFT and the arrow keys. Once your data range is selected, select filter and drop down arrows will appear in the header row. Click on the header for the column to be filtered by and use the filter menu to choose the method. More detailed instructions can be found in the filter documenation here.
Merge or separate CSVs
Merging or separating big CSVs requires features that can modify millions of records at once. Row Zero provides several features that are easy to use with giant CSV files. The first set of features are cut, copy, and paste. Traditionally these features are slow with big data sets but Row Zero makes it easy use the features with big data sets. The second set of features are VLOOKUP and XLOOKUP. These spreadsheet functions are optimized to work with large CSVs and can make combining two or more data sets easy.
Cut, copy, and paste
- If you need to select specific rows or groups of columns to combine, merge or separate parts of the CSV, use cut, copy, or paste and quickly modify the CSV. Row Zero can cut, copy, and paste millions of rows at once.
- To use cut, copy, or paste, right click and select the option from the context menu. Another option is keyboard shortcuts CTRL + X for cut, CTRL + C for copy, and CTRL + V for paste. Use copy, cut, and paste on millions of rows to create the data set you want.
VLOOKUP or XLOOKUP
- Combine data from multiple CSV files based on a unique identifier in one of the columns. The functions can be used to add or combine columns from separate datasets.
- To combine two large CSVs using VLOOKUP or XLOOKUP, import both CSVS to sheets in Row Zero's Data Import options. Once imported it is best to use XLOOKUP, the successor to VLOOKUP, but Row Zero will support both at lighting fast speeds. In the example pictures below, we were working with a 2.5M row data set. Identify which column the XLOOKUP will go in. Type your function using the normal XLOOKUP syntax. Once the formula executes in the first cell, double click on the lower right-hand corner of that cell and it will fill down the entire column instantly.
Analyze a CSV
If your CSV contains numeric or categorical data to be analyzed, Row Zero has graphing and pivoting features that can be used to quickly analyze millions of rows of data. Quickly aggregate and group data sets with pivot tables and graph the output or raw data to see what your data shows.
Graph
- Creating graphs provides visual representation of the data, making it easy to identify trends and anomalies.
- To graph a data set in Row Zero, select the data set range with CTRL + A or by selecting the columns to be graphed and hit the graph button in the top menu bar. Once the graph has been created, use the graphing menu to add or edit series, update formatting, or plot multiple axis. For more details on graphing capabilities in Row Zero see the graphing documentation here.
Pivot Table
- Pivot tables allows you to summarize and consolidate data, making it easier to grasp overall trends and patterns based on various categories and groupings within a big data set.
- To pivot a big CSV in Row Zero, select the range with CTRL + A or CTRL + SHIFT and the arrow keys. Once the desired range is selected, right-click and select 'Pivot' from the context menu. Follow the steps to select the desired location and select values. For more detailed instructions view the Row Zero pivot documentation here.
Export a big CSV
After the big CSV has been merged, combined, de-duped, cleaned, or enriched, export your new CSV from Row Zero by going to File > Download as CSV and export the CSV to your computer.
What are CSVs?
CSV stands for comma separated values. CSV files contain data separated by commas as shown in the example below.
"ID","studyName","Sample Number","Species","Region","Island","Stage","Individual ID","Clutch Completion","Date Egg","Culmen Length (mm)","Culmen Depth (mm)","Flipper Length (mm)","Body Mass (g)","Sex","Delta 15 N (o/oo)","Delta 13 C (o/oo)","Comments" "1","PAL0708",1,"Adelie Penguin (Pygoscelis adeliae)","Anvers","Torgersen","Adult, 1 Egg Stage","N1A1","Yes",2007-11-11,39.1,18.7,181,3750,"MALE",NA,NA,"Not enough blood for isotopes." "2","PAL0708",2,"Adelie Penguin (Pygoscelis adeliae)","Anvers","Torgersen","Adult, 1 Egg Stage","N1A2","Yes",2007-11-11,39.5,17.4,186,3800,"FEMALE",8.94956,-24.69454,NA "3","PAL0708",3,"Adelie Penguin (Pygoscelis adeliae)","Anvers","Torgersen","Adult, 1 Egg Stage","N2A1","Yes",2007-11-16,40.3,18,195,3250,"FEMALE",8.36821,-25.33302,NA "4","PAL0708",4,"Adelie Penguin (Pygoscelis adeliae)","Anvers","Torgersen","Adult, 1 Egg Stage","N2A2","Yes",2007-11-16,NA,NA,NA,NA,NA,NA,NA,"Adult not sampled." "5","PAL0708",5,"Adelie Penguin (Pygoscelis adeliae)","Anvers","Torgersen","Adult, 1 Egg Stage","N3A1","Yes",2007-11-16,36.7,19.3,193,3450,"FEMALE",8.76651,-25.32426,NA
CSVs are common file formats because the simple schema makes them easy to process and they can be opened by a large number of software products. The simplicity of the file format makes CSVs an attractive choice for storing any datasets but particularly large datasets. CSVs are a common file format for exports from ERPs, CRMs, databases, and business intelligence tools. The challenge with CSVs is when an application exports a giant CSV, greater than 1 million rows, there are very few applications that enable easy editing of the CSV contents.
What tools are best for editing big CSVs?
The reality is there are not many tools that make it easy to edit big CSVs. Spreadsheets like Google Sheets and Microsoft Excel make it very easy to edit CSVs but those applications were not designed to handle large data sets, which have now become common in every job function. If you try to import a giant CSV into Google Sheets, you'll get an error message like the one below. You can read more about the Google Sheets row limit here.
Importing a big CSV, greater than 1,048,576 rows, into Excel will truncate all the rows beyond the Excel row limit. You can import larger CSVs into Excel using data model, power query, or power pivot but those features are complicated and require some additional technical skills. Once CSVs get too big for legacy spreadsheets, there aren't many other applications that are easily interactive for editing puropses. Data analysts, business intelligence professionals, and software engineers are likely to use SQL or Python programming languages in tools like Jupyter Notebook or SQL databases. Those solutions require programming knowledge. Other alternatives are traditional BI tools, like Tableau, PowerBI, and Looker, which were designed for big data sets but are better suited for pivot table style aggregations, not transformations or editing of the data.
A previous post, How to Open Big CSVs, covered 5 options for opening big CSVs but of those options, Row Zero is the fastest and most interactive tool for editing big CSVs because it works just like Excel and Google Sheets but won't crash or freeze when importing more than 1 million rows.
What are common use cases for editing big CSVs?
Across business functions, there are a wide range of reasons in which you may need to edit or transform CSV files. Often exports from one application need to be explored, cleaned or transformed, and imported to other platforms. Some common use cases are:
- Marketing: cleaning and enriching lists of contacts from CRMs
- Operations: combining logistics data from a 3PL with ecommerce data
- Manufacturing: Searching for anomalies within production data
- Finance: opening general ledger transaction data or exports from ERPs
- Healthcare: Reviewing and analyzing a list of insurance claims.
Conlcusion
In conclusion, editing and exploring large CSV files can be a complex task if you don't have the right tool. While CSVs are common and simple file formats,challenges arise when handling giant CSV files. Row Zero provides a seamless solution for these challenges, offering a blazingly fast spreadsheet designed for big data.
Row Zero easily imports your CSV files and provides a variety of tools to make your data management more straightforward. As a spreadsheet, it is an extremely powerful CSV viewer and editor.
Additionally all the features that make Row Zero great at editing big CSVs are also useful for working with large data sets stored in data warehouses. For more information on connecting Row Zero to your data warehouse, view all of Row Zero's data import options.