Data Cleaning in Row Zero:

Typical spreadsheet data cleaning and editing

Row Zero generally works like Excel and Google Sheets and supports all of the typical data cleaning and editing options you would expect in a spreadsheet like deleting, adding, editing, formatting, sorting, and filtering cells, columns, and rows. Row Zero also has advanced data cleaning features like remove duplicates, conditional formatting, formulas for finding outliers, and a native python window for writing custom functions and importing python packages.

Remove Duplicates

Row Zero makes it easy to remove duplicate values in a sheet or range. Here's how to remove duplicates:

  1. Select the range of cells you would like to remove duplicates from. If you don't select a range of cells, the feature will select the entire sheet by default, which is generally recommended.
  2. There are two ways to access the feature:
    • In the header menu, go to Data, Remove duplicates
    • Click the remove duplicates icon in the formatting menu in the header. Note: Depending on your screen size and zoom level, you may need to click the 3 dots to expand the menu to access remove duplicates. remove duplicates in spreadsheet
  3. Select the column(s) that you want to remove duplicates from. If you select mutliple columns, it will only remove duplicate rows that have duplicate values in each of the columns selected. Note that all columns are selected by default, but in many instances you may want to select just one or two columns. remove duplicates in spreadsheet
  4. Click Apply and all matching duplicate rows will be deleted from the sheet

Read more about finding dupicates and unique values in your sheet.

Conditional Formatting

The conditional formatting feature makes it easy to find and highlight data that meets certain conditions. You can find particular values, outliers, empty cells, duplicate cells, etc. conditional formatting in spreadsheet

Finding outliers

In addition to conditional formatting, Row Zero offers several ways for finding outliers in your data:

Sort and Filter

Sorting and filtering works just like traditional spreadsheets. You can sort ascending or descending and filter by condition or category.

Formulas

Row Zero offers a large library of built-in spreadsheet functions and formulas that are Excel-compatible as well as some formulas unique to Row Zero. Several functions are useful for finding outliers including MAX, MAXIFS, MIN, and MINIFS. Other -if formulas like IFS, SUMIFS, COUNTIFS, AVERAGEIFS, IFNA, and IFERROR can be useful for cleaning and analyzing data. View all spreadsheet functions.

Pivot tables

Pivot tables are an easy way to summarize data and either find or ignore outliers. Row Zero pivot tables support the typical value calculations like Max, Min, Median, Standard Deviation, etc. as well as some new calculations like Count Unique and Percentiles that can be useful for summarizing dirty data or accounting for outliers. pivot table percentiles

Python functions and packages

The native python window can be used to create custom spreadsheet functions and import python packages to help clean, transform, and analyze data. Here's an example python function to extract the domain portion of an email in your spreadsheet:

def extract_domain(email: str) -> str:
    if "@" in email:
        return email.split("@")[1]
    else:
        return "Invalid email"

# For this example, type '=extract_domain(A1)' in a spreadsheet cell and it will extract the domain portion of an email in cell A1 . 

python extract domain function in spreadsheet

Searching with Ctrl + F

The Ctrl + F function make it easy to find specific values in your sheet. Just hit the Ctrl+F keyboard shortcut to open the Find feature to look for a specific value. find specific values in spreadsheet with ctrl+f keyboard shortcut

Cleaning and editing big files

Row Zero's big data power makes it easy to open, clean, and edit big CSV files and other file formats like txt, parquet, JSONL, .gz, and XLSX. If you're on a paid plan, you can download your workbooks as a CSV at any time or import directly into your database (e.g. Postgres) or data warehouse (e.g. Snowflake). This can be a convenient way to preview and clean data before importing to your ecosystem.

Read more about data cleaning techniques here.