# Supported Formulas

Arcwise allows you to use almost all formulas on arbitrarily large data. Formulas can be used directly on Arcwise connected data: [Adding Columns](/using-arcwise-in-sheets/arcwise-connected-data/adding-columns.md). Formulas can also be used in your Google Sheet.

{% hint style="info" %}
To refer to other columns in formulas you can either use the name of the column (e.g. `ship_date`) or spreadsheet-style `A:A`, `B:B` notation. Clicking on columns while the formula editor is open will automatically insert a reference to the column in the editor.
{% endhint %}

Examples formulas:

* `MONTH(created_date)` will extract the month from a date column
* `MID(name, FIND(" ", name), LEN(name))` would extract the last name from a string column containing full names (e.g. "Firstname Lastname")

## Column Formulas and Operators

A limited subset of spreadsheet functions are supported in formula fields. These will be automatically translated to their SQL equivalent. Standard arithmetic operators are also supported (+, -, /, \*, ^).

**Logical functions**

* `AND(x, y)`
* `IF(logical_expression, value_if_true, value_if_false)`
* `IFNA(value, value_if_na)`
  * NOTE: this actually checks against `NULL` values in the database
  * **Warning: IFERROR is not currently supported. For the common 'catching division by zero errors' use case, you can instead use:** `DIV0(a, b)` in Snowflake and `SAFE_DIVIDE(a, b)` in BigQuery, which will result in a NULL (empty) value instead of an error.
* `NOT(x)`
* `OR(x, y)`

**String functions**

* `DATEVALUE(string)`
* `FIND(needle, haystack, [starting_position = 1])`
* `LEFT(string, number_of_characters)`
* `LEN(string)`
* `LOWER(string)`
* `INT(string)`
  * Converts a string to an integer (rounding to the nearest integer)
* `MID(string, starting_at, extract_length)`
* `REGEXEXTRACT(text, regular_expression)`
  * NOTE: The REGEXEXTRACT function of Google Sheets will populate multiplle columns with multiple match groups. Our implementation will always return the entire matching part of the subject. Different warehouse types (Snowflake, Bigquery, PostgreSQL) have each their own implementations of REGEX. When writing REGEX, expect the specific behavior from the warehouse that your data is connected to, as we execute your REGEX without any processing on that data warehouse.&#x20;
* `RIGHT(string, number_of_characters)`
* `TRIM(string)`
* `UPPER(string)`
* String concatenation - both `&` and `CONCATENATE(string1, [string2, ...])`

**Numerical functions**

* `INT(value)`
* `MAX(value1, [value2, ...])`
* `MIN(value1, [value2, ...])`
* `RAND()`
* `RANDBETWEEN(low, high)`
* `ROUND(value, [places])`
* `ROUNDDOWN(value, [places])`
* `ROUNDUP(value, [places])`
* `VALUE(val)`
  * Converts a value (typically a string) into a numeric value.
* **Standard math functions:** ABS, ACOS, ASIN, ATAN, COS, COT, DEGREES, EXP, FLOOR, LN, LOG, LOG10, MOD, PI, POWER, RADIANS, SIGN, SIN, SQRT, TAN

**Date functions**

* `DATEDIF(date1, date2)`
* `DATEVALUE(string)`
* `DAY(date)`
* `EOMONTH(date)`
* `EDATE(date, number_of_months)`
* `MONTH(date)`
* `WEEKDAY(date)`
* `WEEKNUM(date)`
* `YEAR(date)`

#### Advanced usage: Rolling/window functions

It may be helpful to skip straight to the examples (see below.)

* `ROLLINGSUM(value)`
  * Calculates the rolling sum of a value or column (requires a sort order to be added in the tab as well.)
* `ROLLINGSUMBY(value, group_column1, [group_column2, ...])`
  * Calculates the rolling sum of a column over all rows with the same group\_column1, group\_column2, etc.
* `ROLLINGMIN(value)` & `ROLLINGMINBY(value, group...)`
  * Calcululates the rolling **min** of a column (same syntax as above)
* `ROLLINGMAX(value)` & `ROLLINGMAXBY(value, group, ...)`
  * Calcululates the rolling **max** of a column (same syntax as above)
* `FILLDOWN(value)` & `FILLDOWNBY(value, group, ...)`
  * "Fills down" non-empty / non-NULL values in a column downwards. In other words, if the `value` argument is currently NULL or empty, then we will instead take the previous non-empty value of the same grouping (if one exists.)

We also have "non-rolling" versions of min/max/sum (`MINBY`, `MAXBY`, `SUMBY`) which calculate the overall min/max/sum over all rows within the same group.

**Examples**

This example table shows the result of various different "rolling calculations" using hypothetical `value` and `group` columns. (assuming that the table is already in sorted order.)

Providing a `group` basically means that all the rows within a certain group (e.g. group "a" or group "b") will have their rolling values calculated independently. You can provide multiple `group` columns to refine the grouping even further.

<table data-full-width="true"><thead><tr><th>value</th><th>group</th><th data-type="number">ROLLINGSUM(value)</th><th>ROLLINGMINBY(value, group)</th><th>FILLDOWNBY(value, group)</th><th data-type="number">SUMBY(value, group)</th></tr></thead><tbody><tr><td>1</td><td>a</td><td>1</td><td>1</td><td>1</td><td>3</td></tr><tr><td>(empty)</td><td>a</td><td>1</td><td>1</td><td>1</td><td>3</td></tr><tr><td>2</td><td>a</td><td>3</td><td>1</td><td>2</td><td>3</td></tr><tr><td>(empty)</td><td>b</td><td>3</td><td>(empty)</td><td>(empty)</td><td>4</td></tr><tr><td>4</td><td>b</td><td>7</td><td>4</td><td>4</td><td>4</td></tr><tr><td>(empty)</td><td>b</td><td>7</td><td>4</td><td>4</td><td>4</td></tr></tbody></table>

If `value` is in column A and `group` is in column B, here are the equivalent Google Sheets formulas:

(Assuming C2, D2, E2, F2 all start at $A2)

* `ROLLINGSUM(value)`
  * `C3 = $C2 + $A3` (drag down)
* `SUMBY(value, group)`
  * `D3 = SUMIF($B2:$B, $B3, $A2:$A)`
* `ROLLINGMINBY(value, group)`
  * `E3 = IF($B3=$B2, MIN($E3, $E2), $E3)` (drag down)
* `FILLDOWNBY(value, group)`
  * `F3 = IF($B3=$B2, IF($F3<>"", $F3, $F2), $F3)` (drag down)

#### Advanced usage: SQL functions

Though not officially documented, most SQL functions will work inside Arcwise formula fields. Consult the official docs for your warehouse.

Helpful functions that generally work across databases (WARNING: these often are DIFFERENT from the Google Sheets functions of the same name.)

* `REPLACE(source_string, search_string, replace_string)`
  * Example: `REPLACE("the quick fox", "quick", "slow")`&#x20;
* `REGEXP_REPLACE(subject, pattern, replacement)`
  * <https://docs.snowflake.com/en/sql-reference/functions/regexp_replace>
* `REGEXP_SUBSTR(subject, pattern[, position[, occurrence]])`
  * <https://docs.snowflake.com/en/sql-reference/functions/regexp_substr>

**Links to references:**

* Snowflake functions: <https://docs.snowflake.com/en/sql-reference/intro-summary-operators-functions>
* BigQuery functions: <https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators>

## Aggregation Formulas

{% hint style="info" %}
Aggregation formulas can not be used to create new columns in Arcwise connected data tabs: [Adding Columns](/using-arcwise-in-sheets/arcwise-connected-data/adding-columns.md). They are exclusivly used within Google Sheets: [Using Formulas in Sheets](/using-arcwise-in-sheets/formulas.md).
{% endhint %}

* `COUNT` / `COUNTIF` / `COUNTIFS`
* `COUNTUNIQUE / COUNTUNIQUEIFS`
* `SUM` / `SUMIF` / `SUMIFS`
* `AVERAGE` / `AVERAGEIF` / `AVERAGEIFS`
* `MIN` / `MINA` / `MINIFS`
* `MAX` / `MAXA` / `MAXIFS`
* `MEDIAN` / `MODE`
* `COUNTA` / `ROWS`
  * NOTE: Regardless of the column passed in, these will always return the full number of rows in the connected dataset.
* `LOOKUP` / `XLOOKUP`
  * Both LOOKUP & XLOOKUP are the same with Arcwise (and do not require "sorted data"). However, if there are multiple matches, one of the matches will be randomly selected.
  * Note: only the first three arguments (search\_key, lookup\_range, result\_range) are supported at the moment.
* `ARCFILTERIFS`
  * A cross between the Sheets `FILTER` and an `IFS` function (will return all matching rows in the table, but takes pairs of criteria like COUNTIFS/SUMIFS.)
* `ARCUNIQUE`
  * Returns all unique values in a column.
* `ARCQUERY`
  * Executes inline SQL. See the section below on [#arcquery](#arcquery "mention").

NOTE: Numeric aggregations like SUM, AVERAGE, MIN, MAX, etc. will not work with string columns at the moment.

### Notes on criteria functions (\*IF / \*IFS)

* For columns that are not strings in the database, we interpret the empty string criteria `""` to count the number of `NULL` values. Conversely, `"<>"` will count the number of non-null values.
  * Example: `=COUNTIF(ARCTable!A:A, "")` will count the number of `NULL` values in column A.
* Date columns can be compared against `YYYY/MM/DD, MM/DD/YYYY, YYYY-MM-DD, MM-DD-YYYY` criteria (or numeric spreadsheet dates like `44322`). Other formats may not work as expected.

### ARCQUERY

We expose a special formula, `ARCQUERY`, to allow inline SQL queries in a spreadsheet cell. Syntactically, the first argument(s) should be columns from Arcwise tables, and the second argument should be a SQL query. In the SQL query, use the **lowercase** source sheet name to reference the source Arcwise table. Example:

```cpp
=ARCQUERY(ArcTable!$A:$A, "SELECT * FROM arctable LIMIT 10")
```

You can pass in columns from multiple Arcwise sheets and query them all at once:

```cpp
=ARCQUERY(ArcTable!$A:$A, Sheet2!$B:$B, "SELECT * FROM arctable JOIN sheet2
 ON arctable.id = sheet2.id")
```

> Note: Sheets with spaces or other non-alphanumeric characters may need to be quoted. Usually this means the table name will be e.g. `"sheet 1"` (if you're on BigQuery, then use backticks: `` `sheet 1` ``). Note that double-quotes need to be escaped with **two** double-quotes:
>
> `=ARCQUERY('Sheet 1'!$A:$A, "SELECT * FROM ""sheet 1""")`

### Known limitations

When using Arcwise columns, the following functions are currently unsupported (as of July 2023). Please contact us if you have a use case that requires these!

* `ARRAYFORMULA` (and using array literals like `{1,2,3}` in general)
* `ROW` / `COLUMN` / `INDIRECT`
* `LET` / `LAMBDA`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.arcwise.app/using-arcwise-in-sheets/supported-formulas.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
