Skip to content

Advanced Calculation Functions

HENGSHI SENSE provides several advanced calculation functions for period-over-period comparison, cumulative calculation, moving calculation, retention calculation, activity calculation, and more. The examples for these functions are all based on the advancedcal table in the database. Below is the data from advancedcal.

logindateusernameusertypeenergy
2019-01-05 00:00:00A1U110
2019-01-05 00:00:00A2U220
2019-01-05 00:00:00A3U130
2019-01-05 00:00:00A4U15
2019-01-05 00:00:00A5U115
2019-01-05 00:00:00A6U316
2019-01-05 00:00:00A7U318
2019-01-05 00:00:00A8U121
2019-01-06 00:00:00A2U222
2019-01-06 00:00:00A3U12
2019-01-06 00:00:00A1U112
2019-01-06 00:00:00A6U39
2019-01-06 00:00:00A5U18
2019-01-06 00:00:00A4U15
2019-01-07 00:00:00A5U124
2019-01-07 00:00:00A4U112
2019-01-07 00:00:00A3U116
2019-01-07 08:00:00A5U128
2019-01-09 00:00:00A8U127

bottom_n

Retain the bottom N values of a specified column within a given group and sorting order, and unify all other values as a special value.

  • Function Syntax
text
bottom_n(N, field1, constant1) over([ PARTITION BY expr1 ] ORDER BY expr2 [ DESC ])
  • Parameter Description

    • N: Required. The number of values to retain.
    • field1: Required. The name of the field whose values are to be retained.
    • constant1: Required. The value to represent non-retained values, which must be of the same data type as field1.
    • expr1: Optional. The grouping field. If a grouping field is specified, the retained values of field1 are calculated within each group.
    • expr2: Required. The sorting rule for field1, usually field1 itself or an aggregated value of another field.
  • Return Value Type

    Same as the data type of field1

  • Example 1

Group by userType, retain the bottom 2 usernames ranked by count({energy}), and represent all others as "Other".

text
{"kind":"formula","op":"summarize(dataset(2),{usertype} as t1, bottom_n(2, {username}, 'Other') over(PARTITION BY {usertype} order by count({energy}) asc) as test)"}

calculate

Pre-calculate some data and then use it together with the original data to compute the chart data. This is generally used when creating new metrics or for chart filtering. When used for metric calculation, it requires an appropriate aggregation function as its outer layer. The value of the calculate expression in a chart will change with subtotals; the calculation dimension of the detailed data and the subtotal dimension are different, which affects the value of the calculate expression.

  • Function Syntax
text
calculate(ARG1, [ARG2],... )
  • Parameter Description

    • ARG1: The expression required for pre-calculating the data
    • ARG2: Optional, can be multiple, represents the grouping fields in the chart to be removed
  • Return Value Type

    Any type

  • Example 1

For example, if the chart has dimensions prime_genre and month, and the measure is the sum of votes. The filter condition is to select the top three in the sum of votes grouped by prime_genre. The filter condition is written as follows:

text
calculate(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4

calculatep

Pre-calculate some data and then compute the chart data together with the original data. This is generally used for creating new metrics or for chart filtering. When used for metric calculation, it should be wrapped with an appropriate aggregation function. The value of the calculatep expression in a chart will not be affected by subtotals.

  • Function Syntax
text
calculatep(ARG1, [ARG2],... )
  • Parameter Description

    • ARG1: The expression required for pre-calculating the data
    • ARG2: Optional, can be multiple, indicates the grouping fields in the chart to be removed
  • Return Type

    Any type

  • Example 1

For example, if the chart has dimensions prime_genre and month, and the metric is the sum of votes. The filter condition is to select the top three by the sum of votes grouped by prime_genre. The filter condition is written as follows:

text
calculate(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4

calculatex

Pre-calculate some data, and then use it together with the original data to compute the chart data. This is generally used when creating new metrics or for chart filtering. When used for metric calculation, it requires an appropriate aggregation function as its outer layer. It is not controlled by dimensions, so it can also be used on dimensions.

  • Function Syntax
text
calculatex(ARG1, ARG2,... )
  • Parameter Description

    • ARG1: The expression required to pre-calculate the data
    • ARG2: Required, the grouping needed for the calculation expression.
  • Return Value Type

    Any type

  • Example 1

For example, the chart metric is the sum of votes. The filter condition is to select the top three in the sum of votes grouped by prime_genre. The filter condition is written as follows:

text
calculatex(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4

continuous_retention

Continuous activity calculation under the time dimension. This function returns the number of continuously active records within a specified time period.

  • Function Syntax
text
continuous_retention(expression, date expression, lowExpr, upperExpr, period)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The retention time dimension. The chart must include the same time dimension as the date expression in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include the same time dimension as the date expression in order to use this function in the aggregation parameters.
    • lowExpr: Required. The start of the retention/activity period.
    • upperExpr: Required. The end of the retention/activity period. It should be one day later than the actual end time.
    • period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
  • Return Value Type

    Number

  • Example 1

Calculate the number of users (username) who were continuously active on both yesterday and the day before yesterday.

  1. Add a new metric using the following advanced expression:
text
continuous_retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0), 'day')
  1. Create a new chart, select {logindate} as the dimension, and then drag the above metric into the measures.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        0
      ],
      [
        "2019-01-06",
        0
      ],
      [
        "2019-01-07",
        3
      ],
      [
        "2019-01-09",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_34393f485a5d2eb3_1"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the continuous activity rate of username for yesterday and the day before yesterday.

json
{
  "kind": "formula",
  "op": "summarize(dataset(1), day({logindate}), growth_rate(distinct_count({username}), continuous_retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0), 'day')))"
}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        null
      ],
      [
        "2019-01-06",
        null
      ],
      [
        "2019-01-07",
        0
      ],
      [
        "2019-01-09",
        null
      ]
    ],
    "schema": [
      {
        "fieldName": "hs_new_field_1"
      },
      {
        "fieldName": "hs_new_field_2"
      }
    ]
  }
}

custom_filters

Custom Filter Calculation

  • Function Syntax
text
custom_filters(expression, [{field1},{field2}], condition)
  • Parameter Description

    • expression: Required. Must be a basic aggregation operation.
    • [{field1},{field2}]: Optional. null means not to remove any previous filters; [] means to remove all previous filters; field1/field2 means that if any of these fields are used in a filter, that filter will be removed.
    • condition: Optional. The new filter condition to be added when calculating the expression.
  • Return Value Type Same as the return type of expression

  • Example 1

Use a newly added metric to calculate the sum of energy, remove filters related to logindate from previous context, and add a new time filter.

  1. Add a new metric using the following advanced expression:
text
custom_filters(sum({energy}),[{logindate}],{logindate}='2000-01-01')

date_accumulate

This function returns the result of cumulative calculations on data within a specified time range. It can only be used when adding new metrics or in API queries.

  • Function Syntax
text
date_accumulate(function1(function2(expression)), date expression1, reset period)
  • Parameter Description

    • function1: Required. The type of cumulative calculation. Optional values: sum, avg, min, max.
    • function2: Required. The basic operation on the data, which must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
    • expression: Required. The field to be calculated.
    • date expression1: Required. The time dimension for the basic calculation. The chart must include a time dimension identical to date expression1 in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include a time dimension identical to date expression1 in order to use this function in the aggregation parameters.
    • reset period: Required. The repeat cycle for the cumulative calculation. The calculation will reset and start over at the beginning of each cycle. Optional values: hour, day, week, month, quarter, year.
  • Return Value Type Number

  • Example 1

Use a new metric to calculate the weekly cumulative average of energy.

  1. Add a new metric using the following advanced expression:
text
date_accumulate(sum(avg({energy})), day({logindate}), 'week')
  1. Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the average of {energy} as a measure, and then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        16.8750000000000000,
        16.8750000000000000
      ],
      [
        "2019-01-06",
        9.6666666666666667,
        26.5416666666666667
      ],
      [
        "2019-01-07",
        20.0000000000000000,
        20.0000000000000000
      ],
      [
        "2019-01-09",
        27.0000000000000000,
        47.0000000000000000
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_7cd2518b65642b48_1"
      },
      {
        "fieldName": "u_c3f773f19fcb0912_2"
      }
    ]
  }
}
  • Example 2

Use an HE query to calculate the weekly cumulative average of energy, only including data where userType='U1' in the cumulative calculation.

text
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), avg({energy}), date_accumulate(sum(avg({energy}) filter (where {usertype} = 'U1')), day({logindate}), 'week'))"}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        16.875,
        16.2
      ],
      [
        "2019-01-06",
        9.6666666666666667,
        22.95
      ],
      [
        "2019-01-07",
        20,
        20
      ],
      [
        "2019-01-09",
        27,
        47
      ]
    ],
    "schema": [
      {
        "fieldName": "hs_new_field_1"
      },
      {
        "fieldName": "hs_new_field_2"
      },
      {
        "fieldName": "hs_new_field_3"
      }
    ]
  }
}

date_compare

This function returns the period-over-period comparison value of data relative to a specified date.

  • Function Syntax
text
date_compare(function(expression), date expression1, date expression2)
  • Parameter Description

    • function: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
    • date expression1: Required. The time dimension for the base calculation. The chart must include a time dimension identical to date expression1 in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include a time dimension identical to date expression1 in order to use this function in the aggregation parameters.
    • date expression2: Required. The comparison time dimension.
  • Return Value Type

    Number

  • Example 1

Use a new metric to calculate the comparison value of the sum of energy from two days ago.

  1. Add a new metric using the following advanced expression:
text
date_compare(sum({energy}), day({logindate}), add_day(day({logindate}), -2))
  1. Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        135,
        null
      ],
      [
        "2019-01-06",
        58,
        null
      ],
      [
        "2019-01-07",
        80,
        135
      ],
      [
        "2019-01-09",
        27,
        80
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_7cd2518b65642b48_2"
      },
      {
        "fieldName": "u_c8f6e35a15596acf_3"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the growth rate of the sum of energy from two days ago.

text
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), sum({energy}), growth_rate(sum({energy}), date_compare(sum({energy}), day({logindate}), add_day(day({logindate}), -2))))"}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "2019-01-09",
        27,
        -0.6625
      ],
      [
        "2019-01-06",
        58,
        null
      ],
      [
        "2019-01-07",
        80,
        -0.40740740740740740741
      ],
      [
        "2019-01-05",
        135,
        null
      ]
    ],
    "schema": [
      {
        "fieldName": "hs_new_field_1"
      },
      {
        "fieldName": "hs_new_field_2"
      },
      {
        "fieldName": "hs_new_field_3"
      }
    ]
  }
}

date_shift

Moving calculation. This function shifts each item in a time series and performs statistics on a certain number of items in sequence.

  • Function Syntax
text
date_shift(function1(function2(expression)), date expression, lowExpr, upperExpr)
  • Parameter Description

    • function1: Required. The calculation type for moving calculation. Optional values: sum, avg, min, max.
    • function2: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
    • expression: Required. The field to be calculated.
    • date expression1: Required. The time dimension for the basic calculation. The chart must contain the same time dimension as date expression1 to use this metric. The non-aggregated parameters of the summarize function must also contain the same time dimension as date expression1 to use this function in the aggregation parameters.
    • lowExpr: Required. The amount to shift backward for the moving calculation. Type: number.
    • upperExpr: Required. The amount to shift forward for the moving calculation. Type: number.
  • Return Value Type

    Number

  • Example 1

Shift backward by one day and forward by one day, and calculate the cumulative sum of energy using moving calculation. Use a new metric for calculation.

  1. Add a new metric using the following advanced expression:
text
date_shift(sum(sum({energy})), day({logindate}), -1, 1)
  1. Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        135,
        193
      ],
      [
        "2019-01-06",
        58,
        273
      ],
      [
        "2019-01-07",
        80,
        138
      ],
      [
        "2019-01-09",
        27,
        27
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_7cd2518b65642b48_2"
      },
      {
        "fieldName": "u_2fb8394ac0a23a02_3"
      }
    ]
  }
}
  • Example 2

Shift backward by one day and forward by one day, and calculate the cumulative sum of energy using moving calculation. Use HE calculation.

text
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), sum({energy}), date_shift(sum(sum({energy})), day({logindate}), -1, 1))"}```

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "2019-01-09",
        27,
        27
      ],
      [
        "2019-01-06",
        58,
        273
      ],
      [
        "2019-01-07",
        80,
        138
      ],
      [
        "2019-01-05",
        135,
        193
      ]
    ],
    "schema": [
      {
        "fieldName": "hs_new_field_1"
      },
      {
        "fieldName": "hs_new_field_2"
      },
      {
        "fieldName": "hs_new_field_3"
      }
    ]
  }
}

lookupvalue

Calculates results from other non-associated data and displays them in the current chart. Can only be used for creating new metrics.

  • Function Syntax
text
lookupvalue(ARG1, ARG2, ARG3)
  • Parameter Description

    • ARG1: The aggregation expression to be calculated on the non-associated data. The field information of the non-associated dataset must include the dataset id.
    • ARG2: The expression on the non-associated data that can be associated with the current chart. The field information of the non-associated dataset must include the dataset id.
    • ARG3: The expression in the current chart used for association. It must be the same as one of the dimensions in the chart.
  • Return Value Type

    Any type

  • Example 1

For example, the chart uses the dataset "Sales Orders" with the dimension year({date}), and you need to display the total amount per year from the "Purchase Details" dataset at the same time. The metric expression is as follows. Note that the id of the "Purchase Details" dataset is used in the expression.

text
lookupvalue(SUM({{2}}.{Amount}), year({{2}}.{date}), year({date}))

repetition_count

Repetition calculation. This function calculates the number of repetitions of the data.

  • Function Syntax
text
repetition_count(expression)
  • Parameter Description

    • expression: Required. The field to be calculated.
  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the number of repeated occurrences of username for each type.

  1. Add a new metric using the following advanced expression:
text
repetition_count({username})
  1. Create a new chart, select {logindate} as the dimension, set the calculation method to daily, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        0
      ],
      [
        "2019-01-06",
        0
      ],
      [
        "2019-01-07",
        2
      ],
      [
        "2019-01-09",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_409214f201c6ce01_2"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the number of repeated occurrences of username for each type.

text
{"kind":"formula","op":"summarize(dataset(1), {usertype}, repetition_count({username}))"}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "U2",
        2
      ],
      [
        "U1",
        14
      ],
      [
        "U3",
        2
      ]
    ],
    "schema": [
      {
        "fieldName": "usertype"
      },
      {
        "fieldName": "hs_new_field_1"
      }
    ]
  }
}

repetition

Repetition calculation. Returns the number of repetitions of data according to conditions.

  • Function Syntax
text
repetition(expression, lowExpr, upperExpr)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • lowExpr: Required. Minimum number of repetitions. Type: number.
    • upperExpr: Required. Maximum number of repetitions. Type: number.
  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the number of repetitions where username appears between two and three times for each type.

  1. Add a new metric using the following advanced expression:
text
repetition({username}, 2, 3)
  1. Create a new chart, select {userType} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "U1",
        4
      ],
      [
        "U2",
        1
      ],
      [
        "U3",
        1
      ]
    ],
    "schema": [
      {
        "fieldName": "u_452a834347a47a1e_0"
      },
      {
        "fieldName": "u_34393f485a5d2eb3_1"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the number of repetitions where username appears between two and three times for each type.

text
{"kind":"formula","op":"summarize(dataset(1), {usertype}, repetition({username}, 2, 3))"}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "U2",
        1
      ],
      [
        "U1",
        4
      ],
      [
        "U3",
        1
      ]
    ],
    "schema": [
      {
        "fieldName": "usertype"
      },
      {
        "fieldName": "hs_new_field_1"
      }
    ]
  }
}

retention

Retention/active calculation under time dimension. This function returns the number of retained or active data within the specified time.

  • Function Syntax
text
retention(expression, date expression, lowExpr, upperExpr, includeStartExpr, includeEndExpr, excludeStartExpr, excludeEndExpr)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The time dimension for retention. The chart must contain the same time dimension as date expression to use this metric. The non-aggregated parameters of the summarize function must also contain the same time dimension as date expression to use this function in the aggregation parameters.
    • lowExpr: Required. The start of the retention/active time.
    • upperExpr: Required. The end of the retention/active time, which should be one day more than the actual end time.
    • includeStartExpr: Optional. The start time of the period in which the field to be calculated must exist.
    • includeEndExpr: Optional. The end time of the period in which the field to be calculated must exist.
    • excludeStartExpr: Optional. The start time of the period in which the field to be calculated must not exist.
    • excludeEndExpr: Optional. The end time of the period in which the field to be calculated must not exist.

includeStartExpr, includeEndExpr, excludeStartExpr, and excludeEndExpr must all be present together or all be absent, and the order is strictly required.

  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the retention number of username in the next first and second days.

  1. Add a new metric using the following advanced expression:
text
retention({username}, day({logindate}), dayadd(day({logindate}), 1), dayadd(day({logindate}), 3))
  1. Create a new chart, select {logindate} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        6
      ],
      [
        "2019-01-06",
        3
      ],
      [
        "2019-01-07",
        0
      ],
      [
        "2019-01-09",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_cbc569ab32ff31d1_2"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the number of active usernames in the previous one to two days.

json
{
  "kind": "formula",
  "op": "summarize(dataset(1), day({logindate}), retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0)))"
}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "2019-01-09",
        0
      ],
      [
        "2019-01-06",
        6
      ],
      [
        "2019-01-07",
        3
      ],
      [
        "2019-01-05",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "hs_new_field_1"
      },
      {
        "fieldName": "hs_new_field_2"
      }
    ]
  }
}
  • Example 3

Use a calculated metric to compute the retention number of new usernames in the next first and second days.

  1. Add a new metric using the following advanced expression:
text
retention({username}, day({logindate}), dayadd(day({logindate}), 1), dayadd(day({logindate}), 3), null, null, null, day({logindate}))
  1. Create a new chart, select {logindate} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        6
      ],
      [
        "2019-01-06",
        null
      ],
      [
        "2019-01-07",
        null
      ],
      [
        "2019-01-09",
        null
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_cbc569ab32ff31d1_2"
      }
    ]
  }
}

previous

This function, used in conjunction with the time dimension on the chart axis, returns the period-over-period comparison value of data relative to the specified date.

  • Function Syntax
text
previous(function(expression), period, delta)
  • Parameter Description

    • function: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
    • period: Optional. The granularity to be compared. Optional values: year, quarter, month, week, day, hour. If not specified, it will use the smallest time axis granularity in the chart.
    • delta: Optional. The time offset to be compared. Type: integer.
  • Return Value Type

    Number

  • Example 1

Use a new metric to calculate the day-over-day comparison value of the sum of energy.

  1. Add a new metric using the following advanced expression:
text
previous(sum({energy}))
  1. Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "2019-01-05",
        135,
        null
      ],
      [
        "2019-01-06",
        58,
        135
      ],
      [
        "2019-01-07",
        80,
        58
      ],
      [
        "2019-01-09",
        27,
        null
      ]
    ],
    "schema": [
      {
        "fieldName": "u_6f28278382639341_0"
      },
      {
        "fieldName": "u_7cd2518b65642b48_2"
      },
      {
        "fieldName": "u_c8f6e35a15596acf_3"
      }
    ]
  }
}

rollup_value

It can calculate measures based on partial dimensions and is generally used for creating new metrics.

  • Function Syntax
text
rollup_value(ARG) OVER( [ PARTITION BY expr1 ] )
  • Parameter Description

    • ARG: Aggregation calculation expression
  • Return Value Type

    Any type

  • Example 1

Calculate the total sum of all votes, which can be used in any chart.

text
rollup_value(sum({votes})) OVER ()
  • Example 2

After grouping by month, calculate the total sum of votes within the group. It can be used in charts with the month dimension.

text
rollup_value(sum({votes})) OVER (partition by {month})

static_continuous_retention

Continuous active calculation for non-time dimensions. This function returns the number of continuously active data within the specified time period.

  • Function Syntax
text
static_continuous_retention(expression, date expression, start date, end date, compare start date, compare end date, period)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The time field for calculating activity.
    • start date: Required. The start time to be calculated. Type: time.
    • end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
    • compare start date: Required. The start time of the retention period to be calculated. Type: time.
    • compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
    • period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the number of continuously active usernames online today in the previous one to two years, with the activity period as year.

  1. Add a new metric using the following advanced expression:
text
static_continuous_retention({username}, {logindate}, day(now()), dayadd(day(now()), 1), yearadd(year(now()), -2), yearadd(year(now()), 0), 'year')
  1. Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "U1",
        0
      ],
      [
        "U2",
        0
      ],
      [
        "U3",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_452a834347a47a1e_0"
      },
      {
        "fieldName": "u_c126c85c957b406e_2"
      }
    ]
  }
}

static_continuous_retention_rate

Continuous active rate calculation for non-time dimensions. This function returns the continuous active rate of data within the specified time period.

  • Function Syntax
text
static_continuous_retention_rate(expression, date expression, start date, end date, compare start date, compare end date, period)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The time field for calculating activity.
    • start date: Required. The start time to be calculated. Type: time.
    • end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
    • compare start date: Required. The start time of the retention period to be calculated. Type: time.
    • compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
    • period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
  • Return Value Type

    Number

  • Example 1

Use HE to calculate the number of continuously active usernames online today in the previous one to two years, with the activity period as year.

text
{"kind":"formula","op":"summarize(dataset(1), {usertype}, static_continuous_retention({username}, {logindate}, day(now()), dayadd(day(now()), 1), yearadd(year(now()), -2), yearadd(year(now()), 0), 'year'))"}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "U2",
        0
      ],
      [
        "U1",
        0
      ],
      [
        "U3",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "usertype"
      },
      {
        "fieldName": "hs_new_field_1"
      }
    ]
  }
}

static_retention

Retention/active calculation for non-time dimensions. This function returns the number of retained data within the specified time period.

  • Function Syntax
text
static_retention(expression, date expression, start date, end date, compare start date, compare end date, includeStartExpr, includeEndExpr, excludeStartExpr, excludeEndExpr)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The time field for calculating activity.
    • start date: Required. The start time to be calculated. Type: time.
    • end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
    • compare start date: Required. The start time of the retention period to be calculated. Type: time.
    • compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
    • includeStartExpr: Optional. The start time of the period in which the field to be calculated must exist.
    • includeEndExpr: Optional. The end time of the period in which the field to be calculated must exist.
    • excludeStartExpr: Optional. The start time of the period in which the field to be calculated must not exist.
    • excludeEndExpr: Optional. The end time of the period in which the field to be calculated must not exist.

includeStartExpr, includeEndExpr, excludeStartExpr, and excludeEndExpr must all be present together or all be absent, and the order is strictly required.

  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the number of usernames online during 2019-01-05 to 2019-01-06 retained in the next 1 to 3 days.

  1. Add a new metric using the following advanced expression:
text
static_retention({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10')
  1. Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "U1",
        4
      ],
      [
        "U2",
        0
      ],
      [
        "U3",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_452a834347a47a1e_0"
      },
      {
        "fieldName": "u_55c18b8d38923c85_2"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the number of active usernames online during 2019-01-07 to 2019-01-09 in 2019-01-05.

json
{
  "kind": "formula",
  "op": "summarize(dataset(1), {usertype}, static_retention({username}, {logindate}, '2019-01-07', '2019-01-10', '2019-01-05', '2019-01-06'))"
}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "U2",
        0
      ],
      [
        "U1",
        4
      ],
      [
        "U3",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "usertype"
      },
      {
        "fieldName": "hs_new_field_1"
      }
    ]
  }
}
  • Example 3

Use a calculated metric to compute the number of old usernames online during 2019-01-05 to 2019-01-06 retained in the next 1 to 3 days.

  1. Add a new metric using the following advanced expression:
text
static_retention({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10', null, '2019-01-05', null, null)
  1. Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "U1",
        4
      ],
      [
        "U2",
        0
      ],
      [
        "U3",
        0
      ]
    ],
    "schema": [
      {
        "fieldName": "u_452a834347a47a1e_0"
      },
      {
        "fieldName": "u_55c18b8d38923c85_2"
      }
    ]
  }
}

static_retention_rate

Retention/active rate calculation for non-time dimensions. This function returns the retention rate of data within the specified time period.

  • Function Syntax
text
static_retention_rate(expression, date expression, start date, end date, compare start date, compare end date)
  • Parameter Description

    • expression: Required. The field to be calculated.
    • date expression: Required. The time field for calculating activity.
    • start date: Required. The start time to be calculated. Type: time.
    • end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
    • compare start date: Required. The start time of the retention period to be calculated. Type: time.
    • compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
  • Return Value Type

    Number

  • Example 1

Use a calculated metric to compute the retention rate of usernames online during 2019-01-05 to 2019-01-06 in the next 1 to 3 days.

  1. Add a new metric using the following advanced expression:
text
static_retention_rate({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10')
  1. Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
  2. The calculation result is as follows:
json
{
  "data": {
    "data": [
      [
        "U1",
        0.80000000000000000000
      ],
      [
        "U2",
        0E-20
      ],
      [
        "U3",
        0E-20
      ]
    ],
    "schema": [
      {
        "fieldName": "u_452a834347a47a1e_0"
      },
      {
        "fieldName": "u_6ef7c5c463a74495_1"
      }
    ]
  }
}
  • Example 2

Use HE to calculate the retention rate of usernames online during 2019-01-07 to 2019-01-09 in 2019-01-05.

json
{
  "kind": "formula",
  "op": "summarize(dataset(1), {usertype}, static_retention_rate({username}, {logindate}, '2019-01-07', '2019-01-10', '2019-01-05', '2019-01-06'))"
}

The calculation result is as follows:

json
{
  "data": {
    "data": [
      [
        "U2",
        null
      ],
      [
        "U1",
        1
      ],
      [
        "U3",
        null
      ]
    ],
    "schema": [
      {
        "fieldName": "usertype"
      },
      {
        "fieldName": "hs_new_field_1"
      }
    ]
  }
}

top_n

Retain the top n values of a column under specified grouping and sorting, and unify the remaining values as a special value.

  • Function Syntax
text
top_n(N, field1, constant1) over([ PARTITION BY expr1 ] ORDER BY expr2 [ DESC ])
  • Parameter Description

    • N: Required. The number of values to retain.
    • field1: Required. The field name of the value to retain.
    • constant1: Required. The value to represent non-retained values, which must be of the same type as field1.
    • expr1: Optional. Grouping field. If there is a grouping field, it means to calculate the retained values of field1 under that group.
    • expr2: Required. The sorting rule for field1, usually field1 itself or the aggregation value of another field.
  • Return Value Type

    Same as the field type of field1

  • Example 1

Group by userType, retain the top 2 usernames ranked by count({energy}), and represent the rest as "Other".

text
{"kind":"formula","op":"summarize(dataset(2),{usertype} as t1, top_n(2, {username}, 'Other') over(PARTITION BY {usertype} order by count({energy}) asc) as test)"}

User Manual for Hengshi Analysis Platform