Advanced Calculation Functions
HENGSHI SENSE provides several advanced calculation functions for period-over-period comparison, cumulative calculation, moving calculation, retention calculation, activity calculation, and more. The examples for these functions are all based on the advancedcal table in the database. Below is the data from advancedcal.
logindate | username | usertype | energy |
---|---|---|---|
2019-01-05 00:00:00 | A1 | U1 | 10 |
2019-01-05 00:00:00 | A2 | U2 | 20 |
2019-01-05 00:00:00 | A3 | U1 | 30 |
2019-01-05 00:00:00 | A4 | U1 | 5 |
2019-01-05 00:00:00 | A5 | U1 | 15 |
2019-01-05 00:00:00 | A6 | U3 | 16 |
2019-01-05 00:00:00 | A7 | U3 | 18 |
2019-01-05 00:00:00 | A8 | U1 | 21 |
2019-01-06 00:00:00 | A2 | U2 | 22 |
2019-01-06 00:00:00 | A3 | U1 | 2 |
2019-01-06 00:00:00 | A1 | U1 | 12 |
2019-01-06 00:00:00 | A6 | U3 | 9 |
2019-01-06 00:00:00 | A5 | U1 | 8 |
2019-01-06 00:00:00 | A4 | U1 | 5 |
2019-01-07 00:00:00 | A5 | U1 | 24 |
2019-01-07 00:00:00 | A4 | U1 | 12 |
2019-01-07 00:00:00 | A3 | U1 | 16 |
2019-01-07 08:00:00 | A5 | U1 | 28 |
2019-01-09 00:00:00 | A8 | U1 | 27 |
bottom_n
Retain the bottom N values of a specified column within a given group and sorting order, and unify all other values as a special value.
- Function Syntax
bottom_n(N, field1, constant1) over([ PARTITION BY expr1 ] ORDER BY expr2 [ DESC ])
Parameter Description
- N: Required. The number of values to retain.
- field1: Required. The name of the field whose values are to be retained.
- constant1: Required. The value to represent non-retained values, which must be of the same data type as field1.
- expr1: Optional. The grouping field. If a grouping field is specified, the retained values of field1 are calculated within each group.
- expr2: Required. The sorting rule for field1, usually field1 itself or an aggregated value of another field.
Return Value Type
Same as the data type of field1
Example 1
Group by userType, retain the bottom 2 usernames ranked by count({energy}), and represent all others as "Other".
{"kind":"formula","op":"summarize(dataset(2),{usertype} as t1, bottom_n(2, {username}, 'Other') over(PARTITION BY {usertype} order by count({energy}) asc) as test)"}
calculate
Pre-calculate some data and then use it together with the original data to compute the chart data. This is generally used when creating new metrics or for chart filtering. When used for metric calculation, it requires an appropriate aggregation function as its outer layer. The value of the calculate expression in a chart will change with subtotals; the calculation dimension of the detailed data and the subtotal dimension are different, which affects the value of the calculate expression.
- Function Syntax
calculate(ARG1, [ARG2],... )
Parameter Description
- ARG1: The expression required for pre-calculating the data
- ARG2: Optional, can be multiple, represents the grouping fields in the chart to be removed
Return Value Type
Any type
Example 1
For example, if the chart has dimensions prime_genre and month, and the measure is the sum of votes. The filter condition is to select the top three in the sum of votes grouped by prime_genre. The filter condition is written as follows:
calculate(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4
calculatep
Pre-calculate some data and then compute the chart data together with the original data. This is generally used for creating new metrics or for chart filtering. When used for metric calculation, it should be wrapped with an appropriate aggregation function. The value of the calculatep expression in a chart will not be affected by subtotals.
- Function Syntax
calculatep(ARG1, [ARG2],... )
Parameter Description
- ARG1: The expression required for pre-calculating the data
- ARG2: Optional, can be multiple, indicates the grouping fields in the chart to be removed
Return Type
Any type
Example 1
For example, if the chart has dimensions prime_genre and month, and the metric is the sum of votes. The filter condition is to select the top three by the sum of votes grouped by prime_genre. The filter condition is written as follows:
calculate(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4
calculatex
Pre-calculate some data, and then use it together with the original data to compute the chart data. This is generally used when creating new metrics or for chart filtering. When used for metric calculation, it requires an appropriate aggregation function as its outer layer. It is not controlled by dimensions, so it can also be used on dimensions.
- Function Syntax
calculatex(ARG1, ARG2,... )
Parameter Description
- ARG1: The expression required to pre-calculate the data
- ARG2: Required, the grouping needed for the calculation expression.
Return Value Type
Any type
Example 1
For example, the chart metric is the sum of votes. The filter condition is to select the top three in the sum of votes grouped by prime_genre. The filter condition is written as follows:
calculatex(rank() OVER (ORDER BY sum({votes}) ASC), {prime_genre}) < 4
continuous_retention
Continuous activity calculation under the time dimension. This function returns the number of continuously active records within a specified time period.
- Function Syntax
continuous_retention(expression, date expression, lowExpr, upperExpr, period)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The retention time dimension. The chart must include the same time dimension as the date expression in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include the same time dimension as the date expression in order to use this function in the aggregation parameters.
- lowExpr: Required. The start of the retention/activity period.
- upperExpr: Required. The end of the retention/activity period. It should be one day later than the actual end time.
- period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
Return Value Type
Number
Example 1
Calculate the number of users (username) who were continuously active on both yesterday and the day before yesterday.
- Add a new metric using the following advanced expression:
continuous_retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0), 'day')
- Create a new chart, select {logindate} as the dimension, and then drag the above metric into the measures.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
0
],
[
"2019-01-06",
0
],
[
"2019-01-07",
3
],
[
"2019-01-09",
0
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_34393f485a5d2eb3_1"
}
]
}
}
- Example 2
Use HE to calculate the continuous activity rate of username for yesterday and the day before yesterday.
{
"kind": "formula",
"op": "summarize(dataset(1), day({logindate}), growth_rate(distinct_count({username}), continuous_retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0), 'day')))"
}
The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
null
],
[
"2019-01-06",
null
],
[
"2019-01-07",
0
],
[
"2019-01-09",
null
]
],
"schema": [
{
"fieldName": "hs_new_field_1"
},
{
"fieldName": "hs_new_field_2"
}
]
}
}
custom_filters
Custom Filter Calculation
- Function Syntax
custom_filters(expression, [{field1},{field2}], condition)
Parameter Description
- expression: Required. Must be a basic aggregation operation.
- [{field1},{field2}]: Optional. null means not to remove any previous filters; [] means to remove all previous filters; field1/field2 means that if any of these fields are used in a filter, that filter will be removed.
- condition: Optional. The new filter condition to be added when calculating the expression.
Return Value Type Same as the return type of expression
Example 1
Use a newly added metric to calculate the sum of energy, remove filters related to logindate from previous context, and add a new time filter.
- Add a new metric using the following advanced expression:
custom_filters(sum({energy}),[{logindate}],{logindate}='2000-01-01')
date_accumulate
This function returns the result of cumulative calculations on data within a specified time range. It can only be used when adding new metrics or in API queries.
- Function Syntax
date_accumulate(function1(function2(expression)), date expression1, reset period)
Parameter Description
- function1: Required. The type of cumulative calculation. Optional values: sum, avg, min, max.
- function2: Required. The basic operation on the data, which must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
- expression: Required. The field to be calculated.
- date expression1: Required. The time dimension for the basic calculation. The chart must include a time dimension identical to date expression1 in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include a time dimension identical to date expression1 in order to use this function in the aggregation parameters.
- reset period: Required. The repeat cycle for the cumulative calculation. The calculation will reset and start over at the beginning of each cycle. Optional values: hour, day, week, month, quarter, year.
Return Value Type Number
Example 1
Use a new metric to calculate the weekly cumulative average of energy.
- Add a new metric using the following advanced expression:
date_accumulate(sum(avg({energy})), day({logindate}), 'week')
- Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the average of {energy} as a measure, and then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
16.8750000000000000,
16.8750000000000000
],
[
"2019-01-06",
9.6666666666666667,
26.5416666666666667
],
[
"2019-01-07",
20.0000000000000000,
20.0000000000000000
],
[
"2019-01-09",
27.0000000000000000,
47.0000000000000000
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_7cd2518b65642b48_1"
},
{
"fieldName": "u_c3f773f19fcb0912_2"
}
]
}
}
- Example 2
Use an HE query to calculate the weekly cumulative average of energy, only including data where userType='U1' in the cumulative calculation.
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), avg({energy}), date_accumulate(sum(avg({energy}) filter (where {usertype} = 'U1')), day({logindate}), 'week'))"}
The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
16.875,
16.2
],
[
"2019-01-06",
9.6666666666666667,
22.95
],
[
"2019-01-07",
20,
20
],
[
"2019-01-09",
27,
47
]
],
"schema": [
{
"fieldName": "hs_new_field_1"
},
{
"fieldName": "hs_new_field_2"
},
{
"fieldName": "hs_new_field_3"
}
]
}
}
date_compare
This function returns the period-over-period comparison value of data relative to a specified date.
- Function Syntax
date_compare(function(expression), date expression1, date expression2)
Parameter Description
- function: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
- date expression1: Required. The time dimension for the base calculation. The chart must include a time dimension identical to date expression1 in its dimensions to use this metric. The non-aggregated parameters of the summarize function must also include a time dimension identical to date expression1 in order to use this function in the aggregation parameters.
- date expression2: Required. The comparison time dimension.
Return Value Type
Number
Example 1
Use a new metric to calculate the comparison value of the sum of energy from two days ago.
- Add a new metric using the following advanced expression:
date_compare(sum({energy}), day({logindate}), add_day(day({logindate}), -2))
- Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
135,
null
],
[
"2019-01-06",
58,
null
],
[
"2019-01-07",
80,
135
],
[
"2019-01-09",
27,
80
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_7cd2518b65642b48_2"
},
{
"fieldName": "u_c8f6e35a15596acf_3"
}
]
}
}
- Example 2
Use HE to calculate the growth rate of the sum of energy from two days ago.
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), sum({energy}), growth_rate(sum({energy}), date_compare(sum({energy}), day({logindate}), add_day(day({logindate}), -2))))"}
The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-09",
27,
-0.6625
],
[
"2019-01-06",
58,
null
],
[
"2019-01-07",
80,
-0.40740740740740740741
],
[
"2019-01-05",
135,
null
]
],
"schema": [
{
"fieldName": "hs_new_field_1"
},
{
"fieldName": "hs_new_field_2"
},
{
"fieldName": "hs_new_field_3"
}
]
}
}
date_shift
Moving calculation. This function shifts each item in a time series and performs statistics on a certain number of items in sequence.
- Function Syntax
date_shift(function1(function2(expression)), date expression, lowExpr, upperExpr)
Parameter Description
- function1: Required. The calculation type for moving calculation. Optional values: sum, avg, min, max.
- function2: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
- expression: Required. The field to be calculated.
- date expression1: Required. The time dimension for the basic calculation. The chart must contain the same time dimension as date expression1 to use this metric. The non-aggregated parameters of the summarize function must also contain the same time dimension as date expression1 to use this function in the aggregation parameters.
- lowExpr: Required. The amount to shift backward for the moving calculation. Type: number.
- upperExpr: Required. The amount to shift forward for the moving calculation. Type: number.
Return Value Type
Number
Example 1
Shift backward by one day and forward by one day, and calculate the cumulative sum of energy using moving calculation. Use a new metric for calculation.
- Add a new metric using the following advanced expression:
date_shift(sum(sum({energy})), day({logindate}), -1, 1)
- Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
135,
193
],
[
"2019-01-06",
58,
273
],
[
"2019-01-07",
80,
138
],
[
"2019-01-09",
27,
27
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_7cd2518b65642b48_2"
},
{
"fieldName": "u_2fb8394ac0a23a02_3"
}
]
}
}
- Example 2
Shift backward by one day and forward by one day, and calculate the cumulative sum of energy using moving calculation. Use HE calculation.
{"kind":"formula","op":"summarize(dataset(1), day({logindate}), sum({energy}), date_shift(sum(sum({energy})), day({logindate}), -1, 1))"}```
The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-09",
27,
27
],
[
"2019-01-06",
58,
273
],
[
"2019-01-07",
80,
138
],
[
"2019-01-05",
135,
193
]
],
"schema": [
{
"fieldName": "hs_new_field_1"
},
{
"fieldName": "hs_new_field_2"
},
{
"fieldName": "hs_new_field_3"
}
]
}
}
lookupvalue
Calculates results from other non-associated data and displays them in the current chart. Can only be used for creating new metrics.
- Function Syntax
lookupvalue(ARG1, ARG2, ARG3)
Parameter Description
- ARG1: The aggregation expression to be calculated on the non-associated data. The field information of the non-associated dataset must include the dataset id.
- ARG2: The expression on the non-associated data that can be associated with the current chart. The field information of the non-associated dataset must include the dataset id.
- ARG3: The expression in the current chart used for association. It must be the same as one of the dimensions in the chart.
Return Value Type
Any type
Example 1
For example, the chart uses the dataset "Sales Orders" with the dimension year({date}), and you need to display the total amount per year from the "Purchase Details" dataset at the same time. The metric expression is as follows. Note that the id of the "Purchase Details" dataset is used in the expression.
lookupvalue(SUM({{2}}.{Amount}), year({{2}}.{date}), year({date}))
repetition_count
Repetition calculation. This function calculates the number of repetitions of the data.
- Function Syntax
repetition_count(expression)
Parameter Description
- expression: Required. The field to be calculated.
Return Value Type
Number
Example 1
Use a calculated metric to compute the number of repeated occurrences of username for each type.
- Add a new metric using the following advanced expression:
repetition_count({username})
- Create a new chart, select {logindate} as the dimension, set the calculation method to daily, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
0
],
[
"2019-01-06",
0
],
[
"2019-01-07",
2
],
[
"2019-01-09",
0
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_409214f201c6ce01_2"
}
]
}
}
- Example 2
Use HE to calculate the number of repeated occurrences of username for each type.
{"kind":"formula","op":"summarize(dataset(1), {usertype}, repetition_count({username}))"}
The calculation result is as follows:
{
"data": {
"data": [
[
"U2",
2
],
[
"U1",
14
],
[
"U3",
2
]
],
"schema": [
{
"fieldName": "usertype"
},
{
"fieldName": "hs_new_field_1"
}
]
}
}
repetition
Repetition calculation. Returns the number of repetitions of data according to conditions.
- Function Syntax
repetition(expression, lowExpr, upperExpr)
Parameter Description
- expression: Required. The field to be calculated.
- lowExpr: Required. Minimum number of repetitions. Type: number.
- upperExpr: Required. Maximum number of repetitions. Type: number.
Return Value Type
Number
Example 1
Use a calculated metric to compute the number of repetitions where username appears between two and three times for each type.
- Add a new metric using the following advanced expression:
repetition({username}, 2, 3)
- Create a new chart, select {userType} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"U1",
4
],
[
"U2",
1
],
[
"U3",
1
]
],
"schema": [
{
"fieldName": "u_452a834347a47a1e_0"
},
{
"fieldName": "u_34393f485a5d2eb3_1"
}
]
}
}
- Example 2
Use HE to calculate the number of repetitions where username appears between two and three times for each type.
{"kind":"formula","op":"summarize(dataset(1), {usertype}, repetition({username}, 2, 3))"}
The calculation result is as follows:
{
"data": {
"data": [
[
"U2",
1
],
[
"U1",
4
],
[
"U3",
1
]
],
"schema": [
{
"fieldName": "usertype"
},
{
"fieldName": "hs_new_field_1"
}
]
}
}
retention
Retention/active calculation under time dimension. This function returns the number of retained or active data within the specified time.
- Function Syntax
retention(expression, date expression, lowExpr, upperExpr, includeStartExpr, includeEndExpr, excludeStartExpr, excludeEndExpr)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The time dimension for retention. The chart must contain the same time dimension as date expression to use this metric. The non-aggregated parameters of the summarize function must also contain the same time dimension as date expression to use this function in the aggregation parameters.
- lowExpr: Required. The start of the retention/active time.
- upperExpr: Required. The end of the retention/active time, which should be one day more than the actual end time.
- includeStartExpr: Optional. The start time of the period in which the field to be calculated must exist.
- includeEndExpr: Optional. The end time of the period in which the field to be calculated must exist.
- excludeStartExpr: Optional. The start time of the period in which the field to be calculated must not exist.
- excludeEndExpr: Optional. The end time of the period in which the field to be calculated must not exist.
includeStartExpr, includeEndExpr, excludeStartExpr, and excludeEndExpr must all be present together or all be absent, and the order is strictly required.
Return Value Type
Number
Example 1
Use a calculated metric to compute the retention number of username in the next first and second days.
- Add a new metric using the following advanced expression:
retention({username}, day({logindate}), dayadd(day({logindate}), 1), dayadd(day({logindate}), 3))
- Create a new chart, select {logindate} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
6
],
[
"2019-01-06",
3
],
[
"2019-01-07",
0
],
[
"2019-01-09",
0
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_cbc569ab32ff31d1_2"
}
]
}
}
- Example 2
Use HE to calculate the number of active usernames in the previous one to two days.
{
"kind": "formula",
"op": "summarize(dataset(1), day({logindate}), retention({username}, day({logindate}), dayadd(day({logindate}), -2), dayadd(day({logindate}), 0)))"
}
The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-09",
0
],
[
"2019-01-06",
6
],
[
"2019-01-07",
3
],
[
"2019-01-05",
0
]
],
"schema": [
{
"fieldName": "hs_new_field_1"
},
{
"fieldName": "hs_new_field_2"
}
]
}
}
- Example 3
Use a calculated metric to compute the retention number of new usernames in the next first and second days.
- Add a new metric using the following advanced expression:
retention({username}, day({logindate}), dayadd(day({logindate}), 1), dayadd(day({logindate}), 3), null, null, null, day({logindate}))
- Create a new chart, select {logindate} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
6
],
[
"2019-01-06",
null
],
[
"2019-01-07",
null
],
[
"2019-01-09",
null
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_cbc569ab32ff31d1_2"
}
]
}
}
previous
This function, used in conjunction with the time dimension on the chart axis, returns the period-over-period comparison value of data relative to the specified date.
- Function Syntax
previous(function(expression), period, delta)
Parameter Description
- function: Required. The basic operation on the data, must be a basic aggregation operation. Optional values: sum, avg, min, max, percentile, count, distinct_count.
- period: Optional. The granularity to be compared. Optional values: year, quarter, month, week, day, hour. If not specified, it will use the smallest time axis granularity in the chart.
- delta: Optional. The time offset to be compared. Type: integer.
Return Value Type
Number
Example 1
Use a new metric to calculate the day-over-day comparison value of the sum of energy.
- Add a new metric using the following advanced expression:
previous(sum({energy}))
- Create a new chart, select {logindate} as the dimension, set the calculation method to daily, drag in the sum of {energy} as a measure, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"2019-01-05",
135,
null
],
[
"2019-01-06",
58,
135
],
[
"2019-01-07",
80,
58
],
[
"2019-01-09",
27,
null
]
],
"schema": [
{
"fieldName": "u_6f28278382639341_0"
},
{
"fieldName": "u_7cd2518b65642b48_2"
},
{
"fieldName": "u_c8f6e35a15596acf_3"
}
]
}
}
rollup_value
It can calculate measures based on partial dimensions and is generally used for creating new metrics.
- Function Syntax
rollup_value(ARG) OVER( [ PARTITION BY expr1 ] )
Parameter Description
- ARG: Aggregation calculation expression
Return Value Type
Any type
Example 1
Calculate the total sum of all votes, which can be used in any chart.
rollup_value(sum({votes})) OVER ()
- Example 2
After grouping by month, calculate the total sum of votes within the group. It can be used in charts with the month dimension.
rollup_value(sum({votes})) OVER (partition by {month})
static_continuous_retention
Continuous active calculation for non-time dimensions. This function returns the number of continuously active data within the specified time period.
- Function Syntax
static_continuous_retention(expression, date expression, start date, end date, compare start date, compare end date, period)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The time field for calculating activity.
- start date: Required. The start time to be calculated. Type: time.
- end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
- compare start date: Required. The start time of the retention period to be calculated. Type: time.
- compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
- period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
Return Value Type
Number
Example 1
Use a calculated metric to compute the number of continuously active usernames online today in the previous one to two years, with the activity period as year.
- Add a new metric using the following advanced expression:
static_continuous_retention({username}, {logindate}, day(now()), dayadd(day(now()), 1), yearadd(year(now()), -2), yearadd(year(now()), 0), 'year')
- Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"U1",
0
],
[
"U2",
0
],
[
"U3",
0
]
],
"schema": [
{
"fieldName": "u_452a834347a47a1e_0"
},
{
"fieldName": "u_c126c85c957b406e_2"
}
]
}
}
static_continuous_retention_rate
Continuous active rate calculation for non-time dimensions. This function returns the continuous active rate of data within the specified time period.
- Function Syntax
static_continuous_retention_rate(expression, date expression, start date, end date, compare start date, compare end date, period)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The time field for calculating activity.
- start date: Required. The start time to be calculated. Type: time.
- end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
- compare start date: Required. The start time of the retention period to be calculated. Type: time.
- compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
- period: Required. The activity period. Type: String. Optional values: day, week, month, quarter, year.
Return Value Type
Number
Example 1
Use HE to calculate the number of continuously active usernames online today in the previous one to two years, with the activity period as year.
{"kind":"formula","op":"summarize(dataset(1), {usertype}, static_continuous_retention({username}, {logindate}, day(now()), dayadd(day(now()), 1), yearadd(year(now()), -2), yearadd(year(now()), 0), 'year'))"}
The calculation result is as follows:
{
"data": {
"data": [
[
"U2",
0
],
[
"U1",
0
],
[
"U3",
0
]
],
"schema": [
{
"fieldName": "usertype"
},
{
"fieldName": "hs_new_field_1"
}
]
}
}
static_retention
Retention/active calculation for non-time dimensions. This function returns the number of retained data within the specified time period.
- Function Syntax
static_retention(expression, date expression, start date, end date, compare start date, compare end date, includeStartExpr, includeEndExpr, excludeStartExpr, excludeEndExpr)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The time field for calculating activity.
- start date: Required. The start time to be calculated. Type: time.
- end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
- compare start date: Required. The start time of the retention period to be calculated. Type: time.
- compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
- includeStartExpr: Optional. The start time of the period in which the field to be calculated must exist.
- includeEndExpr: Optional. The end time of the period in which the field to be calculated must exist.
- excludeStartExpr: Optional. The start time of the period in which the field to be calculated must not exist.
- excludeEndExpr: Optional. The end time of the period in which the field to be calculated must not exist.
includeStartExpr, includeEndExpr, excludeStartExpr, and excludeEndExpr must all be present together or all be absent, and the order is strictly required.
Return Value Type
Number
Example 1
Use a calculated metric to compute the number of usernames online during 2019-01-05 to 2019-01-06 retained in the next 1 to 3 days.
- Add a new metric using the following advanced expression:
static_retention({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10')
- Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"U1",
4
],
[
"U2",
0
],
[
"U3",
0
]
],
"schema": [
{
"fieldName": "u_452a834347a47a1e_0"
},
{
"fieldName": "u_55c18b8d38923c85_2"
}
]
}
}
- Example 2
Use HE to calculate the number of active usernames online during 2019-01-07 to 2019-01-09 in 2019-01-05.
{
"kind": "formula",
"op": "summarize(dataset(1), {usertype}, static_retention({username}, {logindate}, '2019-01-07', '2019-01-10', '2019-01-05', '2019-01-06'))"
}
The calculation result is as follows:
{
"data": {
"data": [
[
"U2",
0
],
[
"U1",
4
],
[
"U3",
0
]
],
"schema": [
{
"fieldName": "usertype"
},
{
"fieldName": "hs_new_field_1"
}
]
}
}
- Example 3
Use a calculated metric to compute the number of old usernames online during 2019-01-05 to 2019-01-06 retained in the next 1 to 3 days.
- Add a new metric using the following advanced expression:
static_retention({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10', null, '2019-01-05', null, null)
- Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"U1",
4
],
[
"U2",
0
],
[
"U3",
0
]
],
"schema": [
{
"fieldName": "u_452a834347a47a1e_0"
},
{
"fieldName": "u_55c18b8d38923c85_2"
}
]
}
}
static_retention_rate
Retention/active rate calculation for non-time dimensions. This function returns the retention rate of data within the specified time period.
- Function Syntax
static_retention_rate(expression, date expression, start date, end date, compare start date, compare end date)
Parameter Description
- expression: Required. The field to be calculated.
- date expression: Required. The time field for calculating activity.
- start date: Required. The start time to be calculated. Type: time.
- end date: Required. The end time to be calculated. Type: time. Should be one day later than the actual end time.
- compare start date: Required. The start time of the retention period to be calculated. Type: time.
- compare end date: Required. The end time of the retention period to be calculated. Type: time. Should be one day later than the actual end time of the retention period.
Return Value Type
Number
Example 1
Use a calculated metric to compute the retention rate of usernames online during 2019-01-05 to 2019-01-06 in the next 1 to 3 days.
- Add a new metric using the following advanced expression:
static_retention_rate({username}, {logindate}, '2019-01-05', '2019-01-07', '2019-01-07', '2019-01-10')
- Create a new chart, select {usertype} as the dimension, then drag in the above metric as a measure.
- The calculation result is as follows:
{
"data": {
"data": [
[
"U1",
0.80000000000000000000
],
[
"U2",
0E-20
],
[
"U3",
0E-20
]
],
"schema": [
{
"fieldName": "u_452a834347a47a1e_0"
},
{
"fieldName": "u_6ef7c5c463a74495_1"
}
]
}
}
- Example 2
Use HE to calculate the retention rate of usernames online during 2019-01-07 to 2019-01-09 in 2019-01-05.
{
"kind": "formula",
"op": "summarize(dataset(1), {usertype}, static_retention_rate({username}, {logindate}, '2019-01-07', '2019-01-10', '2019-01-05', '2019-01-06'))"
}
The calculation result is as follows:
{
"data": {
"data": [
[
"U2",
null
],
[
"U1",
1
],
[
"U3",
null
]
],
"schema": [
{
"fieldName": "usertype"
},
{
"fieldName": "hs_new_field_1"
}
]
}
}
top_n
Retain the top n values of a column under specified grouping and sorting, and unify the remaining values as a special value.
- Function Syntax
top_n(N, field1, constant1) over([ PARTITION BY expr1 ] ORDER BY expr2 [ DESC ])
Parameter Description
- N: Required. The number of values to retain.
- field1: Required. The field name of the value to retain.
- constant1: Required. The value to represent non-retained values, which must be of the same type as field1.
- expr1: Optional. Grouping field. If there is a grouping field, it means to calculate the retained values of field1 under that group.
- expr2: Required. The sorting rule for field1, usually field1 itself or the aggregation value of another field.
Return Value Type
Same as the field type of field1
Example 1
Group by userType, retain the top 2 usernames ranked by count({energy}), and represent the rest as "Other".
{"kind":"formula","op":"summarize(dataset(2),{usertype} as t1, top_n(2, {username}, 'Other') over(PARTITION BY {usertype} order by count({energy}) asc) as test)"}