Documentation

  1. Home
  2. Docs
  3. Documentation
  4. What Questions Can I Ask?
  5. Aggregations and Grouping

Aggregations and Grouping

Summary: AnswerDock allows you to easily aggregate metrics, count records and group by dimensions. You just need to type the aggregate function (e.g. sum) followed by its related metric and the dimension to group by.

Understanding Metrics and Dimensions

Columns in your datasets belong to two main types: Metrics or Dimensions (or attributes):

Metrics are numerical fields that can be aggregated (e.g. Sum, Average, etc..).

Dimensions are typically text fields used to segment the data (e.g. Country, Customer Name, OrderDate, etc..). Dimensions can also be numerical fields, with low number of distinct values (e.g. Customer Satisfaction score from 1-5). The Satisfaction Score in this case can be averages (treated as a metric) or used to group the data (e.g. Number of Sessions by Customer Satisfaction Score).

Performing Aggregations and Groupings 

Performing  aggregations of data is quite straightforward in AnswerDock. You just need to type the Metric or Dimension name. You can also look up aggregates, metrics or dimensions from the search guidance menu.

For aggregations. you can use any of the following aggregates functions:

  • Sum: Returns the sum of all values in a column
  • AverageReturns the Average of all values in a column
  • CountReturns the count of values in a column (i.e. count of records)
  • Distinct CountReturns the unique count of values in a column (similar values are counted only once)
  • Standard DeviationReturns the Standard deviation of all values in a column
  • VarianceReturns the Variance of all values in a column
  • MinReturns the minimum values of all values in a column
  • MaxReturns the maximum value of all values in a column

Note that you can use any synonym for aggregates (e.g. Total instead of Sum, Unique Count instead of Distinct Count, Avg. instead of Average, etc..). Also note that in the search box, aggregates are highlighted in green and columns (metrics or dimensions) are highlighted in blue.

AnswerDock also  supports complex / accumulative aggregations as below:

  • Cumulative Sum:Returns the sum of the metric, accumulated by the dimension in your question
  • Cumulative Average: Returns the Average of the metric, accumulated by the dimension in your question
  • Cumulative Count: Returns the count of records, accumulated by the dimension in your question
  • Cumulative Min:Returns the minimum value of the metric, accumulated by the dimension in your question
  • Cumulative Max:Returns the maximum value of the metric, accumulated by the dimension in your question

You can know more about Cumulative aggregations in the next section of the documentation.

 Some examples of questions you can ask:
  • Number of leads by sales person
  • Total Revenue this month
  • Average temperature by city in may 2018
  • Variance of humidity this year
  • Standard Deviation of ratings this week
  • Count of orders by category this month
  • MaX temperature per month of year in last 50 years
  • Minimum delivery duration per destination last quarter
  • Unique count of customers with order date between jan to may this year

Below is an example of a question with Aggregate (average), metric (wait time) and two dimensions (department, visit type)

Default Aggregates

In some cases, Metric Fields will default to their default aggregate set in the Fields settings for the dataset. For example, typing “Sales” will automatically Sum the Sales metric if it has it as a default aggregate. In other cases, you would need to explicitly mention the aggregate in order to Sum the data. For example, “Sales vs. Sessions” will not aggregate Sales by default, it will list the Sales Values next to Sessions in order to be able to plot them on a scatter chart. If you need the Sum of Sales next to the Sum of sessions, you can just add Sum at the beginning :

  •  Sum Sales Sessions
Using the Grouping “By” word

In most cases, you do not need to use the “By” word to indicate grouping by a certain dimension (e.g. sales by category). This is because most dimensions are text fields and the default behavior would be to group the metrics by these dimensions.

In certain cases, however, the “By” or “For Each” or “Per” word is useful to indicate grouping by a certain column, if that column is numeric and the default behavior for it is to be aggregated.

For example, lets say we have a column of “Rating” that has customer rating scores from 1 to 5. Since this is Numeric, we can define this column as a metric with an average as a default aggregate in the dataset settings. in this case typing “Sales Rating” will show the sum of sales next to average of ratings. However, if we’d like the Sales for each rating score, we can then use the grouping word and type “Sales by Rating”. In this case the Rating column is treated as a dimension and AnswerDock will return the Sum of Sales grouped for each Rating.

Aggregating Dimensions

Dimensions can be aggregated if they contain only numeric values. In our previous example for the “Rating” column, we could have defined it as a dimension in the dataset column settings. In this case, typing “Rating” will not average the rating as default, but will list down its values. If we want to compute the average, we can type “Average Rating” and this will display the average, even though we have the field defined as a dimension.

Counting Records

In AnswerDock, you can define in the datasets settings page what each record in your dataset represents. For A search log dataset for example , you can name the records as “Queries” or “Searches”. This will let you ask questions that relate to the number of records:

  • Count of queries this month

Note that queries in this case is not a column in the dataset, its the record name defined in the dataset settings. Also note that you can define multiple names for the records in the dataset (e.g. query, search, hit, etc.. ). You do not need to cater for the plural of the record name, AnswerDock handles these automatically. So even if you define the record name as “search”, you will still be able to ask questions as “Searches this month”. Finally, note that even if you do not define a name for the records in your dataset, you can still simply ask” Number of records this month by category“. AnswerDock understands the keyword “records” and will count the number of records as per the asked question.