Aggregate your data

The Aggregate function allows you to aggregate values such as by taking the mean in one or more columns at a time based on groups defined by another column. A common use is to aggregate a collection of values that have a sampling time column to collection of annual average values. The Aggregate function is essentially the ‘Group by’ function with the addition of a summary method by which to combine the group elements.

Step 1: Select your data from the left file tree or from “files” tab that needs to be manipulated.

Step 2: Once you select the file, it will be opened in the “Data Wrangler”.

Step 3: Click on the “Aggregate” and a popup will appear.

Step 4: Click on the columns in which you want to apply aggregate under “Select data”.

Step 5: After selecting the columns you want to group, the method area drop-down will allow you to select the column that you wish to base the groups on. Depending on the grouping column that you select from the drop down, you will be given a series of grouping options.

  1. Date-time column: your options for grouping are year, month, day of the week or hour of the day.
  2. Numeric column: you can choose the number of bins to group the data into. The bins are of equal size and are calculated as the range in values divided into equal range bins.
  3. Text: each unique text string will be used to form a group. For example, if the column contains names, then the output will have a new column for each person.

Note: The output file will have the same number of columns as the input file (or the subset of columns that you select), but the number of rows will be reduced so that each row represents the summary value for all the group elements.

Step 6: Once the groups are selected, the next step is to describe how all the elements in the groups needs to be summarized. your options are:

  1. Average: The sum of all the values divided by the total number of values in the collection.
  2. Median: The middle value when all the values are sorted. This is also the 50th percentile.
  3. Mode: The most common value in a collection.
  4. Maximum: The maximum value in the collection.
  5. Minimum: The minimum value in the collection.
  6. Sum: The sum of all the values in the collection.
  7. Count: The number of values in the collection.
  8. StDev: The standard deviation is a measure of the amount of variation or dispersion of a data collection.

Step 7: The default “Result” is to overwrite the original file with the updated file.

Select what you want to save as your output file from the “output” drop down menu.

If you want to keep the original file along with your updated file then, click on “Keep all columns” or “only keep selected columns” under “create new file”. Click “run”. The new file will be saved in the same folder as original file.