We are given a set \(\mathcal{S}\) of numbers:
16,1,2,0,4,2,7,1,2,14
We are looking for another number \(N\) (integer) which minimizes a certain measurement: the sum of the cost between \(N\) and each element of \(\mathcal{S}\).
For Part One, the cost between two numbers is just the absolute value of their difference.
For the given example set, the answer will be 37.
For part two, the cost is defined as:
That is, if the absolute difference is \(k\), the cost is the \(k^{\text{th}}\) triangle number.
For the given example set, the answer will be 168.
We will present two ways of solving the problem, a brute force one, and one which uses statistics.
For triangle numbers, we have:
\[ 1 + 2 + \ldots + n = \sum_{k = 1}^n k = \frac{n * (n + 1)}{2} \]
First, we read in the data:
my @nums = <> =~ /[0-9]+/g;
Next, we define two functions to calculate costs. cost1
calculates
the cost for part one, while cost2
calculates the cost for part two.
Both functions take two arguments: a number ($target
), and a reference
to an array with numbers ($nums
). What they return is the sum of the
costs between $target
and each of the elements of $nums
:
sub cost1 ($target, $nums) {
sum map {abs ($target - $_)} @$nums;
}
sub cost2 ($target, $nums) {
sum map {my $n = abs ($target - $_); $n * ($n + 1) / 2} @$nums;
}
For the brute force approach, we calculate the sum of the costs
for every possible integer between the minimum and maximum values
of @nums
, and report the minimum:
say "Solution 1: ", min map {cost1 $_, \@nums} min (@nums) .. max (@nums);
say "Solution 2: ", min map {cost2 $_, \@nums} min (@nums) .. max (@nums);
Find the full program on GitHub.
The sum of the costs of part one will be minimized for the median of \(\mathcal{S}\), while the sum of the costs for part two will be minimized for the arithmetic mean of \(\mathcal{S}\).
Now, both the median and the mean may be non-integers. But if we look at the sum of the costs when going from the minimum value of \(\mathcal{S}\) to the maximum value of \(\mathcal{S}\), the sum of the costs will strictly decrease from the minimum value of \(\mathcal{S}\) to the median/mean, and then strictly increase till we have reached the maximum value of \(\mathcal{S}\).
So, we will be calculating the costs for the median/mean rounded up, and rounded down: the minimum of those will be the answer.
Which leads to:
my $median = median @nums;
my $mean = mean @nums;
say "Solution 1: ", min cost1 (floor ($median), \@nums),
cost1 ( ceil ($median), \@nums);
say "Solution 2: ", min cost2 (floor ($mean), \@nums),
cost2 ( ceil ($mean), \@nums);
We are importing min
and sum
from List::Util
,
median
and mean
from Statistics::Basic
, and
floor
and ceil
from
POSIX
.
Find the full program on GitHub.
Below is a graph show the fuel costs depending on where the crabs converge.