Perl Weekly Challenge 371: Missing Letter

by Abigail

Challenge

You are given a sequence of 5 lowercase letters, with one letter replaced by ‘?’. Each letter maps to its position in the alphabet (‘a = 1’, ‘b = 2’, …, ‘z = 26’). The sequence follows a repeating pattern of step sizes between consecutive letters. The pattern is either a constant step (e.g., ‘+2, +2, +2, +2’) or a simple alternating pattern of two distinct steps (e.g., ‘+2, +3, +2, +3’).

Examples

Input:  a c ? g i
Output: e

Input:  a d ? j m
Output: g

Input:  a e ? m q
Output: i

Input:  a c f ? k
Output: h

Input:  b e g ? l
Output: j

Solution

It is not necessary to determine whether the given sequence uses a constant step, or an alternating pattern of steps. In either case, the difference between the first and second character is the same as the difference between third and fourth character, and the difference between the second and third character is the same as the difference between the fourth and fifth.

So, depending on the position of the missing character, we can take a neighbouring character (base), and add (or subtract) a difference from two other characters (from and to).

We use the following table with character positions:

missing base from to
0 1 3 2
1 2 4 3
2 3 1 0
3 2 0 1
4 3 1 2

For instance, if we look at the first example, a c ? g i, the missing character is on position 2. So, we use the character on position 3 (g), and we subtract the difference between the characters on positions 1 and 0. So, we get g(=7) - (c(=3) - a(=1)) = e(=5).

Note also that to do all we need is the difference between the characters — it is not necessary to map them to numbers 1 .. 26. Using their ASCII or Unicode values will work as well.

Input

We will be reading from standard input. Input consists of lines, with five characters per line, separated by a single space. Input for the given examples will be:

a c ? g i
a d ? j m
a e ? m q
a c f ? k
b e g ? l

Perl

With a line of input in $_, we start off by splitting it on whitespace (leaving a list of characters), mapping them to their Unicode values, and storing them in an array:

my @letters = map {ord} split;

Next, we need to find the position of the missing character (which is the ?):

my ($qi) = grep {$letters [$_] == ord "?"} keys @letters;

Using the table above, we can extract the Unicode values of the base, from and to characters:

my $base = $letters [$qi <= 2 ? $qi + 1 :                      $qi - 1];
my $from = $letters [$qi <  2 ? $qi + 3 : $qi == 2 ? $qi - 1 : $qi - 3];
my $to   = $letters [$qi <  2 ? $qi + 2 :                      $qi - 2];

And that's enough to calculate the missing character, and then print it:

say chr $base - $from + $to;

Find the full program on GitHub.

Go

Our Go solution is similar, but instead of splitting the line of input on whitespace, we turn the line of input (text) into an slice of bytes (which we can do, as it's given the input is ASCII only):

letters := [] byte (text)

We can still use the table above, but the positions will be doubled, as the spaces are still in letters. We iterate once over the letters slice searching for the ?, and once found, get the values for base, from, and to. Go doesn't have a ternary operator, so we have the position checks outside of the assignment. Note the use of switch/case, which is the preferred way in Go to do if/elseif/else.

var base, from, to byte
for i := 0; i <= 8; i += 2 {
    if letters [i] == '?' {
        switch {
            case i <  4:
                base = letters [i + 2]
                from = letters [i + 6]
                to   = letters [i + 4]
            case i == 4:
                base = letters [i + 2]
                from = letters [i - 2]
                to   = letters [i - 4]
            case i >  4:
                base = letters [i - 2]
                from = letters [i - 6]
                to   = letters [i - 4]
        }
        break
    }
}

Calculating the result and printing it is now easy:

fmt . Printf ("%c\n", base - from + to)

Find the full program on GitHub.

BC

Bc poses a bit of a challenge. It only deals with numbers, and our input are letters, right?

Well, if you consider numbers in base 36, then the letters a .. z are just numbers: 10 .. 35. And you can tell bc to treat numbers in a different base. By setting the special variable ibase to 36, it will consider input (including values in the program (!)) to be in base 36. (Note, according to the documentation, the highest value ibase can be is 16, but setting it to 36 does work (GNU bc 1.08.2 on MacOS)). And, conveniently, if you try reading in a question mark (?), it reads it as a 0.

Bc doesn't allow use to read in more than a single token at a time. It also has no way of signalling end-of-file; trying to read past it results in the read call just waiting for more input. So we use a value of -1 to signal end of input.

We read in data in this way:

letters [0] = read ()
if (letters [0] < 0) {
    halt
}
letters [1] = read ()
letters [2] = read ()       
letters [3] = read ()
letters [4] = read ()

Now letters is a 5-element array with values. Next is finding the position of the question mark:

for (i = 0; i < 5; i ++) {
    if (letters [i] == 0) {
        qi = i
    }
}

We then find the character values for base, from and to in a similar way as we did for the Go implementation:

if (qi <  2) {
    base = letters [qi + 1]
    from = letters [qi + 3]
    to   = letters [qi + 2]
}
if (qi == 2) {
    base = letters [qi + 1]
    from = letters [qi - 1]
    to   = letters [qi - 2]
}
if (qi >  2) {
    base = letters [qi - 1]
    from = letters [qi - 3]
    to   = letters [qi - 2]
}

Calculating the value of the missing character is easy (base - from + to), but we can't just print it. Setting the output base (obase = 36) doesn't help — bc won't output letters, but groups three digits indicating the value of the output "digit". So, printing a value of a (in base 36), is outputted as 010.

To get around this, we use a function which takes a value, and prints out the corresponding letter. Note that values in the program are in base 36, using capital letters for values above 10:

define print_chr (ch) {
    if (ch == A) {print "a\n"}
    if (ch == B) {print "b\n"}
    if (ch == C) {print "c\n"}
    if (ch == D) {print "d\n"}
    if (ch == E) {print "e\n"}
    if (ch == F) {print "f\n"}
    if (ch == G) {print "g\n"}
    if (ch == H) {print "h\n"}
    if (ch == I) {print "i\n"}
    if (ch == J) {print "j\n"}
    if (ch == K) {print "k\n"}
    if (ch == L) {print "l\n"}
    if (ch == M) {print "m\n"}
    if (ch == N) {print "n\n"}
    if (ch == O) {print "o\n"}
    if (ch == P) {print "p\n"}
    if (ch == Q) {print "q\n"}
    if (ch == R) {print "r\n"}
    if (ch == S) {print "s\n"}
    if (ch == T) {print "t\n"}
    if (ch == U) {print "u\n"}
    if (ch == V) {print "v\n"}
    if (ch == W) {print "w\n"}
    if (ch == X) {print "x\n"}
    if (ch == Y) {print "y\n"}
    if (ch == Z) {print "z\n"}
}

which we call in this way:

skip = print_chr (base - from + to)

Note that we assign the return value of print_chr to a variable. Otherwise, bc will write those return values (0) to the output, and we don't want that.

Find the full program on GitHub.

We also have solutions in AWK, Bash, C, Lua, Node.js, Python, R, Ruby, and Tcl.


Please leave any comments as a GitHub issue.