Perl Weekly Challenge 373: Equal List

by Abigail

Challenge

You are given two arrays of strings.

Write a script to return true if the two given array represent the same strings otherwise false.

Examples

Input:  @arr1 = ("a", "bc")
        @arr2 = ("ab", "c")
Output: true

Input:  @arr1 = ("a", "b", "c")
        @arr2 = ("a", "bc")
Output: true

Input:  @arr1 = ("a", "bc")
        @arr2 = ("a", "c", "b")
Output: false

Input:  @arr1 = ("ab", "c", "")
        @arr2 = ("", "a", "bc")
Output: true

Input:  @arr1 = ("p", "e", "r", "l")
        @arr2 = ("perl")
Output: true

Input

We prefer to take our input not as arrays of strings, but reading from standard input — where each line corresponds with an excercise. We will represent the strings as whitespace separated sequences of letters. A dot represents the empty string, and the two arrays are separated by a semi-colon. So, the examples above translate to:

a bc;ab c
a b c;a bc
a bc;a c b
ab c .;. a bc
p e r l;perl

Solution

This is a pretty simple exercise. After reading in a line of input, we

Perl

With a line of input in $line, we start off by removing whitespace and dots:

$line =~ s/\s+//g;
$line =~ s/\.//g;

Then we split on the semi-colon:

my @parts = split /;/ => $line;

Finally, comparing the strings and printing the results:

say $parts [0] eq $parts [1] ? "true" : "false";

Find the full program on GitHub.

Ruby

Our Ruby solution uses the same steps as our Perl solution. But due to it's OO nature, we can chain the first steps:

parts = line . gsub(/\s+/, "")
             . gsub(/\./,  "")
             . split /;/

This removes the whitespace from the line of input, the dots, and then splits the line on a semi-colon, leaving both parts in the array parts.

Finally, compare the parts and print the results:

puts parts [0] == parts [1] ? "true" : "false"

Find the full program on GitHub.

C

In C, we have to work much harder. Strings are just arrays of characters, terminated by a NUL character (a character with byte value 0). We don't have any easy array or string modifying functions available.

We will parse the line of input (line, with linelen its length), once, copying the non-space, non-dot characters into two strings: one for the part before the semi-colon, and one for the part after the semi-colon.

We start off by claiming memory for the two strings — they cannot exceed the length of the input, so we just the lenght of the input string as the size of the two new strings. This is more than needed, but it will be enough.

char * part0;
char * part1;
if ((part0 = (char *) malloc (linelen * sizeof (char))) == NULL) {
   perror ("Malloc failed");
   exit (1);
}
if ((part1 = (char *) malloc (linelen * sizeof (char))) == NULL) {
   perror ("Malloc failed");
   exit (1);
}

Now we iterate over the input, skipping white space and dots, and initially copying to part0; after encountering a semi-colon, we copy into part1.

# include <ctype.h>
# define NUL '\0'

char * dest = part0;
for (char * ptr = line; * ptr; ptr ++) {
   if (isspace (* ptr) || * ptr == '.') {    /* Skip spaces and dots */
       continue;
   }
   if (* ptr == ';') {                       /* Split on ;           */
       * dest = NUL;                         /* Terminate part 0     */
       dest = part1;                         /* Switch to part 1     */
       continue;
   }
   * dest ++ = * ptr;                        /* Copy character       */
}
* dest = NUL;                                /* Terminate part 1     */

Note that strings in C need to be terminated by a NUL character, so we need to explicitly add a NUL character to the strings.

isspace requires the use of the ctype.h header file, hence the # include.

Comparing strings is easy, using the strcmp function, which returns the same values as Perl's <=> operator. In particular, it returns 0 is the strings are equal:

# include <strings.h>

printf ("%s\n", strcmp (part0, part1) == 0 ? "true" : "false");

Find the full program on GitHub.

Sed

Sed programs by default iterate over the input, executing the program for each line of the input. Regexes and substitutions are always applied against the current line of input. By default, at the end of the program, the current line will be printed.

First steps are similar to our Perl solution, removing white-space and dots:

s/\s+//g; 
s/\.//g;

But we cannot split a line into two, and compare the two halves. However, we can check whether the part of the string before the semi-colon equals the part after the semi-colon. If so, we replace the entire line by true:

s/^(.*);\1$/true/

This means that if the two parts do not match, the string will still have a semi-colon in it — if so, we replace the entire line with false:

s/^.*;.*$/false/

Find the full program on GitHub.

Befunge-93

Befunge-93 is a programming language very different from many others. Traditional programs are one dimensional, going top to bottom, with occasional jumps (which include loops and return statements). Their program counter (which points to the next instruction) is a single number, incrementing on each instruction — and jumps are implemented by assigning a new number of the program counter.

Befunge-93 programs run on a two dimensional 80x24 grid, which wraps around (so, they run on a torus). Programs start in the top left corner, moving right-ward. But instructions can make it move left-ward, up-ward or down-ward. Befunge-93 doesn't have variables; it does, however, have a single stack. Beside storing things on the stack, it also allows writing to the 80x24 grid itself. One can use this to modify the program, or to store data.

All instructions are single characters — which implies to get any numbers above 9 (or below 0) requires making them using arithmetic.

Before showing the program and drilling into the details, an overview of the program:

Our full program (line numbers are only for reference, they're not part of the program)

00: > > 1 0 45*0+p 1 0 45*1+p ">" 6 6p  v                     Init
01: ^   v < < < < < < < < < < < < < < < < p6 6 "v" $  < <     Switch
02: ^   > ~:1+!#@_ :" "-!#v_ :"."-!#v_ :55+-!#v_ :";"-!#^_v   Parse line
03: ^   ^ < < < < < < $ < < < < < < <         v           v   Skip spaces and dots
04: ^ v   < < < < < < $ < < < < < < < < < < < <           v   End of line
05: ^ v ^ v < < < < < < < < < < < < < < < < < < < < < < < <
06: ^ v ^   0 45*0+g 45*0+p 0 45*0+g 1+ 0 45*0+p  v           Store chars before ;
07: ^ v ^ > 0 45*1+g 45*1+p 0 45*1+g 1+ 0 45*1+p  v           Store chars after  ;
08: ^ v ^ < < < < < < < < < < < < < < < < < < < < <
09: ^ > > 0 45*0+g 0 45*1+g -#v_ 0 45*0+g > v                 Compare length
10: ^                         > > > > > > >   v
11: ^   v < < < < < < < < < < < < < < < < < < v
12: ^   > 1-: !#v_ ::45*0+g \ 45*1+g - #v_  ^ v               Compare strings
13: ^           > > > > > > > > v       v     v
14: ^ < $ ,+55 ,,,, < <  "true" <       v     v
15:                 ^ , "false" < < < < < < < <

As you have have guessed, the instructions >, v, <, and ^, send the program counter right, down, left or up. (Note the text on the right — Befunge-03 doesn't have comments, but Befunge-93 only cares what characters there if it encounters an instruction — so you can put free text anywhere where the program isn't going to be executed).

As said before, Befunge-93 only has single character instructions, which requires to do arithmetic to get numbers exceeding 9. This is how we make the number 21: 45*1+:

(154*+ would also have made 21). In a similar way, we make 20 using the instructions 45*0+. The 0+ is redundant, as adding 0 to something doesn't change it, but we use it for the symmetry of making 21.

Now, look at the first part of line 00, ignoring the arrows:

1 0 45*0+p

This puts a 1 on the stack, then a 0, then 20 (as explained above), then we encounter the p instruction. The p instruction pops off three things from the stack, in order, a row number, a column number, and a value. It then write the value on the grid, with the given row and column number. We use row 20 to store the string before the semi-colon, keeping track of its length using column 0. Column 0 will contain the index where the next character will be written (so, one more than its length). So, we write 1 to grid position (20, 0).

The next part is similar:

1 0 45*1+p

This writes 1 to grid position (21, 0).

The final part of this line is:

">" 6 6p

The " instruction puts the program into string mode. This means that for every character it encounters, it places the ASCII value of that character on the stack; this continues until the program encounters the next ". (It cannot change direction, or execute any other instruction while doing so). So, it places (the ASCII value of) a > on the stack, a 6, another 6, and then excutes the p instruction, effectively writing a > on column 6, row 6. This make lines 05, 06 and 07 look like:

^ v ^ v < < < < < < < < < < < < < < < < < < < < < < < <
^ v ^ > 0 45*0+g 45*0+p 0 45*0+g 1+ 0 45*0+p  v
^ v ^ > 0 45*1+g 45*1+p 0 45*1+g 1+ 0 45*1+p  v

We'll come back to this later.

Now we follow the arrows, and we encounter line 02:

~:1+!#@_ :" "-!#v_ :"."-!#v_ :55+-!#v_ :";"-!#^_v

This is the loop which parses our input, character by character.

Let's break it down. First part:

~ : 1+ ! #@_

The ~ reads a single character from the input, pushing its ASCII value on the stack. If EOF is encountered, the ~ instruction puts -1 on the stack. And this is the first thing we check for. Since we don't want to lose the value of the read in character, we duplicate it using the : instruction, then we add one to it (1+), and negate the value (!). This means that if we just have encounted EOF, we now have 1 at the top of the stack, else we have 0. The # instruction makes us skip the next one — since we're moving left to right, we skip over the @, and encounter _. _ is a conditional statement: it pops up a value from the stack, if it's true, it moves the program counter left-ward, else right-ward. So, if we encountered EOF, we now get the @ instruction — which just terminates the program. Otherwise, we continue:

: " "- ! #v_

First, we duplicate the read character again (as the first duplicate has been consumed), then we push the ASCII value of a space on the stack, and subtract the two value, then negate it. So, if the just read character was a space, we now have a true value on the stack, else a false value. The sequence #v_ is similar as above: if we just read a space, we jump over the v, then go back (due to the _), and execute the v anyway, else we continue to the right. If we do execute the v (so we have read a space), we go back to the beginning of the loop, after executing the $ instruction, which simply pops up a value from the stack, and discard it (the read space is still on top of the stack). If it's not a space, we get:

: "."- ! #v_

which does the same thing as above: check if the read character was a dot, and if so, discard it and restart the loop. Otherwise, we continue with the next check:

: 55+- ! #v_

This is a check whether we just read a newline — whose ASCII value is 10. And we get 10 by adding 5 and 5 together. If we did read an newline, we have finished processing an entire line of input, and we can compare strings — which we do starting from line 09 (just follow the arrows). We'll explain that process below. If we also don't encounter a newline, we're off to the next check:

: ";"- ! #^_ v

Here we check against reading a semi-colon. If we do, we go up and execute line 01 (going right to left (!)), else we execute line 06 or line 07, storing the read character.

First, line 02, to be executed when encountering a semi-colon. Its is executed left to right, but if we place the instruction in the way they're encountered, we get:

$ "v" 6 6p

We pop off a value from the stack (the read in semi-colon), then we write v on row 6, column 6, turning lines 05, 06, and 07 from

^ v ^ v < < < < < < < < < < < < < < < < < < < < < < < <
^ v ^ > 0 45*0+g 45*0+p 0 45*0+g 1+ 0 45*0+p  v
^ v ^ > 0 45*1+g 45*1+p 0 45*1+g 1+ 0 45*1+p  v

into

^ v ^ v < < < < < < < < < < < < < < < < < < < < < < < <
^ v ^ v 0 45*0+g 45*0+p 0 45*0+g 1+ 0 45*0+p  v
^ v ^ > 0 45*1+g 45*1+p 0 45*1+g 1+ 0 45*1+p  v

a very subtle, but important, difference. We then enter the main loop again.

Now, if we didn't encounter any of the special characters (EOF, space, dot, newline, semi-colon), we need to store the read character, by executing line 06 (before having encountered a semi-colon) or line 07 (afterwards). They're very similar, only difference is whether we write to row 20 or 21. So, we'll explain just one of the lines:

It's important to realize that when we're about to execute this line, the read in character (its ASCII value) is on top of the stack.

0 45*0+ g

This pushes 0 on the stack, followed by 20, then we execute the g command. The g command pops off a row number, a column number, then pushes the value found on that position to the stack. So, it pushes the value found on position (20, 0) on the stack, which is the index where to write the next character. So, we have the column on top of the stack, and the value below it. All we need is to push the row number (20) and execute the p instruction:

45*0+ p

Next we need to increment the value found in column 0. First, we get it:

0 45*0+ g

We add 1 to it:

1+

push the column and row number, and write it back:

0 45*0+ p

Finally, we go back to the main loop, which you can see by following the arrows.

Before explaining the comparison of the strings, let us explain the printing of result (true or false), which happens in lines 14 and 15:

^ < $ ,+55 ,,,, < <  "true" <
                ^ , "false" <

We come here from the right, going to the left. We have seen the " instruction before, which puts the ASCII values of the string on the stack. So, for line 14, we push 101 (e), 117 (u), 114 (r), and 116 (t) on the stack (in that order), and for line 15, it will be 101 (e), 115 (s), 108 (l), 97 (a), and 102 (f). The comma instruction (,) pops up a single value from the stack and prints the character representated by that ASCII value. So, if we have pushed false, we print 5 characters; if we have pushed true, we print 4 characters. Then we push 10 on the stack (55+) and print it as a character, so we terminate each line of our output with a newline. We then clear the stack, and go all the way back to the beginning of the program, getting ready for the next exercise.

So, how do we compare strings? We need to check two things: are the strings of the same length, and do all characters match up.

Comparing the lengths is done in the first part of line 09:

0 45*0+ g 0 45*1+ g - #v_

We fetch the (next) indices of the strings from the columns 0 of the rows 20 and 21, and subtract. If they are unequal, their difference will be a true value, and we down, and towards line 15 (which will print false).

Otherwise, we fetch the index again, and continue with line 12:

0 45*0+ g > v

Line 12 looks like this:

1-: !#v_ ::45*0+g \ 45*1+g - #v_  ^

When we enter here for the first time, on top of the stack is the first index past the end of the strings. So, we start of by subtracting 1 it.

1-

Now, if this value equals 0, we have compared the entire strings, haven't found a mismatch, so the answer is going to be true. We duplicate the value, negate it, and send our program to line 15 if it's 0, else we continue:

: ! #v_

We now need to fetch the two characters. Since we need the index (which is now on top of the stack) twice more, we first duplicate it twice:

::

Then we fetch the character from the first string (on row 20):

45*0+ g

So, we now have the first character on top of the stack, and the index to fetch the second character from below it. The \ swaps those values, so we do that, and fetch the second character:

\ 45*1+ g

We now have the two characters we want to compare on top of the stack. We can simply subtract them. If the result is 0 (false), they're equal, else, they're not. If they're not equal, we'll print false, else we continue the compare loop:

- #v_ ^

And that completes our program.


Please leave any comments as a GitHub issue.