Perl Weekly Challenge 145: Palindromic Tree

by Abigail

Challenge

You are given a string $s.

Write a script to create a Palindromic Tree for the given string.

I found this blog exaplaining Palindromic Tree in detail.

Example 1

Input: $s = 'redivider'
Output: r redivider e edivide d divid i ivi v

Example 2

Input: $s = 'deific'
Output: d e i ifi f c

Example 3

Input: $s = 'rotors'
Output: r rotor o oto t s

Example 4

Input: $s = 'challenge'
Output: c h a l ll e n g

Example 5

Input: $s = 'champion'
Output: c h a m p i o n

Example 6

Input: $s = 'christmas'
Output: c h r i s t m a

Discussion

This is one of the many challenges which makes we want to turn away from the Perl Weekly Challenge, and never look back.

The challenge description asks us to do a specific thing (create a Palindromic Tree, which is a graph based data structure), then the examples completely ignore that, and just lists all the palindromic substrings of the input, with duplicates removed.

So, is it:

We will opt for the latter. And since there is, obviously, no requirement on the order of the outputted substrings, we will assume the order given in the examples (same order as their first appearance in the input) is an artifact of the algorithm used, and not a requirement.

Unicode and Combining Characters

With Unicode, graphemes are not the same as characters. A grapheme can consist of more than one character, or can be expressed in multiple ways.

Take for instance 'ò'. This could be a single character, Latin Small Letter O with Grave. Or it could be two: Latin Small Letter O followed by Combining Grave Accent. If the former is a palindrome, should the latter be? What if we have 'òò' where one of the graphemes consists of an 'o' and a combining accent?

We decide to ignore all of this, and only look at the character level. So, 'ò' is a palindrome if and only if it's a single character.

Solutions

We will assume the input consists of one or more lines; for each line (sans its newline), we calculate the palindromic substrings.

Perl

The "usual way" to get all the substrings of string to have have two nested for loops, iterating over the indices. However, we can abuse Perls regexp engine:

/(.+)(?{code})(*FAIL)/

Here, (.+) matches any substring, putting the matched substring in $1. Perl will prefer the left-most, longest substring, but only under the condition the rest of the pattern matches. (?{code}) is a construct which always matches — its point is to execute code. It will be executed each time (.+) matches, and code will have access to $1. (*FAIL) however, is a construct which will always fail to match, causing the engine to backtrack.

The result is that Perl will try to match each and every substring with (.+) before finally concluding there is no match. We don't care about the actual not matching; we do care about the side effect: code is executed exactly once for every substring.

So, what will be put into code? A check if the substring is a palindrome, and if so, we collect it:

my %seen;
/(.+)(?{$seen {$1} ++ if $1 eq reverse $1})(*FAIL)/;

A palindrome reads the same back to front as front to back, so comparing it to its reverse will do.

Now %seen will have all the palindromic substrings as its keys, which we can easily print:

local $, = $";
say keys %seen;

Find the full program on GitHub.

Tcl

Our Tcl solution will use two nested loops to extract all substrings. If the substring is a palindrome, we put it in a dictionary. At the end, we print the keys of the dictionary:

while {[gets stdin line] >= 0} {
    set palindromes [dict create]
    for {set i 0} {$i < [string length $line]} {incr i} {
        for {set j $i} {$j < [string length $line]} {incr j} {
            set str [string range $line $i $j]
            if {$str == [string reverse $str]} {
                dict set palindromes $str 1
            }
        }
    }
    puts [dict keys $palindromes]
}

string range gives the substring.

Find the full program on GitHub.

Go

Go doesn't have a method to reverse a string, so, we first create a method to check whether a string is a palindrome:

func is_palindrome (str string) bool {
    runes := [] rune (str)
    for i, j := 0, len (runes) - 1; i < j; i, j = i + 1, j - 1 {
        if runes [i] != runes [j] {
            return false
        }
    }
    return true
}

In Go, strings can be split into sequences of runes, where a rune is more or less equivalent to a Unicode code point. Once split into runes we'll use two pointers, one from the start and counting upwards, and one from the end counting downwards. We'll compare the runes under the pointers, and return false if there is any pair which differs. If all pairs agree, it's a palindrome.

We can now use two nested loops to get substrings of the input line (text), and if the substring is a palindrome, we store it in a map. At the end, we print the keys from the map:

palindromes := make (map [string] bool)
for i := 0; i < len (text) - 1; i ++ {
    for j := i; j < len (text) - 1; j ++ {
        substr := text [i : j + 1]
        if is_palindrome (substr) {
            palindromes [substr] = true
        }
    }
}
for k := range palindromes {
    fmt . Printf ("%s ", k)
}
fmt . Println ("")

Find the full program on GitHub.

Other languages

We also have implementations in:

AWK, Bash, C, Java, Lua, Node.js, Pascal, Python, R, Ruby, and Scheme.


Please leave any comments as a GitHub issue.