Given a word, you can sort its letters alphabetically (case insensitive). For example, 'beekeeper' becomes 'beeeeekpr' and 'dictionary' becomes 'acdiinorty'. Write a script to find the longest English words that don't change when their letters are sorted.
The English language isn't well defined. There's no authority which dedices which words belong to the English language, and which don't. We therefore resort to a word list, which we assume to be on standard input (one word per line).
This has a pro and a con. The pro is that our solutions are language agnostic — feed it words from a different language, and it finds the longest word with the given constraint (of course, the language needs to be written using ASCII letters).
The con is that the answer depends on the given word list. And even for some word lists, the answer may not be unique: there can be multiple words with the same length.
For instance, the public domain ENABLE word list has two words with seven letters where the letters are in alphabetical order: beefily and billowy, and no longer words.
The file /usr/share/dict/words
on my OS six such words. In addition
to the two words from the ENABLE list, it also has Adelops, alloquy,
begorry, and egilops.
The wordlist originally compiled by infochimps.com has an eight letter word: aegilops.
We will only consider words containing ASCII letters; any word containing anything else, hyphens, digits, accented letters, we will ignore.
We will consider the input words to be caseless. That is, 'a' sorts before 'B', which sorts before 'c'.
Three different approaches spring to mind.
/^a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$/i
matches
any word where the letters are in alphabetical order.For our solutions below, if the language supports regular expressions, we use the third option. Else, we use the second option.
my $pat = join "" => map {"$_*"} 'a' .. 'z';
my $longest = "";
while (<>) {
$longest = $_ if /^$pat$/i && length ($_) > length ($longest)
}
print $longest;
There's not much to it. Of all the input words matching the pattern (so its letters are in sorted order), remember the longest.
Find the full program on GitHub.
GNU AWK support IGNORECASE
, to do case insensitive matching.
First, we create a pattern, by iterating over the ASCII values:
BEGIN {
IGNORECASE = 1
longest = ""
#
# Create a pattern /^a*b*...z*$/
#
pat = "^"
for (i = 97; i <= 122; i ++) {
pat = pat sprintf ("%c*", i)
}
pat = pat "$"
}
We can then use the pattern to remember the longest word:
match ($0, pat) && length ($0) > length (longest) {
longest = $0
}
In an END
block, we print the longest word:
END {
print longest
}
Find the full program on GitHub.
In Bash, sub patterns can have quantifiers, but the syntax uses a prefix
notation instead of a postfix. And it needs to be enabled with shopt
:
shopt -s extglob
pat1="*(a)*(b)*(c)*(d)*(e)*(f)*(g)*(h)*(i)*(j)*(k)*(l)*(m)"
pat2="*(n)*(o)*(p)*(q)*(r)*(s)*(t)*(u)*(v)*(w)*(x)*(y)*(z)"
Bash doesn't have pattern matching — but it does have substitution of a pattern. So, we apply the pattern to the word, replace what was matched by an empty string, and then check whether we're left with an empty string. Those are the matching words.
Note also the unique way of lowercasing a string: ${word,,}
.
while read line
do lower=${line,,} # Lower case input
left=${lower/#$pat1$pat2/} # Remove pattern
# Test whether nothing left,
# and string larger longest found
if [ "X$left" == "X" -a ${#line} -gt ${#longest} ]
then longest=$line
fi
done
Find the full program on GitHub.
No regular expressions in C. Instead, we iterate over the read string, and check whether all its characters are in alphabetical order:
char * line = NULL;
size_t len = 0;
while (getline (&line, &len, stdin) != -1) {
size_t i = 0;
bool in_order = 1;
while (line [i] != '\n') {
if ((i && tolower (line [i]) < tolower (line [i - 1])) ||
!isalpha (line [i])) {
in_order = false;
break;
}
i ++;
}
Of the words in alphabetical order, we have to remember the longest:
char * longest = NULL;
size_t longest_size = 0;
if (in_order && i > longest_size) {
longest_size = i;
if ((longest = (char *) realloc (longest, (i + 1) * sizeof (char)))
== NULL) {
perror ("Realloc failed");
exit (1);
}
for (size_t j = 0; j <= i; j ++) { /* Copies the newline as well */
longest [j] = line [j];
}
longest [i + 1] = '\0';
}
Find the full program on GitHub.
A short Lua program, using a regular expression. Lua doesn't have case insensitive matching, so we match against a lower case copy of the input:
local longest = ""
for line in io . lines () do
if line : lower ()
: find ("^a*b*c*d*e*f*g*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$")
and line : len () > longest : len ()
then longest = line
end
end
print (longest)
Find the full program on GitHub.
Node.js supports regular expressions, leading to:
let longest = ""
require ('readline')
. createInterface ({input: process . stdin})
. on ('line', _ => {
if (_ . match (/^a*b*c*d*e*f*g*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$/i) &&
_ . length > longest . length) {
longest = _
}
})
. on ('close', _ => {
console . log (longest)
})
Find the full program on GitHub.
Pascal doesn't have regular expressions, so we iterate over the characters, comparing each character with the previous one.
uses sysutils;
var
line, longest: string;
ch, prev_ch: char;
valid: boolean;
begin
longest := '';
while not eof () do
begin
readln (line);
valid := true;
prev_ch := ' '; (* Any char less than 'a' will do *)
for ch in lowercase (line) do
begin
if (ch < 'a') or (ch > 'z') or (ch < prev_ch) then
begin
valid := false;
break;
end;
prev_ch := ch;
end;
if valid and (length (line) > length (longest)) then
begin
longest := line;
end;
end;
writeln (longest);
end.
Find the full program on GitHub.
First, we create a pattern:
pat = "^"
for x in list (string . ascii_lowercase):
pat = pat + x + "*"
pat += "$"
Then we match, and remember the longest:
longest = ""
for line in fileinput . input ():
line = line . strip ()
if re . match (pat, line . lower ()) and len (line) > len (longest):
longest = line
print (longest)
Find the full program on GitHub.
Creating the pattern:
pat = "^"
("a" .. "z") . each do
|letter|
pat += letter + "*"
end
pat += "$"
Matching:
longest = ""
ARGF . each_line do
|line|
if line . downcase =~ /#{pat}/ && line . length > longest . length
longest = line
end
end
puts (longest)
Find the full program on GitHub.