You are given an array of strings.
Write a script to find out if the given strings can be chained to form a circle. Print
1
if found otherwise0
.A string
$S
can be put before another string$T
in circle if the last character of$S
is same as first character of$T
.
Input: @S = ("abc", "dea", "cd") Output: 1
Since we can form a circle, e.g. "abc"
, "cd"
, "dea"
.
Input: @S = ("ade", "cbd", "fgh") Output: 0
We cannot form a circle.
It is not clear whether the question is whether we can for a cycle (we opt to use the term cycle instead of circle here, as this is a more common term in this context). making use of all the strings, or whether any cycle is ok, even if it does not include all the strings. The examples are useless in that respect, as one example forms a cycle with all the strings, and the latter doesn't have any cycle.
We opt for the latter interpretation here, that is, we're looking for any cycle. The former interpretation is equivalent to finding a Eulerian Cycle.
We can turn the challenge into a graph problem. Let \(\mathcal{G} = (V, E)\) be a directed graph where the set of vertices \(V\) is defined by the beginning and end characters of the given strings. For each of the given strings, there will be a directed edge in the set of edges \(E\): an edge from the beginning character to the end character of the strings.
So, for the first example, we get \(V = (a, c, d)\) and \(E = (\langle a, c\rangle, \langle d, a\rangle, \langle c, d\rangle)\). Note since \(\mathcal{G}\) is a directed graph, the direction of the edges matter: \(\langle a, b \rangle\) is a different edge than \(\langle b, a \rangle\).
To determine whether the graph contains a cycle, we will calculate the transitive closure of the graph. The transitive closure of a graph \(\mathcal{G} = (V, E)\) is \(\mathcal{G}^T = (V, E^T)\), where \(\langle v_1, v_2 \rangle \in E^T \iff v_1 \in V \wedge v_2 \in V \wedge\) there is a path from \(v_1\) to \(v_2\) in \(\mathcal{G}\). A graph \(\mathcal{G}\) contains a cycle, iff the set \(\{v \in V | \langle v, v \rangle \in E^T\}\) is not empty. That is, the transitive closure has at least one edge with the same begin and end point.
To calculate the transitive closure, we make use of a simplified Floyd-Warshall algorithm. This algorithm is used to calculate the length all the shortest paths between any pair of vertices in a graph. However, for the transitive closure, all we care about is that the shortest path isn't \(\infty\).
The algorithm takes a graph as a adjacency matrix as input.
That is, matrix [x] [y]
is true of the graph has a directed edge
between x
and y
, and false otherwise.
In pseudo code, the algorithm is as follows (V
is the set of vertices
of the graph, matrix
is the adjacency matrix of the graph):
for k in V
do for i in V
do for j in V
do matrix [i] [j] = true if matrix [i] [k] and matrix [k] [j]
done
done
done
At the end of the procedure, we're left with the transitive closure
in matrix
.
It is easy to see that the running time of this algorithm is \(\Theta(|V|^3)\). For some cases, this may not be optimal; running Dijkstra's algorithm for each vertex takes \(\mathcal{O} (|E||V| + |V|^2 \log |V|)\), which is better than \(\Theta(|V|^3)\) if \(|E| \ll |V|^2\). However, Floyd-Warshall is really easy to implement, and if the strings are ASCII strings, \(|V|\) is small. So, we go with Floyd-Warshall.
We will be assuming we get a set of strings on each line of input (so, we get one answer per line of input), the string separated by white space.
There is a module on CPAN
which calculates the transitive closure,
Algorithm::Graphs::TransitiveClosure
, and it uses the
Floyd Warshall algorithm. Normally, I wouldn't use a CPAN module
doing all or most of the task, except when I authored the module.
And I did write and upload Algorithm::Graphs::TransitiveClosure
to CPAN back in 1998.
So, all what is needed to do is to read the input and create an adjacency matrix:
foreach my $node (split) {
$$graph {substr $node, 0, 1} {substr $node, -1} = 1;
}
call the algorithm (which modifies the given matrix):
use Algorithm::Graphs::TransitiveClosure qw [floyd_warshall];
floyd_warshall $graph;
and check for cycles:
say grep ({$$graph {$_} {$_}} keys %$graph) ? 1 : 0;
Find the full program on GitHub.
First, we read in the strings, and build a graph from it. We keep
two datastructures, nodes
, which tracks the nodes in our graph
(each string contributes its first and last character as a node in
the graph), and graph
, which is the adjacency matrix of the graph.
It's a two dimensional array, which has a true value if there a
directed edge between the nodes. AWK has proper multi-dimensional
arrays and autovivifies, so we don't have to declare any (sub-)arrays.
delete graph
delete nodes
for (i = 1; i <= NF; i ++) {
first = substr ($i, 1, 1)
last = substr ($i, length ($i))
graph [first, last] = 1
nodes [first] = 1
nodes [last] = 1
}
Now, we can apply Floyd-Warshall:
for (k in nodes) {
for (i in nodes) {
for (j in nodes) {
if (graph [k, j] > 0 && graph [i, k] > 0) {
graph [i, j] = 1
}
}
}
}
All what is left is to find out whether there is a node which
is reachable from itself. If so, the answer is 1
, else the
answer is 0
:
out = 0
for (i in nodes) {
if (graph [i, i] > 0) {
out = 1
}
}
print (out)
Find the full program on GitHub.
The Lua solution is similar to the AWK solution, but Lua doesn't have
autovivification, so we have to do a little bit more work. Reading
in the data (the line of input is in the variable line
), and building
the graph:
local graph = {}
local nodes = {}
for s in line : gmatch ("%S+")
do local first = s : sub ( 1, 1)
local last = s : sub (-1, -1)
if graph [first] == nil
then graph [first] = {}
end
graph [first] [last] = 1
nodes [first] = 1
nodes [last] = 1
end
Making sure the right sub arrays (or tables as Lua calls them) exist:
for node1 in pairs (nodes)
do for node2 in pairs (nodes)
do if graph [node1] == nil
then graph [node1] = {}
end
if graph [node1] [node2] == nil
then graph [node1] [node2] = 0
end
end
end
Floyd-Warshall:
for k in pairs (nodes)
do for i in pairs (nodes)
do for j in pairs (nodes)
do if graph [i] [j] == 0 and graph [k] [j] == 1 and
graph [i] [k] == 1
then graph [i] [j] = 1
end
end
end
end
Finding the final answer, and printing it:
local out = 0
for i in pairs (nodes)
do if graph [i] [i] == 1
then out = 1
end
end
print (out)
Find the full program on GitHub.
There are not hashes or associative arrays build in in C. Therefore, to make life for ourselves easier, we assume the strings consist of lowercase ASCII letters. There are 26 of them, so we have hard coded the size of the datastructure:
# include <stdbool.h>
# define NR_OF_LETTERS ('z' - 'a' + 1)
bool graph [NR_OF_LETTERS] [NR_OF_LETTERS];
for (size_t i = 0; i < NR_OF_LETTERS; i ++) {
for (size_t j = 0; j < NR_OF_LETTERS; j ++) {
graph [i] [j] = false;
}
}
We can now parse the data, and populate the adjacency matrix. Here,
line
is a pointer to a line from stdin
# include <ctype.h>
char * line_ptr = line;
while (*line_ptr) {
while (*line_ptr && !islower (*line_ptr)) {
line_ptr ++; /* Skip whitespace */
}
if (!*line_ptr) {
break; /* End of string reached */
}
char start = *line_ptr;
char end = *line_ptr ++;
while (*line_ptr && islower (*line_ptr)) {
end = *line_ptr ++;
}
graph [start - 'a'] [end - 'a'] = true;
}
Floyd-Warshall:
for (size_t k = 0; k < NR_OF_LETTERS; k ++) {
for (size_t i = 0; i < NR_OF_LETTERS; i ++) {
for (size_t j = 0; j < NR_OF_LETTERS; j ++) {
graph [i] [j] = graph [i] [j] ||
(graph [k] [j] && graph [i] [k]);
}
}
}
Finding and printing the answer:
short out = 0;
for (size_t i = 0; i < NR_OF_LETTERS; i ++) {
if (graph [i] [i]) {
out = 1;
break;
}
}
printf ("%d\n", out);
Find the full program on GitHub.
Bash does have associative arrays, but no easy way do multidimensional
arrays. But we can use a trick. Each of our nodes is exactly one character,
so in this case, we can simulate 2-d indices by concatenating both
coordinates. And since in Bash, concatenation is done by just sticking
tokens together, instead of writing graph [x] [y]
or graph [x, y]
,
we will write graph[$x$y]
.
We have to declare we're using associative arrays:
declare -A nodes
declare -A graph
Reading in a line from stdin
, splitting them on whitespace, and then
placing the results in an array strings
:
while read -a strings
We can now parse the data, and build our graph:
for string in ${strings[@]}
do first=${string:0:1}
last=${string:$((${#string}-1)):1}
nodes[$first]=1
nodes[$last]=1
graph[$first$last]=1
done
Note the way to get substrings in Bash: ${string:P:L}
which is
the substring of string
from position P
and length L
. To get
the length of a string, we use the syntax ${#string}
.
${array[@]}
expands the array array
to a list of tokens, each
token a value of the array. ${!array[@]}
does the same, but
then for the keys of the array. (This is more or less equivalent to
values @array
(or values %hash
) and keys @array
(or keys %hash
)
in Perl.)
We can now apply Floyd-Warshall:
for k in ${!nodes[@]}
do for i in ${!nodes[@]}
do for j in ${!nodes[@]}
do if [ X${graph[$k$j]} == X1 -a \
X${graph[$i$k]} == X1 ]
then graph[$i$j]=1
fi
done
done
done
And finding the answer:
out=0
for i in ${!nodes[@]}
do if [ X${graph[$i$i]} == X1 ]
then out=1
fi
done
echo $out
Find the full program on GitHub.
We also have implementations in Node.js, Python, and Ruby. They're all similar to the AWK or Lua solution above.