mirror of
https://github.com/boostorg/graph.git
synced 2026-02-01 08:32:11 +00:00
305 lines
10 KiB
HTML
305 lines
10 KiB
HTML
<HTML>
|
|
<!--
|
|
-- Copyright (c) Jeremy Siek 2000
|
|
--
|
|
-- Permission to use, copy, modify, distribute and sell this software
|
|
-- and its documentation for any purpose is hereby granted without fee,
|
|
-- provided that the above copyright notice appears in all copies and
|
|
-- that both that copyright notice and this permission notice appear
|
|
-- in supporting documentation. Silicon Graphics makes no
|
|
-- representations about the suitability of this software for any
|
|
-- purpose. It is provided "as is" without express or implied warranty.
|
|
-->
|
|
<Head>
|
|
<Title>Kevin Bacon Example</Title>
|
|
<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
|
|
ALINK="#ff0000">
|
|
<IMG SRC="../../../c++boost.gif"
|
|
ALT="C++ Boost" width="277" height="86">
|
|
|
|
<BR Clear>
|
|
|
|
<H1>Six Degrees of Kevin Bacon</H1>
|
|
|
|
<P>
|
|
A fun application of graph theory comes up in the popular game ``Six
|
|
Degrees of Kevin Bacon''. The idea of the game is to connect an actor
|
|
to Kevin Bacon through a trail of actors who appeared together in
|
|
movies, and do so in less than six steps. For example, Theodore
|
|
Hesburgh (President Emeritus of the University of Notre Dame) was in
|
|
the movie ``Rudy'' with the actor Gerry Becker, who was in the movie
|
|
``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the
|
|
three students who invented the game, Mike Ginelli, Craig Fass, and
|
|
Brian Turtle decided that Kevin Bacon is the center of the
|
|
entertainment world.
|
|
|
|
<P>
|
|
The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If
|
|
you assign each actor to a vertex, and add an edge between two actors
|
|
if they have appeared together in a movie, then you have a graph that
|
|
represents the data for this game. Then the problem of finding a trail
|
|
of actors to Kevin Bacon becomes a traditional graph problem: that of
|
|
finding a <I>path</I> between two vertices. Since we wish to find a
|
|
path that is shorter than six steps, ideally we would find the
|
|
<I>shortest path</I> between the vertices. Now it turns out that there
|
|
are no algorithms for finding the shortest path between a pair of
|
|
vertices that are asymptotically faster than the algorithms for
|
|
finding the shortest path from one vertex to all other vertices in the
|
|
graph. For this reason we do not have algorithms for finding a
|
|
shortest path between a pair of vertices in BGL. There are two
|
|
single-source shortest paths algorithms implemented in BGL:
|
|
Dijkstra's and the Bellman-Ford algorithm. In the following example we
|
|
will show how to use Dijkstra's algorithm to calculate shortest paths
|
|
to Kevin Bacon.
|
|
|
|
<P>
|
|
|
|
<H2>Input File and Graph Setup</H2>
|
|
|
|
<P>
|
|
For this example we will use a much smaller graph than all movies and
|
|
all actors. The full source code for this example is in <a
|
|
href="../examples/kevin_bacon.cpp"><TT>examples/kevin_bacon.cpp</TT></a>.
|
|
We have supplied a file <TT>kevin_bacon.dat</TT> which contains a list
|
|
of actor pairs who appeared in the same movie. Each line of the file
|
|
contains an actor's name, a movie, and another actor that appeared in
|
|
the movie. The ``|'' character is used as a separator. Here is an
|
|
excerpt.
|
|
|
|
<P>
|
|
<PRE>
|
|
Patrick Stewart|Prince of Egypt, The (1998)|Steve Martin
|
|
</PRE>
|
|
|
|
<P>
|
|
Our first task will be to read in the file and create a graph from
|
|
it. We use a <TT>std::ifstream</TT> to input the file.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::ifstream datafile("./kevin_bacon.dat");
|
|
if (!datafile) {
|
|
cerr << "No ./kevin_bacon.dat file" << endl;
|
|
return -1;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
We will want to attach the actor's names to the vertices in the graph,
|
|
and the movie names to the edges. In addition, Dijkstra's algorithm
|
|
will expect edges to have a weight property (we will just set the
|
|
weights to one). We use the <TT>property</TT> class to specify the
|
|
addition of these vertex and edge properties to the graph.
|
|
We choose to use an undirected graph, because the relationship
|
|
of actors appearing together in a movie is symmetric.
|
|
|
|
<P>
|
|
<PRE>
|
|
using namespace boost;
|
|
typedef adjacency_list<vecS, vecS, undirectedS,
|
|
property<vertex_name_t, string>,
|
|
property<edge_name_t, string, property<edge_weight_t, int> >
|
|
> Graph;
|
|
</PRE>
|
|
|
|
<P>
|
|
To access the properties, we will need to obtain property map
|
|
objects from the graph. The following code shows how this is done.
|
|
|
|
<P>
|
|
<PRE>
|
|
boost::property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g);
|
|
boost::property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g);
|
|
boost::property_map<Graph, edge_weight_t>::type weight = get(edge_weight, g);
|
|
</PRE>
|
|
|
|
<P>
|
|
Next we can start to loop through the file, parsing each line into a
|
|
list of tokens. <TT>stringtok</TT> is a C++ version of the old
|
|
<TT>strtok</TT> standard C function.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::string line;
|
|
while (std::getline(datafile,line)) {
|
|
std::list<std::string> line_toks;
|
|
boost::stringtok(line_toks, line, "|");
|
|
...
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
Each line of the input corresponds to an edge in the graph, with the
|
|
two actors as the vertices incident to the edge, and the movie name
|
|
will be a property attached to the edge. One challenge in creating the
|
|
graph from this format of input is that the edges are specified by a
|
|
pair of actor names. As we add vertices to the graph, we'll need to
|
|
create a map from actor names to their vertices so that later
|
|
appearances of the same actor (on a different edge) can be linked with
|
|
the correct vertex in the graph. The <a
|
|
href="http://www.sgi.com/Technology/STL/Map.html"><TT>std::map</TT></a>
|
|
can be used to implement this mapping.
|
|
|
|
<P>
|
|
<PRE>
|
|
typedef graph_traits<Graph>::vertex_descriptor Vertex;
|
|
typedef std::map<string, Vertex> NameVertexMap;
|
|
NameVertexMap actors;
|
|
</PRE>
|
|
|
|
<P>
|
|
The first token will be an actor's name. If the actor is not already
|
|
in our actor's map we will add a vertex to the graph, set the name
|
|
property of the vertex, and record the vertex descriptor in the map.
|
|
If the actor is already in the graph, we will retreive the vertex
|
|
descriptor from the map's <TT>pos</TT> iterator.
|
|
|
|
<P>
|
|
<PRE>
|
|
NameVertexMap::iterator pos;
|
|
bool inserted;
|
|
Vertex u, v;
|
|
|
|
std::list<std::string>::iterator i = line_toks.begin();
|
|
|
|
boost::tie(pos, inserted) = actors.insert(make_pair(*i, Vertex()));
|
|
if (inserted) {
|
|
u = add_vertex(g);
|
|
actor_name[u] = *i;
|
|
pos->second = u;
|
|
} else
|
|
u = pos->second;
|
|
++i;
|
|
</PRE>
|
|
|
|
<P>
|
|
The second token is the name of the movie, and the third token is the
|
|
other actor. We use the same technique as above to insert the second
|
|
actor into the graph.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::string movie_name = *i++;
|
|
|
|
boost::tie(pos, inserted) = actors.insert(make_pair(*i, Vertex()));
|
|
if (inserted) {
|
|
v = add_vertex(g);
|
|
actor_name[v] = *i;
|
|
pos->second = v;
|
|
} else
|
|
v = pos->second;
|
|
</PRE>
|
|
|
|
<P>
|
|
The final step is to add an edge connecting the two actors, and
|
|
record the name of the connecting movie. We also set the weight
|
|
of the edge to one. Since we chose <TT>setS</TT> for the <TT>EdgeList</TT>
|
|
type of the <TT>adjacency_list</TT>, any parallel edges in the input
|
|
will not be inserted into the graph.
|
|
|
|
<P>
|
|
<PRE>
|
|
Edge e;
|
|
boost::tie(e, inserted) = add_edge(u, v, g);
|
|
if (inserted) {
|
|
connecting_movie[e] = movie_name;
|
|
weight[e] = 1;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
|
|
<H2>Applying Shortest Paths</H2>
|
|
|
|
<P>
|
|
The documentation for variant 4 of <A
|
|
HREF="./dijkstra_shortest_paths.html"><tt>dijkstra_shortest_paths()</tt></a>
|
|
algorithm shows that we will need to provide several more property
|
|
maps for distance, color, and vertex ID (edge weights are already
|
|
on the graph) . The distance will be the shortest distances from each
|
|
actor to Kevin Bacon calculated by the algorithm. This distance has
|
|
also been referred to as the ``Bacon Number'' in memory of the ``Erdos
|
|
Number'' used to rank mathematicians according to how many
|
|
publications separate them from Paul Erdos. We will use a vector to
|
|
store the bacon numbers, and another vector to store the color
|
|
property needed by Dijkstra's algorithm. The <TT>predecessor</TT>
|
|
vector will be used to record the shortest path from each actor to
|
|
Kevin. For each vertex, the <TT>predecessor</TT> vector will contain a
|
|
pointer to the next vertex on the shortest path towards Kevin Bacon.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::vector<int> bacon_number(num_vertices(g));
|
|
std::vector<int> color(num_vertices(g));
|
|
std::vector<Vertex> predecessor(num_vertices(g));
|
|
</PRE>
|
|
|
|
<P>
|
|
Dijkstra's algorithm also requires a source vertex as input; of
|
|
course this will be Kevin Bacon. We now call the BGL Dijkstra
|
|
algorithm, passing in the graph, source vertex, and all the property
|
|
maps. We use the <TT>predecessor_visitor</TT> to record the
|
|
predecessors, since this is not part of the basic algorithm.
|
|
|
|
<P>
|
|
<PRE>
|
|
Vertex src = actors["Kevin Bacon"];
|
|
|
|
dijkstra_shortest_paths
|
|
(g, src, bacon_number.begin(),
|
|
get(edge_weight, g), color.begin(), get(vertex_index, g),
|
|
make_ucs_visitor(record_predecessors(predecessor.begin(),
|
|
on_edge_relaxed())));
|
|
</PRE>
|
|
|
|
<P>
|
|
We can output the Bacon number for each actor simply by looping
|
|
through all the vertices in the graph and use them to index into the
|
|
<TT>bacon_number</TT> vector.
|
|
|
|
<P>
|
|
<PRE>
|
|
graph_traits<Graph>::vertex_iterator i, end;
|
|
for (boost::tie(i, end) = vertices(g); i != end; ++i)
|
|
std::cout << actor_name[*i] << "'s bacon number is "
|
|
<< bacon_number[*i] << std::endl;
|
|
</PRE>
|
|
|
|
<P>
|
|
Note that vertex descriptor objects can not always be used as indices
|
|
into vectors or arrays such as <TT>bacon_number</TT>. This is valid
|
|
with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>,
|
|
but not with other variations of <TT>adjacency_list</TT>. A more
|
|
generic way to index based on vertices is to use the ID property
|
|
map (<TT>vertex_index_t</TT>) in coordination with the <A
|
|
HREF="../../property_map/iterator_property_map.html"><TT>random_access_iterator_property_map</TT></a>.
|
|
|
|
<P>
|
|
Here are some excepts from the output of the program.
|
|
<PRE>
|
|
William Shatner was in Loaded Weapon 1 (1993) with Denise Richards
|
|
Denise Richards was in Wild Things (1998) with Kevin Bacon
|
|
Patrick Stewart was in Prince of Egypt, The (1998) with Steve Martin
|
|
...
|
|
William Shatner's bacon number is 2
|
|
Denise Richards's bacon number is 1
|
|
Kevin Bacon's bacon number is 0
|
|
Patrick Stewart's bacon number is 2
|
|
Steve Martin's bacon number is 1
|
|
...
|
|
</PRE>
|
|
|
|
|
|
|
|
|
|
<br>
|
|
<HR>
|
|
<TABLE>
|
|
<TR valign=top>
|
|
<TD nowrap>Copyright © 2000</TD><TD>
|
|
<A HREF=http://www.boost.org/people/jeremy_siek.htm>Jeremy Siek</A>, Univ.of Notre Dame (<A HREF="mailto:jsiek@lsc.nd.edu">jsiek@lsc.nd.edu</A>)
|
|
</TD></TR></TABLE>
|
|
|
|
</BODY>
|
|
</HTML>
|