mirror of
https://github.com/boostorg/graph.git
synced 2026-01-30 20:02:12 +00:00
311 lines
11 KiB
HTML
311 lines
11 KiB
HTML
<HTML>
|
|
<!--
|
|
-- Copyright (c) Jeremy Siek 2000
|
|
--
|
|
-- Permission to use, copy, modify, distribute and sell this software
|
|
-- and its documentation for any purpose is hereby granted without fee,
|
|
-- provided that the above copyright notice appears in all copies and
|
|
-- that both that copyright notice and this permission notice appear
|
|
-- in supporting documentation. Silicon Graphics makes no
|
|
-- representations about the suitability of this software for any
|
|
-- purpose. It is provided "as is" without express or implied warranty.
|
|
-->
|
|
<Head>
|
|
<Title>Kevin Bacon Example</Title>
|
|
<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
|
|
ALINK="#ff0000">
|
|
<IMG SRC="../../../c++boost.gif"
|
|
ALT="C++ Boost" width="277" height="86">
|
|
|
|
<BR Clear>
|
|
|
|
<H1>Six Degrees of Kevin Bacon</H1>
|
|
|
|
<P>
|
|
A fun application of graph theory comes up in the popular game ``Six
|
|
Degrees of Kevin Bacon''. The idea of the game is to connect an actor
|
|
to Kevin Bacon through a trail of actors who appeared together in
|
|
movies, and do so in less than six steps. For example, Theodore
|
|
Hesburgh (President Emeritus of the University of Notre Dame) was in
|
|
the movie ``Rudy'' with the actor Gerry Becker, who was in the movie
|
|
``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the
|
|
three students who invented the game, Mike Ginelli, Craig Fass, and
|
|
Brian Turtle decided that Kevin Bacon is the center of the
|
|
entertainment world. The Kevin Bacon game is quite similar to the <a
|
|
href="http://www.oakland.edu/~grossman/erdoshp.html">Erdös
|
|
Number</a> which has ``been a part of the folklore of
|
|
mathematicians throughout the world for many years''.
|
|
|
|
<P>
|
|
The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If
|
|
you assign each actor to a vertex, and add an edge between two actors
|
|
if they have appeared together in a movie, then you have a graph that
|
|
represents the data for this game. Then the problem of finding a trail
|
|
of actors to Kevin Bacon becomes a traditional graph problem: that of
|
|
finding a <I>path</I> between two vertices. Since we wish to find a
|
|
path that is shorter than six steps, ideally we would find the
|
|
<I>shortest path</I> between the vertices. One might think to apply
|
|
Dijkstra's shortest path algorithm to this problem, but that would be
|
|
overkill since Dijkstra's algorithm is meant for situations when each
|
|
edge has an associated length (or weight) and the goal is to find the
|
|
path with the shortest cumulative length. Since we are only concerned
|
|
with finding the shortest paths in terms of the number of edges the
|
|
breadth-first search algorithm will solve the problem (and use less
|
|
time than Dijkstra's).
|
|
|
|
<P>
|
|
|
|
<H2>Input File and Graph Setup</H2>
|
|
|
|
<P>
|
|
For this example we will use a much smaller graph than all movies and
|
|
all actors. The full source code for this example is in <a
|
|
href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>.
|
|
We have supplied a file <TT>kevin_bacon.dat</TT> which contains a list
|
|
of actor pairs who appeared in the same movie. Each line of the file
|
|
contains an actor's name, a movie, and another actor that appeared in
|
|
the movie. The ``|'' character is used as a separator. Here is an
|
|
excerpt.
|
|
|
|
<P>
|
|
<PRE>
|
|
Patrick Stewart|Prince of Egypt, The (1998)|Steve Martin
|
|
</PRE>
|
|
|
|
<P>
|
|
Our first task will be to read in the file and create a graph from
|
|
it. We use a <TT>std::ifstream</TT> to input the file.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::ifstream datafile("./kevin_bacon.dat");
|
|
if (!datafile) {
|
|
cerr << "No ./kevin_bacon.dat file" << endl;
|
|
return -1;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
We will want to attach the actor's names to the vertices in the graph,
|
|
and the movie names to the edgesWe use the <TT>property</TT> class to
|
|
specify the addition of these vertex and edge properties to the graph.
|
|
We choose to use an undirected graph, because the relationship of
|
|
actors appearing together in a movie is symmetric.
|
|
|
|
<P>
|
|
<PRE>
|
|
using namespace boost;
|
|
typedef adjacency_list<vecS, vecS, undirectedS,
|
|
property<vertex_name_t, string>,
|
|
property<edge_name_t, string, property<edge_weight_t, int> >
|
|
> Graph;
|
|
</PRE>
|
|
|
|
<P>
|
|
To access the properties, we will need to obtain property map
|
|
objects from the graph. The following code shows how this is done.
|
|
|
|
<P>
|
|
<PRE>
|
|
boost::property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g);
|
|
boost::property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g);
|
|
boost::property_map<Graph, edge_weight_t>::type weight = get(edge_weight, g);
|
|
</PRE>
|
|
|
|
<P>
|
|
Next we can start to loop through the file, parsing each line into a
|
|
list of tokens. <TT>stringtok</TT> is a C++ version of the old
|
|
<TT>strtok</TT> standard C function.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::string line;
|
|
while (std::getline(datafile,line)) {
|
|
std::list<std::string> line_toks;
|
|
boost::stringtok(line_toks, line, "|");
|
|
...
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
Each line of the input corresponds to an edge in the graph, with the
|
|
two actors as the vertices incident to the edge, and the movie name
|
|
will be a property attached to the edge. One challenge in creating the
|
|
graph from this format of input is that the edges are specified by a
|
|
pair of actor names. As we add vertices to the graph, we'll need to
|
|
create a map from actor names to their vertices so that later
|
|
appearances of the same actor (on a different edge) can be linked with
|
|
the correct vertex in the graph. The <a
|
|
href="http://www.sgi.com/tech/stl/Map.html"><TT>std::map</TT></a>
|
|
can be used to implement this mapping.
|
|
|
|
<P>
|
|
<PRE>
|
|
typedef graph_traits<Graph>::vertex_descriptor Vertex;
|
|
typedef std::map<string, Vertex> NameVertexMap;
|
|
NameVertexMap actors;
|
|
</PRE>
|
|
|
|
<P>
|
|
The first token will be an actor's name. If the actor is not already
|
|
in our actor's map we will add a vertex to the graph, set the name
|
|
property of the vertex, and record the vertex descriptor in the map.
|
|
If the actor is already in the graph, we will retreive the vertex
|
|
descriptor from the map's <TT>pos</TT> iterator.
|
|
|
|
<P>
|
|
<PRE>
|
|
NameVertexMap::iterator pos;
|
|
bool inserted;
|
|
Vertex u, v;
|
|
|
|
std::list<std::string>::iterator i = line_toks.begin();
|
|
|
|
boost::tie(pos, inserted) = actors.insert(make_pair(*i, Vertex()));
|
|
if (inserted) {
|
|
u = add_vertex(g);
|
|
actor_name[u] = *i;
|
|
pos->second = u;
|
|
} else
|
|
u = pos->second;
|
|
++i;
|
|
</PRE>
|
|
|
|
<P>
|
|
The second token is the name of the movie, and the third token is the
|
|
other actor. We use the same technique as above to insert the second
|
|
actor into the graph.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::string movie_name = *i++;
|
|
|
|
boost::tie(pos, inserted) = actors.insert(make_pair(*i, Vertex()));
|
|
if (inserted) {
|
|
v = add_vertex(g);
|
|
actor_name[v] = *i;
|
|
pos->second = v;
|
|
} else
|
|
v = pos->second;
|
|
</PRE>
|
|
|
|
<P>
|
|
The final step is to add an edge connecting the two actors, and
|
|
record the name of the connecting movie. We also set the weight
|
|
of the edge to one. Since we chose <TT>setS</TT> for the <TT>EdgeList</TT>
|
|
type of the <TT>adjacency_list</TT>, any parallel edges in the input
|
|
will not be inserted into the graph.
|
|
|
|
<P>
|
|
<PRE>
|
|
Edge e;
|
|
boost::tie(e, inserted) = add_edge(u, v, g);
|
|
if (inserted) {
|
|
connecting_movie[e] = movie_name;
|
|
weight[e] = 1;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
|
|
<H2>Applying Breadth-First Search</H2>
|
|
|
|
<P>
|
|
The documentation for <A
|
|
HREF="./breadth_first_search.html"><tt>breadth_first_search()</tt></a>
|
|
algorithm shows that we will need to provide property maps for color
|
|
and vertex ID. In addition, we will be using visitors to record
|
|
predecessors and distances so we will need property maps for these as
|
|
well. The distance will be the shortest distances from each actor to
|
|
Kevin Bacon calculated by the algorithm. This distance has also been
|
|
referred to as the ``Bacon Number'' in memory of the ``Erdos Number''
|
|
used to rank mathematicians according to how many publications
|
|
separate them from Paul Erdos. We will use a vector to store the bacon
|
|
numbers, and another vector to store the color property needed by
|
|
the BFS algorithm. The <TT>predecessor</TT> vector will be used to
|
|
record the shortest path from each actor to Kevin. For each vertex,
|
|
the <TT>predecessor</TT> vector will contain a pointer to the next
|
|
vertex on the shortest path towards Kevin Bacon.
|
|
|
|
<P>
|
|
<PRE>
|
|
std::vector<int> bacon_number(num_vertices(g));
|
|
std::vector<int> color(num_vertices(g));
|
|
std::vector<Vertex> predecessor(num_vertices(g));
|
|
</PRE>
|
|
|
|
<P>
|
|
The BFS algorithm also requires a source vertex as input; of course
|
|
this will be Kevin Bacon. We now call the BGL BFS algorithm, passing
|
|
in the graph, source vertex, the property maps, and a visitor composed
|
|
of a predecessor and distance recorder. We use the <a
|
|
href="./predecessor_recorder.html"><TT>predecessor_recorder</TT></a>
|
|
to record the predecessors and the <a
|
|
href="./distance_recorder.html"><tt>distance_recorder</tt></a> to
|
|
record distances.
|
|
|
|
<P>
|
|
<PRE>
|
|
Vertex src = actors["Kevin Bacon"];
|
|
|
|
breadth_first_search
|
|
(g, src,
|
|
visitor(make_bfs_visitor
|
|
(make_list(record_predecessors(&predecessor[0],
|
|
on_examine_edge()),
|
|
record_distances(&bacon_number[0],
|
|
on_examine_edge()))))
|
|
);
|
|
</PRE>
|
|
|
|
<P>
|
|
We can output the Bacon number for each actor simply by looping
|
|
through all the vertices in the graph and use them to index into the
|
|
<TT>bacon_number</TT> vector.
|
|
|
|
<P>
|
|
<PRE>
|
|
graph_traits<Graph>::vertex_iterator i, end;
|
|
for (boost::tie(i, end) = vertices(g); i != end; ++i)
|
|
std::cout << actor_name[*i] << "'s bacon number is "
|
|
<< bacon_number[*i] << std::endl;
|
|
</PRE>
|
|
|
|
<P>
|
|
Note that vertex descriptor objects can not always be used as indices
|
|
into vectors or arrays such as <TT>bacon_number</TT>. This is valid
|
|
with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>,
|
|
but not with other variations of <TT>adjacency_list</TT>. A more
|
|
generic way to index based on vertices is to use the ID property
|
|
map (<TT>vertex_index_t</TT>) in coordination with the <A
|
|
HREF="../../property_map/iterator_property_map.html"><TT>iterator_property_map</TT></a>.
|
|
|
|
<P>
|
|
Here are some excepts from the output of the program.
|
|
<PRE>
|
|
William Shatner was in Loaded Weapon 1 (1993) with Denise Richards
|
|
Denise Richards was in Wild Things (1998) with Kevin Bacon
|
|
Patrick Stewart was in Prince of Egypt, The (1998) with Steve Martin
|
|
...
|
|
William Shatner's bacon number is 2
|
|
Denise Richards's bacon number is 1
|
|
Kevin Bacon's bacon number is 0
|
|
Patrick Stewart's bacon number is 2
|
|
Steve Martin's bacon number is 1
|
|
...
|
|
</PRE>
|
|
|
|
|
|
|
|
|
|
<br>
|
|
<HR>
|
|
<TABLE>
|
|
<TR valign=top>
|
|
<TD nowrap>Copyright © 2000-2001</TD><TD>
|
|
<A HREF="../../../people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
|
|
</TD></TR></TABLE>
|
|
|
|
</BODY>
|
|
</HTML>
|