Oracle Tip: Solving directed graph problems with SQL, part 1
Takeaway: There are many algorithms and papers about how to solve problems of traversing along each possible route and finding the route that is the shortest or costs the least. Most of these algorithms are procedural or resort to recursion to solve. However, SQL's declarative language makes solving complex directed graph problems easier.
A common advanced computer science problem can be described under the heading of "directed graphs." A directed graph is a finite set of nodes connected by a set of vectors or edges. For example, a node can be imagined as a "city" and each vector as a "flight" between two cities.There are many algorithms and papers about how to solve problems of traversing along each possible route and finding the route that is the shortest or costs the least. Most of these algorithms are procedural or resort to recursion to solve. However, SQL's declarative language makes solving complex directed graph problems easier and doesn't require much coding.
Let's take the example of flights between cities and create a table to hold some imaginary data:
create table airports
(
code char(3)
constraint airports_pk primary key,
description varchar2(200)
);
insert into airports values ('LHR','London Heathrow, UK');
insert into airports values ('JFK','New York-Kennedy, USA');
insert into airports values ('GRU','Sao Paulo, Brazil');
create table fares
(
depart char(3),
arrive char(3),
price number,
constraint fares_pk primary key
(depart,arrive),
constraint fares_depart_fk foreign key
(depart) references airports,
constraint fares_arrive_fk foreign key
(arrive) references airports
);
insert into fares values('LHR','JFK',700);
insert into fares values('JFK','GRU',600);
insert into fares values('LHR','GRU',1500);
insert into fares values('GRU','LHR',1600);
You cannot use CONNECT BY syntax to resolve how to get from London to Sao Paulo because there is data that creates a loop in the graph (flying back to Sao Paulo):
select * from fares connect by prior arrive =
depart start with depart = 'LHR';
ERROR:
ORA-01436: CONNECT BY loop in user data
To solve directed graph problems, we need to create a temporary table to hold all the possible paths between two nodes. We have to be careful not to duplicate paths we already processed and, in this case, we don't want paths that lead to the same place we started. I'd also like to trace the number of hops it took to get to the destination, as well as a description of the route taken.
The temporary table is created using the following:
create global temporary table faretemp
(
depart char(3),
arrive char(3),
hops integer,
route varchar2(30),
price number,
constraint faretemp_pk primary key
(depart,arrive)
);
A simple view can somewhat simplify the code used in this example. The view can calculate the data for the next hop from a path in faretemp based on a single hop in fares:
create or replace view nexthop
as
select src.depart,
dst.arrive,
src.hops+1
hops,
src.route||','||dst.arrive
route,
src.price
+ dst.price price
from faretemp src,fares dst
where src.arrive = dst.depart
and dst.arrive !=
src.depart;
/
show errors;
The algorithm is fairly straightforward. First, populate the faretemp table with the data from the fare table as the initial hop. Then, take all the values we just inserted and use them to build all the possible two-hop paths. Keep repeating as long as a new path can be created between two nodes. The loop will exit when all possible paths between nodes are described. We could also restrict the first insert to reduce the amount of data being loaded if we were only interested in a certain starting condition. Here is the code for just discovering paths:
truncate table faretemp;
begin
-- initial connections
insert into faretemp
select
depart,arrive,1,depart||','||arrive,price from fares;
while sql%rowcount > 0 loop
insert into
faretemp
select
depart,arrive,hops,route,price from nexthop
where
(depart,arrive)
not
in (select depart,arrive from faretemp);
end loop;
end;
/
show errors;
select * from faretemp order by depart,arrive;
You can view the output in Table A.
There's one small problem with the above data. The data is the set of shortest routes between points (least number of hops). However, the hop from London to Sao Paulo isn't the cheapest possibility.
To solve for the least expensive fare, we need to add an update into our loop that replaces a route with a cheaper route when it's found in a hop. The code for that looks like this:
truncate table faretemp;
declare
l_count integer;
begin
-- initial connections
insert into faretemp
select
depart,arrive,1,depart||','||arrive,price from fares;
l_count := sql%rowcount;
while l_count > 0 loop
update faretemp
set
(hops,route,price) =
(select
hops,route,price from nexthop
where
depart = faretemp.depart
and
arrive = faretemp.arrive)
where
(depart,arrive) in
(select
depart,arrive from nexthop
where
price < faretemp.price);
l_count :=
sql%rowcount;
insert into
faretemp
select
depart,arrive,hops,route,price from nexthop
where
(depart,arrive)
not
in (select depart,arrive from faretemp);
l_count := l_count
+ sql%rowcount;
end loop;
end;
/
show errors;
select * from faretemp order by depart,arrive;
You can view the output in Table B.
The algorithm noticed that the LHR,JFK,GRU route was cheaper than the LHR,GRU route and replaced it. The loop will exit once there are no cheaper fares and no other possible routes.
TechRepublic's Oracle newsletter covers automating Oracle utilities, generating database alerts, solving directed graph problems, and more. Automatically subscribe today!
SponsoredWhite Papers, Webcasts, and Downloads
- Seeing the Big Picture: A Corporate Guide to Better Decisions through IT SAP
- Watts and Volt-Amps: Powerful Confusion American Power Conversion (APC)
- The Economist: A new mandate for IT SAP
Article Categories
- Security
- Security Solutions, IT Locksmith
- Networking and Communications
- E-mail Administration NetNote, Cisco Routers and Switches
- CIO and IT Management
- Project Management, CIO Issues, Strategies that Scale
- Desktops, Laptops & OS
- Windows 2000 Professional, Microsoft Word, Microsoft Excel, Microsoft Access, Windows XP,
- Data Management
- Oracle, SQL Server
- Servers
- Windows NT, Linux NetNote, Windows Server 2003
- Career Development
- Geek Trivia
- Software/Web Development
- Web Development Zone, Visual Basic, .NET


