Project #1: Due Feb. 23 (Before Class Begins)
Last modified: Mon Jan 26 17:37:29 EST 2004
PDA (Personal Database Application), Courtesy of J. Widom
Your IST210 team project will be to build a substantial database
application for a real-world domain of your choosing. You will
design a schema on paper, then you will create an actual database
using DB2 provided by PSU. You will populate the database on your
own (either via an automated program that you write or manual
data hunting), write interactive queries and modifications on the
database, explore other aspects such as constraints and triggers
(if possible), create programs that manipulate the database, and
finally develop a simple Web front-end to your database
application.
|
Your first step is to identify the domain you would like to
manage with your database, and to construct an
entity-relationship (E/R) diagram for the database. We suggest
that you pick an application that you will enjoy working with,
since you'll be stuck with it for the whole semester!
It is especially nice if you pick an application where you can
populate your database using real (as opposed to fabricated)
data. Possible examples are:
- Online Bookstore (e.g., Amazon)
- E-Auction (e.g., eBay)
- Online Shopping Mall (e.g., Buy.com)
- Membership Management System
- Library Information System
- Online Technical Support system (e.g., Dell)
- ...
Try to pick an application that is relatively substantial, but
not too enormous. For example, your E/R design should have in the
range of five or so entity sets, and a similar number of
relationship sets. Note that this is a ballpark figure only! You
should certainly include different kinds of relationships (e.g.,
many-one, many-many) and different kinds of data (strings,
integers, or even picture, etc.), but your application need not
necessarily require advanced features such as weak entity sets,
"is-a" relationships, or roles (but "strongly encouraged").
To get started on your Personal Database Application (PDA for
short), your project #1 assignment consists of four parts:
Part 1: E/R Design
- (Most of you already did this bullet, so skip it) Create an initial
(possibly blank) webpage for your team under PSU (must be somewhere in
the PSU web space to use DB2 later) and email me: (1) team code, (2)
URL. In future, your team needs to add stuff there.
- Write a short (maximum 1 page) description of the database
application you propose to work with throughout the course. Your
description should be brief and relatively informal. If there are any
unique or particularly difficult aspects of your proposed application,
please point them out. Your description will be graded on suitability
and conciseness.
- Draw an E/R diagram for your proposed database. Don't forget
to underline key attributes for entity sets and include arrowheads
indicating the proper cardinalities of relationship sets. If there are
weak entity sets or "is-a" relationships, make sure to notate them
appropriately.
- Using the method for translating an E/R diagram to relations,
produce a set of relations for your database design. As usual,
please be sure to underline key attributes in your relations.
For each attribute, think of proper data types.
Part 2: Database Creation
In this part of the project, you will create a relational schema
for your PDA in the DB2 database system, and you will populate
the tables in your database with initial data sets. Consult URLs
in the Resources section of the class web page for help.
NOTE: since later you are likely to refine your E/R design and relational schema, make sure you write the data population steps in a script so that you can run it over and over
-
Familiarize yourself with the DB2 relational DBMS by reading DB2
Manual, logging into Unix and DB2
resource, changing your password, trying some of the
examples in the book, and experimenting with the help
command. (All basic DB2 and SQL commands were already covered in the Lab on Friday.)
You don't need to turn anything in for this part.
- Create relations for your PDA based on your final
relational schema from PDA Part 1. In addition to creating the
appropriate attributes and types, please declare keys for your
relations. You may have to change the types of your attributes
depending on the specific data types that DB2 supports (e.g.,
some multimedia types or date/time). But don't worry; DB2
supports a really a lot of types. Once you create tables,
to see all, you can use:
prompt> LIST TABLES
- Turn in a script showing an DB2 session in which your
relations are created successfully. Also show, for each relation,
the result of the DESCRIBE command once the relation has
been created: for a relation Foo, type
prompt> DESCRIBE TABLE Foo
- Data population time !! Now, you need to somehow populate
data and load them into your tables. You can take any of the
following ways:
- Manually create fabricated meaningless gibberish data
- From real web sites similar to your PDA, copy data manually
- Write a program that generate meaningless gibberish data automatically
- Write a program that access real web sites similar to your
PDA, and download/clean data automatically
It doesn't matter which option you use -- the same point will be given.
Whichever way you choose, the two conditions that you should satisfy are:
- All your tables must have at least 10 tuples each, and
- TWO of those tables must have at least 50 tuples
If your application naturally includes relations that are
expected to be relatively small (e.g., schools within a
university), then put that fact in your writing clearly.
When you generate or gather data, there
are two important points to keep in mind:
- You will need to make sure not to generate duplicate values for key attributes.
- Your PDA almost certainly includes relations that are
expected to join with each other. For example, you may have a
Student relation with attribute courseNum that's expected to join
with attribute number in relation Course. When generating data,
be sure to generate values that actually do join - otherwise all
of your interesting queries will have empty results! There are a
couple of ways to properly generate joining values. One way is to
generate records for multiple relations (e.g., Course and
Student) at the same time. Another way is to generate the records
for one relation first, and then use the joining values for the
other relation. For example, you could generate records for
relation Course first, then use the Course.number values when
creating values for Student.courseNum.
Part 3: Turn-In
Turn in a project report with the followings:
- Team information on the Cover (eg, nick name, member names)
- Work distribution information (e.g., if work was done
unequally, say clearly how many points a student "Foo" should
get, etc. Similarly, if contributed equally, say "equal
contribution") -- As I said, I do hope to see "Equal
Contribution" at all submitted reports, but...
- E/R diagram and database application description that you are
trying to model. Explain the reasons of your decision (e.g.,
cardinality, entity set vs. attribute, 3-way vs. 2-way, etc)
- The data generation program code (if you wrote one) or a short
description how you gathered the data,
- A small sample of the tuples from each relation (two or three
per relation),
- DB2 script showing the successful loading of your data into DB2,
- DB2 session showing the number of tuples of each table (e.g.,
using SQL command "SELECT count(*) FROM foo;") proving your
tables satisfied the constraint of the minimal number of tuples.
Part 4: Web Posting
Finally, convert your project report into HTML format, store it in
somewhere in the web space, and add a link to it in your project team
webpage.