Project #1: Due Feb. 23 (Before Class Begins)

Last modified: Mon Jan 26 17:37:29 EST 2004

PDA (Personal Database Application), Courtesy of J. Widom

Your IST210 team project will be to build a substantial database application for a real-world domain of your choosing. You will design a schema on paper, then you will create an actual database using DB2 provided by PSU. You will populate the database on your own (either via an automated program that you write or manual data hunting), write interactive queries and modifications on the database, explore other aspects such as constraints and triggers (if possible), create programs that manipulate the database, and finally develop a simple Web front-end to your database application.

Your first step is to identify the domain you would like to manage with your database, and to construct an entity-relationship (E/R) diagram for the database. We suggest that you pick an application that you will enjoy working with, since you'll be stuck with it for the whole semester! It is especially nice if you pick an application where you can populate your database using real (as opposed to fabricated) data. Possible examples are:

Try to pick an application that is relatively substantial, but not too enormous. For example, your E/R design should have in the range of five or so entity sets, and a similar number of relationship sets. Note that this is a ballpark figure only! You should certainly include different kinds of relationships (e.g., many-one, many-many) and different kinds of data (strings, integers, or even picture, etc.), but your application need not necessarily require advanced features such as weak entity sets, "is-a" relationships, or roles (but "strongly encouraged").

To get started on your Personal Database Application (PDA for short), your project #1 assignment consists of four parts:


Part 1: E/R Design

  1. (Most of you already did this bullet, so skip it) Create an initial (possibly blank) webpage for your team under PSU (must be somewhere in the PSU web space to use DB2 later) and email me: (1) team code, (2) URL. In future, your team needs to add stuff there.

  2. Write a short (maximum 1 page) description of the database application you propose to work with throughout the course. Your description should be brief and relatively informal. If there are any unique or particularly difficult aspects of your proposed application, please point them out. Your description will be graded on suitability and conciseness.

  3. Draw an E/R diagram for your proposed database. Don't forget to underline key attributes for entity sets and include arrowheads indicating the proper cardinalities of relationship sets. If there are weak entity sets or "is-a" relationships, make sure to notate them appropriately.

  4. Using the method for translating an E/R diagram to relations, produce a set of relations for your database design. As usual, please be sure to underline key attributes in your relations. For each attribute, think of proper data types.

Part 2: Database Creation

In this part of the project, you will create a relational schema for your PDA in the DB2 database system, and you will populate the tables in your database with initial data sets. Consult URLs in the Resources section of the class web page for help.

NOTE: since later you are likely to refine your E/R design and relational schema, make sure you write the data population steps in a script so that you can run it over and over

  1. Familiarize yourself with the DB2 relational DBMS by reading DB2 Manual, logging into Unix and DB2 resource, changing your password, trying some of the examples in the book, and experimenting with the help command. (All basic DB2 and SQL commands were already covered in the Lab on Friday.) You don't need to turn anything in for this part.

  2. Create relations for your PDA based on your final relational schema from PDA Part 1. In addition to creating the appropriate attributes and types, please declare keys for your relations. You may have to change the types of your attributes depending on the specific data types that DB2 supports (e.g., some multimedia types or date/time). But don't worry; DB2 supports a really a lot of types. Once you create tables, to see all, you can use:
      prompt> LIST TABLES
    

  3. Turn in a script showing an DB2 session in which your relations are created successfully. Also show, for each relation, the result of the DESCRIBE command once the relation has been created: for a relation Foo, type
      prompt> DESCRIBE TABLE Foo
    

  4. Data population time !! Now, you need to somehow populate data and load them into your tables. You can take any of the following ways:

    1. Manually create fabricated meaningless gibberish data
    2. From real web sites similar to your PDA, copy data manually
    3. Write a program that generate meaningless gibberish data automatically
    4. Write a program that access real web sites similar to your PDA, and download/clean data automatically

    It doesn't matter which option you use -- the same point will be given. Whichever way you choose, the two conditions that you should satisfy are:

    If your application naturally includes relations that are expected to be relatively small (e.g., schools within a university), then put that fact in your writing clearly. When you generate or gather data, there are two important points to keep in mind:


Part 3: Turn-In

Turn in a project report with the followings:

  1. Team information on the Cover (eg, nick name, member names)
  2. Work distribution information (e.g., if work was done unequally, say clearly how many points a student "Foo" should get, etc. Similarly, if contributed equally, say "equal contribution") -- As I said, I do hope to see "Equal Contribution" at all submitted reports, but...
  3. E/R diagram and database application description that you are trying to model. Explain the reasons of your decision (e.g., cardinality, entity set vs. attribute, 3-way vs. 2-way, etc)
  4. The data generation program code (if you wrote one) or a short description how you gathered the data,
  5. A small sample of the tuples from each relation (two or three per relation),
  6. DB2 script showing the successful loading of your data into DB2,
  7. DB2 session showing the number of tuples of each table (e.g., using SQL command "SELECT count(*) FROM foo;") proving your tables satisfied the constraint of the minimal number of tuples.

Part 4: Web Posting

Finally, convert your project report into HTML format, store it in somewhere in the web space, and add a link to it in your project team webpage.