IST 210: Organization of Data

   		             Spring 2009

                    Proj #1 (DUE: Feb. 26, 1PM)

              Last Updated: Mon Feb  3 15:35:02 EDT 2009


NOTE: Team Assignment (also available in ANGEL)

Your IST 210 team project will be to build a substantial database
application for a real-world domain. You will design a schema on
paper, and then you will create an actual database using Microsoft SQL
Server provided by IST Infrastructure. You will populate the database
on your own, write interactive queries and modifications on the
database, create programs that manipulate the database.

In a series of 3 projects, you are asked to design a database-backed
application for the on-line retailer, called "X", that hopes to
outweigh the someday. Your first task is to:

- Come up with a name for "X"
- Decide the products to sell (e.g., Books, Electronics, Clothes, etc).

Note a few assumptions as follows:

- X is a purely on-line based business so that all transactions (e.g.,
  purchase, review, return) are designed for on-line activities.

- Nation wide, X has three warehouses (W1 to W3) and three shipping
  sites (S1 three S3). Products are shipped from a supplier to a
  warehouse to a shipping center, and then to a customer.

- The database for X should handle two types of queries -- one for
  regular customers and the other for employees.

In particular, your E/R design should be able to handle the following
scenarios. Consider two users -- a regular customer "Kim" and an
employee "Joe" for each scenario.

1. After Kim registers her information, she can log in back using her
chosen ID and PWD later. X keeps track of the log-in information of
each account so that if one does not log in for more than a year, the
account is suspended and the ID/PWD is overwritten.

2. 10 copies of a product "Foo" arrive from the supplier "IGotAll" to
the warehouse "W2" located at 220 S. Atherton on 1/1/2009. In
addition, 20 copies of a product "Foo2" arrive from another supplier
"IGotSome" to "W2" on 1/2/2009.

3. A product "Foo" can have a new or old edition (e.g., 2nd edition of
a book or successor model of a particular HDTV TV). If "Foo" has a new
edition "NewFoo" or an old edition "OldFoo", Kim should be able to
follow the link to see the detailed information of "NewFoo" or

4. Kim purchases 2 copies of a product "Foo" for $54.50 each (with 20%
discount each) and a product "Bar" for $23.00 (with no discount).  In
her account, Kim can see both purchases in a single transaction that
is uniquely identified by 4 digit number (e.g., 1110). Later, Kim can
make a comment (i.e., 100 words) and assign a rating (i.e., 1 to 5
with 5 being the best) to each product that she purchased.

5. Effective from 1/1/2009 for a week, all products in some category
(e.g., all TVs by Samsung or all books from Oprah's book club) go on
sale with 50% discount.

6. Kim purchases a product "Foo" for $100 on 1/1/2009. Once she
receives the product, she changes her mind and returns it for
credit. If the return is within 30 days of the purchase, she can get
her money back (e.g., credited to her credit card account). Otherwise,
she needs to pick another product, say "Foo2", of the same (or lesser)
value than Foo is. The fact that Kim returns Foo and gets Foo2 instead
must be recorded somehow. Whether the return is treated as a new
transaction or not is upto your design. In theory, Kim can repeat this
purchase and return cycle for arbitrary many times.

7. For any sold products, Joe can retrieve the flow as to how products
were shipped from the beginning to the end (i.e., supplier --
warehouse -- shipping site -- customer).

8. Joe likes to know the number of product items (not the number of
transactions) and the total amount of those product items that "Kim"
has purchased so far for each year. If product items were purchased
and returned, then those items must NOT be included in the number of
items and total amount. 

9. Suppose a product has a category assigned (e.g., SciFi genre for a
book or a Game category for Wii). Now, Joe wants to generate a
quarterly report showing the total sales of each category per year for
2007 and 2008. Again, total sales should NOT include those products
returned. Furthermore, total sales should use only the actual sale
price, not the original price of products. For instance, if a product
"Foo" originally costed $100 but was sold for $50 with a 50% discount,
then the TotalSales should include $50 (actual sale price), not $100
(original price). The final answers would look something like
(assuming Electronics to sell):

Year   Quarter   Category  TotalSales
2007   1         TV-Video  $1.5M
2007   1         Audio     $1.2M
2007   1         ...
2007   2         TV-Video  $2.5M
2007   2         Audio     $0.2M
2007   2         ...
2008   4         TV-Video  $1.4M
2008   4         Audio     $1.12M
2008   4         Music     $3.2M
2008   4         Game      $2.1M

10. One scenario of your own.

Note that in the above descriptions, any quoted names such as "Kim" or
"Foo" can be replaced by any other arbitrary names.


Part 1: E/R Design

With the above scenarios and reasonable assumptions, in proj #1, you
need to design an E/R diagram.

Draw an E/R diagram using MS Word or Visio or any drawling tool (do
NOT hand-write). List all your assumptions for things that are NOT
explicitly specified or ambiguous (e.g., all items can be associated
with some discounts). Otherwise, we will grade your design based on
the common sense, and you may lose points if ours and yours are
different. Don't forget to specify key attributes by * symbol for each
entities. Also, specify cardinalities and modalities for all

Consult the PBL 3.1 AFV in class, where various queries that the given
design is trying to to answer. In your scenario, you should go through
the similar thinking process, and make sure that your design is
*expressive* enough to handle all of the above 10 scenarios.

You goal is to design a good E/R that supports all 10 scenarios.  If
your design cannot support some scenarios since you don't have proper
entities/relationships or your cardinality/modality is not sufficient
enough, then you will lose points. Some E/R design, even though all 10
scenarios can be supported, may have a lot of unnecessary or redundant
components in it or the support of a scenario is not very efficient
(e.g., if one has to use 7-8 entities to support a single scenario,
then in practice, this will result in a lot of JOIN operations and may
take a while to process it). Therefore, I'll also consider how "good"
your E/R design is as well. In short, the main grading rubrics are:

Max points: 6

- Correct support of 10 scenarios: 5 (0.5 point each)
- Goodness of the overall E/R design : 0.5
- Misc (notation errors, report, assumption) : 0.5


Part 2: Turn-In of Proj #1 Report

Your report should consist of at least the following:

- On the cover page, list project team name (e.g., Team-01), and all
  member names

- List 10 major assumptions that you made

- E/R diagram

- For each of 10 scenario, briefly explain why the scenario can be
supported by your E/R design (e.g., "Scenario 3 can be supported by
the binary relationship between the entities X and Y and their
attributes Z...")

- Meeting log: a detailed log showing each meeting date and activity:

Date   Absent Members       Activity
2/3    John Doe             brainstorming
2/4    None                 initial ER sketch

Do NOT make your report lengthy unnecessarily -- be succinct and get
to the point. The length of your report has nothing to do with the
score that you are getting.

If your project team has some problems (eg, some members don't come to
project meetings), you need to inform me early so that I can
intervene. Otherwise, all team members share the * IDENTICAL * scores
for the projects.

TURN-IN: Only *ONE* person from each team drops the report to
ANGEL once AND give me a hard-copy in class.