Some issues are very important in all 134A projects: good user interface design, good software design, teamwork, and a well-written report. I'll be talking more later about the others. About the report:
You should start early, and you are welcome to bring your outline or
your draft to a TA office hour to discuss it and get ideas for improving
it. The report should be detailed enough to be interesting, but it should
also be six pages single-spaced at most. You should use the report-writing
skills that you learned in your general education and lab science classes.
These skills are important for graduate school, for software jobs with
top companies, and for project leader and management jobs in all organizations.
select dayofweek(date) as d, count(*) from messages group by d;Note that dayofweek() is a builtin function. Other builtin group-based functions include min, max, sum, and avg.
select messages.*, address.zip from messages, addressThis type of query is called a "join" query.
where messages.author = address.author
Notice that the join is specified in the where clause, just
like a logical condition on values in a single record.
A simple index is on exactly one column of a table. Fundamentally, it allows the table to be scanned efficiently with records sorted according to the value in that column.
To see why indexes are necessary, suppose that you want to scan messages sometimes by author in alphabetical order, and sometimes by date in time order. You could do either of these if the table was stored in the right sequence, but not both.
The general way to create an index is with the alter table command, whose syntax is rather verbose. The column you want the index on is specified inside parentheses. For example:
alter table messages add index dx (date)Now queries with order by date or group by date can be efficient.
select c1, c2 from t1, t2 where c1=c2With an index on tablet2 this query can be executed by one scan of table t1.
Doing direct accesses to each row of a table is still much slower than
scanning the table sequentially. Reason: for each row, one block
must be read from disk. When scanning, many rows are processed per
read of a block.
For most queries, the index is used to guide a read from the table. For some queries, it is possible to use data from the index directly, so accessing the table is not necessary at all.
We also discussed the definition of scalability from the point of view of a user: constant or near-constant response time despite increases in amount of usage, e.g. database size, number of users, number of simultaneous users.
Every index causes additional slowdown for inserts and updates, and
uses space proportional to the number of rows in the table. So, only
create an index if it is necessary.