SQL 101: Introducing the UMADB Database, Part 2 | SQL

Last time around, I started introducing the UMADB database. Let’s proceed with that discussion now.

I discussed the Students table in the previous article, so now let’s go over the rest of the tables of our fictitious database, starting with the Courses table.

The Courses Table

Let me take a moment to explain the structure of the information in this database. The students take classes, which are taught by teachers and are part of courses. At the end of each semester, the teachers grade the students on each of the classes they attended. This may sound obvious and redundant, but it’s important to keep the structure in mind from this point on. In a way, the Students and Courses tables are the center of the database, because all the other tables are somehow linked to one (or both) of these tables.

Now let’s take a look at the Courses table structure, shown in Table 1.

Table Name	Column Name	Data Type	Length	Dec. Pos.	Description
PFCOM	CONM	Char	60		Course name
PFCOM	CODS	Char	100		Course description
PFCOM	CODN	Char	50		Department name
PFCOM	CODE	Char	60		Course director name
PFCOM	COTA	Char	60		Course teaching assistant name
PFCOM	COSC	Char	1		Status

Table 1: The Courses table structure

Again, the table is pretty typical: the same cryptic names and the lack of a unique record identifier that characterized the Students table. By the way, I imagine that you’re curious about the hidden flaw in the Students table, mentioned earlier. Don’t worry, it’s going to become obvious in a moment, when we look at the Teachers table.

Meanwhile, there’s something common to all the tables in this database: status. As the name implies, it indicates the status of the record. The convention used here is the following:

0: Created but not active record
1: Active record
9: Inactive (deleted) record

This is something that has to be taken into account when querying the database, and it has been the source of many misunderstandings. Sometimes the users forget to include a condition in their queries and end up mixing active and inactive records, which leads to inconsistent or just plain wrong information.

The Teachers Table

The teachers are a very important part of any teaching system. They’re also a very important part of the application our database supports, even though the table that keeps their records is not very “polished.” You’ll see what I mean when we analyze the Teachers table, shown in Table 2.

Table Name	Column Name	Data Type	Length	Dec. Pos.	Description
PFTEM	TENM	Char	60		Teacher name
PFTEM	TETR	Char	20		Teacher rank
PFTEM	TEDB	Decimal	8	0	Date of birth
PFTEM	TEAD	Char	60		Home address
PFTEM	TEPN	Char	15		Home phone number
PFTEM	TEMN	Char	15		Mobile number
PFTEM	TEEM	Char	60		Email address
PFTEM	TEDL	Char	20		Driver’s license
PFTEM	TESN	Char	11		Social security number
PFTEM	TEST	Char	200		Subjects taught
PFTEM	TESA	Decimal	11	2	Salary
PFTEM	TESC	Char	1		Status

Table 2: The Teachers table structure

This table follows the same line as the previous ones, but it introduces the first sensitive piece of information of the database: the teacher’s salary. As things stand, anyone with access to the table can see how much each teacher earns, which might not be a very good idea. I’ll get back to this later, when I discuss how to hide a column’s data from prying eyes.

Notice the similarities between this and the Students table: the personal information (addresses and IDs) is the same. Even though this makes sense—both teachers and students are people and share the same type of information—it brings up a question: what if a student becomes a teacher or vice versa? There will be duplicate and possibly inconsistent information in the database. I’ll address this issue in a later article, in the discussion about database normalization and how that translates to SQL.

Having said that, let me take a moment to explain the other columns in the table. Besides the obvious teacher name and the aforementioned personal information, this table also includes “teacher rank” (which can be something like Assistant Professor, Professor, and so on) and “subjects taught.” The latter is supposed to link to the Classes table, presented in the next section, but the connection is kept by humans, not the database. Because the same person can teach multiple classes in the same school year, the application’s manager thought it would be simpler to manually track the link between teachers and classes—yet another shortcoming we’ll need to solve later.

It’s now time to move on to the next section and review the Classes table.

The Classes Table

Here’s where things start to get interesting: finally, a table with links to other tables. The Classes table contains information about the students that form a class of a given subject during a given year and the course to which the class belongs. As I said before, the teacher is not part of the setup, at least not at database level. Table 3 shows the complete Classes table structure.

Table Name	Column Name	Data Type	Length	Dec. Pos.	Description
PFCLM	CLNM	Char	60	0	Class name
PFCLM	CLCY	Decimal	4	0	Class year
PFCLM	CLCN	Char	60	0	Course name
PFCLM	CLSN	Char	60		Student name
PFCLM	CLSA	Char	60		Student home address
PFCLM	CLSE	Char	60		Student email address
PFCLM	CLSC	Char	1		Status

Table 3: The Classes table structure

As you can see from this table, the links I mentioned before are based on the names of the student and the course, which might cause some problems. The ideal situation would be having record identifiers in each of the tables and keep those IDs, instead of the respective names, on the Classes table records. The next issue is the duplicate student information. The application manager thinks this duplication makes sense, because the student information might change from school year to school year, and keeping the information here allows the teacher to contact the student using the most current address. We’ll also have to deal with this situation later.

Finally, the last table of the downsized version of the UMADB database is the Grades table. Let’s analyze it in the next section.

The Grades Table

After the end of the semester, the students are graded on their performance in each of the classes they attended. The results are stored in the Grades table, shown in detail in Table 4.

Table Name	Column Name	Data Type	Length	Dec. Pos.	Description
PFGRM	GRSN	Char	60		Student name
PFGRM	GRCN	Char	60	0	Class name
PFGRM	GRCY	Decimal	4	0	Class year
PFGRM	GRGR	Char	2		Grade

Table 4: The Grades table structure

Just like the Classes table, this one also depends on another table’s information to form its unique key. In this case, that key is formed by the student name, class name, and class year. Of these three, two are names stored in character strings. This makes them prone to error (character fields usually make awful keys because of the possible mismatches caused by different character cases—“John” is not the same as “john,” for example) and slower to work with (because it takes longer to process a string of characters than a numeric value). The other problem with this table is the Grade column: there’s no validation on the database to prevent inconsistent values, such as invalid grades. It’s assumed that the letters A, B, C, D, and F will be used, optionally followed by a plus or minus sign, but there’s no actual check for a valid grade at the database level. Just like the student’s date of birth validation, this one also exists at the application level, buried in some RPG program.

Just a Few Tables, and So Many Problems

From what you’ve read so far, you probably concluded this (exaggerated) scenario has some similarities with real-life issues on IBM i databases you’ve seen. Probably not all at the same time, but you know what I mean. It’s true that some of the issues are very basic and easy to solve, while others require some database redesign and ingenuity. I’ll address all these issues and a few more, which are related to the non-implemented functionalities that are currently handled outside the application’s scope, over the next subseries of the SQL 101 TechTip series.

You can skip a few articles, but keep in mind that the database will evolve and each subseries will build upon the foundations laid by its predecessor. If you’re comfortable with the topics discussed in a subseries, you can simply have a quick look at the SQL code samples to keep track of the changes to the database.

And that’s all for now. In the next few articles, I’ll start by reviewing some SQL data manipulation language (DML) statements and sharing a few tricks I’ve learned over the years that can, hopefully, help you get more productive when it comes to manipulating data using SQL. Later, we’ll start to fix this broken database!

Analytics & Cognitive Categories

Latest Analytics & Cognitive News

Career Catgories

Latest Career News

Cloud Categories

Latest Cloud News

IT Infrastructure Categories

Latest IT Infrastructure News

News Categories

Latest News

Programming Categories

Latest Programming News

Security Categories

Latest Security News

Tools

Typography

Share This

LATEST COMMENTS

MC Press Online

Support MC Press Online

MC Press Users

Explore

Advertise

Help

Articles

Resources

Social