Common SQL interview questions

1. What is database normalization?

Database normalization is a data design and organization process applied to data structures to minimize redundancy. Normalization usually involves dividing a large table into two or more smaller tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

The main purpose of normalization is to minimize the redundancy and remove Insert, Update and Delete Anomaly.

Data redundancy

happens when the same piece of data is held in two separate place. Data redundancy is a problem because it consumes unnecessary space.

Insert Anomaly:

An Insert Anomaly occurs when certain attributes cannot be inserted into the database without the presence of other attributes.

S.NoCourseNum StudentName AddressCourse
19000JonesJones Address Accounts
29001Jones Jones Address Accounts
39002SmitSimth’s Address Physics
49003RichardsRichards AddressSQL
59004 Richards Richards Address SQL

Considering the above example, if you want to add a new course, you also need to provide student’s details. If the students table and course table had been different, the department can easily add a new course to the curriculum with no student currently enrolled. This is what normalization helps in.

Update Anomaly:

For example, the phone number of a student is there with both Academics Department and Hostel Department. Suppose the student wants to change his phone number then that change has to be reflected in both the places.

Delete Anomaly:

A Delete Anomaly exists when certain attributes are lost because of the deletion of other attributes. For example, consider what happens if Student S30 is the last student to leave the course – All information about the course is lost.

Advantages of normalizing database are:

⦁ No duplicate entries
⦁ Saves storage space
⦁ Boasts the query performances.

2. What Are Different Normalization Forms?

1NF: Eliminate Repeating Groups

Make a separate table for each set of related attributes, and give each table a primary key. Each field contains at most one value from its attribute domain.

2NF: Eliminate Redundant Data

If an attribute depends on only part of a multi-valued key, remove it to a separate table.

3NF: Eliminate Columns Not Dependent On Key

If attributes do not contribute to a description of the key, remove them to a separate table. All attributes must be directly dependent on the primary key.

BCNF: Boyce-Codd Normal Form

If there are non-trivial dependencies between candidate key attributes, separate them out into distinct tables.

4NF:

Isolate Independent Multiple Relationships- No table may contain two or more 1:n or n:m relationships that are not directly related.

5NF:

Isolate Semantically Related Multiple Relationships- There may be practical constrains on information that justify separating logically related many-to-many relationships.

ONF:

Optimal Normal Form- A model limited to only simple (elemental) facts, as expressed in Object Role Model notation.

DKNF:

Domain-Key Normal Form- A model free from all modification anomalies. Remember, these normalization guidelines are cumulative. For a database to be in 3NF, it must first fulfill all the criteria of a 2NF and 1NF database.

3. What is Denormalization?

Denormalization is a database optimization technique in which we add redundant data to one or more tables. This can help us avoid costly joins in a relational database. Note that denormalization does not mean not doing normalization. It is an optimization technique that is applied after doing normalization.

For example, in a normalized database, we might have a Courses table and a Teachers table.Each entry in Courses would store the teacherID for a Course but not the teacherName. When we need to retrieve a list of all Courses with the Teacher name, we would do a join between these two tables. In some ways, this is great; if a teacher changes is or her name, we only have to update the name in one place.

The drawback is that if tables are large, we may spend an unnecessarily long time doing joins on tables.

Denormalization, then, comes into picture. Under denormalization, we decide that we’re okay with some redundancy and some extra effort to update the database in order to get the efficiency advantages of fewer joins.

4. What Is Stored Procedure?

A stored procedure is a set of Structured Query Language (SQL) statements with an assigned name, which is stored in a relational database management system as a group, so it can be reused and shared by multiple programs.

Stored procedures can access or modify data in a database, but it is not tied to a specific database or object, which offers a number of advantages.

A stored procedure provides an important layer of security between the user interface and the database. It supports security through data access controls because end users may enter or change data, but do not write procedures. You can grant execute rights to a stored procedure but the user will not need to have read/write permissions on the underlying tables.

A stored procedure provides an important layer of security between the user interface and the database. It supports security through data access controls because end users may enter or change data, but do not write procedures. You can grant execute rights to a stored procedure but the user will not need to have read/write permissions on the underlying tables.

5. What is a Schema?

A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data. As part of a data dictionary, a database schema indicates how the entities that make up the database relate to one another, including tables, views, stored procedures, and more.

Typically, a database designer creates a database schema to help programmers whose software will interact with the database. The process of creating a database schema is called data modeling.

6. What Is Trigger?

In a DBMS, a trigger is a SQL procedure that initiates an action (i.e., fires an action) when an event (INSERT, DELETE or UPDATE) occurs. Since triggers are event-driven specialized procedures, they are stored in and managed by the DBMS. A trigger cannot be called or executed; the DBMS automatically fires the trigger as a result of a data modification to the associated table. Each trigger is attached to a single, specified table in the database.

Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-driven and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion.

7. What Is View?

Views in SQL are a kind of virtual tables. A view also has rows and columns as they are in a real table in the database. A view can be created from one or many tables which depends on the written SQL query to create a view. A View can either have all the rows of a table or specific rows based on certain condition. You can consider it as a subset of a table or union of two or more tables with all or few columns in those tables

A view is nothing more than a SQL statement that is stored in the database with an associated name. A view is actually a composition of a table in the form of a predefined SQL query. Once a view is created you can use it as you would use any table in a SELECT statement.

8. What Is Index?

Indexing is a data structure technique to efficiently retrieve records from the database files based on some attributes on which the indexing has been done. Simply put, an index is a pointer to data in a table.

An index in a database is very similar to an index in the back of a book. For example, if you want to reference all pages in a book that discusses a certain topic, you first refer to the index, which lists all the topics alphabetically and are then referred to one or more specific page numbers.

Indexing is a way to optimize performance of a database by minimizing the number of disk accesses required when a query is processed.