Fundamentals
SQL Joins
SQL joins are used to combine rows from two or more tables based on a related column between them. Here are the main types of SQL joins:
INNER JOIN
Returns records that have matching values in both tables. It creates the result-set by combining all rows from both tables where the condition is satisfied.
Example:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate FROM Orders INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;
LEFT JOIN
Returns all records from the left table, and the matched records from the right table. For unmatched records, NULL values are returned for the right table columns.
Example:
SELECT Student.NAME, StudentCourse.COURSE_ID FROM Student LEFT JOIN StudentCourse ON Student.ROLL_NO = StudentCourse.ROLL_NO;
RIGHT JOIN
Returns all records from the right table, and the matched records from the left table. For unmatched records, NULL values are returned for the left table columns.
Example:
SELECT Student.NAME, StudentCourse.COURSE_ID FROM Student RIGHT JOIN StudentCourse ON Student.ROLL_NO = StudentCourse.ROLL_NO;
FULL JOIN
Returns all records when there is a match in either left or right table. For unmatched records, NULL values are returned for the other table columns.
Example:
SELECT * FROM Artist FULL OUTER JOIN Album ON Artist.artist_id = Album.artist_id;
CROSS JOIN
Returns the Cartesian product of rows from the two joined tables. It multiplies the number of rows from the first table by the number of rows from the second table.
Example:
SELECT * FROM Artist CROSS JOIN Album;
Views in SQL
In SQL, views are virtual tables that represent the result of a query. They can simplify complex queries, enhance security, and provide a customized interface to the underlying data. There are several types of views in SQL, each serving different purposes. Here are the main types:
1. Simple Views
- Definition: Based on a single table and do not include functions or groupings.
- Usage: Useful for straightforward data retrieval without complexity.
2. Complex Views
- Definition: Derived from multiple tables and can include functions, joins, and groupings.
- Usage: Suitable for aggregating data from various sources, allowing more intricate queries.
3. Materialized Views
- Definition: Store the result of a query physically, which improves performance for large datasets.
- Usage: Ideal for scenarios where data is queried frequently and performance is critical.
4. Indexed Views
- Definition: A type of materialized view that has a unique clustered index created on it.
- Usage: Enhances performance for certain types of queries, especially those that aggregate many rows. However, they are not suitable for frequently updated data sets[1][3].
5. Partitioned Views
- Definition: Combine horizontally partitioned data from multiple tables, making it appear as a single table.
- Usage: Useful for distributing data across different servers or databases while maintaining a unified view.
6. System Views
- Definition: Predefined views in SQL Server that expose metadata about the database.
- Usage: Used to retrieve information about the database structure and objects, such as user-defined databases.
7. User-defined Views
- Definition: Created by users to meet specific requirements.
- Usage: Allow customization of data access and representation based on user needs
What are Computed Columns in SQL?
Computed columns are virtual columns in a table whose values are calculated from an expression using other columns in the same table. The expression can include:
- Non-computed column names
- Constants
- Functions
- Combinations of these connected by operators
Some key points about computed columns:
- They are not physically stored in the table unless marked as
PERSISTED
- Values are recalculated every time the column is referenced in a query
- Values are automatically updated when any columns used in the expression change
- Can be used in indexes, constraints, and views if marked as
PERSISTED
- Cannot be the target of
INSERT
orUPDATE
statements
Creating Computed Columns
Computed columns can be created in two ways:
- Using the
CREATE TABLE
orALTER TABLE
statements:
CREATE TABLE dbo.Products ( ProductID int IDENTITY(1,1) NOT NULL, QtyAvailable smallint, UnitPrice money, InventoryValue AS QtyAvailable * UnitPrice );
- Using the table designer in SQL Server Management Studio (SSMS):
- Right-click the table and select "Design"
- Add a new column
- Set the data type to "Computed Column Specification"
- Enter the expression formula
Persisted vs Non-Persisted Computed Columns
By default, computed columns are non-persisted, meaning their values are calculated on-the-fly when queried. This can impact performance for complex expressions.To physically store the computed values, mark the column as PERSISTED :
ALTER TABLE dbo.Products ADD InventoryValue AS QtyAvailable * UnitPrice PERSISTED;
Persisted computed columns:
- Improve query performance
- Allow indexing for faster lookups
- Require extra storage space
- Incur overhead when data changes (DML operations)
Normalization vs Denormalization
Factor | Normalization | Denormalization |
Purpose | Aims to minimize data redundancy and improve data integrity. | Aims to improve query performance by introducing redundancy. |
Data Redundancy | Reduces redundancy by organizing data into multiple related tables. | May introduce redundancy by combining multiple tables into one. |
Performance | Can lead to complex queries and may affect read performance due to multiple joins. | Improves read performance by simplifying queries and reducing the number of joins. |
Complexity | Increases complexity for write operations due to multiple tables. | Simplifies read operations but can complicate write operations due to redundancy. |
Data Integrity | Maintains data integrity, ensuring consistency across the database. | May compromise data integrity due to redundancy, leading to potential anomalies. |
Use Cases | Best for transactional systems where data accuracy and consistency are critical. | Best for read-heavy workloads, reporting, and analytics where speed is prioritized. |
Memory Consumption | Optimizes memory usage by minimizing data duplication. | May lead to increased memory usage due to redundant data storage. |
OLTP vs OLAP
Feature | OLTP (Online Transaction Processing) | OLAP (Online Analytical Processing) |
Definition | A system designed for managing day-to-day transaction-oriented applications. | A system designed for complex queries and data analysis for decision-making. |
Data Structure | Uses a normalized database structure to minimize redundancy. | Often employs a denormalized structure, such as a star or snowflake schema, for efficient querying. |
Transaction Type | Supports a large number of short online transactions (Insert, Update, Delete). | Primarily supports complex queries (mostly Select operations). |
Response Time | Typically has a response time in milliseconds. | Response time is generally in seconds to minutes due to complex queries. |
Data Volume | Handles current operational data, usually in smaller volumes. | Handles large volumes of historical data, often in terabytes or petabytes. |
User Type | Used by operational staff (e.g., cashiers, clerks) for daily tasks. | Used by analysts and decision-makers (e.g., managers, executives) for insights. |
Purpose | Focused on transaction integrity and speed for daily operations. | Focused on data analysis, reporting, and business intelligence. |
Data Integrity | Strong emphasis on maintaining data integrity and consistency. | Less emphasis on data integrity; data is often read-only and updated less frequently. |
Usage Frequency | Frequent updates and modifications to the database. | Infrequent updates; primarily read operations for analysis. |
Complexity of Queries | Queries are simple and standardized. | Queries are complex, often involving aggregations and multiple joins. |
Backup Strategy | Requires complete and incremental backups frequently. | Backups are less frequent and less critical compared to OLTP. |
Examples | Banking systems, retail transactions, online booking systems. | Data warehousing, business reporting, financial analysis. |
Functions vs Stored Procedures
Feature | Functions | Stored Procedures |
Return Type | Returns a single value or a table. | Can return zero, one, or multiple values. |
SQL Operations | Only allows SELECT statements; cannot modify data. | Can perform any SQL operation (SELECT, INSERT, UPDATE, DELETE). |
Parameters | Supports only input parameters. | Supports both input and output parameters. |
Transactions | Cannot contain transactions. | Can contain transactions. |
Calling Other Objects | Cannot call stored procedures. | Can call functions. |
Execution | Can be used inline in queries and JOINs. | Executed separately using the EXEC statement. |
Complexity | Generally simpler and more limited in functionality. | More complex and can contain multiple SQL statements. |
Use Case | Best for calculations and returning single values. | Best for executing complex business logic and multiple operations. |
What are the differences between SQL and PL/SQL?
SQL | PL/SQL |
SQL is a query execution or commanding language | PL/SQL is a complete programming language |
SQL is a data-oriented language. | PL/SQL is a procedural language |
SQL is very declarative in nature. | PL/SQL has a procedural nature. |
It is used for manipulating data. | It is used for creating applications. |
We can execute one statement at a time in SQL | We can execute blocks of statements in PL/SQL |
SQL tells databases, what to do? | PL/SQL tells databases how to do. |
We can embed SQL in PL/SQL | We can not embed PL/SQL in SQL |
Operators in SQL
Operator Type | Operator | Description | Example |
Arithmetic Operators | + | Addition | SELECT salary + 500 AS new_salary FROM employees; |
ㅤ | - | Subtraction | SELECT price - discount AS final_price FROM products; |
ㅤ | * | Multiplication | SELECT quantity * price AS total_cost FROM orders; |
ㅤ | / | Division | SELECT total / number_of_items AS average FROM sales; |
ㅤ | % | Modulus (remainder) | SELECT 10 % 3 AS remainder; |
Comparison Operators | = | Equal to | SELECT * FROM products WHERE price = 100; |
ㅤ | > | Greater than | SELECT * FROM products WHERE price > 100; |
ㅤ | < | Less than | SELECT * FROM products WHERE price < 100; |
ㅤ | >= | Greater than or equal to | SELECT * FROM products WHERE price >= 100; |
ㅤ | <= | Less than or equal to | SELECT * FROM products WHERE price <= 100; |
ㅤ | <> | Not equal to | SELECT * FROM products WHERE price <> 100; |
Logical Operators | AND | True if both conditions are true | SELECT * FROM customers WHERE country = 'USA' AND state = 'CA'; |
ㅤ | OR | True if either condition is true | SELECT * FROM customers WHERE country = 'USA' OR country = 'Canada'; |
ㅤ | NOT | Negates the condition | SELECT * FROM customers WHERE NOT country = 'USA'; |
Special Operators | BETWEEN | Checks if a value is within a range | SELECT * FROM orders WHERE order_date BETWEEN '2022-01-01' AND '2022-12-31'; |
ㅤ | LIKE | Performs pattern matching with wildcards | SELECT * FROM customers WHERE name LIKE 'A%'; |
ㅤ | IN | Checks if a value matches any in a list | SELECT * FROM products WHERE category IN ('Electronics', 'Clothing'); |
ㅤ | EXISTS | Checks if a subquery returns any results | SELECT * FROM customers WHERE EXISTS (SELECT * FROM orders WHERE customers.id = orders.customer_id); |
ㅤ | IS NULL | Checks for null values | SELECT * FROM employees WHERE manager_id IS NULL; |
QnA
What is the difference between CHAR and VARCHAR2 datatype in SQL?
Both of these data types are used for characters, but varchar2 is used for character strings of variable length, whereas char is used for character strings of fixed length. For example, if we specify the type as char(5) then we will not be allowed to store a string of any other length in this variable, but if we specify the type of this variable as varchar2(5) then we will be allowed to store strings of variable length. We can store a string of length 3 or 4 or 2 in this variable.
What do you mean by data definition language?
Data definition language or DDL allows to execution of queries like CREATE, DROP, and ALTER. That is those queries that define the data.
What do you mean by data manipulation language?
Data manipulation Language or DML is used to access or manipulate data in the database. It allows us to perform the below-listed functions:
- Insert data or rows in a database
- Delete data from the database
- Retrieve or fetch data (SELECT)
- Update data in a database.
What is the view in SQL?
Views in SQL are a kind of virtual table. A view also has rows and columns as they are on a real table in the database. We can create a view by selecting fields from one or more tables present in the database. A View can either have all the rows of a table or specific rows based on certain conditions.
The CREATE VIEW statement of SQL is used for creating views.
CREATE VIEW view_name AS SELECT column1, column2..... FROM table_name WHERE condition; view_name: Name for the View table_name: Name of the table condition: Condition to select rows
What do you mean by foreign key?
A Foreign Key is a field that can uniquely identify each row in another table. And this constraint is used to specify a field as a Foreign key. That is this field points to the primary key of another table. This usually creates a kind of link between the two tables.