Demystifying SQL 101: A Beginner's Journey

What is SQL?

SQL stands for Structured Query Language. It is a programming language invented in the 1970s, and it is used to access and manipulate data from databases.

A database is a tool to store and organise data. Data is any sort of information (numbers, images, text, etc.) A database can store structured or unstructured data. Structured databases are also referred to as relational databases, and are characterised for having well organised and consistent data structure (the data to be stored follows a well-defined structure, which is known as schema). SQL, as you may already inferred because of the name, works with structured data. In fact, unstructured databases are also known as NoSQL databases! 

But wait, is it better to use structured databases over unstructured databases? Well, the short answer is "it depends". Structured databases are easier to query (you can access and manipulate the data easier) because they follow a predefined schema, they are more suitable when data consistency and data modelling is relevant, on the other hand, NoSQL databases are very flexible and more suitable when data types and formats vary widely.

Within relational databases usually comes the concept of table. A table is a structure inside a database where you store and organise data. A table is made up of two things: rows and columns (is just like an Excel Sheet). A database may contain multiple tables, each table storing different data. For example: you may have a customer database, and within this customer database you may have two tables: one storing customer's personal information (name, age, email, sex, etc.) and another table containing the customer's purchases. In summary, a database is a collection of data, and a table is the fundamental data structure where the data is stored within a database. 

To give you an analogy of a database and a table, think of it as a closet. In the closet you have different components where you store different clothes: 
  • Drawers (where you may store the underwear and the socks)
  • Shelfs (where you store the pants and t-shirts)
  • Hanger rods (where you store your shirts and your coats)
In this example, the closet is the database and each component of the closet (drawers, shelfs, hanger rods) is a different table within this database.

SQL Community

You may be wondering, who uses SQL and where do they use it? While SQL is a very simple programming language, it is very powerful. You can create, modify or retrieve data from a table, you can join multiple tables to obtain a single view of them, aggregate data, perform data analysis, and so many other thing. Some of the users and use cases of SQL are listed below:
  1. Database Administrators (DBA): they use SQL to manage, maintain and optimise databases.
  2. Data Analysts: they use SQL to query and analyze data to extract insights and generate reports.
  3. Business Analysts: they use SQL to query and analyze data to identify market trends and make data-driven business decisions.
  4. Data Scientists: they use SQL to retrieve and process data for machine learning, statistical analysis, and modelling.
  5. Data Engineers: Data engineers use SQL queries to transform and prepare data for storage, analytics, and reporting.
  6. Web Developers: SQL is used in web development to retrieve and display dynamic content in a website. Have you ever use a filtering option or a search feature in a webpage? Well, they may be using SQL in the background to filter the results and only show you products that meet the request.

SQL Dialects

SQL has different dialects, which are variations of the language for specific Database Management Systems (a software to create, manage, store, and retrieve data in a database, they are also known as DBMS). Each dialect is based on the core SQL standard, but they may have their own specific features and syntax. Here are some of the SQL dialects and the DBMS they are used with:
  1. Transact-SQL (T-SQL): it is used with Microsoft SQL Server
  2. MySQL SQL: used with MySQL and MariaDB
  3. SQLite SQL: used with SQLite
  4. PostgreSQL SQL: used with PostgreSQL
  5. DB2 SQL: used with IBM Db2
  6. HiveQL: used with Apache Hive
  7. Spark SQL: used with Apache Spark


desert with multiple paths
Desert with multiple paths as an analogy of SQL having distinct dialects.


Do not worry, you won't need to learn all of these dialects. Most of them share many similarities, with additional unique functions or operators, but following the same original SQL syntax. That being said, you can start with one dialect, learn the basics of SQL, and then you can easily jump to another dialect if needed!

Is SQL a programming language?

Well, some people say it is, some say it isn't. A programming language is a system used to communicate instructions to the computer. This set of instructions is what is called "code". Below are some of the conditions a programming language must meet to be considered as one: 
  • Syntax and Semantics: programming languages have a well-defined syntax and a set of rules for writing code. Programming languages have a well-defined syntax, which includes rules for writing code, keywords, operators, functions, etc. And if you are wondering, yes, SQL has a well-defined syntax and semantics.
  • Data Types: the type of values that can be stored in a program, such as numbers, strings, date, and booleans. SQL has data types for defining the type of values that can be stored in a database.
  • Variables: variables are basically a named space in memory that can store values (numbers, text, etc.) Some SQL dialects support variable declaration.
  • Operators: these are symbols used to perform specific operations, such as arithmetic operators (addition, subtraction, multiplication, etc.), comparison operators (greater than, less than, equal to, etc.), and logical operators (AND, OR). SQL includes these operators.
  • Control Structures: Statements used to control the flow of a program, allowing the program to do some of the following:
    • Conditional Statement - make decisions (execute part of the code only if a condition is met). This is usually known as IF statements. 
    • Looping Statements - loop or iterate (execute part of the code multiple times until a condition is met). Usually known as FOR and/or WHILE loops.
    • BREAK statement - stop the program (if a condition is met)
    SQL does support some of these control structure, such as CASE statements (which are conditional statements) and loop statements.
  • Data Manipulation: it is the ability to handle and modify data in various ways. SQL is limited to SELECT, CREATE, INSERT, DELETE or UPDATE tables from databases only.
  • Libraries: programming languages have libraries to develop applications. Libraries are a collection of pre-written code that can be used to perform a task. Libraries are reusable and allow developers to save time and effort. SQL have specific libraries used for database-related task only, and commonly, these libraries are used to provide connectivity with a specific database management system (such as MySQL, PostgreSQL, Oracle, and SQLite). An example of these libraries is when you use Python to develop a program and need to connect to a PostgreSQL database with user information for authentication; to achieve this, you need to use use the psycopg2 Python library to connect to the database and be able to authenticate the user.
  • Application Development: programming languages can be used to develop web applications, software, games, mobile apps, etc. SQL is often used within applications to interact with a database, although SQL is not suitable to build an application alone, it can only be used to query and manipulate data from a database.
In summary, SQL is not a General-Purpose Language (GPL), meaning that SQL cannot be used to build applications, perform a wide range of computational tasks, or control hardware and software systems. In fact, SQL can only interact with databases, therefore, it is a Domain-Specific Language (DSL). A DSL is a specialised programming language designed for a particular domain, problem, or application. So we can consider SQL as a programming language! 

How to use and where to practice SQL?

For me, SQL Fiddle or DB Fiddle are the number 1 tools to use if you are new to the SQL world. They are FREE ONLINE sites where you are able to test and share SQL queries in different SQL dialects (MySQL, PostgreSQL, and SQLite). In the next posts we will cover some basics of SQL and explain how to use SQL Fiddle and DB Fiddle.

I like to recommend these tools first as when you are learning you will be faced with multiple SQL dialects, which may require you to install the DBMS specific software to use them; and if you are unsure of which dialect you want to learn you may need to download and install all these different DBMS in order to use the different SQL dialects. With SQL Fiddle and DB Fiddle you can test and share SQL queries in various SQL dialects.

Comments

Popular Posts