Name: sqlstyle.guide
Owner: AutoScout24
Description: A consistent code style guide for SQL to ensure legible and maintainable projects
Forked from: treffynnon/sqlstyle.guide
Created: 2018-02-19 10:00:09.0
Updated: 2018-02-19 10:04:22.0
Pushed: 2018-02-19 11:56:20.0
Size: 330
Language: HTML
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
These guidelines are designed to be compatible with Joe Celko's book to make adoption for teams who have already read that book easier. This guide is a little more opinionated in some areas and in others a little more relaxed. It is certainly more succinct where Celko's book contains anecdotes and reasoning behind each rule as thoughtful prose.
SQL style guide by Simon Holywell is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
YYYY-MM-DD HH:MM:SS.SSSSS
).WHERE
clauses that can otherwise be derived./*
and
closing */
where possible otherwise precede comments with --
and finish
them with a new line.CT file_hash -- stored ssdeep hash
file_system
E file_name = '.vimrc';
pdating the file record after writing to the file */
TE file_system
le_modified_date = '1980-02-22 13:19:01.00000',
le_size = 209732
E file_name = '.vimrc';
sp_
or tbl
.staff
instead of employees
or people
instead of individuals
.first_name
).CT first_name
staff;
staff
and employees
.tbl
or any other such descriptive prefix or Hungarian
notation.cars_mechanics
prefer services
.id
as the primary identifier for the table.AS
keyword?makes it easier to read as it is explicit.SUM()
or AVG()
) use the name you would give it were it
a column defined in the schema.CT first_name AS fn
staff AS s1
students AS s2
s2.mentor_id = s1.staff_num;
CT SUM(s.monitor_tally) AS monitor_total
staff AS s;
sp_
or any other such descriptive prefix or Hungarian
notation.The following suffixes have a universal meaning ensuring the columns can be read and understood easily from SQL code. Use the correct suffix where appropriate.
_id
?a unique identifier such as a column that is a primary key._status
?flag value or some other status of any type such as
publication_status
._total
?the total or sum of a collection of values._num
?denotes the field contains any kind of number._name
?signifies a name such as first_name
._seq
?contains a contiguous sequence of values._date
?denotes a column that contains the date of something._tally
?a count._size
?the size of something such as a file size or clothing._addr
?an address for the record could be physical or intangible such as
ip_addr
.Always use uppercase for the reserved keywords like SELECT
and WHERE
.
It is best to avoid the abbreviated keywords and use the full length ones where
available (prefer ABSOLUTE
to ABS
).
Do not use database server specific keywords where an ANSI SQL keyword already exists performing the same function. This helps to make code more portable.
CT model_num
phones AS p
E p.release_date > '2014-09-30';
To make the code easier to read it is important that the correct compliment of spacing is used. Do not crowd code or remove natural language spaces.
Spaces should be used to line up the code so that the root keywords all end on the same character boundary. This forms a river down the middle making it easy for the readers eye to scan over the code and separate the keywords from the implementation detail. Rivers are bad in typography, but helpful here.
ECT
species_name,
G(f.height) AS average_height,
G(f.diameter) AS average_diameter
flora AS f
E f.species_name = 'Banksia'
f.species_name = 'Sheoak'
f.species_name = 'Wattle'
P BY f.species_name, f.observation_date
N ALL
ECT
species_name,
G(b.height) AS average_height,
G(b.diameter) AS average_diameter
botanic_garden_flora AS b
E b.species_name = 'Banksia'
b.species_name = 'Sheoak'
b.species_name = 'Wattle'
P BY
species_name,
observation_date
Notice that the main keywords such as SELECT
, FROM
, etc. are all left aligned.
Although not exhaustive always include spaces:
=
),
)'
) where not within parentheses or with a trailing
comma or semicolon.CT
title,
release_date,
recording_date
albums AS a
E a.title = 'Charcoal Lane'
a.title = 'The New Danger';
Always include newlines/vertical space:
AND
or OR
Keeping all the keywords aligned to the righthand side and the values left aligned creates a uniform gap down the middle of query. It makes it much easier to scan the query definition over quickly too.
RT INTO albums (title, release_date, recording_date)
LUES
('Charcoal Lane', '1990-01-01 01:01:01.00000', '1990-01-01 01:01:01.00000'),
('The New Danger', '2008-01-01 01:01:01.00000', '1990-01-01 01:01:01.00000');
ql
TE albums
release_date = '1990-01-01 01:01:01.00000'
E title = 'The New Danger';
ql
CT
title,
release_date,
recording_date,
production_date -- grouped dates together
albums AS a
E a.title = 'Charcoal Lane'
a.title = 'The New Danger';
To ensure that SQL is readable it is important that standards of indentation are followed.
Joins should be indented to the other side of the river and grouped with a new line where necessary.
CT r.last_name
riders AS r
NER JOIN bikes AS b
ON r.bike_vin_num = b.vin_num
AND b.engines > 2
NNER JOIN crew AS c
ON r.crew_chief_last_name = c.last_name
AND c.chief = 'Y';
Subqueries should also be aligned to the right side of the river and then laid out using the same style as any other query. Sometimes it will make sense to have the closing parenthesis on a new line at the same character position as its opening partner?this is especially true where you have nested subqueries.
CT
last_name,
ELECT MAX(YEAR(championship_date))
ROM champions AS c
HERE c.last_name = r.last_name
AND c.confirmed = 'Y'
AS last_championship_year
riders AS r
E r.last_name IN
ELECT c.last_name
ROM champions AS c
HERE YEAR(championship_date) > '2008'
AND c.confirmed = 'Y'
BETWEEN
where possible instead of combining multiple statements
with AND
.IN()
instead of multiple OR
clauses.CASE
expression. CASE
statements can be nested to form more complex logical structures.UNION
clauses and temporary tables where possible. If the
schema can be optimised to remove the reliance on these features then it most
likely should be.CT
SE postcode
WHEN 'BN1' THEN 'Brighton'
WHEN 'EH1' THEN 'Edinburgh'
D AS city
office_locations
E country = 'United Kingdom'
D opening_time BETWEEN 8 AND 9
D postcode IN ('EH1', 'BN1', 'NN1', 'KW1')
When declaring schema information it is also important to maintain human readable code. To facilitate this ensure the column definitions are ordered and grouped where it makes sense to do so.
Indent column definitions by four (4) spaces within the CREATE
definition.
REAL
or FLOAT
types where it is strictly necessary for floating
point mathematics otherwise prefer NUMERIC
and DECIMAL
at all times. Floating
point rounding errors are a nuisance!DECIMAL
do not provide an INTEGER
default value.NOT NULL
statement.Constraints and their subset, keys, are a very important component of any database definition. They can quickly become very difficult to read and reason about though so it is important that a standard set of guidelines are followed.
Deciding the column(s) that will form the keys in the definition should be a carefully considered activity as it will effect performance and data integrity.
It is a reasoned and considered balancing act to be performed at the definition of a database. Should requirements evolve in the future it is possible to make changes to the definitions to keep them up to date.
Once the keys are decided it is possible to define them in the system using constraints along with field value validation.
UNIQUE
, PRIMARY KEY
and FOREIGN KEY
where the database vendor will generally supply sufficiently
intelligible names automatically.CREATE TABLE
statement.CREATE TABLE
definition.ON DELETE
comes before ON UPDATE
.NOT NULL
definitions could start at the same
character position. This is not hard and fast, but it certainly makes the code
much easier to scan and read.LIKE
and SIMILAR TO
constraints to ensure the integrity of strings
where the format is known.CHECK()
to prevent incorrect values entering the database or the silent
truncation of data too large to fit the column definition. In the least it
should check that the value is greater than zero in most cases.CHECK()
constraints should be kept in separate clauses to ease debugging.TE TABLE staff (
IMARY KEY (staff_num),
staff_num INT(5) NOT NULL,
first_name VARCHAR(100) NOT NULL,
pens_in_drawer INT(2) NOT NULL,
NSTRAINT pens_in_drawer_range
ECK(pens_in_drawer >= 1 AND pens_in_drawer < 100)
CHECK()
to ensure valid data is
inserted into the column.UNION
rather than just simply querying one table.