Skip to content

Using Bmg when coming from SQL, a cheatsheet

This pages summarizes usual SQL queries and their equivalent in Bmg. Please carefully read the notes below each example, as subtle semantics differences exist between Bmg and SQL regarding: duplicate tuples, duplicate attribute names, tuple ordering and null.

And remember: Bmg truly composes. You can for instance use union or restrict after a group by (summarize), use restrict multiple times in a row, and make any other composition that would fail in SQL or with query composer libraries.

SELECT

SELECT * FROM suppliers

suppliers

SELECT sid, name FROM suppliers

suppliers
.project([:sid, :name])

DISTINCT

SELECT DISTINCT city FROM suppliers

suppliers
.project([:city])

Unlike SQL, Bmg never returns duplicates, hence the absence of a .distinct operator. Bmg’s SQL compiler will automatically introduce DISTINCT if needed to filter duplicates.

Renaming…

SELECT
sid AS supplier_id
FROM suppliers

suppliers
.rename(:sid => :supplier_id)

SELECT
id AS supplier_id,
name AS supplier_name,
city AS supplier_city,
status AS supplier_status
FROM suppliers

suppliers
.prefix(:supplier_)

SELECT
id AS id_supplier,
name AS name_supplier,
city AS city_supplier,
status AS status_supplier
FROM suppliers

suppliers
.suffix(:_supplier)

Unlike SQL, Bmg does not support multiple attributes having the same name (aka duplicate attributes). Bmg’s .prefix and .suffix have no equivalent in SQL: you’ll have to do the job manually, as in the example above.

All but an attribute…

SELECT
sid,
name,
city
FROM suppliers

suppliers
.allbut([:status])

SQL does not support an ALLBUT keyword. You’ll have to select all attributes but those that you want to throw away, as in the example above.

WHERE

SELECT * FROM suppliers
WHERE city = 'London'

suppliers
.restrict(city: 'London')

or

p = Predicate.eq(:city, 'London')
suppliers
.restrict(p)

SELECT * FROM suppliers
WHERE city <> 'London'

suppliers
.exclude(city: 'London')

or

p = Predicate.neq(:city, 'London')
suppliers
.restrict(p)

SELECT * FROM suppliers
WHERE city IN ('London', 'Paris')

suppliers
.restrict(city: ['London', 'Paris'])

or

p = Predicate.in(:city, ['London', 'Paris'])
suppliers
.restrict(p)

Please refer to Predicate’s documentation or Bmg’s README for the full list of supported predicates.

EXISTS / IN

SELECT * FROM suppliers s
WHERE EXISTS (
SELECT * FROM supplies ss
WHERE ss.sid = s.sid
)

suppliers
.matching(supplies, [:sid])

SELECT * FROM suppliers s
WHERE NOT EXISTS (
SELECT * FROM supplies ss
WHERE ss.sid = s.sid
)

suppliers
.not_matching(supplies, [:sid])

We discourage the usage of WHERE (NOT) IN in SQL since its semantics in presence of NULL may generate subtle bugs. Bmg’s matching and not_matching can be used everytime you’d be using IN in SQL.

JOIN

CROSS

SELECT s.*, p.*
FROM
suppliers s,
parts p

suppliers
.cross_join(parts)

or

suppliers
.join(parts, [])

Please note that Bmg does not allow duplicate attribute names. Hence, the examples above are only equivalent if suppliers and parts have no attribute in common. In practice, Bmg simply ignores attributes of the right operand that already exist on the left operand, in the join result.

In other words, the following compilation actually applies:

suppliers
.cross_join(parts)

SELECT
s.sid,
s.name,
s.status,
s.city,
p.pid
-- p.name, p.status and p.city are ignored
FROM
suppliers s,
parts p

EQUI

SELECT s.*, ss.*
FROM
suppliers s
JOIN
supplies ss ON ss.sid = s.sid

suppliers
.join(supplies, [:sid])

SELECT s.*, ss.*
FROM
suppliers s
JOIN
supplies ss ON ss.supplier_id = s.id

suppliers
.join(supplies, :supplier_id => :sid)

The same remark about duplicate attribute names applies here. In presence of conflictual attributes, the following compilation applies:

suppliers
.join(parts, [:city])

SELECT
s.sid,
s.name,
s.city,
s.status,
p.pid
-- p.name, p.status and p.city are ignored
FROM
suppliers s
JOIN
parts p ON p.city = s.city

THETA

SELECT s.*, p.*
FROM
suppliers s
JOIN
parts p ON s.status > p.status

suppliers
.rename(:status => :s_status)
.cross_join(parts.rename(:status => :p_status))
.restrict(Predicate.gt(:s_status, :p_status))

The same remark about duplicate attribute names applies here.

LEFT

SELECT
s.sid,
s.name,
s.city,
s.status,
COALESCE(p.pid, 'none') AS pid
FROM
suppliers s
LEFT JOIN
parts p ON s.city = p.city

suppliers
.left_join(parts, [:city], :pid => 'none')

Bmg allows providing an optional ‘default’ tuple for missing ones on right operand. Not using it generates NULL/nil like SQL, which may expose subtle bugs further down in your data transformation.

UNION

SELECT city FROM suppliers
UNION
SELECT city FROM parts

suppliers
.project([:city])
.union(parts.project([:city]))

MINUS

SELECT city FROM suppliers
MINUS
SELECT city FROM parts

suppliers
.project([:city])
.minus(parts.project([:city]))

GROUP BY

SELECT
city, MAX(status) AS status
FROM
suppliers s
GROUP BY
city

suppliers
.summarize([:city], :status => :max)

SELECT
city, COUNT(*) AS c
FROM
suppliers s
GROUP BY
city

suppliers
.summarize([:city], :c => Bmg::Summarizer.count)

HAVING

SELECT
city, MAX(status) AS status
FROM
suppliers s
GROUP BY
city
HAVING
status < 100

suppliers
.summarize([:city], :status => :max)
.restrict(Predicate.lt(:status, 100))

Bmg has no special HAVING construct, it’s as simple as using restrict (WHERE) like SQL should do.

LIMIT / OFFSET

SELECT * FROM suppliers s
ORDER BY status DESC
LIMIT 3

suppliers
.page([[:status, :desc]], 1, :page_size => 3)

SELECT * FROM suppliers s
ORDER BY status DESC
OFFSET 6 LIMIT 3

suppliers
.page([[:status, :desc]], 3, :page_size => 3)

Unlike SQL, Bmg does not support limit/offset without specifying an ordering relation (SQL’s ORDER BY), since the result is non deterministic.