a developer: 1月 2022

Requirement

Developers keep releasing new software, the version format was {major}.{minor}.{micro}.{build}
When a device ask for upgrade information, we need figure out the latest versions of software
But not only the latest one, we may need to know following version information, so that we can provide suitable recommendation

is there are 3 newer versions of software?
How many newer software versions?

Challenge

Database sort text by "Natural Sorting"
These 2 versions, the older version will be treated as the newer one if we use natural sorting

1.2.3.100 => Will be treated as older because 2 > 1
1.2.3.20 => So the version 20 will be treated as newer

Ideas

Persist major, minor, micro, build in different columns
Calculate versions to be a number, which can be sorted correctly
Merge versions to be a single text, and need to be sorted correctly (We need padLeft 0 to let all versions become the same length)
Persist major, minor, micro, build in a byte array (BLOB),
assume the natural sorting will compare BLOB from the first element of an array.

Test Steps

Prepare TestContainer for PostgreSQL testing
Generate 10000 records

major 0
minor 0-9
micro 0-9
build 0-99

Compare with a specified version: 0.6.7.58
Compare query plan and query result (Query the latest 3 records)

Code for the test

https://github.com/axxdeveloper/postgresql-schema-testing/tree/master/src/test/java/com/shooeugenesea/dao

Tables

Version1

DDL

CREATE TABLE version1 (
    id  uuid PRIMARY KEY,
    major int,
    minor int,
    micro int,
    build int
);

CREATE INDEX version1_major_idx ON version1 (major DESC);
CREATE INDEX version1_major_minor_idx ON version1 (major DESC, minor DESC);
CREATE INDEX version1_major_minor_micro_idx ON version1 (major DESC, minor DESC, micro DESC);
CREATE INDEX version1_major_minor_micro_build_idx ON version1 (major DESC, minor DESC, micro DESC, build DESC);

SQL

String sql = "SELECT * FROM Version1 " +
                "WHERE (major > 0) " +
                "OR (major = 0 AND minor > 6) " +
                "OR (major = 0 AND minor = 6 AND micro > 7) " +
                "OR (major = 0 AND minor = 6 AND micro = 7 AND build > 58) " +
                "ORDER BY major DESC, minor DESC, micro DESC, build DESC " +
                "LIMIT 3";

Query Plan

QUERY PLAN: Limit  (cost=0.29..1.16 rows=3 width=32) (actual time=0.065..0.193 rows=3 loops=1)
QUERY PLAN:   ->  Index Scan using version1_major_minor_micro_build_idx on version1  (cost=0.29..982.84 rows=3366 width=32) (actual time=0.051..0.073 rows=3 loops=1)
QUERY PLAN:         Filter: ((major > 0) OR ((major = 0) AND (minor > 6)) OR ((major = 0) AND (minor = 6) AND (micro > 7)) OR ((major = 0) AND (minor = 6) AND (micro = 7) AND (build > 58)))
QUERY PLAN: Planning Time: 0.920 ms
QUERY PLAN: Execution Time: 0.307 ms
QUERY RESULT: {major=0, minor=9, micro=9, build=99, id=7e5878ff-9d25-46dc-a0c0-79196fd8c5d3}
QUERY RESULT: {major=0, minor=9, micro=9, build=98, id=ee2097ef-fe86-491e-869b-afda5976a354}
QUERY RESULT: {major=0, minor=9, micro=9, build=97, id=ada64c48-72e2-4a39-a6b0-5eabda135d0d}

Version2

DDL

CREATE TABLE version2 (
    id  uuid PRIMARY KEY,
    build int
);

CREATE INDEX version2_order_idx ON version2 (build DESC);

SQL

String sql = "SELECT * FROM Version2 " +
                "WHERE build > " + getVersion2Number(0,6,7,58) + 
		" ORDER BY build DESC" +
		" LIMIT 3";

Query Plan

QUERY PLAN: Limit  (cost=0.29..0.61 rows=3 width=20) (actual time=0.053..0.124 rows=3 loops=1)
QUERY PLAN:   ->  Index Scan using version2_order_idx on version2  (cost=0.29..391.76 rows=3627 width=20) (actual time=0.037..0.060 rows=3 loops=1)
QUERY PLAN:         Index Cond: (build > 6758)
QUERY PLAN: Planning Time: 0.388 ms
QUERY PLAN: Execution Time: 0.195 ms
QUERY RESULT: {build=9999, id=e32b8067-9cae-4c49-b776-372e8a2137e4}
QUERY RESULT: {build=9998, id=c2902963-6579-48d3-bb91-deac6a8528bd}
QUERY RESULT: {build=9997, id=de28843c-b245-4b46-8e2d-7e549f0b862e}

Version3

DDL

CREATE TABLE version3 (
    id  uuid PRIMARY KEY,
    build bytea
);

CREATE INDEX version3_order_idx ON version3 (build DESC);

SQL

String sql = "SELECT * FROM Version3 " +
                "WHERE build > ?" +
                " ORDER BY build DESC" +
                " LIMIT 3";

Query Plan

QUERY PLAN: Limit  (cost=0.28..0.77 rows=3 width=48) (actual time=0.063..0.130 rows=3 loops=1)
QUERY PLAN:   ->  Index Scan using version3_order_idx on version3  (cost=0.28..368.24 rows=2283 width=48) (actual time=0.047..0.068 rows=3 loops=1)
QUERY PLAN:         Index Cond: (build > '\x000506073a'::bytea)
QUERY PLAN: Planning Time: 0.329 ms
QUERY PLAN: Execution Time: 0.214 ms
[0, 9, 9, 99]
[0, 9, 9, 98]
[0, 9, 9, 97]

Version4

DDL

CREATE TABLE version4 (
    id  uuid PRIMARY KEY,
    build VARCHAR 
);

CREATE INDEX version4_build_idx ON version4 (build DESC);

SQL

String sql = "SELECT * FROM Version4 " +
                "WHERE build > '" + getVersion4Text(0, 6, 7, 58) + "'" +
                " ORDER BY build DESC" +
                " LIMIT 3";

Query Plan

QUERY PLAN: Limit  (cost=0.29..0.80 rows=3 width=48) (actual time=0.074..0.137 rows=3 loops=1)
QUERY PLAN:   ->  Index Scan using version4_build_idx on version4  (cost=0.29..512.72 rows=2996 width=48) (actual time=0.060..0.079 rows=3 loops=1)
QUERY PLAN:         Index Cond: ((build)::text > '0000000600070058'::text)
QUERY PLAN: Planning Time: 0.313 ms
QUERY PLAN: Execution Time: 0.256 ms
QUERY RESULT: {build=0000000900090099, id=29f9f1a0-40eb-4a31-873f-ff04868fa3d1}
QUERY RESULT: {build=0000000900090098, id=1ead5f4d-73d9-4cda-b899-f745b90f8598}
QUERY RESULT: {build=0000000900090097, id=bc9a4dfe-12c9-4c34-8925-836e8c4a7ad3}

Comparison

Options	cost	Pros & Cons
Version1 (split columns)	(cost=0.29..982.84 rows=3366 width=32)	Pros: Flexible, can change SQL easily Cons: Slow
Version2 (calculate to number)	(cost=0.29..391.76 rows=3627 width=20)	Pros: Fast Cons: Need calculate before persist and may need migrate if the logic to compare changed Sorting will be broken if the number exceed max number
Version3 (calculate to string)	(cost=0.28..368.24 rows=2283 width=48)	Pros: Fast and don't have max number issue Cons: Need calculate before persist and may need migrate if the logic to compare changed
Version4 (calculate to byte array)	(cost=0.29..512.72 rows=2996 width=48)	Pros: Don't need extra calculation Cons: Easy to exceed max byte number, so still need extra calculation Slow

a developer

PostgreSQL version schema performance comparison

別名演算法 Alias Method

Visit

Translate