Versioning, Conventional Commit


Software Engineering

(for Intelligent Distributed Systems)

Module “Principles and Methods”

A.Y. 2023/2024

Giovanni Ciatto (reusing material made by Danilo Pianini)


Compiled on: 2024-05-07 — printable version

back

Software versioning

Versioning: the process of assigning a unique identifier to a unique state of some software

  • Used to distinguish different software states
  • Used to refer to different states of the same software
  • The identifier is normally a sequence of alfanumeric characters spaced by dots, slashes, and dashes
  • Assigning IDs in a predictable way could help gathering information on the software itself

Versioning levels

Versioning can happen at different levels, for instance:

  • Code (DVCS)
    • Fine grained
    • Automatic
    • Non-progressive
    • Non-linear (i.e., versions can’t be sorted)
  • Subproject / feature version
  • Software release
    • Usually manual
    • Usually linear

Versioning scopes

W.r.t. the administrative boundaries (e.g. the organization, the department, the team) versioning can be:

  • Internal
    • Identifies a point in development
    • Changes do not impact the “outer world”
  • External
    • A publicly visible release of the software
    • Changes are disruptive

Often for open source projects, internal and external versioning coincide

Versioning approaches

Code naming

The version is represented by a (usually pronounceable) word, short phrase or acronym

Examples: macOS Sierra, Windows Vista, Ubuntu Bionic Beaver

  • With few exceptions, does not provide any direct information on the project (often by purpose)
  • Very frequently used internally to refer to new feature sets
  • Often used to “protect” the corporation from information leaking
  • Often changed to purposely create confusion
  • Used to separate pre-release versions from final releases
  • Often associated for commercial reasons to other version numbers
    • e.g. Ubuntu 18.04 Bionic Beaver or MacOS X 10.13.5 Sierra
  • Reasons for code-names are often political and commercial rather than technical

Versioning approaches

Date based versioning

The version is represented by a string representing the release date

Example: Ubuntu 18.04, Windows 2000, Office 2007, Visual Studio 2022

  • Dates don’t always match development rate

    • A project may change more in a week than in a year
  • Useful for projects that are fast-paced (multiple releases per week)

  • Useful as a companion for other versioning schemes

  • Useful at the commercial level for clearly indicating the novelty (and the age) of a project

    • e.g. Windows 98, Office 2003

Versioning approaches

Unary numbering

The version is represented by a string whose length grows at each version

Example: $\TeX$ 3.1, 3.14, 3.141, …, $\pi$

  • Only useful for project that reached maturity

  • Extremely unlikely in today’s software world

  • May lead to version length explosion

Versioning approaches

Degree of retro compatibility

The version is represented one or more numbers, separately incremented, that
reflect incrementally widespread changes in the product

Example: 1.0.1, 1.1.0, 2.0.0

  • Often used in conjunction with other techniques
  • Often used badly (see the Linux kernel)
  • Formal methodologies for applying it exist
  • Sometimes instead of indicating API-level changes, the version may indicate user-level perceivable changes
    • Very much depends on who are the clients/customers

Versioning in the real world

Microsoft Windows versioning

Combination of all the techniques:

  • Dates (Windows 9x, 2001)

  • Codenames (NT, Vista, XP, Millenium Edition)

  • Pre-release code-names (Longhorn)

  • Dates for internal builds

  • Incremental versions on multiple levels

    • e.g., Windows 95 is also MS-DOS 7.0 and Windows 4.00
  • Separation between “commercial” versions and “actual” “versions”

    • e.g., Windows 7 is actually Windows 6.1, and Windows 10 is actually Windows 6.4
    • One future Windows may actually become Windows 7.0, clashing with the “commercial” version of an older product

Proliferation of methodologies and inconsistencies leads to issues

Versioning in the real world

Macintosh versioning

  • Different versioning styles over time

  • Version 1 to 9: commercial names were linear

  • Version X (10 in Roman numerals) become part the commercial name

  • Mac OS X was a series of products, distinguished by a number of big cats names and sub-numbers

    • e.g. Mac OS X 10.0 (code name Cheetah) … Mac OS X 10.8 (code name Mountain Lion)
  • Product name become simply macOS since version 10.12 (Sierra)

  • Version 11 (Big Sur) is the first version of macOS to be numbered 11

  • Current version is 14 (Sonoma)

  • The Darwin (i.e. macOS’s Kernel) versioning schema is completely different from the commercial version number

    • last version is 23.4.0

Versioning in the real world

Canonical Ubuntu versioning

Association of a date in format YY.MM and a two word code-name in form of Adjective AnimalName. Both the words of the code-name begin with the same letter.

  • Version number does not track changes

    • The development is arguably linear
    • But actually new versions may bring in substantial novelties, e.g. entirely new desktop environments
  • “LTS” can be optionally appended to identify “Long Term Support” versions

  • Two versions of Ubuntu can be compared by date but also by the first letter of their codename

    • “Zesty Zapus” is newer than “Utopic Unicorn”
    • Unfortunately, since there are a limited number of letters, “Bionic Beaver” is newer than “Xenial Xerus” and “Zesty Zapus”

“Support” of software products (pt. 1)

What happens to a software product after a new version is released?

  1. Let’s say product X is released under code name Abstract Alfred with version 1.0.0

  2. Developers keep working on the product, and release a new version 1.1.0 under code name Boring Beatrix

  3. Some severe security issue is found in version 1.1.0 and a patch 1.1.1 is released

    • version 1.1.1 is now released under Boring Beatrix v2 code name
  4. The bug is still affecting Abstract Alfred, which still has a lot of active users

    • a patch 1.0.1 is released for Abstract Alfred, containing the same fix of 1.1.1, but not all the features of 1.1.0
    • the patch is released under code name Abstract Alfred v2

Takeaways:

  • one version does not stop to exist when the next one is released
  • bug fixes are often back-ported to older versions

“Support” of software products (pt. 2)

  • Big software companies often provide support for their products, for a given amount of time

    • this is to give time to users/customers to migrate to newer versions
  • When releasing a new version of a product, the company shall keep spending money to maintain the old one

    • e.g. keep fixing bugs, keep providing security patches
    • e.g. providing guides for how to migrate to the new versions
  • The end of support is the date after which the company will not provide any more support for the product

    • no more updates will be released
    • no more help for users on how to migrate to newer versions
  • Most commonly, the end of support is announced well in advance

    • upon release, the company will announce the end of support date
  • On the user side, it is important to know when the end of support is to plan the migration to newer versions

    • most commonly, updates are very critical situations for development teams
    • they may want to assess the new version before migrating to it, and adjust issues after migration

Support vs. Long-Term Support (LTS)

  • Most commonly when release is periodic the default support time window is relatively short

  • Examples:

    • Canonical releases one new version of Ubuntu every 6 months
    • OpenJDK releases a new version of Java every 6 months
  • Once every few years, a version is released with Long-Term Support (LTS)

    • LTS versions are supported for a much longer period of time
    • LTS versions are often preferred by user companies, as they provide a stable platform for a longer time

Versioning in the real world

Wine versioning

Formerly a pure date, in ISO format without hyphens, e.g. 20040505.

The project switched to a classic versioning in form of major.minor

  • The change may give some headaches to dependency managers, since 20040505 is bigger than 3.9 and other subsequent versions.
  • A 0.Date format for initial development releases would have been advisable with hindsight

Versioning in the real world

$\TeX$ versioning

Purely unary numbering converging to $\pi{}$

  • Current version (released in February 2021) is 3.141592653
  • Every time a new version is produced, a number from $\pi{}$ is added to the version string
  • Sustainable just because $\TeX$ is now extremely stable, and development is almost frozen

At the time of my death, it is my intention that the then-current versions of $\TeX$ […] be forever left unchanged, except that the final version numbers to be reported in the “banner” lines of the programs should become: TeX, Version $\pi $ […]. From that moment on, all “bugs” will be permanent “features”.

Donald E. Knuth

Versioning in the real world

Python Enhancement Proposal 440 (PEP440)

It is the way Python software should be versioned

  • Flexible but complicated
  • Order of release segments is mandated

Format: [N!]N(.N)*[{a|b|rc}N][.postN][.devN]

  1. Optional epoch segment: an integer number N followed by exclamation mark (e.g. 1!)
  2. Mandatory release segment: as many integer numbers N1, N2, …, separated by . (e.g. 1.2.3.4.5)
  3. Optional pre-release segment: one of a, b, rc followed by an integer number N (e.g. a1, b2, rc3)
  4. Optional post-release segment: one . followed by post and an integer number N (e.g. .post1)
  5. Optional development release segment: one . followed by dev and an integer number N (e.g. .dev1)

Versioning in the real world

Semantic Versioning (SemVer)

Encodes version numbers and their change to convey meaning about the underlying code and what has been modified from one version to the next.

  • Written in RFC-style
  • No-retract
  • Versioned using Semantic versioning
  • Format X.Y.Z[-P][+B]
    • X $\Rightarrow$ mandatory Major (integer number)
    • Y $\Rightarrow$ mandatory Minor (integer number)
    • Z $\Rightarrow$ mandatory Patch (integer number)
    • P $\Rightarrow$ optional Pre-release (alphanumeric string, prepended by -)
    • B $\Rightarrow$ optional Build metadata (alphanumeric string, prepended by +)

Versioning in the real world

Semantic Versioning (SemVer) overivew

  • Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it should be precise and comprehensive.
  • Once a versioned package has been released, the contents of that version MUST NOT be modified. Any modifications MUST be released as a new version.
  • A normal version number MUST take the form X.Y.Z where X, Y, and Z are non-negative integers, and MUST NOT contain leading zeroes.
  • X is the major version, Y is the minor version, and Z is the patch version. Each element MUST increase numerically.

Versioning in the real world

Semantic Versioning (SemVer) criteria

  • Patch version Z MUST be incremented if only backwards compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior.
  • Minor version Y MUST be incremented if new, backwards compatible functionality is introduced to the public API.
    • It MUST be incremented if any public API functionality is marked as deprecated.
    • It MAY be incremented if substantial new functionality or improvements are introduced within the private code.
    • It MAY include patch level changes.
    • Patch version MUST be reset to 0 when minor version is incremented.
  • Major version X MUST be incremented if any backwards incompatible changes are introduced to the public API.
    • It MAY include minor and patch level changes.
    • Patch and minor version MUST be reset to 0 when major version is incremented.

Versioning in the real world

Semantic Versioning (SemVer) details

  • Version number 0.1.0 is the minimum (i.e. first) one

  • Major version zero (0.y.z) is for initial development: anything may change at any time

    • the public API should not be considered stable
  • Version 1.0.0 defines the public API (or at least its first version)

    • the way in which the version number is incremented after this release is dependent on how the public API changes
  • A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version

    • identifiers MUST comprise only ASCII alphanumerics and hyphen [0-9A-Za-z-].
    • this may be used to mark alpha releases, beta releases, or release candidates
  • Build metadata MAY be denoted by appending a plus sign and a series of dot separated identifiers immediately following the patch or pre-release version

    • identifiers MUST comprise only ASCII alphanumerics and hyphen [0-9A-Za-z-].
    • this may be used to keep track of from which commit the version was built

Versioning in the real world

The importance of a versioning methodology

Think before choosing a versioning schema, and then be consistent

  • Semantic versioning is warmly recommended

    • Can be integrated with the DVCS!
    • Dates can be added (e.g. in the pre-release or build-metadata sections)
  • Codenames can be used informally

    • For internal subprojects
    • As part of product promotion or for commercial activities
  • Dates may make sense for projects with fast and steady development

    • Possibly as part of a Semantic Versioned project
    • Dates are useful as part of a better versioned system

DVCS-based versioning

  • The underlying state of the DVCS can be used to assign version numbers to the software

  • The practice can change, but consider for instance a case in which:

    • Manually added tags identify versions
      • And are in X.Y.Z format
    • An automated system searches for the closest past tag T
      • if no tag is found, then T=0.1.0
      • If the current commit is tagged, then the version is T
      • Otherwise, if C is the count of intermediate commits and H the current hash, it is T-C+H
    • Automatically generates a SemVer compatible version!

Commit message-based versioning

What do we need commit messages for?

Identify what is different between changes

But isn’t this essentially what DVCS is about?

Idea

find a way to write conventional commit messages such that some automatic tool can understand whether a new version should be released

Put humans and sentiments out of the loop

Conventional commits

One of the possible ways to write standardized commits – https://www.conventionalcommits.org/

Heavily inspired by the Angular convention: https://bit.ly/3VnAp4T

Format (optional parts in [square brackets]):

type[(scope)][!]: description

[body]

[BREAKING CHANGE: <breaking change description>]
  • type: what the commit introduces
    • Can differ among projects
    • fix (bug fix, no API change) and feat (new feature) always present
    • common optional types: build, chore, ci, docs, style, refactor, perf, test
  • Optionally, the scope identifies the module of the software that was changed
  • Breaking changes are identified by a ! before the : and/or by a description in the footer of the commit after BREAKING CHANGE:

Semantic release

Idea

Assuming a conventional way to commit, use the information to understand when and how to release

Practice

  1. Decide which branch should be looked at for triggering releases
  2. Define which kind of release should be associated with which kind of commit
    • Rules can be custom per-project, as far as they are consistent
    • e.g., fix and docs are PATCH, feat are MINOR, Breaking changes are MAJOR
    • Usually the commit type is relied upon, but the scope may be used as well
  3. Scan all commits from the last tag, searching for the “largest” version change
  4. If at least one version change was found, and this is still the last commit on the branch triggering releases, create a release tag and perform the release procedure

Automatic Semantic Release with semantic-release

An implementation of Semantic release: https://github.com/semantic-release/semantic-release

  • Automatically computes the version number after commit & push
  • Automatically generates commit-based release notes
  • Automatically runs the publishing commands

Example of projects using semantic-release