SQUIDs

It is often useful to produce short, quasi-unique identifiers (SQUIDs) without the benefit of a central authority to prevent duplication. Although Universally Unique Identifiers (UUIDs) provide for this, these are also unwieldy; for example, the most used UUID, version 4, is 36 characters long. SQUIDs are short (8 characters) at the expense of having more collisions, which can be mitigated by combining them with human-produced suffixes, yielding relatively brief, half human-readable, almost-unique identifiers (see for example the identifiers used for Decentralized Construct Taxonomies; Peters & Crutzen, 2024, doi.org/mr4n). SQUIDs are the number of centiseconds elapsed since the beginning of 1970 converted to a base 30 system. This package contains functions to produce SQUIDs as well as convert them back into dates and times.

Details

SQUIDs are defined as 8-character strings that express a timestamp (the number of centiseconds that passed since the UNIX Epoch) in a base 30 decimal system. The lowest possible SQUID, therefore, is 00000001 (which corresponds to 1970-01-01 00:00:00 UTC), and the highest possible SQUID is zzzzzzzz, which corresponds to 2177-11-28 11:59:59 UTC.

The base 30 system

The characters used in SQUIDs are Arabic digits (0-9) and (lowercase) Latin letters, omitting vowels. This yields the sequence listed at the bottom of this page. This means that in the base 30 system used by SQUIDs:

“10” represents “30” in the decimal system;
“3b” represents “100” in the decimal system (which, for SQUIDs, corresponds to 100 centiseconds, so one second);
“6n0” represents “6000” in the decimal system (corresponding to one minute);
“fb00” represents “360000” in the decimal system (corresponding to one hour);
“bn00” represents “8640000” in the decimal system (corresponding to one day);

Avoiding collisions

Because SQUIDs represent centiseconds, if you generate two or more sequences of SQUIDs in quick succession, these will likely overlap (i.e. contain the same SQUIDs, called “collisions” in “identifier speak”).

For example, if you produce a sequence of 1000 SQUIDs, this covers an interval of 10 seconds, and if you produce a sequence of 6000 SQUIDs, this covers an interval of one minute. This means that if you request 6000 SQUIDs and after 30 seconds request another 6000 SQUIDs, and assuming you use the default origin of the current time, the last 3000 SQUIDs of the first sequence and the first 3000 SQUIDs of the second sequence will be the same.

To avoid this, {squids} allows you to specify a sequence of SQUIDs that you want your new SQUIDs to follow using the follow argument. You can also follow the first sequence of SQUIDs at a distance, using the followBy argument; if you specify one or more SQUIDs in the follow argument, and if you specify followBy = 1000, the new sequence of SQUIDs will have an origin 1001 centiseconds after the last SQUID in the sequence you passed in follow.

For example, let’s create five SQUIDs and store them:

exampleSQUIDs <-
  squids::squids(5);

Let’s look at what we got:

exampleSQUIDs;

🦑 7zyq041c, 7zyq041d, 7zyq041f, 7zyq041g & 7zyq041h

To then follow this sequence, we can specify them when creating new SQUIDs:

squids::squids(
  5,
  follow = exampleSQUIDs
);

🦑 7zyq041j, 7zyq041k, 7zyq041l, 7zyq041m & 7zyq041n

And we can use followBy to specify we want a gap:

squids::squids(
  5,
  follow = exampleSQUIDs,
  followBy = 30
);

🦑 7zyq042j, 7zyq042k, 7zyq042l, 7zyq042m & 7zyq042n

SQUIDs

Details

The base 30 system

Avoiding collisions

The SQUID “digits”