order to keep our code today understandable for
future generations.
We assume that the design concepts behind the code
will become less relevant in a certain amount of
years. Our assumption is based on the fast pace of
software evolution. In the 1970s efficiency was
amongst the major requirements for C programs
which were often enriched by assembly fragments.
Nowadays, thanks to modern and powerful
computers, (that can endure huge libraries)
readability and maintainability have become
“affordable” and actual issues. The design rationale
behind modern OO libraries (such as Java Beans)
may become less relevant in the upcoming 30-40
years but sometimes we might be willing to run
programs based on these libraries. We need to be
able to understand the execution effects rather than
the design details of old programs.
We therefore focus on bit preservation, i.e., the
ability to restore the bits of a program stored for a
long period of time and obtain their effect, namely,
run the bytecode of the restored program. The bit-
level interpreter should thus embody some
fundamental computing ideas based on bit
manipulation (e.g., Random Access Machine,
Goldstine et al., 1947)). This concept is powerful,
simple to simulate, efficient (with relation to Turing
machine) and has not be changed (and will probably
not change) since the beginning of the computer
science era.
However, there are several challenges for long-
term code preservation as in (Factor et al., 2009).
These are storage media degradation, hardware
obsolescence and hardware failures. These
drawbacks are aggravated by targeted attacks against
media systems staged by malicious parties. As a
result, corrupted data may be fetched from the
storage. To address this challenge, we use the
Subleq machine as in (Mazonka et al., 2011) as an
appropriate format for keeping and running code.
Redundancy is also employed when storing the data,
augmenting data with error correcting information.
The use of error correcting codes enables integrity
checks and correction during data access. One
possible error correcting code is Hamming code that
can be utilized for the preservation of every bit of a
command.
RAM based Subleq is proven to be Turing-
equivalent and therefore capable of expressing code
for any feasible computation. In our scheme any
code is converted into the Subleq assembly format
and then into a bit stream. Such bit streams enriched
with the Hamming error-correcting codes (Arazi,
1988) are placed into media storage units (e.g., CD-
ROMS, or even into a plate with Braille style marks
on a surface).
The second challenge is to ensure that our
descendants in future will want to learn more about
our lives and will be able to do so. Even if we focus
solely on execution effects, what is the input/output
format? Textual input/output is usually expressed in
some modern language (today mostly in English).
Images and sound might be based on modern
cultural references (like modern computers, cars).
With that being said, modern languages and other
cultural references may be forgotten in the future,
sharing the destiny of undeciphered ancient scripts.
A possible solution would be to propose a set of
visual or tactile signs allowing any language group
to express their own utterances; that is, using a
universal writing system (Universal Writing System,
2015). Possible inputs/outputs can be translated into
this system and deciphered in the future. There is a
clear need to bring together computer scientists,
linguists, psychologists etc. into the joint
interdisciplinary project aimed at developing a
universal writing system.
Paper Organization. Section 2 outlines the
principles behind preserving code excerpts as bit
streams. It includes compiling any procedural
language into the Subleq assembly language and
further into bytecode as well as integrating error-
correcting capabilities. Section 3 outlines the main
idea behind the usage of a universal writing system
in our context and provides an example of a simple
C program converted into the bit stream format.
Finally, some conclusions are drawn in Section 4.
2 PRESERVATION OF CODE
EXCERPTS
2.1 Keeping Code as Bit Stream
A Subleq program represents an infinite array of
memory cells, where each cell holds an integer
number. This number can be an address of another
memory cell. The Subleq interpreter considers this
array as a sequence of instructions; each instruction
has 3 operands: A,B,C. An operand may be either
an integer number or a symbolic label describing a
memory address. Execution of a Subleq instruction
subtracts the value in the memory cell at the address
stored in A from the content of a memory cell at the
address stored in B and then writes the result back
into the cell with the address in B. If the value after
subtraction in B is less than or equal to zero, the