Mind Journeys

Saturday, February 27, 2021

Chapter 13
In search of a blind watchmaker

A discussion of

The Blind Watchmaker:
Why the Evidence of Evolution Reveals a Universe without Design

by the evolutionary biologist Richard Dawkins.

First posted October 2010. Reposted in July 2017,
with several paragraphs deleted and other, minor, changes.

Surely it is quite unfair to review a popular science book published years ago. Writers are wont to have their views evolve over time [1]. Yet in the case of Richard Dawkins's The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design (W.W. Norton 1986), a discussion of the mathematical concepts seems warranted, because books by this eminent biologist have been so influential and the "blind watchmaker" paradigm is accepted by a great many people, including a number of scientists.

Dawkins's continuing importance can be gauged by the fact that his most recent book, The God Delusion (Houghton Mifflin 2006), was a best seller. In fact, Watchmaker, also a best seller, was re-issued in 2006.

I do not wish to disparage anyone's religious or irreligious beliefs, but I do think it important to point out that non-mathematical readers should beware the idea that Dawkins has made a strong case that the "evidence of evolution reveals a universe without design."

There is little doubt that some of Dawkins's conjectures and ideas in Watchmaker are quite reasonable. However, many readers are likely to think that he has made a mathematical case that justifies the theory(ies) of evolution, in particular the "modern synthesis" that combines the concepts of passive natural selection and genetic mutation.

Dawkins wrote his apologia back in the eighties when computers were becoming more powerful and accessible, and when PCs were beginning to capture the public fancy. So it is understandable that, in this period of burgeoning interest in computer-driven chaos, fractals and cellular automata, he might have been quite enthusiastic about his algorithmic discoveries.

However, interesting computer programs may not be quite as enlightening as at first they seem.

Cumulative selection
Let us take Dawkins's argument about "cumulative selection," in which he uses computer programs as analogs of evolution. In the case of the phrase, "METHINKS IT IS LIKE A WEASEL," the probability -- using 26 capital letters and a space -- of coming up with such a sequence randomly is 27^-28 (the astonishingly remote 8.3 x 10^-41). However, that is also the probability for any random string of that length, he notes, and we might add that for most probability distributions. when n is large, any distinct probability approaches 0.

Such a string would be fantastically unlikely to occur in "single step evolution," he writes. Instead, Dawkins employs cumulative selection, which begins with a random 28-character string and then "breeds from" this phrase. "It duplicates it repeatedly, but with a certain chance of random error -- 'mutation' -- in the copying. The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.

Three experiments evolved the precise sentence in 43, 64 and 41 steps, he wrote.

Dawkins's basic point is that an extraordinarily unlikely string is not so unlikely via "cumulative selection."

Once he has the readers' attention, he concedes that his views of how natural selection works preclude use of a long-range target. Such a target would fulfill the dread "final cause" of Aristotle, which implies purpose. But then Dawkins has his nifty "biomorph" computer visualizations (to be discussed below).

Yet it should be obvious that Dawkins's "methinks" argument applies specifically to evolution once the mechanisms of evolution are at hand. So the fact that he has been able to design a program which behaves like a neural network really doesn't say much about anything. He has achieved a proof of principle that was not all that interesting, although I suppose it would answer a strict creationist, which was perhaps his basic aim.

But which types of string are closer to the mean? Which ones occur most often? If we were to subdivide chemical constructs into various sets, the most complex ones -- which as far as we know are lifeforms -- would be farthest from the mean. (Dawkins, in his desire to appeal to the lay reader, avoids statistics theory other than by supplying an occasional quote from R.A. Fisher.)[2]

Dawkins then goes on to talk about his "biomorph" program, in which his algorithm recursively alters the pixel set, aided by his occasional selecting out of unwanted forms. He found that some algorithms eventually evolved insect-like forms, and thought this a better analogy to evolution, there having been no long-term goal. However, the fact that "visually interesting" forms show up with certain algorithms again says little. In fact, the remoteness of the probability of insect-like forms evolving was disclosed when he spent much labor trying to repeat the experiment because he had lost the exact initial conditions and parameters for his algorithm. (And, as a matter of fact, he had become an intelligent designer with a goal of finding a particular set of results.)

Again, what Dawkins has really done is use a computer to give his claims some razzle dazzle. But on inspection, the math is not terribly significant.

It is evident, however, that he hoped to counter Fred Hoyle's point that the probability of life organizing itself spontaneously was equivalent to a tornado blowing through a junkyard and assembling from the scraps a fully functioning 747 jetliner, Hoyle having made this point not only with respect to the origin of life, but also with respect to evolution by natural selection.

So before discussing the origin issue, let us turn to the modern synthesis.

The modern synthesis
I have not read the work of R.A. Fisher and others who established the modern synthesis merging natural selection with genetic mutation, and so my comments should be read in this light. [Since this was written I have examined the work of Fisher and of a number of statisticians and biologists, and I have read carefully a modern genetics text.]

Dawkins argues that, although most mutations are either neutral or harmful, there are enough progeny per generation to ensure that an adaptive mutation proliferates. And it is certainly true that, if we look at artificial selection -- as with dog breeding -- a desirable trait can proliferate in very short time periods, and there is no particular reason to doubt that if a population of dogs remained isolated on some island for tens of thousands of years that it would diverge into a new species, distinct from the many wolf sub-species.

But Dawkins is of the opinion that neutral mutations that persist because they do no harm are likely to be responsible for increased complexity. After all, relatively simple lifeforms are enormously successful at persisting.

And, as Stephen Wolfram points out (A New Kind of Science, Wolfram Media 2006), any realistic population size at a particular generation is extremely unlikely to produce a useful mutation because the ratio of possible mutations to the number of useful ones is some very low number. So Wolfram also believes neutral mutations drive complexity.

We have here two issues:

1. If complexity is indeed a result of neutral mutations alone, increases in complexity aren't driven by selection and don't tend to proliferate.

2. Why is any species at all extant? It is generally assumed that natural selection winnows out the lucky few, but does this idea suffice for passive filtering?

Though Dawkins is correct when he says that a particular mutation may be rather probable by being conditioned by the state of the organism (previous mutation), we must consider the entire chain of mutations represented by a species.

If we consider each species as representing a chain of mutations from the primeval organism, then we have a chain of conditional probability. A few probabilities may be high, but most are extremely low. Conditional probabilities can be graphed as trees of branching probabilities, so that a chain of mutation would be represented by one of these paths. We simply multiply each branch probability to get the total probability per path.

As a simple example, a 100-step conditional probability path with 10 probabilities of 0.9 and 60 with 0.7 and 30 with 0.5 yields an overall probability of 1.65 x 10^-19.

In other words, the more mutations and ancestral species attributed to an extant species, the less likely it is to exist via passive natural selection. The actual numbers are so remote as to make natural selection by passive filtering virtually impossible, though perhaps we might conjecture some nonlinear effect going on among species that tends to overcome this problem.

Dawkins's algorithm demonstrating cumulative evolution fails to account for this difficulty. Though he realizes a better computer program would have modeled lifeform competition and adaptation to environmental factors, Dawkins says such a feat was beyond his capacities. However, had he programed in low probabilities for "positive mutations," cumulative evolution would have been very hard to demonstrate.

Our second problem is what led Hoyle to revive the panspermia conjecture, in which life and proto-lifeforms are thought to travel through space and spark earth's biosphere. His thinking was that spaceborne lifeforms rain down through the atmosphere and give new jolts to the degrading information structures of earth life. (The panspermia notion has received much serious attention in recent years, though Hoyle's conjectures remain outside the mainstream.)

From what I can gather, one of Dawkins's aims was to counter Hoyle's sharp criticisms. But Dawkins's vigorous defense of passive natural selection does not seem to square with the probabilities, a point made decades previously by J.B.S. Haldane.

Without entering into the intelligent design argument, we can suggest that the implausible probabilities might be addressed by a neo-Lamarkian mechanism of negative feedback adaptations. Perhaps a stress signal on a particular organ is received by a parent and the signal transmitted to the next generation. But the offspring's genes are only acted upon if the other parent transmits the signal. In other words, the offspring embryo would not strengthen an organ unless a particular stress signal reached a threshold.

If that be so, passive natural selection would still play a role, particularly with respect to body parts that lose their role as essential for survival.

Dawkins said Lamarkianism had been roundly disproved, but since the time he wrote the book, molecular biology has shown the possibility of reversal of genetic information (retroviruses and reverse transcription). However, my real point here is not about Lamarkianism but about Dawkins's misleading mathematics and reasoning.

Joshua Mitteldorf, an evolutionary biologist with a physics background and a Dawkins critic, points out that an idea proposed more than 30 years ago by David Layzer is just recently beginning to gain ground as a response to probability issues. Roughly I would style Layzer's proposal a form of neo-Lamarckianism [3].

Dawkins concedes that the primeval cell presents a difficult problem, the problem of the arch. If one is building an arch, one cannot build it incrementally stone by stone because at some point, a keystone must be inserted and this requires that the proto-arch be supported until the keystone is inserted. The complete arch cannot evolve incrementally. This of course is the essential point made by the few scientists who support intelligent design.

Dawkins essentially has no answer. He says that a previous lifeform, possibly silicon-based, could have acted as "scaffolding" for current lifeforms, the scaffolding having since vanished. Clearly, this simply pushes the problem back. Is he saying that the problem of the arch wouldn't apply to the previous incarnation of "life" (or something lifelike)?

Some might argue that there is a possible answer in the concept of phase shift, in which, at a threshold energy, a disorderly system suddenly becomes more orderly. However, this idea is left unaddressed in Watchmaker. I would suggest that we would need a sequence of phase shifts that would have a very low overall probability, though I hasten to add that I have insufficient data for a well-informed assessment.

Cosmic probabilities
Is the probability of life in the cosmos very high, as some think? Dawkins argues that it can't be all that high, at least for intelligent life, otherwise we would have picked up signals. I'm not sure this is valid reasoning, but I do accept his notion that if there are a billion life-prone planets in the cosmos and the probability of life emerging is a billion to one, then it is virtually certain to have originated somewhere in the cosmos.

Though Dawkins seems to have not accounted for the fact that much of the cosmos is forever beyond the range of any possible detection as well as the fact that time gets to be a tricky issue on cosmic scales, let us, for the sake of argument, grant that the population of planets extends to any time and anywhere, meaning it is possible life came and went elsewhere or hasn't arisen yet, but will, elsewhere.

Such a situation might answer the point made by Peter Ward and Donald Brownlee in Rare Earth: Why Complex Life Is Uncommon in the Universe (Springer 2000) that the geophysics undergirding the biosphere represents a highly complex system (and the authors make efforts to quantify the level of complexity), meaning that the probability of another such system is extremely remote. (Though the book was written before numerous discoveries concerning extrasolar planets, thus far their essential point has not been disproved. And the possibility of non-carbon-based life is not terribly likely in that carbon valences permit high levels of complexity in their compounds.)

Now some may respond that it seems terrifically implausible that our planet just happens to be the one where the, say, one-in-a-billion event occurred. However, the fact that we are here to ask the question is perhaps sufficient answer to that worry. If it had to happen somewhere, here is as good a place as any. A more serious concern is the probability that intelligent life arises in the cosmos.

The formation of multicellular organisms is perhaps the essential "phase shift" required, in that central processors are needed to organize their activities. But what is the probability of this level of complexity? Obviously, in our case, the probability is one, but, otherwise, the numbers are unavailable, mostly because of the lack of a mathematically precise definition of "level of complexity" as applied to lifeforms.

Nevertheless, probabilities tend to point in the direction of cosmically absurd: there aren't anywhere near enough atoms -- let alone planets -- to make such probabilities workable. Supposing complexity to result from neutral mutations, probability of multicellular life would be far, far lower than for unicellular forms whose speciation is driven by natural selection. Also, what is the survival advantage of self-awareness, which most would consider an essential component of human-like intelligence?

Hoyle's most recent idea was that probabilities were increased by proto-life in comets that eventually reached earth. But, despite enormous efforts to resolve the arch problem (or the "jumbo jet problem"), in my estimate he did not do so.

Interestingly, Dawkins argues that people are attracted to the idea of intelligent design because modern engineers continually improve machinery designs, giving a seemingly striking analogy to evolution. Something that he doesn't seem to really appreciate is that every lifeform may be characterized as a negative-feedback controlled machine, which converts energy into work and obeys the second law of thermodynamics. That's quite an arch!

The problem of sentience
Watchmaker does not examine the issue of emergence of human intelligence, other than as a matter of level of complexity.

Hoyle noted in The Intelligent Universe (Holt, Rhinehart and Winston 1984) that over a century ago, Alfred Russel Wallace was perplexed by the observation that "the outstanding talents of man... simply cannot be explained in terms of natural selection."

Hoyle quotes the Japanese biologist S. Ohno:

Did the genome (genetic material) of our cave-dwelling predecessors contain a set or sets of genes which enable modern man to compose music of infinite complexity and write novels with profound meaning? One is compelled to give an affirmative answer...It looks as though the early Homo was already provided with the intellectual potential which was in great excess of what was needed to cope with the environment of his time.

Hoyle proposes in Intelligent that viruses are responsible for evolution, accounting for mounting complexity over time. However, this seems hard to square with the point just made that such complexity doesn't seem to occur as a result of passive natural winnowing and so there would be no selective "force" favoring its proliferation.

At any rate, I suppose that we may assume that Dawkins in Watchmaker saw the complexity inherent in human intelligence as most likely to be a consequence of neutral mutations.

An issue not addressed by Dawkins (or Hoyle for that matter) is the question of self-awareness. Usually the mechanists see self-awareness as an epiphenomenon of a highly complex program (a notion Roger Penrose struggled to come to terms with in The Emperor's New Mind (Oxford 1986) and Shadows of the Mind (Oxford 1994).)

But let us think of robots. Isn't it possible in principle to design robots that multiply replications and maintain homeostasis until they replicate? Isn't it possible in principle to build in programs meant to increase probability of successful replication as environmental factors shift?

In fact, isn't it possible in principle to design a robot that emulates human behaviors quite well? (Certain babysitter robots are even now posing ethics concerns as to an infant's bonding with them.)

I don't suggest that some biologists haven't proposed interesting ideas for answering such questions. My point is that Watchmaker omits much, making the computer razzle dazzle that much more irrelevant.

Conclusion
In his autobiographical What Mad Pursuit (Basic Books 1988) written when he was about 70, Nobelist Francis Crick expresses enthusiasm for Dawkins's argument against intelligent design, citing with admiration the "methinks" program. Crick, who trained as a physicist and was also a panspermia advocate, doesn't seem to have noticed the difference in issues here. If we are talking about an analog of the origin of life (one-step arrival at the "methinks" sentence), then we must go with a distinct probability of 8.3 x 10^-41. If we are talking about an analog of some evolutionary algorithm, then we can be convinced that complex results can occur with application of simple iterative rules (though, again, the probabilities don't favor passive natural selection).

One can only suppose that Crick, so anxious to uphold his lifelong vision of atheism, leaped on Dawkins's argument without sufficient criticality. On the other hand, one must accept that there is a possibility his analytic powers had waned.

At any rate, it seems fair to say that the theory of evolution is far from being a clear-cut theory, in the manner of Einstein's theory of relativity. There are a number of difficulties and a great deal of disagreement as to how the evolutionary process works. This doesn't mean there is no such process, but it does mean one should listen to mechanists like Dawkins with care.

1. In a 1996 introduction to Watchmaker, Dawkins wrote that "I can find no major thesis in these chapters that I would withdraw, nothing to justify the catharsis of a good recant."

2. In previous drafts, I permitted myself to get bogged down in irrelevant, and obscure, probability discussions. Plainly, I like a challenge; yet it's all too true that a writer who is his own sole editor has a fool for a client.

3. Genetic Variation and Progressive Evolution by David Layzer, The American Naturalist Vol. 115, No. 6 (Jun., 1980), pp. 809-826 (article consists of 18 pages) Published by: The University of Chicago Press for The American Society of Naturalists

Chapter 16
Brahman as Unknown God

What follows is a discussion of some material found in A Short History of Philosophy by Robert C. Solomon and Kathleen M. Higgins (Oxford 1996).
Though their book is somewhat uneven, the interludes of analytical brilliance make a read-through worthwhile. I was intrigued by what seems to be a succinct summary of the essence of much of Indian philosophy, which prompted some thought.
This chapter also appears in another of Conant's e-books, Dharma Crumbs.

As paralleled in Heraclitus and some of the other pre-Socratic philosophers, in Vedanta, Brahman is the ground, the value, and the essence of everything. This ultimate unity is therefore a coincidence of opposites – hot and cold, dry and wet, consciousness and world – which is incomprehensible to us. Brahman is "beyond all names and forms," and, like Yahweh, Brahman is a name for the unnameable, a reference to what cannot be understood or analyzed. Brahman is always "not this, not that."

But, we are assured Brahman, can be experienced, in meditation and mysticism, with Brahman being ultimately identical to one's true self or atman. It is thus the awareness of Brahman, most importantly, that is every person's supreme personal good. One of the obstacles to this good, especially among the learned, is the illusion of understanding.

The apostle Paul would very likely have identified Brahman as the "unknown God" – the utterly mysterious mind behind and within all existence.

Yet, he would say, we cannot connect to this mind without the intermediary Jesus, the Savior. The Unknown God decided to reveal his great love of humanity by this means. That mind is far beyond our rational capacities, whether the mind is called Brahman or Yahweh.

Yahweh (=Jehovah),
which means,
He is,
hence suggesting,
I am.
Thence
Jesus (=Joshua=Yeshua),
which means,
I am salvation
or
I save.

Jesus is the human face of the Unknown God, or Brahman. As the Son, Jesus is the projection of God into the world of humans.

Ultimately, Brahman is in fact one's true self, we are told. This idea runs parallel to Jesus, quoting scripture, saying "you are gods" (hence strongly implying "you are God") and to saying that he would bring to his right mind, or wake up, the person who turns to him. Those who turn to him are, says Paul, junior partners in Christ, welded into a spiritual oneness. All share the Holy Spirit, an inexhaustible fount of wisdom and cheer. In other words, they share in God's mind. So if believers have God's Spirit, they begin to awaken – sometimes very slowly – to their true, higher selves. They are returning to the state of perfect oneness from which their angels – atmans – have fallen.

Also, we are told, that in Vedanta recognizing oneself as atman is at the same time recognizing one's true self as Brahman. "An individual person is really just one aspect, one of infinitely many transient manifestations, of the One." Even so, there is plenty of room for interpretation as to whether Brahman is to be considered as the One who created those manifestations, or is identical to them, or is incomprehensibly different from them.

So, I suppose many Buddhists and some Vedantists turn away from the concept of vast, unfathomable mind. Yet are they not reaching toward superior mind or consciousness for their destinies? Why should such greatly enlightened minds be the pinnacle of the cosmos? It seems to me that that would mean that something that is less than the cosmos would still be superior to it, even if, perhaps, only temporarily. (Do you hear an echo of the ontological proof of God's existence in that argument?)

Yet, we must be careful here because of the apparent difference, for Buddhists, between mind and consciousness. In Buddhist parlance, the idea is to empty one's mind or self, obtaining the state of the anatman, which essentially means no-mind. That is to say, the Buddhists equate the human mind with the self, which needs to go away in order for the person to reach a state of bliss. From my perspective, both the Vedantist and Buddhist ideas are summed up by the New Testament injunction that one must die to self, to lose one's carnal mind (stop being a meat-head).

It should be noted that the authors say that Buddhists, in general, view Brahman and atman as illusions. Yet, if there is no ground of being, what is it that Buddhists are attempting to reach? How can any kind of eventuality exist without a ground of being?

Now, the Buddhist aim of enlightenment, either in this life and this body, or in a future life and body, yields this puzzle:

What is it that will suffer or experience bliss in the future? If the basic Buddhist theory holds that the objects of all desires are transitory, that the mind and soul are both temporarily existent illusions, that nothing lasts forever, then why desire a state of non-mind bliss, that supposedly implies an end to suffering? "You" won't be "there" to enjoy nothing anyway. Similarly, why worry about karma (you reap what you sow) in a subsequent life if it isn't really you proceeding to that next life?

So then, a Buddhist would desire to share in the bliss of Nirvana. He or she does yearn for some continuity of existence between his or her present state and the future. Of course, Buddhists will attribute such a contradiction to the inadequacy, when it comes to sublime mysteries, of human logic and language.

(We acknowledge that the Northern – Mahayana – school favors that devotees strive to become boddhisatvas, or enlightened beings, who delay attainment of Nirvana in order to help others become free of the bondage of suffering, whereas the Southern – Theravada – school favors Nirvana first followed by the helping of others. In either case, our puzzle remains.)

In response James Conant, a Buddhist, quotes Chögyam Trungpa:

The bad news is you’re falling through the air, nothing to hang on to, no parachute. The good news is, there’s no ground.^†

We can draw a parallel here based on these scriptures:

Psalm 46:10

Be still, and know that I am God: I will be exalted among the heathen, I will be exalted in the earth.

Being still here, I suggest, implies a deep, meditative awareness, letting our transitory thoughts and desires subside so as to permit the "ground of being" to be heard.

1 Kings 19:9-12

9 And [Elijah] came thither unto a cave, and lodged there; and, behold, the word of the Lord came to him, and he said unto him, What doest thou here, Elijah?
10 And he said, I have been very jealous for the Lord God of hosts: for the children of Israel have forsaken thy covenant, thrown down thine altars, and slain thy prophets with the sword; and I, even I only, am left; and they seek my life, to take it away.
11 And he said, Go forth, and stand upon the mount before the Lord. And, behold, the Lord passed by, and a great and strong wind rent the mountains, and brake in pieces the rocks before the Lord; but the Lord was not in the wind: and after the wind an earthquake; but the Lord was not in the earthquake:
12 And after the earthquake a fire; but the Lord was not in the fire: and after the fire a still small voice.

At the core of existence is God. He is not "in" the phenomena, even though he causes them. (I note that there is a distinction between the "word of the Lord" that asked Elijah why he was hiding in a cave and the "still small voice." I suggest that Elijah was led to commune with God at a deeper level, at the "ground of being" if you like.

Mark 4:37-40

37 And there arose a great storm of wind, and the waves beat into the ship, so that it was now full.
38 And he was in the hinder part of the ship, asleep on a pillow: and they awake him, and say to him, Master, care you not that we perish?
39 And he arose, and rebuked the wind, and said to the sea, Peace, be still. And the wind ceased, and there was a great calm.
40 And he said to them, Why are you so fearful? how is it that you have no faith?
41 And they feared exceedingly, and said one to another, What manner of man is this, that even the wind and the sea obey him?

The world's phenomena, that we take to be so real, are subject to the human mind when it is in accord with God's mind.

A key difference between the Christian and Eastern outlooks is the assurance that Jesus will assist the believer to die to self (granting the fact that it doesn't always appear that very many believers actually do so).

Matthew 16:25

For whosoever will save his life shall lose it: and whosoever will lose his life for my sake shall find it.

For a list of other supporting scriptures, please see:
https://zion78.blogspot.com/2018/02/we-must-die-to-self.html

The spiritual seekers of ancient India had had some important revelations. Yet, in Christian eyes, they were yearning for the big revelation that did not occur until the resurrection of Jesus.

We observe that Jesus himself pulled in those of low estate, who were acutely conscious of their need and not so inclined to intellectualize themselves out of drinking in the water of life. The "poor in spirit" (meek) are the ones positioned to break through the barrier of self-justifying delusion. Even today, as through the centuries, very strong belief flourishes best among the poor and lowly.

Matthew 11:28-29

28 Come unto me, all ye that labour and are heavy laden, and I will give you rest.
29 Take my yoke upon you, and learn of me; for I am meek and lowly in heart: and ye shall find rest unto your souls.

†. Brings to mind

Free fall in orbit or outer space
Life in the amniotic sack
The "unbearable lightness of being."

Chapter 12
Proto-integers and (very) naive classes

Below is a another bit of philosophy of mathematics. I am well aware that readership will be quite limited.

Deriving the four arithmetic operations
without appeal to standard sets

Aim

We use a Peano-like approach to build "proto-integers" from "proto-sets." These proto-integers are then used to justify numerical quantifiers that can then be applied to sets forthwith, alongside the standard ∀ and ∃ (and perhaps ∃!) and accepting the "not" sign, ~ .

As we shall see, however, it is not required that we call our objects "sets" -- although psychologically it is hard to distinguish them from some sort of set or other, as the next paragraph indicates. Though it is a useful distinction elsewhere, no attempt is made to distinguish the word "set" from "class."

Once we axiomatize our system, we find implied a priori proto-sets. These are well justified by Quine's argument concerning how children learn abstraction and communication. That is, a word represents an idea that does not precisely match every imagined instance. So it becomes necessary to say that a word represents an idea associated with other ideas or images. These secondary ideas are known as properties or attributes. So a word represents an abstracted idea, shorn of the potential distinctive properties.

From this basic law of thought, the idea of collection, class or set must follow. So we are entitled to accept such primitive sets as self-evident. Beyond this we may not go without formulating a proper set theory with an associated system of logic. But what we may do is apply the numerical quantifiers to these primitive sets right away. We don't need to establish the foundations of arithmetic in terms of a proper class theory or to define numbers with formal sets, as in the von Neumann derivation of integers.

Our method does in fact derive the basic arithmetical operations, but this is a frill that does not affect the basic aim of our approach. We indulge in anachronisms when the method is applied to "advanced" systems for which we have not troubled to lay the groundwork.

So once we have these integer quantifiers, we may then go on to establish some formal class theory or other, such as ZFC or NBG. If we like, we can at some point dispense with the proto-integers and accept, say, the von Neumann set theoretic definition of integer.

There is nothing terribly novel in this method. The point is to show that we can accompany basic set theory with exact integer quantifiers right away.

Method

Some Euclidean axioms that we appropriate

1. A line on the plane may be intersected by two other lines A and B such that the distance from A to B is definite (measurable in principle with some yardstick) which we shall call magnitude AB.

2. Magnitude, or distance, AB can be exactly duplicated with another intersecting line C such that AB and CA have no distance between their nearer interior end points.

3. Some line A may be intersected by a line B such that A and B are perpendicular.

From these, we are able to construct and imply two parallel lines A and B, each of which is intersected by lines of a "unit" length apart and we can arrange that the perpendicular distance from A to B is unity. In other words, we have a finite strip of squares all lined up. We have given no injunction against adding more squares along the horizontals.

We define "0" magnitude as the distance covered by the intersection of two lines.

Now, for graphic purposes, we shall imagine an S perfectly inscribed within a square. The "S" is for our convenience; it is the square upon which we rely.

Now as we consider a strip of squares (say beginning on the right with the eye going leftward), we observe that there is a vertical line at the beginning on the right. Beyond that there is no square. We shall symbolize that condition with a "0" .

At this point we interject that the term "adjacent square" means that squares have a common side and that we are pointing to, or designating, a specific square.

The word "consecutive" implies that there is some strip of squares such that if we examine any specific square we find that it is always adjacent to some other(s). To be more precise, we use the concepts of "left" and "right," which, like "top" and "bottom," are not defined. If at the leftmost or rightmost side of a strip of squares (or "top or bottom"), we find there is no adjacent square, we may use that extreme square as a "beginning." We then sheer off only that square.

We then repeat the process. There is now a new "beginning" square that meets the original conditions. This is then shorn off. This algorithm may be performed repeatedly. This process establishes the notions of "consecutive" and "consecutive order." If there is no "halt" order implied, then the process is open-ended. We cannot say that an infinity is really implied, as we have not got to more advanced class theory (which we are not going to do).

We can say that a halt order is implied whenever we have decided to name a strip, which becomes obvious from what follows.

So then, all this permits us to use the unoriginal symbolism

S0

Under Peano's rule, 0 has no predecessor and every S has a predecessor.

From this we obtain 0 --> no square to the right.

S0 is the successor of 0 and is named "1."

SS0 is called the successor of "1" and is named "2."

From here we may justify "any" constructible integer without resort to mathematical induction, an axiom of infinity and an infinite axiom scheme. We do not take the word "any" as it is used with the "all" quantifier. Rather, what we mean is that if a number is constructible by the open-ended successor algorithm, it can also be used for counting purposes.

Now we derive the arithmetical operations.

Addition

Example

"1 + 2 = 3" is justified thus:

We write S0 + SS0, retaining the plus sign as convenient.

This tells us to eliminate or ignore the interior '0' and slide the left-hand S (or S's, as the case may be) to the right, thus giving the figure

SSS0.

We can, if desired, be fussy and not talk about sliding S's but about requiring that the strip S0 must extend leftward from the leftward vertical side of strip SS0 on ground that 0 implies no distance between the two strips.

"Two" is not defined here as a number, but as a necessary essential idea that we use to mean a specified object of attention and an other specified object of attention. I grant that the article "an" already implies "oneness" and the word "other" already implies "twoness." Yet these are "proto-ideas" and not necessarily numbers. BUT, since we have actually defined numbers by our successor algorithm, we are now free to apply them to our arithmetic operations.

Subtraction

Example

5 - 2 = 3

Subtraction is handled by first forming a third horizontal parallel line that is also a unit distance apart from the nearest other parallel. Thus we have two rows of squares that can be designated by S place-holders.

We write

SSSSS0
000SS0

where we have designated with 0's those squares on the second strip that are to the left of the bottom successor strip and directly under squares of the top successor strip. Hence, we require that only vertical strips with no 0's be erased and collapsed (again, we could be more finicky in defining "erase and collapse" but won't be bothered).

The result is

SSS[SS]0 = SSS0

Note that we may reverse the procedure to obtain negative numbers.

A negative number is defined for K - J with J < K. The less-than relation is determined if we have two strips, as above, in which one strip contains leftward 0's. The strip with the leftward 0's is "less than" the strip without leftward 0's.

So then,

1 - 2 = -1

results from

0S0
SS0

Similarly we cross out the vertical S's, preserving the top strip, which gives

0S0

Though that last expression is OK, for symmetry, we should drop the right-hand 0. In that case 0S is -1, 0SS is -2, and so on.

We require (this must be an axiom) that

Axiom: 0S + S0 = 00 = 0

In which case, we may reduce matching opposed numbers K + -K to the form

0S
S0

which is 0.

For example, 0SS + SS0 gives

0SSS

SSS0

and by erasing the columns of S's (no 0's), we obtain the 0 identity.

Multiplication

Example

3 x 2 = 6

We decide that we will associate the left of the multiplication sign with horizontal rows and the right with vertical columns.

Thus,

SS0
SS0
SS0
000

We match each horizontal "2" with a strip under it, until we have reached the vertical number "3." As a nicety, I have required a bottom row of 0's, to assure that the columnular number is defined. We then slide each row onto the top framework of squares, thus:

SS0SS0SS0

However, interior 0's imply that there is no distance between sub-strips. Hence we erase them and of course get

SSSSSS0

which we have decided to name "6."

Division

Exact division

i) If two strips are identical we say that only one name is to be assigned -- say, "K." That is, they both take the same name. Thus if two strips completely match (no difference in magnitude), then the number K is said to divide by K.

ii) Let a shorter strip be placed under another strip.

SSSS0
00SS0

The shorter strip is said to divide exactly the longer if the shorter strip is replicated and placed leftward under the longer strip and, after erasure of interior 0's, the two row strips are identical.

SSSS0
SSSS0

But before we may do that, we must ascertain what number the exact division yields. In other words, we have proved that 2 divides 4 exactly. But we have not shown that 4/2 = 2. This requires another step, which harks back to the multiplication procedure.

Each sub-strip in row 2 must be placed in a new row, using the former row 2 as the present top row, and, for clarity, we add a row of 0's. From the above example:

SS0
SS0
000

We may now read down the left-hand column to obtain the desired divisor.

SSSSSSSSS0

divided by the strip SSS0 yields

SSS0
SSS0
SSS0
0000

By this, we have proved that the number known as 9 is exactly divided by the number called 3 into 3 strips, all with the name 3.

Rationals

Rationals are defined by putting one successor number atop another and calling it a ratio.

We do not permit (axiom) division by 0 or, that is, for a ratio's denominator to have no predecessor.

0SS0
SSS0

is 2/3 and likewise,

SSS0
0SS0

is 3/2.

and

0S0
SS0

yields - 1/2

while

0S0
0S0

yields + 1

and

00S0
0SS0

yields + 1/2

We do not enter into the subject of equivalence classes, which at this point would be a highly anachronistic topic.

So then we can say, for example, (2x ∈ X)(x,a), which reads there are at least 2 x in X which have the property a. Of course, that does not mean we are not obliged to build up the sets and propositions in some coherent fashion.

By establishing proto-integers through the use of some routine axioms, we are able to give exact quantifiers for any sets we intend to build. As a bonus, we have established the basic arithmetic operations without resort to formal set theory.

The definition of successor/predecessor relation is easily derived from the discussion above of "consecutive" and "next."

For our purposes, a proto-set, or "set," is a successor number the elements of which are predecessors (so this is similar to the Von Neumann method in which a successor is defined "x U {x}." Our method, I would say, is a tad more primitive.

Our "elements" may be visualized by placing each immediate predecessor on an adjacent horizontal strip, as shown:

Or we may have an equivalent graph

which is handy because we now have a matrix with its row and column vectors -- though of course we are not anywhere near that level of abstraction in our specific business.

So, if we like, we may denote each strip with the name "set." The bottom strip we call the "0 set" which means that it is the class with no predecessor. That it is equivalent to the empty set of standard class theories is evident because it implies no predecessor, which is to say there can be no "element" beneath it. Also, note that Russell's paradox does not arise in this primitive system, because a strip number's "element" (we are free to avoid that loaded word if we choose) strips are always below it.

Now note that the top number has an S on the extreme left -- rather than a 0 -- such that S₃ ⇒ S₂ ⇒ S₁ ⇒ 0 (where the sub numbers are only for our immediate convenience and have no intrinsic meaning; we could as well use prime marks or arbitrary names, as in S_dog ⇒ S_starship.

In any case we may, only if we so desire, name the entire graphic above as a "set" or "proto-set." Similarly, we may so name each sub-graphic that occurs when we erase a top strip. Obviously, this parallels the usual set succession rules.

Though our naming these graphics "sets" is somewhat user-friendly, it is plainly unnecessary. We could as well name them with some random string of characters, say "xxjtdbn." The entire graph has the general name "xxjtdbn." Under it is another xxjtdbn, which differs from the other and so must take another name as a mark of distinction. In fact, every permissible graph must take some distinct name.

So for the entire graph we have "xxjtdbn." For the "next" sub-graph, we have "agbfsaf." For the one below that, we have "dtdmitg." And for the "0" strip we have "zbhikeb."

We are expected to know that each name applies to a specific strip and thus is either a successor or a predecessor or both. So then, we don't really need to employ the abstract concept of "set" (though we are employing abstract Euclidean axioms and a couple of other axioms).

Now if we write, for example,

(2x ∈ X)P

we seem to be saying that the more "advanced" set definitions are in force for "X" and so "2x ∈ X" is not a legitimate quantifier of the assertion (=proposition) P.

It is true that we are not done with our quantifier design. We are saying that "2" is a name given our graphic that is also known as "agbfsaf." We accept that there may be objects of some sort that go by the generic name "x." We must be able to establish a 1-1 correspondence between our agbfsaf graphic and any x's.

That is, we must be able to draw a single line (at least notionally) between every strip of agbfsaf, except the 0 strip, and one and only one of the x's. If that 1-1 correspondence is exact -- no unconnected ends -- then we may write (2!x ∈ X)P, which tells us that there are exactly 2 x's that apply to P "truthfully."

We have implied in our notation that there is a class X of which the x's are members. This isn't quite necessary. We may just say that X is a name for various x's. We might even say that we have a simple pairing system (aka "binary relation," though we must beware this terminology as probably anachronistic) such that x_o,X, meaning that x may vary but that X may not.

Now suppose we wish to talk about equality, as in P means "x = x."

We may write (∀ x)(x = x), (∃ x)(x = x) and ~(∀ x)(x = x) in standard form.

We interrupt here to deny that the ∀ quantifier must be taken to require a set. Instead of using the concept "set," we say that there exists some formation of triples y,p,Y, such that y varies but p and Y do not. Y is associated with distinct y's.

Also, we are prissy about the words "all," "any" and "each." If a set or class is thought to be strictly implied by the word "all," then we disavow that word. Rather the ∀ quantifier is to be read as meaning that "any" y may be paired with (p,Y). That is, we say that we may select a y arbitrarily and must find that it has the name Y and the property p under consideration.[1]

Though the word "each" (="every") may connote "consecutive order" in the succession operation described above, it is neater for our purposes to use only the word "any" in association with the all quantifier.

Now if we mean by x one of the graphic "numbers," each graph shows a 1-1 correspondence with a copy of itself and so we can establish the first two statements as true and the last as false. If we mean that x represents arbitrary ZFC or NBG class theoretic numbers, we can still use the correspondence test (though we need not deal with indefinite unbounded algorithms -- loosely dubbed "infinity").

Further, we may use the correspondence test for, say, the graphic named "2." If we find that we can draw single lines connecting strips of "2" with objects known as x's among NBG or ZFC "numbers," then we can say 2x(x = x). That means that graphic "2" is 1-1 with some part of ZFC or NBG. Or, we would normally say, "there are at least 2 x's in NBG or ZFC that are equal to themselves (where "self" = "duplicate")," as opposed to 2!x(x = x), which would normally be verbalized as "there are exactly 2 x's in NBG or ZFC that are equal to themselves (or their duplicates). The last statement can only be true if we specify what x's we are talking about.

I concede that in these last few paragraphs I mix apples and oranges. Why would we need the graphic "numbers" if we already have ZFC or NBG? But, we do have proof of principle -- that much can be done with these successor graphics, or what some might term pseudo-sets.

1. In his book Logic for Mathematicians (Chelsea Publishing 1978, McGraw Hill 1953), J. Barkley Rosser cautions against the word any. "Sometimes any means each and sometimes it means some. Thus, sometimes "for any x..." means (x) and sometimes "for any x..." means (Ex)."
¶ [Note: (x) is an old-fashioned way to denote ∀x and (Ex) is Rosser's way of denoting ∃x.]
¶ After giving an ambiguous example, Rosser says, "If one wishes to be sure that one will be understood, one should never use any in a place where either each or some can be used. For instance, "I will not violate any law." The statement "I will not violate each law" has quite a different meaning, and the statement, "I will not violate some law" might be interpreted to mean that there is a particular law which I am determined not to violate.
¶ "Nonetheless," says Rosser, "many writers use any in places where each or some would be preferable."

Version 5 after 4 very rough drafts

Friday, February 26, 2021

Chapter 11
Note on Wolfram's
principle of computational equivalence

Below is a bit of philosophy of mathematics. I am well aware that readership will be quite limited.

Stephen Wolfram has discussed his "principle of computational equivalence" extensively in his book A New Kind of Science and elsewhere. Herewith is this writer's understanding of the reasoning behind the PCE:

1. At least one of Wolfram's cellular automata is known to be Turing complete. That is, given the proper input string, such a system can emulate an arbitrary Turing machine. Hence, such a system emulates a universal Turing machine and is called "universal."

2. One very simple algorithm is Wolfram's CA Rule 110, which Matthew Cook has proved to be Turing complete. Wolfram also asserts that another simple cellular automaton algorithm has been shown to be universal or Turing complete.

3. In general, there is no means of checking to see whether an arbitrary algorithm is Turing complete. This follows from Turing's proof that there is no general way to see whether a Turing machine will halt.

4. Hence, it can be argued that very simple algorithms are quite likely to be Turing complete, but because there is no way to determine this in general, the position taken isn't really a conjecture. Only checking one particular case after another would give any indication of the probability that a simple algorithm is universal.

5. Wolfram's principle of computational equivalence appears to reduce to the intuition that the probability is reasonable -- thinking in terms of geochrons -- that simple algorithms yield high information outputs.

Herewith the writer's comments concerning this principle:

1. Universality of a system does not imply that high information outputs are common (recalling that a bona fide Turing computation's tape halts at a finite number of steps). The normal distribution would seem to cover the situation here. One universal system is some algorithm (perhaps a Turing machine) which produces the function f(n) = n+1. We may regard this as universal in the sense that it prints out every Turing machine description number, which could then, notionally, be executed as a subroutine. Nevertheless, as n approaches infinity, the probability of happening on a description number goes to 0. It may be possible to get better efficiency, but even if one does so, many description numbers are for machines that get stuck or do low information outputs.

2. The notion that two systems in nature might both be universal, or "computationally equivalent," must be balanced against the point that no natural system can be in fact universal, being limited by energy resources and the entropy of the systems. So it is conceptually possible to have two identical systems, one of which has computation power A, based on energy resource x, and the other of which has computation power B, based on energy resource y. Just think of two clone mainframes, one of which must make do with half the electrical power of the other. The point here is that "computational equivalence" may turn out not to be terribly meaningful in nature. The probability of a high information output may be mildly improved if high computation power is fairly common in nature, but it is not easy to see that such outputs would be rather common.

A mathematician friend commented:

I'd only add that we have very reasonable ideas about "most numbers," but these intuitions depend crucially on ordering of an infinite set. For example, if I say, "Most integers are not divisible by 100", you would probably agree that is a reasonable statement. But in fact it's meaningless. For every number you show me that's not divisible by 100, I'll show you 10 numbers that are divisible by 100. I can write an algorithm for a random number generator that yields a lot more numbers that are divisible by 100 than otherwise. "But," you protest, "not every integer output is equally likely under your random number generator." And I'd have to agree, but I'd add that the same is true for any random number generator. They are all infinitely biased in favor of "small" numbers (where "small" may have a different meaning for each random number generator).

Given an ordering of the integers, it is possible to make sense of statements about the probability of a random integer being thus-and-so. And given an ordering of the cellular automata, it's possible to make sense of the statement that "a large fraction of cellular automata are Turing complete."

My reply:

There are 256 cellular automata in NKS. The most obvious way to order each of these is by input bit string, which expresses an integer. That is, the rule operates on a bit-string stacked in a pyramid of m rows. It is my thought that one would have to churn an awfully long time before hitting on a "universal." Matthew Cook's proof of the universalism of CA110 is a proof of principle, and gives no specific case.

As far as I know, there exist few strong clues that could be used to improve the probability that a specific CA is universal. Wolfram argues that those automata that show a pseudorandom string against a background "ether" can be expected to show universality (if one only knew the correct input string). However, let us remember that it is routine for functions to approach chaos via initial values yielding periodic outputs.

So one might need to prove that a set of CA members can only yield periodic outputs before proceeding to assess probabilities of universalism.

Perhaps there is a relatively efficient means of forming CA input values that imply high probability of universalism, but I am unaware of it.

Another thought: Suppose we have the set of successive integers in the interval [1,10]. Then the probability that a randomly chosen set member is even is 1/2. However, if we want to talk about an infinite set of integers, in line with my friend's point, the probability of a randomly selected number being even is meaningless (or, actually, 0, unless we invoke the axiom of choice). Suppose we order the set of natural numbers thus: {1,3,5,7,9,2,11,13,15,17,4...}. So we see that the probability of a specific property depends not only on the ordering, but on an agreement that an observation can only take place for a finite subset.

As my friend points out, perhaps the probability of hitting on a description number doesn't go to 0 with infinity; it depends on the ordering. But, we have not encountered a clever ordering and Wolfram has not presented one.

Chapter 10
Drunk and disorderly:
The fall and rise of entropy

Some musings posted Nov. 20, 2010

One might describe the increase of the entropy ⁰ of a gas to mean that the net vector -- sum of vectors of all particles -- at between time t ₀ and t _n tends toward 0 and that once this equilibrium is reached at t_n , the net vector stays near 0 at any subsequent time. One would expect a nearly 0 net vector if the individual particle vectors are random. This randomness is exactly what one would find in an asymmetrical n-body scenario, where the bodies are close together and about the same size. The difference is that gravity isn't the determinant, but rather collisional kinetic energy. It has been demonstrated that n-body problems can yield orbits that become extraordinarily tangled. The randomness is then of the Chaitin-Kolmogorov variety: determining future position of a particular particle becomes computationally very difficult. And usually, over some time interval, the calculation errors increase to the point that all predictability for a specific particle is lost.

But there is also quantum randomness at work. The direction that an excited photon exits an atom is probabilistic only, meaning that the recoil is random. This recoil vector must be added to the other electric charge recoil vector associated with particle collision -- though its effect is very slight and usually ignored. Further, if one were to observe one or more of the particles, the observation would affect the knowledge of the momentum or position of the observed particles.

Now supposing we keep the gas at a single temperature in a closed container attached via a closed valve to another evacuated container, when we open the valve, the gas expands to fill both containers. This expansion is a consequence of the effectively random behavior of the particles, which on average "find less resistance" in the direction of the vacuum.

In general, gases tend to expand by inverse square, or that is spherically (or really, as a ball), which implies randomization of the molecules.

The drunkard's walk
Consider a computerized random walk (aka "drunkard's walk") in a plane. As n increases, the area covered by the walk tends toward that of a circle. In the infinite limit, there is probability 1 that a perfect circle has been covered (though probability 1 in such cases does not exclude exceptions). So the real question is: what about the n-body problem yields pi-randomness? It is really a statistical question. When enough collisions occur in a sufficiently small volume (or area), the particle vectors tend to cancel each other out.

Let's go down to the pool hall and break a few racks of balls. It is possible to shoot the cue ball in such a way that the rack of balls scatters symmetrically. But in most situations, the cue ball strikes the triangular array at a point that yields an asymmetrical scattering. This is the sensitive dependence on initial conditions associated with mathematical chaos. We also see Chaitin-Kolmogorov complexity enter the picture, because the asymmetry means that for most balls predicting where one will be after a few ricochets is computationally very difficult.

Now suppose we have perfectly inelastic, perfectly spherical pool balls that encounter idealized banks. We also neglect friction. After a few minutes, the asymmetrically scattered balls are "all over the place" in effectively random motion. Now such discrete systems eventually return to their original state: the balls coalesce back into a triangle and then repeat the whole cycle over again, which implies that in fact such a closed system, left to its own devices, requires entropy to decrease , a seeming contradiction of the second law of thermodynamics. But the time scales required mean we needn't hold our breaths waiting. Also, in nature, there are darned few closed systems (and as soon as we see one, it's no longer closed at the quantum level), allowing us to conclude that in the ideal of 0 friction, the pool ball system may become aperiodic, implying the second law in this case holds.

(According to Stephen Wolfram in A New Kind of Science, a billiard ball launched at any irrational angle to the banks of an idealized, frictionless square pool table must visit every bank point [I suppose he excludes the corners]. Since each point must be visited after a discrete time interval, it would take eternity to reach the point of reversibility.) And now, let us exorcize Maxwell's demon, which, though meant to elucidate, to this day bedevils discussions of entropy with outlandish "solutions" to the alleged "problem." Maxwell gave us a thought experiment whereby he posited a little being controlling the valve between canisters. If (in this version of his thought experiment) the gremlin opened the valve to let speedy particles past in one direction only, the little imp could divide the gas into a hot cloud in one canister and a cold cloud in the other. Obviously the energy the gremlin adds is equivalent to adding energy via a heating/cooling system, but Maxwell's point was about the very, very minute possibility that such a bizarre division could occur randomly (or, some would say, pseudo-randomly).

This possibility exists. In fact, as said, in certain idealized closed systems, entropy decrease MUST happen. Such a spontaneous division into hot and cold clouds would also probably happen quite often at the nano-nano-second level. That is, when time intervals are short enough, quantum physics tells us the usual rules go out the window. However, observation of such actions won't occur for such quantum intervals (so there is no change in information or entropy), and as for the "random" chance of observing an extremely high-ordering of gas molecules, even if someone witnessed such an occurrence, not only does the event not conform to a repeatable experiment, no one is likely to believe the report, even if true.

Truly universal?
Can we apply the principle of entropy to the closed system of the universe? A couple of points: We're not absolutely sure the cosmos is a closed system (perhaps, for example, "steady state" creation supplements "big bang" creation). If there is a "big crunch," then, some have speculated, we might expect complete devolution to original states (people would reverse grow from death to birth, for example). If space curvature implies otherwise, the system remains forever open or asymptotically forever open.

However, quantum fuzziness probably rules out such an idealization. Are quantum systems precisely reversible? Yes and no. When one observes a particle collision in an accelerator, one can calculate the reverse paths. However, in line with the Heisenberg uncertainty principle one can never be sure of observing a collision with precisely identical initial conditions. And if we can only rarely, very rarely, replicate the exact initial conditions of the collision, then the same holds for its inverse.

Then there is the question of whether perhaps a many worlds (aka parallel universes) or many histories interpretation of quantum weirdness holds. In the event of a collapse back toward a big crunch, would the cosmos tend toward the exact quantum fluctuations that are thought to have introduced irregularities in the early universe that grew into star and galactic clustering? Or would a different set of fluctuations serve as the attractor, on grounds both sets were and are superposed and one fluctuation is as probable as the other? And, do these fluctuations require a conscious observer, as in John von Neumann's interpretation?

Of course, we face such difficulties when trying to apply physical or mathematical concepts to the entire cosmos. It seems plausible that any system of relations we devise to examine properties of space and time may act like a lens that increases focus in one area while losing precision in another. I.e., a cosmic uncertainty principle.

Conservation of information?
A cosmic uncertainty principle would make information fuzzy. As the Heisenberg uncertainty principle shows, information about a particle's momentum is gained at the expense of information about its position. But, you may respond, the total information is conserved. But wait! Is there a law about the conservation of information? In fact, information cannot be conserved -- in fact can't exist -- without memory, which in the end requires the mind of an observer. In fact, the "law" of increase of entropy says that memories fade and available information decreases. In terms of pure Shannon information, entropy expresses the probability of what we know or don't know.

Thus entropy is introduced by noise entering the signal. In realistic systems, supposing enough time elapses, noise eventually overwhelms the intended signal. For example, what would you say is the likelihood that this essay will be accessible two centuries from now? (I've already lost a group of articles I had posted on the now defunct Yahoo Geocities site.) Or consider Shakespeare's plays. We cannot say with certainty exactly how the original scripts read.

In fact, can we agree with some physicists that a specified volume of space contains a specific quantity of information? I wonder. A Shannon transducer is said to contain a specific quantity of information, but no one can be sure of that prior to someone reading the message and measuring the signal-to-noise ratio.

And quantum uncertainty qualifies as a form of noise, not only insofar as random jiggles in the signal, but also insofar as what signal was sent. If two signals are "transmitted" in quantum superposition, observation randomly determines which signal is read. So one may set up a quantum measurement experiment and say that for a specific volume, the prior information describes the experiment. But quantum uncertainty still says that the experiment cannot be described in a scientifically sensible way. So if we try to extrapolate information about a greater volume from the experiment volume, we begin to lose accuracy until the uncertainty reaches maximum. We see that quantum uncertainty can progressively change the signal-to-noise ratio, meaning entropy increases until the equilibrium level of no knowledge.

This of course would suggest that, from a human vantage point, there can be no exact information quantity for the cosmos.

So this brings us to the argument about whether black holes decrease the entropy of the universe by making it more orderly (i.e., simpler). My take is that a human observer in principle can never see anything enter a black hole. If one were to detect, at a safe distance, an object approaching a black hole, one would observe that its time pulses (its Doeppler shift) would get slower and slower. In fact, the time pulses slow down asymptotic to eternity.

So the information represented by the in-falling object is, from this perspective, never lost. But suppose we agree to an abstraction that eliminates the human observer -- as opposed to a vastly more gifted intelligence. In that case, perhaps the cosmos has an exact quantity of information at t _a. It then makes sense to talk about whether a black hole affects that quantity.

Consider a particle that falls into a black hole. It is said that all the information available about a black hole is comprised of the quantities for its mass and its surface area. Everything this super-intelligence knew about the particle, or ever could know, seemingly, is gone. Information is lost and the cosmos is a simpler, more orderly place, higher in information and in violation of the second law... maybe.

But suppose the particle is a twin of an entangled pair. One particle stays loose while the other is swallowed by the black hole. If we measure, say, the spin of one such particle we would ordinarily automatically know the spin of the other. But who's to tell what the spin is of a particle headed for the gravitational singularity at the black hole's core? So the information about the particle vanishes and entropy increases. This same event however means the orderliness of the universe increases and the entropy decreases. So, which is it? Or is it both. Have no fear, this issue is addressed in the next section.

Oh, and of course we mustn't forget Hawking radiation, whereby a rotating black hole slowly leaks radiation as particles every now and then "tunnel" through the gravitational energy barrier and escape into the remainder cosmos. The mass decreases over eons and eons until -- having previously swallowed everything available -- it eventually evaporates, Hawking conjectures.

A question: suppose an entangled particle escapes the black hole? Is the cosmic information balance sheet rectified? Perhaps, supposing it never reached the singularity. But, what of particles down near the singularity? They perhaps morph as the fields transform into something that existed close to the cosmic big bang. So it seems implausible that the spin information is retained. But, who knows?

Where's that ace?
There is a strong connection between thermodynamic entropy and Shannon information entropy. Consider the randomization of the pool break on the frictionless table after a few minutes. This is the equivalent of shuffling a deck of cards.

Suppose we have an especially sharp-eyed observer who watches where the ace of spades is placed in the deck as shuffling starts. We then have a few relatively simple shuffles. After the first shuffle, he knows to within three cards how far down in the deck the ace is. On the next shuffle he knows where it is with less accuracy. Let's say to a precision of (1/3)(1/3) = 1/9. After some more shuffles his potential error has reachs 1/52, meaning he has no knowledge of the ace's whereabouts.

The increase in entropy occurs from one shuffle to the next. But at the last shuffle, equilibrium has been reached. Further shuffling can never increase his knowledge of where the ace is, meaning the entropy won't decrease.

The runs test gives a measure of randomness¹ based on the normal distribution of numbers of runs, with the mean at n/2, "Too many" runs are found in one tail and "too few" in another. That is, a high z score implies that the sequence is non-random or "highly ordered." What however is meant by order? (This is where we tackle the conundrum of a decrease in one sort of cosmic information versus an increase in another sort.) Entropy is often defined as the tendency toward decrease of order, and the related idea of information is sometimes thought of as the surprisal value of a digit string. Sometimes a pattern such as HHHH... is considered to have low information because we can easily calculate the nth value (assuming we are using some algorithm to obtain the string). So the Chaitin-Kolmogorov complexity is low, or that is, the information is low. On the other hand a string that by some measure is effectively random is considered here to be highly informative because the observer has almost no chance of knowing the string in detail in advance.

However, we can also take the opposite tack. Using runs testing, most digit strings (multi-value strings can often be transformed, for test purposes, to bi-value strings) are found under the bulge in the runs test bell curve and represent probable randomness. So it is unsurprising to encounter such a string. It is far more surprising to come across a string with far "too few" or far "too many" runs. These highly ordered strings would then be considered to have high information value.

So, once the deck has been sufficiently shuffled the entropy has reached its maximum (equilibrium). What is the probability of drawing four royal flushes? If we aren't considering entropy, we might say it is the same as that for any other 20-card deal. But, a runs test would give a z score of infinity (probability 1 that the deal is non-random) because drawing all high cards is equivalent to tossing a fair coin and getting 20 heads and no tails. If we don't like the infinitude we can posit 21 cards containing 20 high cards and 1 low card. The z score still implies non-randomness with a high degree of confidence.

0.Taken from a Wikipedia article:
The dimension of thermodynamic entropy is energy divided by temperature, and its SI unit is joules per kelvin.

In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable. The concept was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication."

1. We should caution that the runs test, which works for n1 > 7 and n2 > 7, fails for the pattern HH TT HH TT... This failure seems to be an artifact of the runs test assumption that a usual number of runs is about n/2. I suggest that we simply say that the probability of that pattern is less than or equal to H T H T H T..., a pattern whose z score rises rapidly with n. Other patterns such as HHH TTT HHH... also climb away from the randomness area slowly with n. With these cautions, however, the runs test gives striking results.

2.Taken from Wikipedia:
In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable. The concept was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication."

Shannon's entropy represents an absolute limit on the best possible lossless compression of any communication, under certain constraints: treating messages to be encoded as a sequence of independent and identically-distributed random variables, Shannon's source coding theorem shows that, in the limit, the average length of the shortest possible representation to encode the messages in a given alphabet is their entropy divided by the logarithm of the number of symbols in the target alphabet.

A fair coin has an entropy of one bit. However, if the coin is not fair, then the uncertainty is lower (if asked to bet on the next outcome, we would bet preferentially on the most frequent result), and thus the Shannon entropy is lower. Mathematically, a coin flip is an example of a Bernoulli trial, and its entropy is given by the binary entropy function. A long string of repeating characters has an entropy rate of 0, since every character is predictable. The entropy rate of English text is between 1.0 and 1.5 bits per letter,[1] or as low as 0.6 to 1.3 bits per letter, according to estimates by Shannon based on human experiments.

Chapter 9
My aboutface on the Shroud

This article appeared a number of years (ca. 2005) before I reposted it in 2013 and again in 2020.

Vintage Anatomy LEONARDO da VINCI'S SUPERFICIAL ANATOMY OF THE SHOULDERS AND NECK c1510 250gsm ART CARD Gloss A3 Reproduction Poster: Amazon.co.uk: Kitchen & Home

Sometimes I am wrong

I haven't read The Da Vinci Code but...
. . . I have scanned a book by the painter David Hockney, whose internet-driven survey of Renaissance and post-Renaissance art makes a strong case for a trade secret: use of a camera obscura technique for creating precision realism in paintings.

Hockney's book, Secret Knowledge: rediscovering the lost legacy of the old masters, 2001, uses numerous paintings to show that European art guilds possessed this technical ability, which was a closely guarded and prized secret. Eventually the technique, along with the related magic lantern projector, evolved into photography. It's possible the technique also included the use of lenses and mirrors, a topic familiar to Leonardo da Vinci.

Apparently the first European mention of a camera obscura is in Codex Atlanticus.

I didn't know about this when first mulling over the Shroud of Turin controversy and so was quite perplexed as to how such an image could have been formed in the 14th century, when the shroud's existence was first reported. I was mistrustful of the carbon dating, realizing that the Kremlin had a strong motive for deploying its agents to discredit the purported relic.

See my old page

Science, superstition and the Shroud of Turin
http://www.angelfire.com/az3/nuzone/shroud.html
Also,
https://needles515.blogspot.com/2020/09/science-superstition-and-shroud-of-turin.html

But Hockney's book helps to bolster a theory by fellow Brits Lynn Picknell and Clive Prince that the shroud was faked by none other than Leonardo, a scientist, "magician" and intriguer. Their book The Turin Shroud was a major source of inspiration for The Da Vinci Code, it has been reported.

The two are not professional scientists but, in the time-honored tradition of English amateurs, did an interesting sleuthing job.

As they point out, the frontal head image is way out of proportion with the image of the scourged and crucified body. They suggest the face is quite reminiscent of a self-portrait by Leonardo. Yet, two Catholic scientists at the Jet Propulsion Lab who used a computer method in the 1980s to analyze the image had supposedly demonstrated that it was "three-dimensional." But a much more recent analysis, commissioned by Picknell and Prince, found that the "three-dimensionalism" did not hold up. From what I can tell, the Jet Propulsion pair proved that the image was not made by conventional brushwork but that further analysis indicates some type of projection.

Picknell and Prince suggest that Leonardo used projected images of a face and of a body -- perhaps a cadaver that had been inflicted with various crucifixion wounds -- to create a death mask type of impression. But the image collation was imperfect, leaving the head size wrong and the body that of, by Mideast standards, a giant. This is interesting, in that Hockney discovered that the camera obscura art often failed at proportion and depth of field between spliced images, just as when a collage piece is pasted onto a background.

Still the shroud's official history begins in 1358, about a hundred years prior to the presumed Da Vinci hoax. It seems plausible that either some shroud-like relic had passed to a powerful family and that its condition was poor, either because of its age or because it wasn't that convincing upon close inspection. The family then secretly enlisted Leonardo, the theory goes, in order to obtain a really top-notch relic. Remember, relics were big business in those days, being used to generate revenues and political leverage.

For if Leonardo was the forger, we must account for the fact that the highly distinctive "Vignon marks" on the shroud face have been found in Byzantine art dating to the 7th century. I can't help but wonder whether Leonardo only had the Mandylion (the face) to work with, and added the body as a bonus (I've tried scanning the internet for reports of exact descriptions of the shroud prior to da Vinci's time but haven't succeeded).

The Mandylion refers to an image not made by hands. This "image of Edessa" must have been very impressive, considering the esteem in which it was held by Byzantium. Byzantium also was rife with relics and with secret arts -- which included what we'd call technology along with mumbo-jumbo. The Byzantine tradition of iconography may have stemmed from display of the Mandylion.

Ian Wilson, a credentialed historian who seems to favor shroud authenticity, made a good case for the Mandylion having been passed to the Knights Templar -- perhaps when the crusaders sacked Constantinople in 1204. The shroud then showed up in the hands of a descendant of one of the Templars after the order was ruthlessly suppressed. His idea was that the shroud and the Mandylion were the same, but that in the earlier centuries it had been kept folded in four, like a map, with the head on top and had always been displayed that way.

The other possibility is that a convincing relic of only the head was held by the Templars. A discovery at Templecombe, England, in 1951 showed that regional Templar centers kept paintings of a bearded Jesus face, which may well have been copies of a relic that Templar enemies tried to find but couldn't. The Templars had been accused of worshiping a bearded idol.

Well, what made the Mandylion so convincing? A possibility: when the Templars obtained the relic they also obtained a secret book of magical arts that told how to form such an image. This of course implies that Leonardo discovered the technique when examining this manuscript, which may have contained diagrams. Or, it implies that the image was not counterfeited by Leonardo but was a much, much older counterfeit.

Obviously all this is pure speculation. But one cannot deny that the shroud images have a photographic quality but are out of kilter with each other and that the secret of camera obscura projection in Western art seems to stem from Leonardo's studios.

The other point is that the 1988 carbon analysis dated the shroud to the century before Leonardo. If one discounts possible political control of the result, then one is left to wonder how such a relic could have been so skillfully wrought in that era. Leonardo was one of those once-in-a-thousand-year geniuses who had the requisite combination of skills, talents, knowledge and impiety to pull off such a stunt.

Of course, the radiocarbon dating might easily have been off by a hundred years (but, if fairly done, is not likely to have been off by 1300 years).

All in all, I can't be sure exactly what happened, but I am strongly inclined to agree that the shroud was counterfeited by Leonardo based on a previous relic. The previous relic must have been at least "pretty good" or why all the fuss in previous centuries? But, it is hard not to suspect Leonardo's masterful hand in the Shroud of Turin.

Of course, the thing about the shroud is that there is always more to it. More mystery. I know perfectly well that, no matter how good the scientific and historical analysis, trying to nail down a proof one way or the other is a wil o' the wisp.

Table of content of Mind Journeys

This is the revised edition of November 2022, containing a new 18th chapter.
This e-book contains about 100,000 words.

Mind Journeys

Saturday, February 27, 2021

Chapter 13
In search of a blind watchmaker

Chapter 16
Brahman as Unknown God

Chapter 12
Proto-integers and (very) naive classes

Deriving the four arithmetic operations
without appeal to standard sets

Friday, February 26, 2021

Chapter 11
Note on Wolfram's
principle of computational equivalence

Chapter 10
Drunk and disorderly:
The fall and rise of entropy

Chapter 9
My aboutface on the Shroud

Table of content of Mind Journeys

<small><i>Chapter 18</i></small><br> Chasing Schrödinger's cat

Report Abuse

Saturday, February 27, 2021

Deriving the four arithmetic operations without appeal to standard sets

Friday, February 26, 2021

Deriving the four arithmetic operations
without appeal to standard sets