Can we prove what the Voynich Manuscript is? Well, we need to take that first step sooner or later.
With ROS (Robot Operating System) being much less co-operative than I had envisioned, I haven’t had time at all to investigate the Voynich Manuscript as much as I had promised myself and others. That said, I looked at David Jackson’s interesting essay A Logical Consideration of the Voynich Manuscript where he evaluates all the evidence out there and comes to some conclusions.
Here are my thoughts (and in writing them, I have created more questions to consider).
Introduction and hypothesis
“There is more than enough work “out there” on the web on the Voynich. In fact, everytime I have an idea, I find that somebody else has already had that idea and has investigated it. Surely there is enough evidence out there to start getting a glimpse of the truth?” (Jackson 1)
I don’t think there’s enough evidence out there yet. There is still a lot of physical science analysis that could be done on the manuscript itself, as Pelling has described in some detail.
“I start by defining a hypothesis to prove, and a null (contrary) hypothesis.
• H0 : The Voynich Manuscript contains understandable content.
• H1 : The Voynich Manuscript does not contain understandable content.” (Jackson 3)
This is the question I was planning to start off with too. However, the hypotheses needed to be worded better or clarified here. There is a difference between something containing information and being understandable.
“Understandable” now or at the time of creation? Some have suggested that the emendations and degradation could have changed the text so much that the original meaning cannot be recovered (Stolfi). The re-ordering of the pages (which almost everyone including Jackson agrees on) also would make it less understandable. From the point of view of deciphering the manuscript, it makes no difference whether it never contained information or if it contained information but became too degraded. But from the point of view of deciding what it is, the distinction matters a great deal.
“Understandable” to who(m)? My mother has awful handwriting and often can’t read what she has written. Her notes and letters cannot be understood by anyone, but it still does (or did?) contain meaningful information. It would be inaccurate to call that gibberish (though I sometimes do :P), meaningless, a hoax or a forgery.
“Understandable” in what fraction? The text, the illustrations, or both? If 10% of the manuscript is gibberish and 90% is meaningful, is this understandable? Probably everyone would say yes. What about 90% gibberish and 10% meaningful? 95% to 5%? 99% to 1%? What if the text was gibberish but the pictures mean something? If we mean to ask if it contains anything understandable, then yes, we can indisputably understand the month names (Vogt and Schwerdtfeger). If you want to point out that they are later additions, then yes, we can understand the zodiac illustrations. If you want to point out that obviously we’re asking about the mysterious “text” and the manuscript as a whole (not just a small section), then we have to really specify where to look and why.
[please ignore the sudden change in text color]
“Understandable” with what intention? With so many details to consider, it’s mentally a lot easier to simplify the issue and just ask “does the text contain information?” without regards to context or intention. But inevitably it must come up, since someone must have created it for some reason. Hypothetically, consider if the text simply said “lol you got trolled” in cipher, and the rest was gibberish. Technically it did contain something meaningful and decipherable to uncover. But overall is the manuscript truly meaningful? Not really.
I know, this all sounds very pedantic. But with the ambitious reach of Jackson’s project, the logic it employs and the scope of the Voynich Manuscript, it’s important to be specific.
Explanations and definitions
“I collect all of the most common “explanations” for the Voynich Manuscript, collate all possible third analysis of each explanation, and attempt to evaluate these analysis.” (Jackson 3)
Is this referring to general hypotheses like “it could be a natural language”, or specific theories like the proto-Ukrainian translation theory? It’s not clear. Jackson also could have listed the explanations he has collected.
I’m disappointed that only “forgery” and “hoax” are the only definitions used for gibberish explanations, considering how many other variations there are. These two definitions both include “a deliberate attempt to pass itself off as a 15th century text”. If it was gibberish but actually produced in the 15th century, what would Jackson call it, I wonder?
Origins – Materials
Nothing to discuss here. The current physical analyses are well accepted.
Origins – Inking
I also agree with this conclusion. But while the other discussions about paints and ordering are indeed tedious, they really do need to be considered to be comprehensive.
“I therefore suggest we discard the numbering from our analysis as not relevant, and probably as being added later by an unknown owner of the VM.” (Jackson 7)
I disagree with the first statement. All facts are relevant, even if indirectly. The work on the page numbering helps us to know about the original ordering a bit more, which gives us clues about the production process and the illustration patterns, for example the ones that flow across bifolios when the original order is restored (Pelling). Also, the fact that those who numbered the quires and pages didn’t understand the text is a good indicator that the cipher/code/language/whatever was not part of a long-lasting cult.
“By their very nature, they do not form part of the main body of work and there appear to be no
theories linking them to the original work. In either case, they are so indistinct that debate rages about
their very nature. Therefore, I suggest we discard them as irrelevant to this study.” (Jackson 7)
Wait, what!? For someone who has considered the McCrone ink analysis, how did Jackson miss the finding that the Latin marginalia on f116v was determined to have used the same ink as the main text? (Barabe 5).
That is a significant research failure.
“There is no consensus on the Voynich alphabet.” (Jackson 8)
Agreed, but two other details probably should have been mentioned in his summary:
- The transcriptions often differ because many symbols have an extremely strong affinity for each other and can be grouped together. For example, EVA “i” is usually only repeated before EVA “n”, “r” and “m”, leading to Currier counting “in”, “iin”, and so forth as glyphs in their own right. But there are a frustrating number of exceptions to these sorts of rules.
- Whatever the glyph system is, it was clearly thought out and prepared before the manuscript was written.Jackson moves on to talk about the text itself. It would have been good to consider the geographic and socioeconomic background of the author.
What is the nature of the text?
Unfortunately, the second half of the essay is where it falls apart. Jackson tries to answer the first question “is it a natural language?” by investigating these sub-questions:
- Is it any known writing system?
- Is it a type of shorthand?
- Is the VM an invented (“constructed”) language or based on an existing natural one?
- Is the VM in code?
- Is it gibberish?
To reach conclusions, Jackson seems to be using this sort of reasoning:
- The Voynich Manuscript text can only be possibility 1, possibility 2, possibility 3, and so on.
- Consider possibility 1. Possibility 1 has these properties: [mentions some properties]. The Voynich Manuscript text does not have these properties. Therefore it is not possibility 1.
- Ditto for possibility 2.
- And so on…
- Using the method of elimination, it is most likely gibberish.
The problem with this approach is that it only works if you truly consider every possibility for the nature of the “text”, which is not and cannot be done. The question “is it any known writing system” already assumes that it is writing. Fair enough, it does look like writing, but we all know that nothing is as it seems in this manuscript. Various people have suggested that it could encipher non-textual information such as music or map co-ordinates. While these are very rare proposals that would have been filtered out by Jackson’s initial popularity check, I think this category should at least have been mentioned.
The problem with considering the properties of each possibility is shown in conclusion nine:
“The VM is not written in shorthand alone.” (Jackson 10)
If we accept that different meaning systems could have been used in combination, then it is almost impossible to compare to the known properties of each one individually.
“If it is a constructed language, it is of a different level to any other known invented language of before the 19th century.” (Jackson 11)
“However, when I look at these languages I see three main types” (Jackson 11)
These sorts of groupings are bound to end up as arbitrary as the Celestial Emporium of Benevolent Knowledge. Edit: To those unfamiliar, I am referring to Jorge Luis Borges’ hypothetical taxonomy supposedly sourced from an ancient Chinese encyclopedia. Its purpose is to point out the arbitrariness and cultural specificity of any classification, whether they are formally stated or the ones that we unconsciously assume and that guide our thoughts. This is reproduced below:
“On those remote pages it is written that animals are divided into
- those that belong to the Emperor
- embalmed ones
- those that are trained
- suckling pigs
- fabulous ones
- stray dogs
- those that are included in this classification
- those that tremble as if they were mad
- innumerable ones
- those drawn with a very fine camel’s hair brush
- those that have just broken a flower vase
- those that resemble flies from a distance” (Borges in Weinberger 231)
In the end, the categorization says more about us than it does about the topic. Even worse, categories encourage us to mentally mask the nuances and complexity within them. The full range of patterns and purposes in artificial/insane meaning systems is huge, probably even larger than that of natural languages. To re-conceptualize and classify this gamut through the mental lens of a sane, 21st Century mind is bound to strip out significant distinctions that cannot be readily understood in a modern context.
This is not to say that strict groupings are useless, just that they can’t be applied here. The Voynich Manuscript is already notoriously difficult to pigeonhole, so now most that study it simply classify it primarily as itself – it just is what it is. It has vague links to other categories but never fits squarely in just one, making us realize that these categories are more like loose mental associations than discrete “boxes”. In the same way, artificial meaning systems arise from extremely unique and personal circumstances and can only be subjectively correlated with loose descriptions. To then discuss these types as if they are objective, distinct and with inherent properties is to miss the true nature of mental categorization¹, to project opinion onto fact, and to equate personal perception with reality.
¹(the human brain cannot simultaneously consider and process too many objects in short term memory, so it subjectively simplifies them into groups as a work-around, especially if it does not fully comprehend the details involved. These groups are derived from subjective and personal experience according to current knowledge and what’s useful right now. They are treated as discrete and distinct objects in their own right to allow for continued processing and judgement. When pushed to extremes, this is where stereotyping comes from. (Lehrer))
“The VM does not seem to contain any known religious imagery (see concepts). I am unaware of any attempts from this era to create a ‘universal language’ that did not include these concepts.” (Jackson 12)
Does there have to be a precedent for this? The Voynich Manuscript is in itself already unique in many ways. Jackson already mentioned Balaibalan which is uniquely a 16th Century Middle Eastern constructed language.
“The underlying construct of the language is simply too ordered for it to be a naturally generated language.” (Jackson 12)
I agree, but unfortunately this isn’t easy to quantify.
“A constructed language would not display this characteristic unless the constructed words were of a unified length, something that wouldn’t necessarily have made sense in the pre-computer era.” (Jackson 13)
If we are considering situations like insane babble, mystical experiences and personal circumstances, why should it make sense?
“Basically they want to find repeating CVCV blocks in the text. Nobody has ever found any, that I’m aware of.” (Jackson 14)
I’m not aware of any either. An important detail is that the glyphs are deliberately designed to resemble consonants and vowels, regardless of whether they do.
“The VM may be exhibiting elements of a phonetic language (ie Chinese)” (Jackson 14)
Minor detail, but Chinese writing is mostly semantic, not phonetic.
I assume Jackson means both codes and ciphers in this section.
“If he was stopping and working out the cipher as he went along, it can not have been a very laborious process.” (Jackson 15)
It could have plausibly been copied from a prepared draft that had already been enciphered.
“There were no computers. Any cipher or code that has ever been found from that time is limited in its nature to the essential. We can develop great theories about quantum encoding processes, but this is irrelevant to the VM. Any cipher would be limited to a) the realities of the era’s philosophical and mathematical abilities and b) the realities of the encoding / decoding process.” (Jackson 15)
The fact that the Voynich Manuscript itself is an outlier could easily suggest that its author and/or generation method is an outlier too. Jackson had mentioned Ramon Llull’s ars combinatoria logic system which doesn’t exactly fit with the era. As for abilities of the time, we must consider the range of the human brain that has not been fully explored by science. In the modern day we find people who can intuitively play invisible Tetris (e.g. Jin8), instantly calculate the day of any given calendar date (e.g. Orlando Serrell) or memorize entire pages in a matter of seconds (e.g. Kim Peek). Who can say what unrecorded minds may have been able to accomplish? This is purely speculative of course, but the point is that if we are discussing and evaluating all possibilities, we really must refrain from underestimating people with a lower technological level than us.
Also, the cipher doesn’t have to be impossibly complex, it could simply be very creative. Maybe it’s the sort of idea that only arises independently every few hundred years.
Going back to mathematical ability, complexity and encoding/decoding realities, it is possible to have enciphering methods that are mathematically extremely complex yet very simple to invent and execute. I have an example in my head (it’s not the synesthesia thing below) but I haven’t had time to write about it. Methods that rely on external information or rules can add to the apparent complexity.
“The VM is not a simple substitution cipher nor code.” (Jackson 16)
“No, to create a cipher that produces Voynich like text you need to use concepts and methods from the modern era.” (Jackson 17)
Anachronistic ahead-of-their-time inventions act as precedents for this sort of thing, e.g. Babbage’s differential engine, the Antikythera Mechanism or Roman concrete. There are probably more out there that are lost to time. Not to mention the other possibilities I mentioned; it could have used a powerful biological computer (i.e. the human brain), it could be a creative cipher, it could use external information, and so on.
“If you were bright enough to work out a cypher this complicated in the Middle Ages, and had secrets
enough to hide, why the hell invent some funny alphabet that’s only going to attract attention?” (Jackson 17)
I think that speculating on motivations is the least reliable type of reasoning here. As Tumblr reminds us – we know their name, not their story.
“C15: The VM is not written in a sophisticated cipher nor code.” (Jackson 17)
If Jackson has evaluated the evidence and followed the reasoning above to personally reach a conclusion like this, then so be it. But to present this conclusion on the same level of confidence as the statements “The parchment of the manuscript is from the first half of the 15th century” or “There is no consensus on the Voynich alphabet” is far too premature in my opinion.
I’ll state a hypothetical example that combines some of the above possibilities. It also relates to my point about the variations in “artificial meaning systems” that others cannot even begin to imagine. Synesthesia is a neurological phenomenon where experiencing one stimulus (e.g. the letter D) will trigger another unrelated experience (e.g. the musical note C#). The brains of synesthetes can link anything in all five senses (or space, or time, or feelings, or…) with other things in a very intense and personal way. These experiences can be very creative and detailed, like the qualities of a voice being linked to the circular movement patterns of bright lights. Pat Duffy once said “I realized that to make an R all I had to do was first write a P and draw a line down from its loop. And I was so surprised that I could turn a yellow letter into an orange letter just by adding a line.” They can state things like the smell of a cube, or the texture of a direction, just as naturally as we associate music and dance. They don’t have to make sense to anyone else.
Anyway, imagine that I have synesthesia and I experience colors when I hear phonemes. The rules that link these are intuitive and natural, I don’t have to think about them. In fact I don’t even know what they are, the result just comes to me. (it’s like humor, nobody knows the full set of rules and formulas to mathematically determine if a joke is funny, you just “know” if something is funny to you once you experience it). For example “the” is bright blue and “an” is dark violet blending into red. However “and the” is altogether bright sea green because a violet sound next to a blue one usually makes it quarter-way closer to “fi”, though there are plenty of exceptions. And so on. One day I assign symbols to each color I experience (possibly including colors that don’t exist for anyone else), as well as properties like brightness and paleness. This is my artificial “language” where the symbols represent meaning in a very personal and unique way. I write some text using the following method: 1. I read out a word in my head, 2. I write down the color of the word. I could map this instantly, and “read” it instantly too. The result would be unreadable to anyone but me.
This method is computationally more complex than what modern supercomputers could process. It relies on the state of interconnections between billions and billions of neurons, i.e. simulating the technique would involve simulating the entire brain. Yet for the encipherer (pre-modern or otherwise) this would be trivially easy to execute. The method is also too surreal to usefully categorize from the point of view of a neurotypical person, too creative to decipher (there is no other example to compare to or learn from), and it relies on too many external rules to decipher (even if you knew the method, and gave it to a synesthetic person, they wouldn’t know where to start because their brain isn’t wired in the same way as the encipherer and there is no way to retrieve, record or explain this “system”).
This is not actually a proposal or explanation, just a thought experiment to show the futility in certain lines of inquiry. We can say that a certain level of complexity is improbable but not conclusively impossible. We can consider artificial meanings and mental hijinks but we can never conclusively understand them, nor their range. And no amount of artificial intelligence could analyze or identify such unique works to reach meaningful conclusions, since there are no collections of other examples, nor any scientific knowledge about them.
“But Torstein Timm in his academic paper “How the Voynich Manuscript was created” describes a very simple process for creating the Voynich Manuscript using 15th century methods.” (Jackson 16)
His paper seems promising but unfortunately I haven’t had time to read it in full.
“why would a star be called the same as a plant?” (Jackson 16)
If the 13th-16th Century Hygromanteia is anything to go by, it’s because the spirit of the star is imbued in the moisture of a specific plant species, which is what gives it its medicinal properties. I had always suspected a link with the Hygromanteia, as it features encrypted text, astrological spirits, herbs, plant moisture, circular diagrams, ball-and-stick diagrams like the ones on the rosette folio, and an emphasis on time cycles. Not to mention the unconventional and unique Aries symbol marking the astrology section on f1r appears to be lifted directly from a copy of the Hygromanteia (Zandbergen). So this finding with the labels doesn’t make it seem meaningless to me; ironically it strengthens my suspicion instead of undermining it.
“A possibility is my ‘Volvelle’ theory, in which a wheel is used to start the generation of words. A series of concentric paper rings are constructed, with different glyphs on each section. The wheels are rotated to give new words. This accounts for Stolfi’s ‘crust core mantle’ paradigm as glyphs from each section are simply the ones on each ring.
The scribe uses the Volvelle to generate the first word of a sentence. He then uses Timm’s system to generate the rest of the text as he goes along. For labels on pages with many similar words he uses Timm’s system – on pages where there are no continuation between concepts he uses the generator each time and generates similar looking words (perhaps the scribe only moves the middle ring or the last one each time).” (Jackson 17)
Although I haven’t read Timm’s paper in full [update: I’ve read it now] and Jackson’s Volvelle theory is only in its draft form, my current understanding of Jackson’s proposal could still be consistent with a cipher. Allow me to explain.
We’ll start with a code. We have a supplementary code-book – this can be any dictionary, book, or list of words. For each word in our plaintext, we convert it to a number to indicate where to find it in the code-book. Let’s start with “the” as an example. “The” can be found on page 1, position 4 of a book I am holding, so we can encode it as 1-04. Or maybe 004 as it is the fourth word overall. The details don’t matter, the point is that we are converting each word to a number.
We also have Jackson’s proposed Volvelle. This is a device with three paper rings, each with a set of “Voynichese” glyphs. Only my hypothetical version has a number for each glyph. We generate the first ciphertext word by finding the glyph in position 1 from the first ring, position 0 from the second and position 4 from the third, to encode 1-04 as, say, EVA “oledy”. We then fill out the rest of the ciphertext line with filler using Timm’s method. Repeat with the next plaintext word at the start of each line. Or perhaps there are multiple words per line that use the Volvelle, which can be detected only in hindsight.
I’m perfectly aware that this method is extremely inefficient and pointless – why convert the number to a Voynichese word if it is already encoded, for example? This isn’t an actual proposal however, this is just a concept to demonstrate that any process that uses a generator can also be used to encode information some way or another.
Conclusions and postscript
Naturally, I didn’t come to the same conclusions as Jackson. Here is a brief comment on each of them:
- C1: The parchment of the manuscript is from the first half of the 15th century. – Agree
- C2: The ink used on the parchment was available at this period in time. – Agree
- C3: The ink used for the writing and the drawings is essentially the same. – Agree
- C4: The ink used for the numbering of the quires / pages, and the Latin alphabet, differ from one another and the writing / drawing. – Agree
- C5: The illustrations were sketched first and then the text added afterwards. – Agree
- C6: The quire and page numbers do not correspond to the true pagination of the VM as indicated by the illustrations and text flow. – Agree
- C7: There is no consensus on the Voynich alphabet. – Agree
- Corollary to C7: Analysis using only transcripts must attempt to adjust for erroneous input. – Agree
- C8: The VM is not written in a known writing system. – Partially agree. You can’t find the entire glyph set anywhere else, but I don’t think the wording of this is accurate. I would not call it a “writing system” yet.
- C9: The VM is not written in shorthand alone. – Agree
- C10: The VM is not a natural language. – Partially agree. It is not a natural language written clearly and without any significant distortion, that’s for sure.
- C11: The VM may be a restricted constructed language. – Agree
- C12: The VM appears (remember C7!) to have a strong underlying pattern that hints at a generator. – Agree
- C13: The VM was written in a fluent and confident manner. – Partially agree. The manuscript itself is. The underlying text may not have been generated fluently and confidently as implied, especially if it is a copy.
- C14: The VM is not a simple substitution cipher nor code. – Agree
- C15: The VM is not written in a sophisticated cipher nor code. – Disagree. My reasons have been stated above.
- C16: A simple and fast process to generate random text for the VM was available. – Partially agree. Depends on how you want to define “simple” and “fast”.
What about his final conclusion?
“I would say, having weighed over 100 years worth of study of the VM in my hand, that we have three
• The VM is written in a long lost script and language. In which case, unless we find more
examples or a key, we’ll never know what it says.
• The VM is written in an amazingly sophisticated code. In which case, it’s probably a modern
• The VM is gibberish.
So, the VM does [sic] null hypothesis is to be favoured:
the VM does not contain understandable content.” (Jackson 18)
While his reasoning makes sense, I think the evidence base is too small and the range of possibilities have not been tested, so this conclusion is premature. Other possibilities not discounted in Jackson’s essay but not included in his final list of three are non-textual information, a combination of different meaning systems, or a very creative cipher. And I’m sure there are more.
Also, “the VM does not contain understandable content” was H1, not the null hypothesis (Jackson 3). So either I am misreading Jackson’s conclusion, or he has swapped his hypotheses somewhere.
“I had no axe to grind or pet theory when I started this. I simply wanted to ‘weigh the evidence’ that was out there.” (Jackson 19)
Good. We need that kind of thinking – and not just in the study of the Voynich Manuscript. I’m sure it will get Jackson far and I wish him the best of luck in his investigations and development of his Volvelle theory.
Barabe, Joseph. Materials Analysis of the Voynich Manuscript. 1 April 2009.
Jackson, David. A Logical Consideration of the Voynich Manuscript (version 1). 2014. PDF file. <http://www.davidjackson.info/voynich/wp-content/uploads/sites/3/2014/08/consideration_v1.pdf>
Lehrer, Jonah. How We Decide. Boston, MA, USA: Houghton Mifflin Harcourt, 2009. Print.
Pelling, Nicholas. “Voynich Codicology”. Cipher Mysteries. 21 October 2008. Web. 22 August 2014. <http://www.ciphermysteries.com/the-voynich-manuscript/voynich-codicology>
Stolfi, Jorge. “Evidence of Text Retouching on f1r”. Voynich Manuscript stuff. 15 April 2004. Web. 22 August 2014. <http://www.ic.unicamp.br/~stolfi/voynich/04-07-15-retouching>
Zandbergen, René. “Codex Taurinensis C VII 15”. Voynich.nu. 5 April 2010. Web. 23 August 2014. <http://www.voynich.nu/extra/mstaur.html>
Vogt, Elmar, and Schwerdtfeger, Elias. “Writings on the Wall: A Discussion of the Voynich Manuscript Marginalia”. Voynich Thoughts. 2010. PDF file. <http://voynichthoughts.wordpress.com/treatise-on-marginalia>
Weinberger, Eliot. Selected Nonfictions. Westminster, London, UK: Penguin Books, 1999. Print.