Language Purification

Language purification, or "debabling", is the process of aligning words within a specific language with their underlying semantics. This process can also occur between two or more languages.

We have tried to keep the content both practically and academically correct. Practical use is an important value. The concepts or mechanisms are more or less often already applied in open source (sofware) projects and / or in the broader information technology domain (like—excuses for the technical term—"Object Oriented Analysis and Design", OOAD (think of concepts as 'inheritage', 'non ambiguity', 'redundancy avoidancy', 'semantics', 'language constructs', etc.).

We hope you enjoy reading it!

----
!!!Spread the word, but watch out not to kill the message!!!
----

Bableisme in one language
When someone 'discovers' that a certain term has an improper usage regarding its semantics, the following steps should be taken: within Wiktionary the definition should be corrected so that the usage and the semantics are correct, or a new term should be defined, this is called semantic splitting, and a new definition should be added to Wiktionary. In this case, the definition of the 'corrupt' term should be altered as well.

For example, consider the sentence "This is a really fast car." This could mean that it has it a high acceleration ability, i.e. it can go from zero to 100 km/hour in only 2,6 seconds, or it could also mean that it has a high maximum speed, such as 200km/hour. There already is a proper word for one part, speed, but we miss a word for the accelaration side. Perhaps, the word "faacc" could be used for fast acceleration, and or "qis" for quick at high speed. After defining a proper word, it should be added to the Wiktionary. So if qis is selected, it should be added to the Wiktionary. The definition of "fast" in the should be altered to read "Use this word only

if both the characteristics 'high accelaration' and 'it has a high maximum speed' apply. Otherwise use the terms 'qis' or 'speed'. What does someone exactly mean?"

If the old definition causes too much confusion, it can be decided to make this term a zombi-word, in which case the corresponding entry in the Wiktionary should read "this term should not be used anymore, and that instead the following terms ... should be used."

Bableisme over multiple languages

Really a challenge...
This gets more complex. The ideal step should be that all the language specific use a pointer to one definition (semantics) in the English (or better: a separate wiki for semantics, called e.g. 'semanticwiki', abbriviated to 'wikimantic'). By doing so 'Babbleism' can be avoided, and even partly 'corrected'. For now the Englisch (.org wiki) can be used, due to the fact that such a migration is too huge at this moment in time.

The maintenance of the 'word' is done solely at one place in that case. In such manner all energy can be put in using the word instead of solving cunfusement about it... 'Spread the word' has been taken to literally: even the 'maintenance' of the connected semantics (definition) got spread and semaniocs started to evolute in all kinds of places, causing that people -who understood each other at the moment of 'conception' of the word- now don't understand each other any more. Babylonia with all kinds of consequences followed...

A very good example of keeping the maintenance of a word at one place: only one day after the word 'wordsynthesis' was creacted (concepted!), it was decided to use the term 'wordconception' instead. Think what would have happened hen the word 'wordsynthesis' was already 'propagated' into various languages specific wiki's, dictionaries, usage in documents, etc. Just one day of creation, the concept would be stuck completely. 'Word dynamics' or 'word evolution' (another interesting concept) would not be possible anymore for this word. The only thing left to do (already) might be to give the word 'wordsynthesis' the status 'zombiword', and add the new term... This would be a waste of 'wordspace' and efforts spend by people to enter the definition is all those systems, databases and locations. Pass semantics 'by reference' gets a more important meaning at once!

Other rules:
*Do not add semantics on a word which is not that-language specific. In order words: if the semantics (definition / description) should be global, than the language specific wik only contains the term (or maybe more than one!) and a pointer towards the global term and global semantics, stored in the SemanticWiki

Global solution proposal: WikiMantics?

As mentioned earlier, it is crucial to seperate global definitions from local definitions (semantics), and put the global definitions in a central database. See picture below:

See Figure - idea behind WikiMantics

The idea behind it is that 'lokal' wiki's become far more lightweight than they are now. Currently all global knowledge is duplicated in about 24 wiki's (and this number is rising). This means that currently probably about 700.000 articles are duplicated 24 times (about 16.800.000 articles, still rising fast). I think that about 98% of the storage is wasted due to redundancy (I think in most countries definitions ans semantics are more or less the same, or should be the same). Only national / local concepts should be definition in addition in 'local' wiki's / databases.

Business case calculation, benefit estimation (huge)
Far more important is the number of man hours involved in editing and maintaining all those articles, and keeping them consistent with other languages. Some quantification:
*About 13.000.000 articles are produced 'too many'.
*It is not unrealistic to think that each article 'costs' 5 man hours on average (this is a very conservative estimation, it will be probable more than 50 man hours on average).
*Let's say that with the WikiMantics in place, still 20% of the man hours is needed per local wiki instance, then we could safe about 50.000.000 man hours.
*In money terms (that works mostly better), let's calculate with an 70 euro an hour rate: 3.500.000.000 euro can (or could have been) be 'saved'.
*I think a normal national wiki would easily grow towards 10.000.000 articles. Imagine that we could have the SemanticWiki in place right now, we could save about 25 (local wiki's)*8.000.000 (assumption of work reduction)*5 = 7E+11 euro (I think this is a seven with 11 zero's: 700.000.000.000. This is off course af very rough and maybe partly incorrect calculation. But it gives a good idea. Benefits are even far more greater if we take in account that we don't need separate dictionaries anymore, and that the amount of local wiki's will grow to at least 500(? - how many countries and/or languages are there?).

Avoid 20th century Babylon
Far more important than that is that we are creating our own Babylonia, by having so much redundancy. See the section 'Buddha plays with Babylon' furtheron. The 'costs' (and human suffering!) are very hard to quantize. In money terms this will be hundreds's of billions euro. There are far more better purposes for this money: support all kind of aid funds and/or (e.g. decease) research funds. And off course spend the money to make to SemanticWiki a great base of knowledge sharing (with a lot of detail, if we whish, I think this is a good idea. Offcourse with proper decomposition, take al look again at software development, it's an important discipline there (I would love to see information technologists work together with linguistic people...).

So what to do?
*Make the wiki platform ready to use the WikiMantics concept
*Test this functionality thorougly
*Start migrating to the new style with one, or two 'local' wiki's.
*When it all works, start migrating the rest.
*Do not make separate wikidictionaries? Why should we? Dogma?
To my oppinion, it is more importnant to get this done, than facilitate the 'uncontrolled' growth of all several wiki's at this moment. Wiki might die if nothing changes...

Note: the proposed concept is not new! In information industrie it is a normal thing to avoid redundancy in databases. The 'only' thing which makes this 'project' different, is the huge scale of it, and the difficulty of manage-ing it (a large part of the communucaty is doing things autonomously, which is a good thing). But things are getting out of hand, and need rigid coordination of high quality to get into the 'proper' direction again. This can be seen as a new life cycle of wiki as well: It started with a concept and a 'just start' mentality. Now it is clear that it is a very good concept, maybe the best 'invention' of this age(?) or at least it is in the top100 (not that this is important, it cam serve mankind in a great way, that counts). What goes for words (wordevotion), goes for wiki (and that is a sign it is good). The next cycle is to make it more efficient, and to be able to absorp the enthusiasm of the global community....

General concepts/terms:
To wocon a new word (abbreviation from Wordconception)
This concept equals partly to 'Word synthesis' in generic. But there are some very specific thing about the wocon concept: words / terms are 'designed' for usage on the internet, globally and in within IT systems. It is a wiki-slang word. The concept 'word synthesis' has a too broad definition. This makes hard to definie specific rules for e.g. the wiki domain, or the special context for which the 'wocon' word is intended. In addition: googling on 'word synthesis' is cumebersome, it gives far too diverse hits.
[Help wanted: I encountered a article (linguistics) about word synthesis, I can't find it anymore. Please help and add the link]

Therefore you can say: Woconing is a specific form of Word-synthesis (this is important to state, it describes the semantics!).

'Conception' purily means 'getting pregnant' (conception, collide). In the wiki world it means:
----
"When a word lacks for a certain concept, and the concept is relevant for others, think about creating a new word, by using 'wordconcepting' (practical usage: 'to wocon', e.g.: I woconed a new word today..."")."
----

Woconning has a strong relationship with Etymology. The groundrules for woconing are more or less the same as for etmymological analysis / study. Actually: woconing is doing etymology in a reverse manner.

[Note: In the Dutch wiki some topics about soound 'scales' and 'sound' relations between words wered included. I did not encounter them in the English version. Those topics are very important... Can anybody add these parts to the Englisch version (nice example of what happens to global understanding of semantics by having various local wiki's, see the section about 'Over multiple languages.]


Note: a good woconing ('word design') aspect of the word 'wocon' itself: it sounds like 'walk on', which is very appropriate.

When to use?
* When a certain concept is relevant for a group, but the concept has no unique name, therefore it is difficult 'to adress' it, learn it to others and/or to do further 'development' upon.
*As a consequence it is difficult to 'speak' about it (you always have to use several sentences to express what you mean).
*And maybe the most important thing: everbody has its own 'idea' about it, which easily leads to miscommunication, argumenting, inefficiency, etc.
After a word has been created (woconed), it can be put in a dictionary, and a commonly accepted definition (semantics) can be added.

Note: some ground rules probably have to be developed on when the new word is included in eg. a wiki. A first shot:
*The new word must at least be relevant for 50.000 people.
*...?

Some first rules of thumb for woconing:
*A new term (word) should be easily pronouncable.
*A new term must be semantically 'logical', based upon the used 'wordparts' c.q. the words related, which 'pop up' by 'free association'. Try writing it down and write all kind of words down which you feel sound the same, mean the same, are related, etc.
*The new term must be usable for the intented 'usergroup'. Ideally the term is 'prepared' to scale up to global usage (that is a reason why using e.g. French, German, Dutch, etc. languages is mostly not a good idea). By doing so future Bableisme is avoided. Only use a local language when it is practically asolutely sure that the term is not relevant in a global manner, e.g. when we look for a term for a specific habit of people living in the North of France. Than a French feeling term my be constructed.
*It must be one word (no spaces, backslashes, etc.). This is required for creating unique hits for search engines.
*It must be practically unique in the global word (Google on the intended term before adding it in a wiki!).
*How does the new word feel? And sound? Look? Does it raise the proper 'feeling', even without explaining the 'word design background' and/or definition to people? Emotion and sound could be 'stronger' than exact spelling. It is a bit like poetry...

Debabling
The process of 'Language purification' is given one unique term 'Debabling'.

Rationale:
*One unique word or term is required for such an important concept.
*When one enters the keywords 'Language purification' in Google, one term which has the same idea is 'purism'. When

entering purism however, the intended concept is not listed within the top 5 hits.

The relevance of the term 'Debabling':
When a term is not 'pure' things are going wrong, e.g.:
*Two persons think that they understand each other, but they both can use completely or slightly (maybe even more

dangerous concepts).
*When translating semantics to another language, things can get even more difficult: The following example happened

in real life when the author (Dutch) spoke whith a French man. Their common language was English. E.g. in Ducth the

word for 'over strained' (sickness) is 'overspannen'. In French the -probably- most proper term is 'Pression'. So

far so good, but than the opposite of 'overspannen' in Dutch would be something like 'stressed by having to few to

do' (in Dutch there is not a really opposite). But in French the word 'Depression' seams to be the opposite of

'Pression'. But in Dutch the word 'Depression' means quite something different, it means the 'disease'

'deppression' (the same goes for the Englisch language). And there we have Babilonia... Note: the author feels that

the term 'depression' is dangerously mis definied itself (at least incomplete), but that is not a topic for this

article.
*Within sofware development there is broad experience what happens when a part of a system is defined not properly

e.g. ambiguous in a unwanted manner. Whole systems can collapse due to this. The same goes for natural language,

especially when people aren't awake anough to ask themselve what the word actually means, or what the other party

means by it.
*Within sofware development there is -sadly- a broad experience as well on how hard it is to get a term (label and

its semantics) removed form the 'system' (the software, developers, designers, etc.). Sometimes the only thing left

to do is to 'demilitairize' the term: don't use it anymore, use new terms with proper semantics instead. Avoid

referencing to the old term (software, documentation, mind set, etc.) as much as possible. Within some years the

false term is not used any more, but is still in the system (but it doesn's harm, as long as nobody starts using it

again). That's why the term / classification 'zombiword' came up.

Wordevotion
Wordevotion stand for the idea that a word has a life-change-split-etc-death cyclus as well:
*1. Creation (wordconception, see above)
*2. Change of its semantics (no unique term woconed yet)
*3. Split of a part of its semantics (no unique term woconed yet)
*4. Union with the semantics (or part of) another word (no unique term woconed yet)
*5. Zombinize, no more new use is allowed (it becomes a 'zombiword')
*6. Death: it is completely removed (will not often occur, probably, hard to remove all instances) (no unique term woconed yet, may be 'word starvation' abbr. to 'wostarf'?)

Wordevotion is the woconing of 'word' (term), 'evolution' (Darwin) and 'emotion'. About emotion: The sound of words and the related emotions it gives (or the underlying parts give), make the word alive (or not). It will be indeed a survival of the fittest, or better: the most pure, loved and used words will remain. In such way debabling will take place, and our language will get more and more meaningful.

Global relevance
(or the dictionary alternative) and proper 'change manegement' on terms and their semantics is the key to understanding each other globally and get our communication significantly at a higher level...

Why now?
Wondering why it is so hard to get the language pure? Some resons probably will be:

How did language evolution happen in 'the early' days?
In the 'earlier' world adding or changing a word would have led to bableism for years: how to inform everybody about the new word? Words and semantics had to develope and spread 'autonomically': when somewhere a new word developed (this will have been processes for many years), and by 'succes' it got adapted by more and more people.

This was however an 'unconscious' process. In the same manner the semantics of words could change, words could 'starf' get recombined etc.

This might already be a good example ('starf'): everybody knows the word 'starvation' (dying due to too little nutricion). But I actually think this word might derive form a verb 'to starf' (= to die), I haven't done any research on this topic yet (the wiki gives the earlier mentioned definition of starvation). In Dutch 'te sterven, or ik sterf', means 'to die, I die'. But natural language evolution made the word 'starf' fade. Nobody uses the word 'starf', but when I would say 'Bill starf this morning', everybody would now (feel/senses?) directly what I mean. Probably by the sound of the word. It is therefore thought that the sound of words are maybe more important than how we write them (spelling).

This could be logical: in the earlier days no such thing as written conmmunication existed. But than... how do we 'know' what 'to starf' means? The author(s) of this article newer learned it in their youth...

But the 'natural' development of language leade to 'specific' languages all over the world (specific words, sounds, grammars, etc.). This led to a high degree of bableism: only people living 'close' together understood each other. Global communication was not possible.

Than people started to write words down. This helped a lot in global communication. Dictionaries were created. Many books were written, etc.

This however made that people did not like languages to change: every new word needed a new dictionary, and that causes a lot of work (and should everybody buy a new dictionary than each time). So language's fell 'asleep'. They were not 'allowed' to change anymore.

But nature is taking care for that. One way or the other change (adaption) is in the nature of all things. Is this maybe the reason 'dead languages (Greek, Latin)' exist nowadays? This could very well be true (why did they die?). A better term might be 'sleeping' language: the language is still 'in the world', and when we allow the language to change againg it might awake again...

And how language evolution could go in nowadays...
Let's get back to our 'time': the reasons for not changing al language in a more flexible manner are gone. Why? Due to information technology: global usage of a or dictionary (Should those two not merge to avoid redundancy?). We can centrally change the definition of a word, and make the change directly visible to all the 'stakeholders': people all over the world using that language. Off course we should be careful: too much change will kill the language as well.

So, not changing a language will kill it in the end. What about the soul of the language? The soul moves on to another language when a language dies (or falls asleep?), this happened eg. when Latin and Greep stopped developing (writing this, maybe those languages are dead, and we are trying to keep them alive, something we maybe shoudn't do...). Take a look at this sentence: "...the reason why dead languages still exist..." (A dead language exists??? A dead person exists as well???). Something that is dead should be burried, and not exist anymore (other than leaving its footprint in a new languages (or multiple languages)).

So what to do now? Let the language change itself, but in a controlled manner! See what happens.

Note on spelling: to 'debable' or 'debabble'? That's the question.
The first spelling was to use two 'b's, for example "debabbling". But then it was realised this was not pure itself:

'Babylon' is spelled with one 'b'. However, on the internet there are several occurances of related words using two 'b's.

An important related term 'Babel fish' uses one 'b'...

Free association (propsed term: freesociation) can help:
*'Bable' associates with 'Bible' which is nice, due to the fact, that the Bible contains the story of Babylon. And, not very important, the Bible is one of the most important examples of the 'written word' (Dutch "Heilige Geschrift"). And to my oppinion also the most important example of what happens in the world when taking the 'word' literarily... Wars, disputes, etc. Why? The underlying semantics of the word is (or should be) the same for all parties: love, compassion, etc.
*'Bable' associates also with 'Google', which is nice (and appropriate) as well due to the fact that Google (and other search engines) make language transparant. "Language cannot hide itself on the internet". And so its defects come (or can be made) very clear. Just google for one or at a certain term (or its composits, related terms, etc.). See what it means in several languages. Use translation tools (for - and backward translation), see the different wiki's, etc. Debabling is fun!
*Debabling associates with 'debating' which is good / true as well. Buddhism monks tend to reach a high degree in debating. Debating can follow when someone proposes to do a certain debable action.

Note on the process of 'freesociation': In order to wocon a new word dogma-free association and fantasy (and etymology / wordbreakdown / analysis of parts) can help. See section on 'wocon' as well (ground rules).

But why not spell it as 'Debabel'? It was almost proposed, but after playing a little word game with Babel Fish, it was found that this is not a good idea. Babel Fish babbles (creates bableism) itself! See the section "Buddha plays with Babylon".

Why need Esperanto?

And don't forget: language, and specific (e.g. per country) language occurrences are so beautiful. And there is some much information and wisdom in words, you can see it, 'free' your mind and start 'free' and dogma-free association. But don't try to find things. The harder you try, the lesser you will see...

Information technology helps us to AND keep our specific languages (which contains the soul of a nation / group of people?) AND getting the semantics right (by using 'Pass semantics by reference' concept).

[Call for help: A proper term has to be wonconed yet for this concept ('Pass semantics by referenence')! Who has a good idea? Just add it on the wiki, and see what happens! Off course you can contact me first for discussion. Can be done here (public) or by email:contact@spirilogic.com Go for the challenge, don't be afraid, act like children do, and make this world a better place.]

Decision: It is clear: one 'b' is used due to de strong association with the term 'Bible' (and Babylon is with one 'b', not two 'b's.

It's all in the name...

But, the oppinion of wiki-users is very interesing, please post them at this 'site' (talk), but also mention that nothing has to be changed on this topic (if so). Thank you!

Buddha plays with Babylon: A good exampe of the impact of bableism...
What does the Fish say?
Try the following:
*1. Start Bable Fish translator
*2. Select 'English to Dutch'
*3. Type 'Budha' and press . Look at the result.
*4. Type 'Buddha' and press . Look at the result.
*5. Select "Dutch to Englisch"
*6. Type 'Bhoeda' and press . Look at the result.
*7. Type 'Bhoedda' and press . Look at the result.

Saw anything strange? If not, do it again. There is (or at least, there was at 3 September 2009), something strange about the results.

What happened?
*3 gave 'Budha'. This is really wrong. In Dutch Budhha is never spelled with an 'u'. But, as experience Bable Fish users know, when the Fish doesn't know it, it doesn't say it, but it gives back the same word as which was asked for instead... Great. The problem is, that in a lot of languages this migt be right (when a term has been adoptes cross language). Users have learned a new (incorrect!) spelling... Here the first group of fishermen go into the mist (lost in translation).
*4 gave 'Boedha'. Ok, people who just want to know the right spelling stop here and start writing it down, start talking about it, etc. There is a huge chance, that when they have to communicate in English, they will use the term 'Budha' (Wrong!). Why? Because the Fish said so, even when we entered 'Buddha' originally. There another (huge) group of fisherman crash on the rocks.
*5 gave 'Bhoeda'. This must be very wrong again, but I think only one or two fishermen crash, because the use of 'oe' in Englisch is really ackward.
*6 gave 'Buddha'. This is -maybe?- the correct spelling.

Jesus still walks on the water (he sees through it), but are there any fishermen left? Lost in translation again...

Note: Google gives 6.280.000 hits on 'Jezus' and 272.000.000 on 'Jesus'.

What does Google say?
Something what I -and many with me- do to find the 'correct' spelling of a word, is to do a simple google hitcount comparison. This gives:
*Buddha - 38.600.000 hits
*Budha - 1.940.000 hits
*Budda - 2.470.000 hits
*Buda - 22.500.000 hits
*Bhuda - 2.590 hits
*Bhudha - 5.760 hits

There are a lot more variations, but I think these are the most interesting.

This is very interesting. Some observations:
*The result with 'Buddha' is ok (it has significantly the most hits and is the spelling used in the wiki's).
*The result with 'Buda' suprises. I have encountered four different spellings in the wiki's (English alone), but not this one. An explanation could be that this spelling is the most logical one for a Western language oriented person (and with so many different spellings, one might choose the 'sound' spelling).
*I would hace suspected that the result on Bhuda and Bhudha would be higher than 'Buda'. See text above.

Notes:
*It is a very good thing is that the .org and Google are aligned, when searching for Budha, it says: "Didn't you mean 'Buddha'?", which is the term used within wiki.org. Another good thing is that the spelling aligns with other language wiki's, eg Ducth and French (respectevily Boeddha and Bouddha).
*I do not want to [Dutch: 'zwartmaken', who knows the English synonime?, meaning something like: 'make black', or 'shoot holes into it', the Babel Fish translator. I have used (and still do) the tool often, whit much added value!

What should be done about this? It feels like we should start with eliminating al different dictionaries. Or better: Wiki is leading (after entering all information into the wiki, both historical-but nowadays not correct, as actual and correct). Than we should do the same with translation tools. Or better: all tools should use the same underlying base (something like 'TransWiki?').

Not Buddha but Budha?
But Budha - or whatever his name is - has more to play. Just as all Wiki's are filled with Buddha's, it could seem that the 'most correct' spelling is not 'Buddha', but 'Budha'... ouch. See for this the section on 'spelling' within the 'Buddhism' article.

In Holland, Buddha does not exist...
Then try the following.
#Type the following URL: http://www.mijnwoordenboek.nl/
#Select 'Dutch' to 'English'
#Type 'Boeddha' (most likely spelling probably)
#See the result: it does not exist in the dictionary
#Then try different spellings (I tried 'Boedha', 'Buda', 'Boeda', 'Boedda', 'Bhoedha', 'Bhoedda' and 'Bhoeddha'. Nothing.
(At least this was the case 4 September 2008).

My conclusion: Buddha does not exist in Holland. Buddha laughs again.

Am I going insane? Just to be sure I enter -again- 'Buddha' in the Babel Fish translator, and translate it form Englisch to Dutch. Result:'Boedha'. (You remember, one 'b' instead of two?) Once again I enter this in the Ducth translation tool. Nothing again, for me (Dutchman) Buddha does unfortunately not exist (in written word, or at least, I can't find him...).

Reuse, commercial use, publication, etc
Notes:
#I hope the content of this article is interesting for publication, reuse in further investigations, or other purposes. Please do, it's free (really)! Spread the word, but don't get lost in translation (neither kill the message)... Even if you can money make out of it: fine! (but please think about supporting wiki financially as well).
#Note II: Maybe it is a good idea to post an discussion entry if you want to publish an article in a magazine, start a research project (graduation thesis?) or something like it (this may be a critical paper as well), in such a manner we avoid having 'conflicts'. 'Token process': post a discussion topic with the titel 'I use the material for'... etc. (please add, if possible a contactemail as well. Create a new one, because the spambots will spoil your email adress otherwise, and than 'the token' is yours. I hope other people respect this and respect the action of the person who was deared to commuicate in the open (which I would find great) what his/her plans are (this makes one vulnerable to mis usage...). Live in abundance...
 
< Prev   Next >