| View previous topic :: View next topic |
| Author |
Message |
SuperFly Guest
|
Posted: Wed Sep 17, 2003 2:39 am Post subject: funny information about text files |
|
|
Hi all,
Just read this post in sci.crypt. If a human can parse this to normal
text. An 'inteligent' context based algorithm might be able to do the
same. Might be an angle to boost text compression results :) Using a:
first -, last letter and sort the inbetween data algorithm.
*************************************************************************
This is such a marvelous example of English language entropy/character
I must share it with sci.crypt. A friend sent me this today:
-------------------------------------------------------------------------
Subject: Dslyecixs taek haret
Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
--------------------------------------------------------------------------
Isn>t that great? :-)
Native English speaker/readers should have no trouble parsing the text
because of the redundancy of printed English. Only the first and last
letters of each word remain in their expected positions.
John A. Malley
************************************************************************* |
|
| |
|
Back to top |
David A. Scott Guest
|
Posted: Wed Sep 17, 2003 3:18 am Post subject: Re: funny information about text files |
|
|
SuperFly <fake@email.com> wrote in
news:o20fmv4mpbpcq1pq7in2fh0n6rd2pdsc2c@4ax.com:
[quote]Hi all,
Just read this post in sci.crypt. If a human can parse this to normal
text. An 'inteligent' context based algorithm might be able to do the
same. Might be an angle to boost text compression results :) Using a:
first -, last letter and sort the inbetween data algorithm.
***********************************************************************
**
This is such a marvelous example of English language entropy/character
I must share it with sci.crypt. A friend sent me this today:
-----------------------------------------------------------------------
-- Subject: Dslyecixs taek haret
Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
-----------------------------------------------------------------------
---
Isn>t that great? :-)
Native English speaker/readers should have no trouble parsing the text
because of the redundancy of printed English. Only the first and last
letters of each word remain in their expected positions.
John A. Malley
***********************************************************************
**
[/quote]
I don>t belive the letters are totally random since i had no truouble
reading it. But if its true one could design a compressor to compress
written text smaller than a normal compressor that likes correct spelling.
If you took a word like "english" then replace it with "egilnsh" just
make first and last letter accurate and then allow only in sorted order.
This would make it much easier to compress since since strings between
first and last letters would be sorted. The draw make is there is bound
to be several words that are duplicates. But then again that already
exists in Englihs take the word "read" it could be pronanced like the
word "red" or the word "reed" also for words that are plurals you might
have to leave last two letters in place like the word "letters" in the
text above the "rs" was preserved and "ltteers" was used.
David A. Scott
--
My Crypto code
http://cryptography.org/cgi-bin/crypto.cgi/Misc/scott19u.zip
http://cryptography.org/cgi-bin/crypto.cgi/Misc/scott16u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link" |
|
| |
|
Back to top |
Pete Fraser Guest
|
Posted: Wed Sep 17, 2003 4:41 am Post subject: Re: funny information about text files |
|
|
"David A. Scott" <daVvid_a_scott@email.com> wrote in message
news:Xns93F8A65273526H110W296LC45WIN3030R@130.133.1.4...
[quote]SuperFly <fake@email.com> wrote in
news:o20fmv4mpbpcq1pq7in2fh0n6rd2pdsc2c@4ax.com:
-- Subject: Dslyecixs taek haret
Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
-----------------------------------------------------------------------
I don>t belive the letters are totally random since i had no truouble
reading it. But if its true one could design a compressor to compress
written text smaller than a normal compressor that likes correct spelling.
If you took a word like "english" then replace it with "egilnsh" just
make first and last letter accurate and then allow only in sorted order.
This would make it much easier to compress since since strings between
first and last letters would be sorted. The draw make is there is bound
to be several words that are duplicates. But then again that already
exists in Englihs take the word "read" it could be pronanced like the
word "red" or the word "reed" also for words that are plurals you might
have to leave last two letters in place like the word "letters" in the
text above the "rs" was preserved and "ltteers" was used.
When I first saw this reported, it came with a perl script to do the[/quote]
scrambling.
It would be interesting to replace that with a sort. |
|
| |
|
Back to top |
Petrut Guest
|
Posted: Wed Sep 17, 2003 11:55 pm Post subject: Re: funny information about text files |
|
|
you should read the algorithm of the star-encoding
I think that it>s a little bit better than that (however, it>s an
interesting
thing to see how our brain works (or rather "doesn>t work"))
it>s the same thing in french, the first and the last letter of
a word are enough to make it readable
bye
petrut |
|
| |
|
Back to top |
|