www.GetXFactor.com

Leading Technology, Science,
Agriculture News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

funny information about text files
   Science and Technology news... Forum Index -> Compression Forum  
View previous topic :: View next topic  
Author Message
SuperFly
Guest






PostPosted: Wed Sep 17, 2003 2:39 am    Post subject: funny information about text files Reply with quote

Hi all,

Just read this post in sci.crypt. If a human can parse this to normal
text. An 'inteligent' context based algorithm might be able to do the
same. Might be an angle to boost text compression results :) Using a:
first -, last letter and sort the inbetween data algorithm.

*************************************************************************

This is such a marvelous example of English language entropy/character
I must share it with sci.crypt. A friend sent me this today:


-------------------------------------------------------------------------
Subject: Dslyecixs taek haret

Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
--------------------------------------------------------------------------

Isn>t that great? :-)

Native English speaker/readers should have no trouble parsing the text
because of the redundancy of printed English. Only the first and last
letters of each word remain in their expected positions.

John A. Malley

*************************************************************************
Back to top
David A. Scott
Guest






PostPosted: Wed Sep 17, 2003 3:18 am    Post subject: Re: funny information about text files Reply with quote

SuperFly <fake@email.com> wrote in
news:o20fmv4mpbpcq1pq7in2fh0n6rd2pdsc2c@4ax.com:

[quote]Hi all,

Just read this post in sci.crypt. If a human can parse this to normal
text. An 'inteligent' context based algorithm might be able to do the
same. Might be an angle to boost text compression results :) Using a:
first -, last letter and sort the inbetween data algorithm.

***********************************************************************
**

This is such a marvelous example of English language entropy/character
I must share it with sci.crypt. A friend sent me this today:


-----------------------------------------------------------------------
-- Subject: Dslyecixs taek haret

Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
-----------------------------------------------------------------------
---

Isn>t that great? :-)

Native English speaker/readers should have no trouble parsing the text
because of the redundancy of printed English. Only the first and last
letters of each word remain in their expected positions.

John A. Malley

***********************************************************************
**

[/quote]
I don>t belive the letters are totally random since i had no truouble
reading it. But if its true one could design a compressor to compress
written text smaller than a normal compressor that likes correct spelling.
If you took a word like "english" then replace it with "egilnsh" just
make first and last letter accurate and then allow only in sorted order.
This would make it much easier to compress since since strings between
first and last letters would be sorted. The draw make is there is bound
to be several words that are duplicates. But then again that already
exists in Englihs take the word "read" it could be pronanced like the
word "red" or the word "reed" also for words that are plurals you might
have to leave last two letters in place like the word "letters" in the
text above the "rs" was preserved and "ltteers" was used.

David A. Scott
--
My Crypto code
http://cryptography.org/cgi-bin/crypto.cgi/Misc/scott19u.zip
http://cryptography.org/cgi-bin/crypto.cgi/Misc/scott16u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
Back to top
Pete Fraser
Guest






PostPosted: Wed Sep 17, 2003 4:41 am    Post subject: Re: funny information about text files Reply with quote

"David A. Scott" <daVvid_a_scott@email.com> wrote in message
news:Xns93F8A65273526H110W296LC45WIN3030R@130.133.1.4...
[quote]SuperFly <fake@email.com> wrote in
news:o20fmv4mpbpcq1pq7in2fh0n6rd2pdsc2c@4ax.com:

-- Subject: Dslyecixs taek haret

Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deson>t mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht frist and lsat ltteer is at the rghit pclae. The rset can be a
toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
we do not raed ervey lteetr by it slef but the wrod as a wlohe.
-----------------------------------------------------------------------


I don>t belive the letters are totally random since i had no truouble
reading it. But if its true one could design a compressor to compress
written text smaller than a normal compressor that likes correct spelling.
If you took a word like "english" then replace it with "egilnsh" just
make first and last letter accurate and then allow only in sorted order.
This would make it much easier to compress since since strings between
first and last letters would be sorted. The draw make is there is bound
to be several words that are duplicates. But then again that already
exists in Englihs take the word "read" it could be pronanced like the
word "red" or the word "reed" also for words that are plurals you might
have to leave last two letters in place like the word "letters" in the
text above the "rs" was preserved and "ltteers" was used.

When I first saw this reported, it came with a perl script to do the[/quote]
scrambling.
It would be interesting to replace that with a sort.
Back to top
Petrut
Guest






PostPosted: Wed Sep 17, 2003 11:55 pm    Post subject: Re: funny information about text files Reply with quote

you should read the algorithm of the star-encoding

I think that it>s a little bit better than that (however, it>s an
interesting
thing to see how our brain works (or rather "doesn>t work"))

it>s the same thing in french, the first and the last letter of
a word are enough to make it readable

bye

petrut
Back to top
Display posts from previous:   
   Science and Technology news... Forum Index -> Compression Forum  
Page 1 of 1
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum