| View previous topic :: View next topic |
| Author |
Message |
Thomas Richter Guest
|
Posted: Mon Jul 21, 2008 12:39 pm Post subject: Re: compression method |
|
|
mcjason@gmail.com schrieb:
[quote]for who says random can>t be compressed...
[/quote]
Ok, here>s an exercise for you, or rather a question:
Please give a definition of "random". (I>m asking for not more).
Then, once you have that definition, we can work from there backwards.
So long,
Thomas |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 4:02 am Post subject: Re: compression type |
|
|
On Jul 21, 3:52 am, Willem <wil...@stack.nl> wrote:
[quote]mcja...@gmail.com wrote:
) What am I ignoring that>s fundamental?
)
) I>m taking the understanding into account that compression works with
) the idea of reducing redundancy...
Compression does not "work with" reducing redundancy.
Compression *IS* reducing redundancy.
Two different names for the same thing.
) so far the only idea I think is repeat occurances that can be said
) once and explained more often right?
Wrong. Reducing redundancy is not only about finding repeat occurrences.
Perhaps you should first research existing methods of redundancy reduction.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I>m not paranoid. You all think I>m paranoid, don>t you !
#EOT
[/quote]
Does anybody see what proportion a method like this has for
compression?
I think it might be different that the ideas used for any other ways
of compressing...
Because for "frog toad" and "ford goat" only "f o r d g at" would be
what gets stored. Like, stored in a sphere where each part is
orientented where a curved line can connect the parts in different
ways. so then the tokens themself are actually a curved line
descriptor in math that connects the parts together in the order for
what the token means. So for example with just that stored in the
sphere now there can be a token for "gord, grad, rat, droat, foad,
dorf, dator, fat rat, rat fog" as taking only the size of the token
for more space.
Anybody want to check the math on this one ? |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 9:11 am Post subject: Re: compression type |
|
|
On Jul 22, 3:23 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
[quote]mcja...@gmail.com schrieb:
Does anybody see what proportion a method like this has for
compression?
Well, why don>t you implement it then and test it? My personal
impression is: Not much, because LZ is not so different and works pretty
well - it>s not so easy to compete against.
Anybody want to check the math on this one ?
It>s not yet in the form where math can be applied in first place. Only
you can do that - and I doubt many will volunteer. If you want to prove
that your method works, I think it>s up to you to provide an implementation.
So long,
Thomas
[/quote]
I>m convinced this way of compressing has a proportional difference to
methods tried...
I even see no reason why random can>t be compressed quite well
actually.
say given a random string:
"jht845mnh82hk"
right.. no repeat occurances significant enough to call tokened with
traditional compression>s string repeat occurance tactic...
however,
"j h t 8 4 5 m n 2 k"
can be stored, and tokens that say:
curve line j - h - t - 8 - 4 - 5 - m - n
curve line 2 - h - k
maybe not how this actually works out to be with the size of how you
say a curved line, but it>s not stringent on any limits of how this
can be a proportion that achieves for random a compression ratio
better off. |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 10:34 am Post subject: Re: genuine random compressability |
|
|
On Jul 22, 5:11 am, mcja...@gmail.com wrote:
[quote]On Jul 22, 3:23 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
mcja...@gmail.com schrieb:
Does anybody see what proportion a method like this has for
compression?
Well, why don>t you implement it then and test it? My personal
impression is: Not much, because LZ is not so different and works pretty
well - it>s not so easy to compete against.
Anybody want to check the math on this one ?
It>s not yet in the form where math can be applied in first place. Only
you can do that - and I doubt many will volunteer. If you want to prove
that your method works, I think it>s up to you to provide an implementation.
So long,
Thomas
I>m convinced this way of compressing has a proportional difference to
methods tried...
I even see no reason why random can>t be compressed quite well
actually.
say given a random string:
"jht845mnh82hk"
right.. no repeat occurances significant enough to call tokened with
traditional compression>s string repeat occurance tactic...
however,
"j h t 8 4 5 m n 2 k"
can be stored, and tokens that say:
curve line j - h - t - 8 - 4 - 5 - m - n
curve line 2 - h - k
maybe not how this actually works out to be with the size of how you
say a curved line, but it>s not stringent on any limits of how this
can be a proportion that achieves for random a compression ratio
better off.- Hide quoted text -
- Show quoted text -
[/quote]
I think for sure this method would achieve a good result with
random...
take for example how it can work this way...
say all the file put in a sphere... but just as one block in the
sphere, now you can just say the sphere in the file and a token to the
one block.. ok?
so say always then everything in the file goes in the sphere, and how
the file content is can only be tokens.
so now no compression but no loss to have a pattern token system in
place, get it? so a sphere and 1 token to the data block of the whole
file. same size.
so now to say instead of a big block of data, have it in the sphere
broken apart and the file many tokens.. so now the file is only
getting bigger right? with tokens to inside the sphere?
so with the whole file as a big block in the sphere, you can break the
block up where pieces reorganize to be part of the file, but you can
also get rid of any pieces of the block that show a reorganization
stategy that is already found or make one already there extend.
so a file the same size to start with, but only smaller as you can
keep saying that instead of a having something in the file be stored
as a whole block, it>s either nothing at all if the blocks already can
have a curved line draw them together, or just the part that extends
how a curved line can say some but not all.
so isn>t that like saying that even if among alot of random data, all
you can find is "qlkzg[p387b" and "nalncqgbzz" then you get a lesser
storage size since
you can just say "q lk z g[p387b nalnc gb" in the sphere, and file
content said as a few lines that connect the blocks together in order.
so now file contents is line, like a curved line for in the sphere, as
tokens.
this isn>t edging a condition of token size as thinking the tradeoff
of whether or not it>s worth making a token for repeat occurances, it
isn>t like that at all. |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 10:54 am Post subject: Re: compression method |
|
|
On Jul 20, 6:27 pm, mcja...@gmail.com wrote:
[quote]On Jul 20, 5:54 pm, mcja...@gmail.com wrote:
On Jul 20, 5:35 pm, mcja...@gmail.com wrote:
On Jul 20, 3:59 pm, Willem <wil...@stack.nl> wrote:
mcja...@gmail.com wrote:
) what would happen if this though...
If you ignore fundamental principles and simple arguments,
then you will either get laughed at or get ignored.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I>m not paranoid. You all think I>m paranoid, don>t you !
#EOT
What am I ignoring that>s fundamental?
I>m taking the understanding into account that compression works with
the idea of reducing redundancy...
so far the only idea I think is repeat occurances that can be said
once and explained more often right?
what a way to achieve the reduction of information... but I wouldn>t
say the only way, it>s just said so the way to be about it.
I was trying to not be far from an idea that says differerent, would
work in idea of thinking about it, and has something else to it when
it comes to what proportion can be achieved in how much information
can be reduced.
now think of a string of text... find the string of text said another
way as a shape somehow, where every word there can be would draw a
different shape. ok ?
now find another string of text, find the shape for it, now find math
that transforms one shape to another, find that in some cases the math
to transform the first shape into the second shape is smaller than the
second shape itself... so say this now, hold the first shape and the
math to transform the shape as the information...
so now in idea it>s compression not working for the idea of repeat
occurances, but for how a shape is math transform in size bigger or
smaller than another shape.
so not like there>s any math for the idea, or how even any example
tries to fare, it>s just the idea of how it>s working to achieve
compression.
see how that>s completely different than finding repeat occurances of
even the same string?
see how it doesn>t even depend on how many repeat occurances can be
found?
so in simplicity of the same proportion I think this idea of
compression would work, I don>t get stuck thinking of it anyway...
like... say for everytime abc cba or bca is found as part of the file,
you say coordinate in area and a curve where you start at the first
letter and the curve follows through across each letter. so now only
the letters "abc" are in the geometry area, but a token that says the
letters rearranged any way.
so that achieves another way besides repeat occurances of the same
string.
I think files say alot better about rearranged patterns than repeat
occurances.. and _no matter what_ it>s doing exactly redundancy
reductiotion the same as repeat occurances is too, it definitely says
that _at least_, but could only be better.
This is being different than redundant information if that only says
repeat occurances, is it not ?
I would think of it working like....
say first of all none of the file for real mixed with tokens, but the
idea like this....
put all of the file in a geometric area, where parts are further or
closer apart.
keep putting it in the geometric area where like if "had been here" is
already in there, it might be broken apart as words or together maybe?
but now putting "here already" in is what, so put the word "already"
like near the word "here".
so now for "had been here" and "here already" you only keep "had been
here already", because the word "here" already found but not like a
repeat occurance, but like a pattern to find another way.
so it should be like "gunsmith", "muts", "record", "buns", "thrill"
has it so there>s maybe in geometric area
g uns mi th muts record b rill
and then for those words a token that has a plot coordinate map said
shorter, like just a curved line to connect the parts in an ordering.
see how this can achieve better? it has no limit the same as finding
repeat occurances this way.- Hide quoted text -
- Show quoted text -
also, I>d like to add... to think about maybe?
I>d like to find the theoretical compression limit to be said a better
way maybe...
isn>t it always possible to write a small software program, that when
run, has software runtime of generating a greater amount of
information?
so not said for any example that could work, but can>t it always be a
smaller software program that runs to output more information?
can>t a small program be like what goes through a loop of transforming
a string, inside another loop, inside another loop, inside another
loop... like all loops transform the string in some complicated math
bizarre that ends up being the output? like be a few strings being
transformed where the loop is a run-on series of string transform the
way it>s the output maybe?
so can>t any file like it>s compressed just be a small program that
runs how it outputs the file if it runs a way to compute for what
output is?
so can>t compression in idea be what goes the way of being the
smallest possible program that can run to generate output? if for
example it>s the smallest software program that can run to generate
output? nothing to say that can work, but in idea... ?
isn>t it fair to say the smallest program that can be made to run and
generate file output is the best it can be ?- Hide quoted text -
- Show quoted text -
[/quote]
so given the idea of this as what to compress:
"abcdefghijklmnopqrstuvwxyz abstract wizard ward start"
you can store:
"ab cdefghijklmnopqrstuvwxyz stract w izard ar d st t"
or
"ab cdefghijklmnopqrstuvwxyz st ract wiz ar d ward t"
or
"ab cdefghijklmnopqrstuv w xyz st ract iz ar d ard t"
or
"a b cdefghijklmnopqrs t uv w xyz st rct iz ar d rd"
or
"a b c d efghijklmnopqrs t uv w xyz st rct iz ar r"
or
"a b c d efghijklmnopqrs t uv w xyz s rc iz ar r"
or
"a b c d efghijklmnopq r s t uv w xyz s rc iz"
and so on..
each way has connect-the-dots to say the same. |
|
| |
|
Back to top |
Mark Nelson Guest
|
Posted: Tue Jul 22, 2008 11:30 am Post subject: Re: compression method |
|
|
On Jul 21, 2:39 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
[quote]mcja...@gmail.com schrieb:
for who says random can>t be compressed...
Ok, here>s an exercise for you, or rather a question:
Please give a definition of "random". (I>m asking for not more).
Then, once you have that definition, we can work from there backwards.
[/quote]
I think a good definition for the purpose of discussion here is
something like this:
"A random sequence is defined as a any sequence which cannot be
generated with a program shorter than itself. "
The only catch to this definition is that it is with respect to the
machine on which the program is going to run. Other than that I think
it works very well.
It also stops any discussion of compressing random data dead in its
tracks by defining the problem away.
|
| Mark Nelson - http://marknelson.us
| |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 11:33 am Post subject: Re: compression method |
|
|
On Jul 22, 6:54 am, mcja...@gmail.com wrote:
[quote]On Jul 20, 6:27 pm, mcja...@gmail.com wrote:
On Jul 20, 5:54 pm, mcja...@gmail.com wrote:
On Jul 20, 5:35 pm, mcja...@gmail.com wrote:
On Jul 20, 3:59 pm, Willem <wil...@stack.nl> wrote:
mcja...@gmail.com wrote:
) what would happen if this though...
If you ignore fundamental principles and simple arguments,
then you will either get laughed at or get ignored.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I>m not paranoid. You all think I>m paranoid, don>t you !
#EOT
What am I ignoring that>s fundamental?
I>m taking the understanding into account that compression works with
the idea of reducing redundancy...
so far the only idea I think is repeat occurances that can be said
once and explained more often right?
what a way to achieve the reduction of information... but I wouldn>t
say the only way, it>s just said so the way to be about it.
I was trying to not be far from an idea that says differerent, would
work in idea of thinking about it, and has something else to it when
it comes to what proportion can be achieved in how much information
can be reduced.
now think of a string of text... find the string of text said another
way as a shape somehow, where every word there can be would draw a
different shape. ok ?
now find another string of text, find the shape for it, now find math
that transforms one shape to another, find that in some cases the math
to transform the first shape into the second shape is smaller than the
second shape itself... so say this now, hold the first shape and the
math to transform the shape as the information...
so now in idea it>s compression not working for the idea of repeat
occurances, but for how a shape is math transform in size bigger or
smaller than another shape.
so not like there>s any math for the idea, or how even any example
tries to fare, it>s just the idea of how it>s working to achieve
compression.
see how that>s completely different than finding repeat occurances of
even the same string?
see how it doesn>t even depend on how many repeat occurances can be
found?
so in simplicity of the same proportion I think this idea of
compression would work, I don>t get stuck thinking of it anyway...
like... say for everytime abc cba or bca is found as part of the file,
you say coordinate in area and a curve where you start at the first
letter and the curve follows through across each letter. so now only
the letters "abc" are in the geometry area, but a token that says the
letters rearranged any way.
so that achieves another way besides repeat occurances of the same
string.
I think files say alot better about rearranged patterns than repeat
occurances.. and _no matter what_ it>s doing exactly redundancy
reductiotion the same as repeat occurances is too, it definitely says
that _at least_, but could only be better.
This is being different than redundant information if that only says
repeat occurances, is it not ?
I would think of it working like....
say first of all none of the file for real mixed with tokens, but the
idea like this....
put all of the file in a geometric area, where parts are further or
closer apart.
keep putting it in the geometric area where like if "had been here" is
already in there, it might be broken apart as words or together maybe?
but now putting "here already" in is what, so put the word "already"
like near the word "here".
so now for "had been here" and "here already" you only keep "had been
here already", because the word "here" already found but not like a
repeat occurance, but like a pattern to find another way.
so it should be like "gunsmith", "muts", "record", "buns", "thrill"
has it so there>s maybe in geometric area
g uns mi th muts record b rill
and then for those words a token that has a plot coordinate map said
shorter, like just a curved line to connect the parts in an ordering.
see how this can achieve better? it has no limit the same as finding
repeat occurances this way.- Hide quoted text -
- Show quoted text -
also, I>d like to add... to think about maybe?
I>d like to find the theoretical compression limit to be said a better
way maybe...
isn>t it always possible to write a small software program, that when
run, has software runtime of generating a greater amount of
information?
so not said for any example that could work, but can>t it always be a
smaller software program that runs to output more information?
can>t a small program be like what goes through a loop of transforming
a string, inside another loop, inside another loop, inside another
loop... like all loops transform the string in some complicated math
bizarre that ends up being the output? like be a few strings being
transformed where the loop is a run-on series of string transform the
way it>s the output maybe?
so can>t any file like it>s compressed just be a small program that
runs how it outputs the file if it runs a way to compute for what
output is?
so can>t compression in idea be what goes the way of being the
smallest possible program that can run to generate output? if for
example it>s the smallest software program that can run to generate
output? nothing to say that can work, but in idea... ?
isn>t it fair to say the smallest program that can be made to run and
generate file output is the best it can be ?- Hide quoted text -
- Show quoted text -
so given the idea of this as what to compress:
"abcdefghijklmnopqrstuvwxyz abstract wizard ward start"
you can store:
"ab cdefghijklmnopqrstuvwxyz stract w izard ar d st t"
or
"ab cdefghijklmnopqrstuvwxyz st ract wiz ar d ward t"
or
"ab cdefghijklmnopqrstuv w xyz st ract iz ar d ard t"
or
"a b cdefghijklmnopqrs t uv w xyz st rct iz ar d rd"
or
"a b c d efghijklmnopqrs t uv w xyz st rct iz ar r"
or
"a b c d efghijklmnopqrs t uv w xyz s rc iz ar r"
or
"a b c d efghijklmnopq r s t uv w xyz s rc iz"
and so on..
each way has connect-the-dots to say the same.- Hide quoted text -
- Show quoted text -
[/quote]
reorganized patterns _not_ recurring patterns.
say "ab gh tu" as stored, then a single token can be for
tughab ghtuab ghabtu abtugh abghtu tuabgh |
|
| |
|
Back to top |
Guest
|
Posted: Tue Jul 22, 2008 11:47 am Post subject: Re: compression method |
|
|
On Jul 22, 7:33 am, mcja...@gmail.com wrote:
[quote]On Jul 22, 6:54 am, mcja...@gmail.com wrote:
On Jul 20, 6:27 pm, mcja...@gmail.com wrote:
On Jul 20, 5:54 pm, mcja...@gmail.com wrote:
On Jul 20, 5:35 pm, mcja...@gmail.com wrote:
On Jul 20, 3:59 pm, Willem <wil...@stack.nl> wrote:
mcja...@gmail.com wrote:
) what would happen if this though...
If you ignore fundamental principles and simple arguments,
then you will either get laughed at or get ignored.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I>m not paranoid. You all think I>m paranoid, don>t you !
#EOT
What am I ignoring that>s fundamental?
I>m taking the understanding into account that compression works with
the idea of reducing redundancy...
so far the only idea I think is repeat occurances that can be said
once and explained more often right?
what a way to achieve the reduction of information... but I wouldn>t
say the only way, it>s just said so the way to be about it.
I was trying to not be far from an idea that says differerent, would
work in idea of thinking about it, and has something else to it when
it comes to what proportion can be achieved in how much information
can be reduced.
now think of a string of text... find the string of text said another
way as a shape somehow, where every word there can be would draw a
different shape. ok ?
now find another string of text, find the shape for it, now find math
that transforms one shape to another, find that in some cases the math
to transform the first shape into the second shape is smaller than the
second shape itself... so say this now, hold the first shape and the
math to transform the shape as the information...
so now in idea it>s compression not working for the idea of repeat
occurances, but for how a shape is math transform in size bigger or
smaller than another shape.
so not like there>s any math for the idea, or how even any example
tries to fare, it>s just the idea of how it>s working to achieve
compression.
see how that>s completely different than finding repeat occurances of
even the same string?
see how it doesn>t even depend on how many repeat occurances can be
found?
so in simplicity of the same proportion I think this idea of
compression would work, I don>t get stuck thinking of it anyway....
like... say for everytime abc cba or bca is found as part of the file,
you say coordinate in area and a curve where you start at the first
letter and the curve follows through across each letter. so now only
the letters "abc" are in the geometry area, but a token that says the
letters rearranged any way.
so that achieves another way besides repeat occurances of the same
string.
I think files say alot better about rearranged patterns than repeat
occurances.. and _no matter what_ it>s doing exactly redundancy
reductiotion the same as repeat occurances is too, it definitely says
that _at least_, but could only be better.
This is being different than redundant information if that only says
repeat occurances, is it not ?
I would think of it working like....
say first of all none of the file for real mixed with tokens, but the
idea like this....
put all of the file in a geometric area, where parts are further or
closer apart.
keep putting it in the geometric area where like if "had been here" is
already in there, it might be broken apart as words or together maybe?
but now putting "here already" in is what, so put the word "already"
like near the word "here".
so now for "had been here" and "here already" you only keep "had been
here already", because the word "here" already found but not like a
repeat occurance, but like a pattern to find another way.
so it should be like "gunsmith", "muts", "record", "buns", "thrill"
has it so there>s maybe in geometric area
g uns mi th muts record b rill
and then for those words a token that has a plot coordinate map said
shorter, like just a curved line to connect the parts in an ordering.
see how this can achieve better? it has no limit the same as finding
repeat occurances this way.- Hide quoted text -
- Show quoted text -
also, I>d like to add... to think about maybe?
I>d like to find the theoretical compression limit to be said a better
way maybe...
isn>t it always possible to write a small software program, that when
run, has software runtime of generating a greater amount of
information?
so not said for any example that could work, but can>t it always be a
smaller software program that runs to output more information?
can>t a small program be like what goes through a loop of transforming
a string, inside another loop, inside another loop, inside another
loop... like all loops transform the string in some complicated math
bizarre that ends up being the output? like be a few strings being
transformed where the loop is a run-on series of string transform the
way it>s the output maybe?
so can>t any file like it>s compressed just be a small program that
runs how it outputs the file if it runs a way to compute for what
output is?
so can>t compression in idea be what goes the way of being the
smallest possible program that can run to generate output? if for
example it>s the smallest software program that can run to generate
output? nothing to say that can work, but in idea... ?
isn>t it fair to say the smallest program that can be made to run and
generate file output is the best it can be ?- Hide quoted text -
- Show quoted text -
so given the idea of this as what to compress:
"abcdefghijklmnopqrstuvwxyz abstract wizard ward start"
you can store:
"ab cdefghijklmnopqrstuvwxyz stract w izard ar d st t"
or
"ab cdefghijklmnopqrstuvwxyz st ract wiz ar d ward t"
or
"ab cdefghijklmnopqrstuv w xyz st ract iz ar d ard t"
or
"a b cdefghijklmnopqrs t uv w xyz st rct iz ar d rd"
or
"a b c d efghijklmnopqrs t uv w xyz st rct iz ar r"
or
"a b c d efghijklmnopqrs t uv w xyz s rc iz ar r"
or
"a b c d efghijklmnopq r s t uv w xyz s rc iz"
and so on..
each way has connect-the-dots to say the same.- Hide quoted text -
- Show quoted text -
reorganized patterns _not_ recurring patterns.
say "ab gh tu" as stored, then a single token can be for
tughab ghtuab ghabtu abtugh abghtu tuabgh- Hide quoted text -
- Show quoted text -
[/quote]
"abcdefgh eic feg cad bad gag"
can be as
"a b c d efg h eic feg"
to think even one big curve line but that stored maybe.. |
|
| |
|
Back to top |
Thomas Richter Guest
|
Posted: Tue Jul 22, 2008 12:23 pm Post subject: Re: compression type |
|
|
mcjason@gmail.com schrieb:
[quote]Does anybody see what proportion a method like this has for
compression?
[/quote]
Well, why don>t you implement it then and test it? My personal
impression is: Not much, because LZ is not so different and works pretty
well - it>s not so easy to compete against.
[quote]Anybody want to check the math on this one ?
[/quote]
It>s not yet in the form where math can be applied in first place. Only
you can do that - and I doubt many will volunteer. If you want to prove
that your method works, I think it>s up to you to provide an implementation.
So long,
Thomas |
|
| |
|
Back to top |
Jim Leonard Guest
|
Posted: Tue Jul 22, 2008 7:28 pm Post subject: Re: compression type |
|
|
On Jul 22, 4:11 am, mcja...@gmail.com wrote:
[quote]I even see no reason why random can>t be compressed quite well
actually.
say given a random string:
"jht845mnh82hk"
right.. no repeat occurances significant enough to call tokened with
traditional compression>s string repeat occurance tactic...
however,
"j h t 8 4 5 m n 2 k"
can be stored, and tokens that say:
curve line j - h - t - 8 - 4 - 5 - m - n
curve line 2 - h - k
[/quote]
Right. So how would you describe these "curves"? What would the data
look like? Probably some floating-point numbers for vectors/points
and, what, bezier curves? Floating-point numbers for those too,
right? For sufficient precision in the mantissa and exponent you>d
use IEEE 80-bit floating point numbers, yes?
Why don>t you work this all out. You>d see that the "curve"
definitions would actually take up more space just for themselves than
1. the data they were trying to reconstruct, for small datasets, or 2.
other established methods like LZ77, for larger datasets.
If you want to stick to one dimension, you can convert your "curves"
into "lines" since it doesn>t matter if the "lines" are curved or not
as long as they point to the right data. So you>d only use integer
numbers to point to matches. And that>s LZ77. |
|
| |
|
Back to top |
Jim Leonard Guest
|
Posted: Tue Jul 22, 2008 7:31 pm Post subject: Re: genuine random compressability |
|
|
On Jul 22, 5:34 am, mcja...@gmail.com wrote:
[quote]you can just say "q lk z g[p387b nalnc gb" in the sphere, and file
content said as a few lines that connect the blocks together in order.
so now file contents is line, like a curved line for in the sphere, as
tokens.
[/quote]
You are completely neglecting the size of the data necessary to
describe these spheres and curves/lines. Saying "put all the file in
a sphere" implies that you are defining a sphere and indicating where
the file goes. That takes up space. It is not free. Try working it
out and you>ll see that this arbitrary arrangement takes up more space
than it saves. |
|
| |
|
Back to top |
Thomas Richter Guest
|
Posted: Tue Jul 22, 2008 11:57 pm Post subject: Re: compression method |
|
|
Mark Nelson wrote:
Mark Nelson wrote:
[quote]On Jul 21, 2:39 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
mcja...@gmail.com schrieb:
for who says random can>t be compressed...
Ok, here>s an exercise for you, or rather a question:
Please give a definition of "random". (I>m asking for not more).
Then, once you have that definition, we can work from there backwards.
I think a good definition for the purpose of discussion here is
something like this:
"A random sequence is defined as a any sequence which cannot be
generated with a program shorter than itself. "
The only catch to this definition is that it is with respect to the
machine on which the program is going to run. Other than that I think
it works very well.
[/quote]
That (namely, Kolmogorov>s) is a very good definition indeed, and it
ends the argument.
For the purpose of the thread, I actually considered a simpler one,
namely "data the compressor did not expect" - speaking as a
mathematician, I would say it is the matter of how you place your
quantors: The OP claimed:
There is a compressor such that for any data sequence the output of the
sequence under the compressor is shorter than the input.
which is wrong - and what I would call "compression of random data" or
(equivalently) "recursive compression". Correct is, however, only
For any data sequence exists a compressor such that the output of the
sequence under the compressor is shorter than the input.
[quote]It also stops any discussion of compressing random data dead in its
tracks by defining the problem away.
[/quote]
Well, indeed. Though it would have been nice if the OP had to make up
his mind himself. The question is not quite as trivial as it seems.
So long,
Thomas |
|
| |
|
Back to top |
mcjason Guest
|
Posted: Wed Jul 23, 2008 1:53 am Post subject: Re: genuine random compressability |
|
|
On Jul 22, 3:31 pm, Jim Leonard <MobyGa...@gmail.com> wrote:
[quote]On Jul 22, 5:34 am, mcja...@gmail.com wrote:
you can just say "q lk z g[p387b nalnc gb" in the sphere, and file
content said as a few lines that connect the blocks together in order.
so now file contents is line, like a curved line for in the sphere, as
tokens.
You are completely neglecting the size of the data necessary to
describe these spheres and curves/lines. Saying "put all the file in
a sphere" implies that you are defining a sphere and indicating where
the file goes. That takes up space. It is not free. Try working it
out and you>ll see that this arbitrary arrangement takes up more space
than it saves.
[/quote]
Ok so given nothing different about the size I was thinking it would
be a file with just the same size about for what it is it say
(sphere)TOKEN and just if you have all the data into the sphere not
broken apart.
so that>s the same size, so then it can be thought that everything to
do where size is smaller is to be where something like
ajkl12 and ajlk12 can be said another way, and another way to say is
less than what it takes to say one data block.
so a sphere with like "(data but not aj k, and 12: aj:k:12:
data)TOKENS"...
so for alot it>s like to say that you only need in all of it a
reoganized pattern a few times to be better no matter what. right
maybe? |
|
| |
|
Back to top |
mcjason Guest
|
Posted: Wed Jul 23, 2008 2:03 am Post subject: Re: genuine random compressability |
|
|
On Jul 22, 9:53 pm, mcjason <mcja...@gmail.com> wrote:
[quote]On Jul 22, 3:31 pm, Jim Leonard <MobyGa...@gmail.com> wrote:
On Jul 22, 5:34 am, mcja...@gmail.com wrote:
you can just say "q lk z g[p387b nalnc gb" in the sphere, and file
content said as a few lines that connect the blocks together in order..
so now file contents is line, like a curved line for in the sphere, as
tokens.
You are completely neglecting the size of the data necessary to
describe these spheres and curves/lines. Saying "put all the file in
a sphere" implies that you are defining a sphere and indicating where
the file goes. That takes up space. It is not free. Try working it
out and you>ll see that this arbitrary arrangement takes up more space
than it saves.
Ok so given nothing different about the size I was thinking it would
be a file with just the same size about for what it is it say
(sphere)TOKEN and just if you have all the data into the sphere not
broken apart.
so that>s the same size, so then it can be thought that everything to
do where size is smaller is to be where something like
ajkl12 and ajlk12 can be said another way, and another way to say is
less than what it takes to say one data block.
so a sphere with like "(data but not aj k, and 12: aj:k:12:
data)TOKENS"...
so for alot it>s like to say that you only need in all of it a
reoganized pattern a few times to be better no matter what. right
maybe?
[/quote]
i mean to say..
isn>t it like saying that no matter what the size of data is, it>s
that the way to just say it and not be smaller is
(sphere)TOKEN but then to say smaller is to only find _in any size it
can be_ only a few patterns that reorganize.
it>s neat how the size of the curves and how you work the sphere area
matters.. but the tradeoff is mostly another way.
right.. lines or curves or something.. where a token plots for an
arrangement. |
|
| |
|
Back to top |
mcjason Guest
|
Posted: Wed Jul 23, 2008 2:06 am Post subject: Re: genuine random compressability |
|
|
On Jul 22, 9:53 pm, mcjason <mcja...@gmail.com> wrote:
[quote]On Jul 22, 3:31 pm, Jim Leonard <MobyGa...@gmail.com> wrote:
On Jul 22, 5:34 am, mcja...@gmail.com wrote:
you can just say "q lk z g[p387b nalnc gb" in the sphere, and file
content said as a few lines that connect the blocks together in order..
so now file contents is line, like a curved line for in the sphere, as
tokens.
You are completely neglecting the size of the data necessary to
describe these spheres and curves/lines. Saying "put all the file in
a sphere" implies that you are defining a sphere and indicating where
the file goes. That takes up space. It is not free. Try working it
out and you>ll see that this arbitrary arrangement takes up more space
than it saves.
Ok so given nothing different about the size I was thinking it would
be a file with just the same size about for what it is it say
(sphere)TOKEN and just if you have all the data into the sphere not
broken apart.
so that>s the same size, so then it can be thought that everything to
do where size is smaller is to be where something like
ajkl12 and ajlk12 can be said another way, and another way to say is
less than what it takes to say one data block.
so a sphere with like "(data but not aj k, and 12: aj:k:12:
data)TOKENS"...
so for alot it>s like to say that you only need in all of it a
reoganized pattern a few times to be better no matter what. right
maybe?
[/quote]
anybody know how to explain that tradeoff better than me? |
|
| |
|
Back to top |
|