www.GetXFactor.com

Leading Technology, Science,
Agriculture News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

The fundamental problems in data compression
   Science and Technology news... Forum Index -> Compression Forum  
View previous topic :: View next topic  
Author Message
Andrey I. Savov
Guest






PostPosted: Sun Jul 20, 2003 2:30 am    Post subject: The fundamental problems in data compression Reply with quote

I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem that
cannot be categorized among these or combination thereof?

1) Create the best model for the data.

Intractable if you only know the data. However, in most cases, source of
the data is at least partially known and allows approximations.

2) Map data model to probability model.

May be considered part of (1), although purposes are different. Usually
probabilities are assigned
to model elements, based on their occurrence in some/all conforming data
with or withour regard of context. May be performed dynamically per input,
or statically -- on the model.

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.
Back to top
Mark Nelson
Guest






PostPosted: Thu Jul 24, 2003 7:17 am    Post subject: Re: The fundamental problems in data compression Reply with quote

[quote]"Andrey I. Savov" <savov@hotmail.com> wrote in message

1) Create the best model for the data.
[/quote]
A lot of compression techniques, both lossless and lossy,
use some sort of transform to to manipulate the data
into an easier-to-compress format. Does this count as
modeling? I>m not sure - it seems like it could be.

--
|
| Mark Nelson - markn@ieee.org - http://MarkNelson.us
| The Ultimate Compression Resource - http://www.datacompression.info
| Sponsored by Xceedsoft.com - Serious About Components
|
"Andrey I. Savov" <savov@hotmail.com> wrote in message
news:oMiSa.4274$ff.3523@fed1read01...
[quote]I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem that
cannot be categorized among these or combination thereof?

1) Create the best model for the data.

Intractable if you only know the data. However, in most cases, source of
the data is at least partially known and allows approximations.

2) Map data model to probability model.

May be considered part of (1), although purposes are different. Usually
probabilities are assigned
to model elements, based on their occurrence in some/all conforming data
with or withour regard of context. May be performed dynamically per
input,
or statically -- on the model.

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.





[/quote]
Back to top
Andrey I. Savov
Guest






PostPosted: Thu Jul 24, 2003 8:13 am    Post subject: Re: The fundamental problems in data compression Reply with quote

"Mark Nelson" <markn@ieee.org> wrote in message
news:nlHTa.118331$sY2.54034@rwcrnsc51.ops.asp.att.net...
[quote]"Andrey I. Savov" <savov@hotmail.com> wrote in message

1) Create the best model for the data.

A lot of compression techniques, both lossless and lossy,
use some sort of transform to to manipulate the data
into an easier-to-compress format. Does this count as
modeling? I>m not sure - it seems like it could be.
[/quote]
There is a distinction I make between *data* (1) modeling and *probability*
(2) modeling which is based purely on the purpose for the activity.

In your case, if you refer to BWT or DCT they would qualify as data
modeling, because these transforms by themselves do not assign probabilities
to the symbols or the matrix elements. Processes external to the transforms
do.
Back to top
Andrey I. Savov
Guest






PostPosted: Thu Jul 24, 2003 8:15 am    Post subject: Re: The fundamental problems in data compression Reply with quote

"Stuart Caie" <kyzer@4u.net> wrote in message
news:3f1a5721$0$965$cc9e4d1f@news.dial.pipex.com...
[quote]Andrey I. Savov wrote:
I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem
that
cannot be categorized among these or combination thereof?

Reversibility? It must be possible to decompress your compressed data.
Either all relevant information can be included in the compressed data
itself, or it can be implicitly known both by the compressor and
decompressor. Breaking this rule is how a number of compressors claim they
can compress "any" file.
[/quote]
Reversibility in itself is not a problem, but should be an attribute
(property) of all solutions. I agree that it is a rather important one.
Back to top
Dale King
Guest






PostPosted: Fri Jul 25, 2003 9:20 am    Post subject: Re: The fundamental problems in data compression Reply with quote

In article <oMiSa.4274$ff.3523@fed1read01>, savov@hotmail.com says...
[quote]I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem that
cannot be categorized among these or combination thereof?

1) Create the best model for the data.

Intractable if you only know the data. However, in most cases, source of
the data is at least partially known and allows approximations.

2) Map data model to probability model.

May be considered part of (1), although purposes are different. Usually
probabilities are assigned
to model elements, based on their occurrence in some/all conforming data
with or withour regard of context. May be performed dynamically per input,
or statically -- on the model.

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.
[/quote]
#3 is trivial. If you know the probability of the message then you know
the optimal length encoding for it (that was of course what Shannon>s
work was all about). The issue is that we don>t know the probability #2,
mainly because we don>t know the model #1.

--
Dale King
Back to top
Sachin Garg
Guest






PostPosted: Sat Jul 26, 2003 4:07 am    Post subject: Re: The fundamental problems in data compression Reply with quote

Dale King <kingd@tmicha.net> wrote in message news:<MPG.1987db91afe74e9c989702@netnews.insightBB.com>...
[quote]In article <oMiSa.4274$ff.3523@fed1read01>, savov@hotmail.com says...
I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem that
cannot be categorized among these or combination thereof?

1) Create the best model for the data.

Intractable if you only know the data. However, in most cases, source of
the data is at least partially known and allows approximations.

2) Map data model to probability model.

May be considered part of (1), although purposes are different. Usually
probabilities are assigned
to model elements, based on their occurrence in some/all conforming data
with or withour regard of context. May be performed dynamically per input,
or statically -- on the model.

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.

#3 is trivial. If you know the probability of the message then you know
the optimal length encoding for it (that was of course what Shannon>s
work was all about). The issue is that we don>t know the probability #2,
mainly because we don>t know the model #1.
[/quote]
But "knowing" the optimal length is different form "creating" optimal
length bit stream...

Of course, the *real* work is in #1 and #2 (thats because we already
have #3)
but both #1 and #2 will be useless without #3.

So, I would rather vote for #3 to be an "important" fundamental
problem.

Sachin Garg [India]
http://sachingarg.go.to
http://sachingarg.cjb.net
Back to top
Dr Chaos
Guest






PostPosted: Sat Jul 26, 2003 10:07 pm    Post subject: Re: The fundamental problems in data compression Reply with quote

On 25 Jul 2003 16:07:09 -0700, Sachin Garg <schngrg@yahoo.com> wrote:
[quote]Dale King <kingd@tmicha.net> wrote in message news:<MPG.1987db91afe74e9c989702@netnews.insightBB.com>...
In article <oMiSa.4274$ff.3523@fed1read01>, savov@hotmail.com says...
I was trying to understand what are the *fundamental* problems data
compression needs to solve, so I came up with these three and a little
synopsis about their solutions. Have I missed a fundamental problem that
cannot be categorized among these or combination thereof?

1) Create the best model for the data.

Intractable if you only know the data. However, in most cases, source of
the data is at least partially known and allows approximations.

2) Map data model to probability model.

May be considered part of (1), although purposes are different. Usually
probabilities are assigned
to model elements, based on their occurrence in some/all conforming data
with or withour regard of context. May be performed dynamically per input,
or statically -- on the model.

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.

#3 is trivial. If you know the probability of the message then you know
the optimal length encoding for it (that was of course what Shannon>s
work was all about). The issue is that we don>t know the probability #2,
mainly because we don>t know the model #1.

But "knowing" the optimal length is different form "creating" optimal
length bit stream...
[/quote]
If you have the probability model, then in theory, arithmetic coding
comes almost close enough to count as a fully solved problem
for creating the shortest length.
Back to top
Andrey I. Savov
Guest






PostPosted: Thu Jul 31, 2003 7:35 am    Post subject: Re: The fundamental problems in data compression Reply with quote

"Dale King" <kingd@tmicha.net> wrote in message
news:MPG.1987db91afe74e9c989702@netnews.insightBB.com...
[quote]In article <oMiSa.4274$ff.3523@fed1read01>, savov@hotmail.com says...

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.

#3 is trivial. If you know the probability of the message then you know
the optimal length encoding for it (that was of course what Shannon>s
work was all about). The issue is that we don>t know the probability #2,
mainly because we don>t know the model #1.
[/quote]
I am pretty sure Shannon, if he was alive, would disagree with your
classification of his lifework as "trivial". :-) Largely solved, yes;
uninteresting, maybe, but trivial -- no. There are so many solutions to
that same problem, that it is quite clear to me that it is a fundamental
one. Also it is one of the first fundamental problems to be solved.

To account for your line of thinking, I could add "within certain
constraints (time, space, etc.)." to the formulation of that problem. That
ought to make it more non-"trivial", eh?

While on the subject of triviality, the thing that bothers me in this
classification is whether #2 would be trivial if you actually solved #1.
Anyone?
Back to top
Dale King
Guest






PostPosted: Mon Aug 04, 2003 9:39 am    Post subject: Re: The fundamental problems in data compression Reply with quote

In article <0g%Va.17176$ff.16943@fed1read01>, savov@hotmail.com says...
[quote]"Dale King" <kingd@tmicha.net> wrote in message
news:MPG.1987db91afe74e9c989702@netnews.insightBB.com...
In article <oMiSa.4274$ff.3523@fed1read01>, savov@hotmail.com says...

3) Create shortest average length encoding for conforming data.

Solutions here vary drastically. Examples are entropy coders, MTF,
positional/permutation order coding, etc.

#3 is trivial. If you know the probability of the message then you know
the optimal length encoding for it (that was of course what Shannon>s
work was all about). The issue is that we don>t know the probability #2,
mainly because we don>t know the model #1.

I am pretty sure Shannon, if he was alive, would disagree with your
classification of his lifework as "trivial". :-) Largely solved, yes;
uninteresting, maybe, but trivial -- no.
[/quote]
I certainly was not saying that his life work was trivial or that
compression is trivial. The way that the original poster posed the
question, his #3 was trivial. If you know the probability of each message
then you know the length for each output. That is what Shannon tells us.

[quote]There are so many solutions to
that same problem, that it is quite clear to me that it is a fundamental
one. Also it is one of the first fundamental problems to be solved.
[/quote]
The issue is with his #1 and #2, which are not trivial and not even
really solvable to any great degree of accuracy. But as he framed the
problem where #1 & #2 are the problems to be solved, #3 is already solved
if you solve #1 and #2.

I don>t agree with his approach to thinking of the problem. Things like
BWT certainly did not follow this approach.

--
Dale King
Back to top
Andrey I. Savov
Guest






PostPosted: Mon Aug 04, 2003 9:27 pm    Post subject: Re: The fundamental problems in data compression Reply with quote

"Dale King" <kingd@tmicha.net> wrote in message
news:MPG.199796b11dca770c989713@netnews.insightBB.com...

[quote]I am pretty sure Shannon, if he was alive, would disagree with your
classification of his lifework as "trivial". :-) Largely solved, yes;
uninteresting, maybe, but trivial -- no.

I certainly was not saying that his life work was trivial or that
compression is trivial. The way that the original poster posed the
question, his #3 was trivial. If you know the probability of each message
then you know the length for each output. That is what Shannon tells us.
[/quote]
The original poster was me. If you know the probability of each message,
you still have the option of encoding the message in quite a number of ways
depending on constraints, which does affect *average length*. Second, the
fact that there are many existing solutions to this problem, doesn>t make it
less fundamental or trivial. I really don>t understand why does it look
trivial to you.

[quote]The issue is with his #1 and #2, which are not trivial and not even
really solvable to any great degree of accuracy. But as he framed the
problem where #1 & #2 are the problems to be solved, #3 is already solved
if you solve #1 and #2.
[/quote]
Are you saying that if you solved #1 and #2, then #3 is no longer
fundamental (since its "trivial")? Just because you have a probability
model, doesn>t mean that selecting encoding is trivial, particularly because
execution time and average length are (greatly) affected by this selection.

Clearly, we must better define terms here. My definition of "fundamental"
is "applicable to all" and "trivial" meaning "of no importance". By that
definition, #3 is fundamental and certainly not trivial.

[quote]I don>t agree with his approach to thinking of the problem. Things like
BWT certainly did not follow this approach.
[/quote]
Perhabs then you could offer a better answer to my original question: What
are the fundamental problems all data compression algorithms need to solve,
both at design and run time? Please use the definition above or any
reasonable one you may suggest.

I also never meant to say that all algorithm designers thought of the
problem as composed of these 3 parts. I was merely trying to frame the
major obstacles to creating a new or even slightly different algorithm.
Also, when you have decomposition of problems then you can compare solutions
more effectively. BWT, the transform itself, is clearly a solution for #1
to me. MTF is a solution to #2.
Back to top
D.A.Kopf
Guest






PostPosted: Mon Aug 04, 2003 11:45 pm    Post subject: Re: The fundamental problems in data compression Reply with quote

"Andrea I. Savor" wrote:
[quote]
Perhabs then you could offer a better answer to my original question: What
are the fundamental problems all data compression algorithms need to solve,
both at design and run time? Please use the definition above or any
reasonable one you may suggest.
[/quote]
It seems to me that entropy coding is a side track to the main issue
of compression, that of modeling. I don>t know if there are papers
equivalent to Shannon>s on the subject of modeling, but I somehow
feel that the "perfect" model would produce "random" residuals for
which entropy coding would have no effect.

If correct, an important corollary would of course be that any genuine
bijective method of compression would need to take the content of each
message into account ;)
Back to top
Display posts from previous:   
   Science and Technology news... Forum Index -> Compression Forum  
Page 1 of 1
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum