www.GetXFactor.com

Leading Technology, Science,
Agriculture News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

Trying to determine compression algorithm. Probably LZ varia
   Science and Technology news... Forum Index -> Compression Forum  
View previous topic :: View next topic  
Author Message
Venatyr
Guest






PostPosted: Wed Aug 27, 2003 3:00 am    Post subject: Trying to determine compression algorithm. Probably LZ varia Reply with quote

Hello,

This is my first foray into compression and I was wondering if somebody could provide some help.

I have two files. One file is the original and the other is compressed with an unknown algorithm
(Probably an LZ variant).

The files are available to be downloaded from this forum thread:

http://talus.hopto.org/bored/index.php?act=ST&f=13&t=106


Thanks,

Venatyr
Back to top
Stuart Caie
Guest






PostPosted: Wed Aug 27, 2003 4:53 am    Post subject: Re: Trying to determine compression algorithm. Probably LZ v Reply with quote

Venatyr wrote:
[quote]Hello,

This is my first foray into compression and I was wondering if somebody could provide some help.

I have two files. One file is the original and the other is compressed with an unknown algorithm
(Probably an LZ variant).

The files are available to be downloaded from this forum thread:

http://talus.hopto.org/bored/index.php?act=ST&f=13&t=106
[/quote]
It>s LZSS.

It>s a stream of blocks, where each block starts with a control byte. This
byte tells us what the next 8 elements will be, through whether its bits are
set or not. The order the bits are tested are from LSB to MSB. If a bit is
set, the element is a match (2 bytes). If not, it>s a literal (1 byte).

The literal is just that, a single literal byte.

The match is encoded like so:
- first byte is the distance to go back in the stream (in bytes)
to be at the start of the match.
- second byte is the length of the match, minus 2,
shifted left by 4 bits.

Distance can be 255 at most. Length can be 17 at most.

Here>s a hand decoding of the start of your japanese compressed file:

control=00 (LLLLLLLL: all literals)
83 65 83 43 83 8B 83 74
control=00 (LLLLLLLL: all literals)
83 42 83 93 83 4F 8E 63
control=00 (LLLLLLLL: all literals)
82 E8 82 51 89 F1 0D 0A
control=0B (MMLMLLLL: two matches, literal, match, 4 literals)
18 F0: match distance=24, length=17
18 00: match distance=24, length=2
50
18 F0: match distance=24, length=17
4F 8F C1 96
control=02 (LMLLLLLL)
C5
14 00: match distance=20, length=2
8D 55 8C 82 28 41
control=02 (LMLLLLLL)
29
1D 10: match distance=29, length=3
51 81 5B 83 80 82

Let us know once you have a decoder that gets this far and post it. You may
find the matches come out wrong once you reach a certain point in the
stream. I don>t know where this is, because it depends on what size the
sliding window is. I also may be wrong about match offsets and lengths, as I
only bothered decoding as far as you see above.

Regards
Stuart
Back to top
Venatyr
Guest






PostPosted: Wed Aug 27, 2003 9:04 pm    Post subject: Re: Trying to determine compression algorithm. Probably LZ v Reply with quote

Stuart Caie <kyzer@4u.net> wrote in
news:3f4bf2f7$0$251$cc9e4d1f@news.dial.pipex.com:

[quote]Venatyr wrote:
Hello,

This is my first foray into compression and I was wondering if
somebody could provide some help.

I have two files. One file is the original and the other is
compressed with an unknown algorithm (Probably an LZ variant).

The files are available to be downloaded from this forum thread:

http://talus.hopto.org/bored/index.php?act=ST&f=13&t=106

It>s LZSS.

It>s a stream of blocks, where each block starts with a control byte.
This byte tells us what the next 8 elements will be, through whether
its bits are set or not. The order the bits are tested are from LSB to
MSB. If a bit is set, the element is a match (2 bytes). If not, it>s a
literal (1 byte).

The literal is just that, a single literal byte.

The match is encoded like so:
- first byte is the distance to go back in the stream (in bytes)
to be at the start of the match.
- second byte is the length of the match, minus 2,
shifted left by 4 bits.

Distance can be 255 at most. Length can be 17 at most.

Here>s a hand decoding of the start of your japanese compressed file:

control=00 (LLLLLLLL: all literals)
83 65 83 43 83 8B 83 74
control=00 (LLLLLLLL: all literals)
83 42 83 93 83 4F 8E 63
control=00 (LLLLLLLL: all literals)
82 E8 82 51 89 F1 0D 0A
control=0B (MMLMLLLL: two matches, literal, match, 4 literals)
18 F0: match distance=24, length=17
18 00: match distance=24, length=2
50
18 F0: match distance=24, length=17
4F 8F C1 96
control=02 (LMLLLLLL)
C5
14 00: match distance=20, length=2
8D 55 8C 82 28 41
control=02 (LMLLLLLL)
29
1D 10: match distance=29, length=3
51 81 5B 83 80 82

Let us know once you have a decoder that gets this far and post it.
You may find the matches come out wrong once you reach a certain point
in the stream. I don>t know where this is, because it depends on what
size the sliding window is. I also may be wrong about match offsets
and lengths, as I only bothered decoding as far as you see above.

Regards
Stuart


[/quote]
Great. Thanks for you help.

After researching different LZ variants, I was thinking to myself that it
was LZSS.

--
Venatyr
Back to top
Venatyr
Guest






PostPosted: Fri Aug 29, 2003 8:32 pm    Post subject: Re: Trying to determine compression algorithm. Probably LZ v Reply with quote

Venatyr <darkfire7@yourmojo.com> wrote in
news:Xns93E46683465A1darkfire77starlok77@24.71.223.159:

[quote]Stuart Caie <kyzer@4u.net> wrote in
news:3f4bf2f7$0$251$cc9e4d1f@news.dial.pipex.com:

Venatyr wrote:
Hello,

This is my first foray into compression and I was wondering if
somebody could provide some help.

I have two files. One file is the original and the other is
compressed with an unknown algorithm (Probably an LZ variant).

The files are available to be downloaded from this forum thread:

http://talus.hopto.org/bored/index.php?act=ST&f=13&t=106

It>s LZSS.

It>s a stream of blocks, where each block starts with a control byte.
This byte tells us what the next 8 elements will be, through whether
its bits are set or not. The order the bits are tested are from LSB
to MSB. If a bit is set, the element is a match (2 bytes). If not,
it>s a literal (1 byte).

The literal is just that, a single literal byte.

The match is encoded like so:
- first byte is the distance to go back in the stream (in bytes)
to be at the start of the match.
- second byte is the length of the match, minus 2,
shifted left by 4 bits.

Distance can be 255 at most. Length can be 17 at most.

Here>s a hand decoding of the start of your japanese compressed file:

control=00 (LLLLLLLL: all literals)
83 65 83 43 83 8B 83 74
control=00 (LLLLLLLL: all literals)
83 42 83 93 83 4F 8E 63
control=00 (LLLLLLLL: all literals)
82 E8 82 51 89 F1 0D 0A
control=0B (MMLMLLLL: two matches, literal, match, 4 literals)
18 F0: match distance=24, length=17
18 00: match distance=24, length=2
50
18 F0: match distance=24, length=17
4F 8F C1 96
control=02 (LMLLLLLL)
C5
14 00: match distance=20, length=2
8D 55 8C 82 28 41
control=02 (LMLLLLLL)
29
1D 10: match distance=29, length=3
51 81 5B 83 80 82

Let us know once you have a decoder that gets this far and post it.
You may find the matches come out wrong once you reach a certain
point in the stream. I don>t know where this is, because it depends
on what size the sliding window is. I also may be wrong about match
offsets and lengths, as I only bothered decoding as far as you see
above.

Regards
Stuart



Great. Thanks for you help.

After researching different LZ variants, I was thinking to myself that
it was LZSS.

--
Venatyr

[/quote]
I managed to decompress it properly.

It is using 12bits for distance and 4bits for length.

Thanks for your help Stuart.

--
Venatyr
Back to top
Display posts from previous:   
   Science and Technology news... Forum Index -> Compression Forum  
Page 1 of 1
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum