Some days ago, Hugh managed to recover a bunch of files from a ZIP disk. Among the files there was this one special archive file: VAX.ARC. So close, yet so far. Despite all the efforts Hugh had taken to track down, compile and run the ARC 5.21p tool, the file has blatantly withstood all extraction attempts. After having a closer look at the ARC sources, there was only one reasonable conclusion: The file is encrypted.

Considering the colourful past of the ARC compression tools, there was another minimal chance: A compatibility problem between ARC 5.21p and the version that was used to create the archive. Diving deep into my old archive CDs brought up a handful of tools with support for the ARC format: arca, arce, pkarc, arc - DOS files from the late 80's or early 90's. Firing up DOSBOX and ... you guessed it. All show the same CRC errors as ARC 5.21. A little bit disappointing, but certainly not unexpected.

Ok, the password then. Taking a closer look at the ARC sources reveals that we are facing a repeating XOR encryption. Actually nothing fancy, but the encryption is applied to the packed data. This makes everything a lot harder, because it prevents frequency analysis to a high degree.

The only obvious way open to us seems to be brute-forcing the password. But even considering the simplicity of the algorithm we are probably talking about a substantial time frame. Assuming a password length of at least 7 or 8 chars and the fact that we must verify the passwords through the CRC calculation, we are talking about days if not weeks in the worst case. Plus Hugh has detected some serious inconsistencies aka bugs in ARC 5.21 when he added a brute-force option. We probably can reduce the key space a little bit: An early documentation of ARC suggests that all passwords are converted to uppercase and considering the fact that the user has to enter the passwords via shell and the fact that we are talking about the late 80's, we can assume a password consisting of alphanumeric chars. However, not exactly a giant leap forward.

Sigh, and it looked so promising:

alice.tar       9 (Squashed)       474673   1421312  1991-06-23 16:01  4AD7
animator.tar    9 (Squashed)      1013114   2396160  1991-06-23 21:33  69CE
as68.tar        9 (Squashed)       226903    675840  1991-06-23 21:46  C5D8
asasin.tar      9 (Squashed)       506893   1157120  1991-06-23 18:38  F320
dvi2ps.gfx      9 (Squashed)       136691    266240  1991-06-23 21:51  60D5
emu.tar         9 (Squashed)       924620   2211840  1991-06-23 22:33  3CDA
fish.tar        9 (Squashed)       530010   1198080  1991-06-23 18:03  1198
fred23jr.tar    9 (Squashed)        42197    112640  1991-06-23 13:47  A1F6
gfi.tar         9 (Squashed)       149121    296960  1991-06-23 22:38  F339
kermit.tar      4 (Squeezed)       324504    358400  1991-06-23 22:45  45A7
lnk.tar         9 (Squashed)       262342    921600  1991-06-23 22:59  8B24
miscc.tar       9 (Squashed)        31894     81920  1991-06-23 23:01  2633
patch.tar       9 (Squashed)        75587    163840  1991-06-23 23:04  9AB0
rcs.tar         9 (Squashed)       216397    532480  1991-06-23 23:13  7FD8
sps2.tar        9 (Squashed)        76224    188416  1991-06-23 23:16  1585
undump.tar      9 (Squashed)        23259     61440  1991-06-23 23:17  B60C
windows.rcs     2 (Stored)        2593536   2593536  1991-05-13 20:07  DA28

But wait, there is one file in the archive that is particularly special:

windows.rcs     2 (Stored)        2593536   2593536  1991-05-13 20:07  DA28

It is stored, not packed. This has the original data XOR'ed with the password. We might gain a foothold! To begin with, we extract the encrypted data from the archive. Now we have a WINDOWS.RCS, which is still encrypted, though.

Next we need to go for the password. Saying that we need the password length and the actual password chars. Time for a little statistical probing. For each possible password length we shift our raw data about the password length and XOR it with the original raw data.

49 D4 D4 6A A5 81 DE BD D4 84 DF 74
   49 D4 D4 6A A5 81 DE BD D4 84 DF 74
      49 D4 D4 6A A5 81 DE BD D4 84 DF 74

Why would we do that? XOR'ing a value with itself results in zero. a xor a = 0. For each shift we count the number of resulting zeroes. If we have an unusual high number of zeroes we hit a potential password length - or a multiple of it. Counting for passwords up to 16 chars, we get the following numbers:

   1:   17972
   2:   17971
   3:   17981
   4:   18416
   5:   17968
   6:   18136
   7:   17966
   8:   24309
   9:   18019
  10:   18039
  11:   17994
  12:   18655
  13:   18023
  14:   18148
  15:   18023
  16:   24343

See, what we got. The numbers for 8 chars and 16 chars are significantly higher than the other ones. Chances are high, our password has a length of 8 or 16 chars and for the above reasons we assume an 8 char password.

Next we need to go for the actual password. If we assume an 8 char password every 9th byte in the original data is encrypted with the same char. The first char of the password encrypts the bytes 0,8,16,...; the second char encrypts 1,9,17,...; and so on. Time to run a frequency analysis on each of the 8 groups.

But what are we looking for? ARC decided to store the file in the archive. This means that WINDOWS.RCS almost certainly contains random binary data. The most frequent byte in a random binary file might likely be 0x00. Let's count and see what we get:

In the first group we have a maximum of 4265 hits for char 86. Now here comes the beautiful part: Remember a xor a = 0? Assuming that the most frequent byte is 0x00, the value 86 would immediately represent the first char of the password: 'V'.

What do we get from the other 7 groups?

73 with 1786 hits 
68 with 4123 hits 
69 with 1901 hits 
79 with 4292 hits 
77 with 1810 hits 
79 with 4266 hits 
78 with 1874 hits 


There it is: the moment of truth... Firing up DOSBOX again. We got it!

Next Post Previous Post