Instruction Set



For a full description of the SCI PMachine instruction set, read one of the following:


The % Use column gives a rough idea of how common the instructions is. It was based on counts from the scripts of about nine SCI0 games. The most common instruction by a long way is pushi. Next comes push1, then bnt, push0, send, push, ldi, push2, jmp, lofsa. That makes up the top ten. From this we can see that of the top 10, half of them are variations of push. In fact it turns out that nearly 50% of all instructions encountered were a variation of push. The pushi instruction accounts for 25% by itself. Add in the counts for the other push instructions and it is increased to nearly 50%. The top 10 instructions account for just over 70% of all instructions encountered. The top 20 for about 85% or so, and the top 30 for nearly 95%. Some of the instructions literally appeared only once or twice across all of those games, and some instructions didn't appear at all.

   Description % Use Usage Analysis
 00
 01
 bnot binary not 1% (~ var)
 02
 03
 add addition < 1% (+ var 5)

 04
 05
 sub subtraction < 1% (- var 5)
 06
 07
 mul multiplication < 1% (* var 5)
 08
 09
 div division < 1% (/ var 3)
 0A
 0B
 mod modulo  
 0C
 0D
 shr shift right logical  (<< var 2)
 0E
 0F
 shl shift left logical  (>> var 2)
 10
 11
 xor exclusive or  (^ var 0xff00)
 12
 13
 and logical and  
 14
 15
 or logical or  
 16
 17
 neg sign negation  
 18
 19
 not boolean not  (! expr)

 1A
 1B
 eq? is equal to? 2% (== var 5)
 1C
 1D
 ne? is not equal to?  (!= var 5)
 1E
 1F
 gt? is greater than?  (> var 5)
 20
 21
 ge? is greater or equal to?  (>= var 5)
 22
 23
 lt? is less than?  (< var 5)
 24
 25
 le? is less than or equal to?  (<= var 5)
 26
 27
 ugt? unsigned greater than  
 28
 29
 uge? unsigned greater or equal  
 2A
 2B
 ult? unsigned less than  
 2C
 2D
 ule? unsigned less than or equal  
 2E
 2F
 bt branch if true A bt usually implies an 'or' within a test condition. An 'or' would skip the remaining tests within the 'or' if a true were encountered.

(or (== i 0) (< n 5))

The bt instruction almost always branches forward, but there are a few instances of it branching backwards. From the examples I've seen of this, it is possible that this is due to an optimization. In one example, the bt is right before the end of a loop and would normally branch to the end of the loop, but because in this example the end of the loop is then jumping back to the top of the loop, the bt seems to have been adjusted by a compiler optimization to branch to the top of the loop instead, i.e. a branch-to-jump optmization.

 30
 31
 bnt branch if not true 6%In a lot of cases bnt implies an if in the original language. A bnt will immediately follow the test for a if condition to essentially jump over the code that is within the if, i.e. if the condition is not true then jump over the block of code.

(if (< count 5) (= count 5))

A bnt could also imply a switch, and, for, while,  basically anything that has a conditional test of some kind followed by some code that needs to be skipped if the test is false. I list 'and' here because an 'and' would skip evaluating further tests within the 'and' if a false were encountered.

 32
 33
 jmp jump 4%In a lot of cases jmp implies an 'else'. A jmp will immediately follow the block of code that is executed for an if when there is an else attached to that if. It is there so that the else block is jumped over. A jmp that is jumping forward would imply an else.

(if (< count 5) (= count 5) else (= count 10))

A jmp can also mean a loop, e.g. a for or while. Usually a jmp that is jumping back would imply a loop of some kind.

 34
 35
 ldi load data immediate 5%Opcodes such as pushi, push0, push1 and push2 mean that we don't see this fairly general purpose opcode as much as we might expect to see it.

By far the most common usage appears to be immediately prior to one of the arithmetic, bitwise or logical opcodes. This makes sense given that those operations always use the accumulator and they quite often work with immediate values.

(< x 0)
(+ state 1)
(& signal $8000)

It is also used when assigning an immediate value:

(= cycler 0)

It is also used when returning an immediate value from a method or procedure:

(return 5)

And probably other scenarios requiring an immediate value.

The main point is that it implies that the original source has an immediate value at that point (or perhaps a constant that represents the immediate value).

 36
 37
 push push to stack 5%Usually used for pushing the result of an operation or the return value of a method call or procedure call on to the stack so that it becomes the value of a parameter to a subsequent operation, send or call.

There are usually more efficient ways to push values on to the stack for an operation, send or call that isn't nested within another operation, send or call. Opcodes such as pushi, push0, push1, push2 and pTos are usually used in those scenarios.

 38
 39
 pushi push immediate 25%Used primarily for passing immediate values as parameters to procedures and methods invoked by send, self, class, super, callk, calle, callb and call.

 3A
 3B
 toss top of stack substract < 1%Usually appears at the very end of a switch, to remove from the stack what was originally pushed to it prior to testing each of the cases. All the cases jmp to the toss as a final thing inside each of their blocks, unless its the final case block and there is no else block (since in such a scenario that case block simply falls through to the toss).

Note that sometimes a case block has a "ret", which means that it never does a jmp to the toss. This presumably doesn't matter since everything added to the stack by the routine (method or procedure) is cleared out when it leaves the routine.

 3C
 3D
 dup duplicate top of stack 2%Appears at the start of each case block within a switch.

Normally what happens in the byte code for a switch is that it puts something on the stack, being the value that it is using to switch on, then at the start of each case, it uses dup to make another copy of that on the stack. This second copy will get popped off the stack as part of the comparison against the case's value. If it didn't use dup in this way, then the next case wouldn't have access to the value to compare against.

In the case of the "else" (i.e. otherwise/default) part of the switch, there is no dup since it doesn't need to compare anything. It's the default case.

Note that the structure of the switch statement in SCI is known due to the Feature Writer in later games, such as SQ5 and LSL6. 


The dup instruction is also used on the odd occasion when same value is passed to a method or procedure more than once in a row, e.g:

(self doSomething: what what what)

...would use two dup instructions in a row, for pushing the second and third "what".

In fact in general it seems that dup might be used in many situations where exactly the same value is being pushed on to the stack more than once in a row. This could mean that it appears immediately after something like an "lsp" when the same property is being pushed again, or it could be immediately after a "pushi" when the same immediate value is being pushed, etc., etc. When used like this, it seems to be an optimization.

 3E
 3F
 link allocate local stack space ?Allocates stack space for temporary variables. It serves a similar purpose to the 68000 LINK instruction. One big difference is that temporary variables in SCI routines are automatically discarded on exiting the routine, so there isn't an UNLK instruction.

It appears at the top of a routine (i.e. either a method or a procedure) and allocates stack space for the temporary variables used within that routine. If the routine doesn't use temporary variables, then "link" will not appear.

The stack space allocated is the given size multiplied by 2.

 40
 41
 call call inside script 1% 
 42
 43
 callk call kernel function 3% 
 44
 45
 callb call base script (0) 1% 
 46
 47
 calle call external script 1% 
 48
 49
 ret return from function/method 1%Returns from a method or procedure, e.g:

(return -1)
(return n)

 4A
 4B
 send send messages to object 6%Used for sending keyword messages to objects. There are a few examples of this in the PQ SWAT code snippet, e.g:

(qualProd setReal: qualprod 6)
(curroom newroom: ANGELES-TABLES)
(theGame hands0n:)

 4C
 4D
 - (not used)  
 4E
 4F
 - (not used)  
 50
 51
 class get class address 1%Often used as part of sending a message to a class. It loads the accumulator with a reference to the class. If the "send" opcode follows immediately afterwards then the message is sent to the class. The following are examples where this would happen:

(Event new:) 
(Sound pause: oldPause)
(User curEvent:)

Also used in other expressions where a class is used, such as in the following example:

(= window SysWindow)

 52
 53
 - (not used)  
 54
 55
 self send to self Used to send a message to the current object. So rather than specifying a object to send to, the keyword "self" is used, as we've seen in the PQ SWAT code snippet:

(self dispose:)

 56
 57
 super send to any class The opcode allows for sending to any class, but given the name, it is likely that it was intended for sending messages to a super class of the one that is making the call, e.g:

(super init: client)

 58
 59
 &rest push rest of params to stack  

An interesting point to note on this one is that LISP has a keyword in its high level language called &rest but it appears to be for a different purpose. However, given the similarity of the SCI language to LISP in appearance, it is possible that at least the name "&rest" came from there. Given the strangeness of this instruction's mnemonic (starting with an &), it seems quite possible that the high level SCI language used this same expression to "pass the rest of the parameters" to a method or procedure call. This would be similar to "self" above, where the mnemonic matches the high level language.

 5A
 5B
 lea load effective address Used when passing a pointer to a variable to a kernel call. It relates to the @ syntax shown in the SCI template game, e.g.:

@inputStr

 5C
 5D
 selfID get self address Used for returning the current object, e.g.:

(return self)

Also used when assigning the current object to a variable or property:

(= var self)

Could potentially be used in any situation where a reference to the current object makes sense.

Usually it wouldn't be used as part of passing the value of self to a method (selfID followed by push) since pushSelf is a more efficient way of doing that. 

 5E
 5F
 - (not used)  
 60
 61
 pprev push prev register to stack Pushes the value of the prev register, set by the last comparison bytecode (eq?, lt?, ule?, etc.), on the stack. Those comparison operators set the prev register to what the accumulator was before the comparison. This is because the comparison operator sets the accumulator to the rest of the comparison.

One place where the pprev instruction is used is for expressions such as the following, where a comparison operators is being performed multiple times between values:

(== -1 top bottom left right)
(< 150 dir 210)

Given that the specs refer to the prev register only in relation to the comparison operators, it seems likely that the above types of multi value comparisons are the pprev instruction's intended use, and it also seems likely that multi-value expressions like those shown above are possible for all of the comparison operators.

 62
 63
 pToa property to accumulator If a property is an object, then pToa is used to load the accumulator before a send.

Also used when a property is the final operand for an operator.

 64
 65
 aTop accumulator to property Used for assignment to a property, e.g.:

(= count 1)

where count is a property. The above would be compiled to:

ldi 1
aTop count

 66
 67
 pTos property to stack Used when passing a property value as a parameter in a method call or procedure call.

Also used when a property value is being included in an expression, such as the following:

(* distance 2)

Note that if this had been the other way around, i.e. the 2 before distance, then distance is the final operand and pToa would have been used instead.

 68
 69
 sTop stack to property May not be used at all. Have yet to find an example and can't think of a scenario where something on the stack would need to be popped in to a property. 

 6A
 6B
 ipToa inc property to accumulator This is probably to handle an expression such as the following, where "size" is a property:

(++ size)

 6C
 6D
 dpToa dec property to accumulator Likewise, this would be for an expression like the following:

(-- seconds)

 6E
 6F
 ipToS inc property to stack ~ 0%Have yet to find an example of it being used.

 70
 71
 dpTos dec property to stack ~ 0%Have yet to find an example of it being used.
 72
 73
 lofsa load offset to accumulator 3%Used to reference an instance of a class, or a text string.
 74
 75
 lofss load offset to stack ~ 0%
 76
 77
 push0 push 0 6%Equivalent to pushi 0
 78
 79
 push1 push 1 7%Equivalent to pushi 1
 7A
 7B
 push2 push 2 4%Equivalent to pushi 2
 7C
 7D
 pushSelf push self < 1%Shorthand for selfID followed by push but presumably would bypass the accumulator.

Usually used for passing the current object (i.e. self) as the value part of a message in a send, e.g as follows (where "add" is the method selector and self is the value to pass to it:

(cast add: self)

 7E
 7F
 - (not used) 
  lag load accumulator with global variable 3% 

  lal load accumulator with local variable 0.7% 

  lat load accumulator with temp variable 0.3% 

  lap load accumulator with parameter 0.8% 

  lsg load stack with global variable 0.6% 

  lsl load stack with local variable 0.4% 

  lst load stack with temp variable 0.2% 

  lsp load stack with parameter 0.4% 

  lagi  ~ 0 %Very rare but some games appear to have used it. It would imply a global variable of type array.

  lali  Used when accessing an item within a local variable of type array (by index). The accumulator has the array index in it.

  lati  Used when accessing an item within a temporary variable of type array (by index). The accumulator has the array index in it.

  lapi  Used when accessing an item within a parameter variable of type array (by index). The accumulator has the array index in it. 

  lsgi  Very rare but some games appear to have used it. It would imply a global variable of type array.

  lsli  Push an item from a local array variable on to the stack.

  lsti  Push an item from a temporary array variable on to the stack. 

  lspi  Push an item from a parameter array variable on to the stack. 

  sag  

  sal   
  sat   
  sap   
  ssg   
  ssl   
  sst   
  ssp   
  sagi   
  sali   
  sati   
  sapi   
  ssgi   
  ssli   
  ssti   
  sspi   
  +ag   
  +al   
  +at  Increments a temporary variable. For example, if "i" where a temp variable, then the "+at" instruction would be used as part of the following:

(++ i)

  +ap   

  +sg   
  +sl   
  +st   
  +sp   
  +agi   
  +ali   
  +ati   
  +api   
  +sgi   
  +sli   
  +sti   
  +spi   
  -ag   
  -al   
  -at   
  -ap   
  -sg   
  -sl   
  -st   
  -sp   
  -agi   
  -ali   
  -ati   
  -api   
  -sgi   
  -sli   
  -sti   
  -spi   




Comments