Previous section   Next section

22.7 Defenses

Defending against malicious logic takes advantage of several different characteristics of malicious logic to detect, or to block, its execution. The defenses inhibit the suspect behavior. The mechanisms are imprecise. They may allow malicious logic that does not exhibit the given characteristic to proceed, and they may prevent programs that are not malicious but do exhibit the given characteristic from proceeding.

22.7.1 Malicious Logic Acting as Both Data and Instructions

Some malicious logic acts as both data and instructions. A computer virus inserts code into another program. During this writing, the object being written into the file (the set of virus instructions) is data. The virus then executes itself. The instructions it executes are the same as what it has just written. Here, the object is treated as an executable set of instructions. Protection mechanisms based on this property treat all programs as type "data" until some certifying authority changes the type to "executable" (instructions). Both new systems designed to meet strong security policies and enhancements of existing systems use these methods (see Section 15.3.1).

EXAMPLE: Boebert, Young, Kain, and Hansohn [127] propose labeling of subjects and objects in the Logical Coprocessor Kernel or LOCK (formerly the Secure Ada Target or SAT) [126, 434, 881, 882], a system designed to meet the highest level of security under the U.S. Department of Defense TCSEC (see Section 21.2). Once compiled, programs have the label "data" and cannot be executed until a sequence of specific, auditable events changes the label to "executable." After that, the program cannot be modified. This scheme recognizes that viruses treat programs as data (when they infect them by changing the file's contents) and as instructions (when the program executes and spreads the virus) and rigidly separates the two.

EXAMPLE: Duff [312] has suggested a variant for UNIX-based systems. Noting that users with execute permission for a file usually also have read permission, he proposes that files with execute permission be of type "executable" and that those without it be of type "data." Unlike the LOCK, "executable" files could be modified, but doing so would change those files' types to "data." If the certifying authority were the omnipotent user, the virus could spread only if run as that user. Libraries and other system components of programs must also be certified before use to prevent infection from nonexecutable files.

Both the LOCK scheme and Duff's proposal trust that the administrators will never certify a program containing malicious logic (either by accident or deliberately) and that the tools used in the certification process are not themselves corrupt.

22.7.2 Malicious Logic Assuming the Identity of a User

Because a user (unknowingly) executes malicious logic, that code can access and affect objects within the user's protection domain. So, limiting the objects accessible to a given process run by the user is an obvious protection technique. This draws on the mechanisms for confining information (see Chapter 17, "Confinement Problem").

22.7.2.1 Information Flow Metrics

Cohen suggests an approach [206]. This approach is to limit the distance a virus can spread.

Definition 22–17. Define the flow distance metric fd(x) for some information x as follows. Initially, all information has fd(x) = 0. Whenever x is shared, fd(x) increases by 1. Whenever x is used as input to a computation, the flow distance of the output is the maximum of the flow distance of the input.

Information is accessible only while its flow distance is less than some particular value.

EXAMPLE: Anne, Bill, and Cathy work on the same computer. The system uses the flow distance metric to limit the flow of information. Anne can access information with a flow distance less than 3, and Bill and Cathy can access information with a flow distance less than 2. Anne creates a program dovirus containing a computer virus. Bill executes it. Because the contents of the program have a flow distance of 0, when the virus infects Bill's file safefile, the flow distance of the virus is 1, and so Bill can access it. Hence, the copying succeeds. Now, if Cathy executes safefile, when the virus tries to spread to her files, its flow distance increases to 2. Hence, the infection is not permitted (because Cathy can only access information with a flow distance of 0 or 1).

This example also shows the problem with the flow distance policy (which constrains sharing based on the flow distance metric). Although Cathy cannot be infected by viruses that Bill has acquired, she can be infected by viruses that Bill has written. (For example, had Cathy run Anne's dovirus program, she would have had her files infected.) The bounding constant limits the transitivity of trust. This number should therefore be low. If it is 1, only the people from whom Cathy copies files are trusted. Cathy does not trust anyone that they trust.

This mechanism raises interesting implementation issues. The metric is associated with information and not objects. Rather than tagging specific information in files, systems implementing this policy would most likely tag objects, treating the composition of different information as having the maximum flow distance of the information. This will inhibit sharing.

Ultimately, the only way to use this policy is to make the bounding constant 0. This isolates each user into his or her own protection domain and allows no sharing. Cohen points out that this defeats the main purpose of scientific or development environments, in which users build on the work of others.

22.7.2.2 Reducing the Rights

The user can reduce her associated protection domain when running a suspect program. This follows from the principle of least privilege (see Section 13.2.1). Wiseman discusses one approach [1055], and Juni and Ponto present another idea in the context of a medical database [532].

EXAMPLE: Smith [939] combines ACLs and C-Lists to achieve this end. Suppose s1 owns a file o1 and s2 owns a program o2 and a file o3. The union of discretionary ACLs is

BACL = { (s1, o1, r), (s1, o1, w), (s1, o2, x), (s1, o3, w), (s2, o2, r), (s2, o2, w), (s2, o2, x), (s2, o3, r) }

Program o2 contains a Trojan horse. If s1 wants to execute o2, he must ensure that it does not write to o1. Ideally, s1's protection domain will be reduced to { (s1, o2, x )}.

Then if p12, the process (subject) created when s1 executes o2, tries to access o3, the access will be denied. In fact, p12 inherits the access rights of s1. So, the default protection domain for p12 will be

PD(p12) = PD(s1) = { (p12, o1, r), (p12, o1, w), (p12, o2, x), (p12, o3, w) }

Now, because s1 can write to o3, so can p12. Moreover, s1 cannot constrain this behavior because s1 does not own o3 and so cannot delete its access rights over o3.

Smith's solution is to require each user si to define an authorization denial subset R(si) to contain those ACL entries that it will not allow others to exercise over the objects that si owns. In this example, if R(s2) = { (s1, o3, w) }, then

PD(p12) = PD(s1) ¬ (j R(sj)) = { (p12, o1, r), (p12, o1, w), (p12, o2, x) }

where "¬" means set complement. Now p12 cannot write to o3.

Although effective, this approach begs the question of how to determine which entries should be in the authorization denial subsets. Karger suggests basing access on the program being executed and some characteristic of the file being accessed.

EXAMPLE: Karger proposes a knowledge-based subsystem to determine if a program makes reasonable file accesses [550]. The subsystem sits between the kernel open routine and the application. The subsystem contains information about the names of the files that each program is expected to access. For example, a UNIX C compiler reads from C source files (the names of which end in ".c" and ".h") and writes to temporary files (the names of which begin with "/tmp/ctm") and assembly files (whose names end in ".s"). It executes the assembler, which reads from assembly files and writes to object files (with names ending in ".o"). The compiler then invokes the linking loader, which reads from object files and library files (whose names end in ".a") and writes to executable files (with names ending in ".out" unless the user supplies an alternative name). So, Karger's subsystem has the following associations.

Program

Reads

Writes

Executes

Compiler

*.c, *.h

*.s, /tmp/ctm*

Assembler, loader

Assembler

*.s

*.o

 

(Linking) loader

*.o, *.a

*.out

 

(The "*" means zero or more characters.)

When the subsystem is invoked, it checks that the access is allowed. If not, it either denies the access or asks the user whether to permit the access.

A related approach is to base access to files on some characteristic of the command or program [206], possibly including subject authorizations as well [204].

EXAMPLE: Lai and Gray [603] have implemented a modified version of Karger's scheme on a UNIX system. Unlike Karger, they combine knowledge about each command with the command-line arguments of the current invocation. Their idea is to use this information to determine the user's intent to access files and the type of access. They do not protect these files, but instead prevent other files not named on the command line from being accessed (with two exceptions).

Processes are divided into two groups. File accesses by trusted processes are not checked. Associated with each untrusted process is a valid access list (VAL) consisting of the arguments of the process plus any temporary files created. When an untrusted process tries to access a file, the kernel executes the following sequence of steps.

  1. If the process is requesting access to a file on the VAL, the access is allowed if the effective UID and GID of the process allow the access.

  2. If the process is opening the file for reading and the file is world-readable, the open is allowed.

  3. If the process is creating a file, the creation is allowed if the effective UID and GID of the process allow the creation. The file is entered into the VAL of the process and is marked as a new nonargument (NNA) file. The file's protection modes are set so that no other user may access the file.

  4. Otherwise, an entry in the system log reflects the request, and the user is asked if the access is to be allowed. If the user agrees, the access is allowed if the effective UID and GID of the process allow it. Otherwise, the access is denied.

VALs are created whenever a trusted process spawns an untrusted process, and are inherited.

Files marked NNA have permissions such that only the creating user can access them. They are in the VAL of the creating process, and no others, so only that process and its descendents can access the NNA file. However, neither the creating process nor its descendants may change the protection modes of that file. When the file is deleted, its entry is removed from the VAL. When the process terminates, the user is notified of any existing NNA files.

The trusted processes in a 4.3BSD UNIX environment are UNIX command interpreters (csh and sh), the programs that spawn them on login (getty and login), programs that access the file system recursively (ar, chgrp, chown, diff, du, dump, find, ls, rcp, restore, and tar), programs that often access files not in their argument lists (binmail, cpp, dbx, mail, make, script, and vi), and various network daemons (fingerd, ftpd, ntalkd, rlogind, rshd, sendmail, talkd, telnetd, tftpd, and uucpd). Furthermore, a program called trust enables root to spawn trusted processes other than those listed above.

As an example, consider the assembler when invoked from the cc program. The assembler is called as

as x.s /tmp/cc2345

and the assembler creates the file /tmp/as1111 during the assembly. The VAL is

x.s /tmp/cc2345 /tmp/as1111

with the first file being read-only and the next two being readable and writable (the first because cc created it and the second because as created it). In cc's VAL, the temporary file /tmp/cc2345 is marked NNA; in as's VAL, it is not (because it is a command-line argument to as). The loader is invoked as

ld /lib/crt0.o /tmp/cc2345 -lc -o x

The loader's VAL is

/lib/crt0.o /tmp/cc2345 /lib/libc.a x

The first three files are read-only and the last file is readable and writable.

Now, suppose a Trojan horse assembler is to copy the program to another user's area. When it attempts to create the target file, rule 3 forces the target to be readable only by the originator. Hence, the attacker cannot read the newly created file. If the attacker creates the file with privileges to allow him to read it, the victim is asked if write access to the file should be allowed. This alerts the user to the presence of the Trojan horse.

An alternative mechanism is interception of requests to open files. The "watchdog" or "guardian" then performs a check to determine if the access is to be allowed. This effectively redefines the system calls involved. The issues of determining how to write watchdogs to meet the desired goals and allowing users to specify semantics for file accesses [88, 259] may prove useful in some contexts—for example, in protecting a limited set of files.

All such mechanisms (1) trust the users to take explicit actions to limit their protection domains sufficiently, (2) trust tables to describe the programs' expected actions sufficiently for the mechanisms to apply those descriptions and to handle commands with no corresponding table entries effectively, or (3) trust specific programs and the kernel when they would be the first programs malicious logic would attack.

22.7.2.3 Sandboxing

Sandboxes and virtual machines (see Section 17.2) implicitly restrict process rights. A common implementation of this approach is to restrict the program by modifying it. Usually, special instructions inserted into the object code cause traps whenever an instruction violates the security policy. If the executable dynamically loads libraries, special libraries with the desired restrictions replace the standard libraries.

EXAMPLE: Bishop and Dilger [117] propose a modification to UNIX system calls to detect race conditions in file accesses. A race condition occurs when successive system calls operate on an object identified by name, and the name can be rebounded to a different object between the first and second system calls. The augmentation involved would record the inode number (unique identifier) of the object identified in the first system call. When the object named in the second system call differed from the object named in the first system call, the mechanism would take appropriate action.

22.7.3 Malicious Logic Crossing Protection Domain Boundaries by Sharing

Inhibiting users in different protection domains from sharing programs or data will inhibit malicious logic from spreading among those domains. This takes advantage of the separation implicit in integrity policies (see Chapter 6).

EXAMPLE: When users share procedures, the LOCK system (see Section 22.7.1) keeps only one copy of the procedure in memory. A master directory, accessible only to a trusted hardware controller, associates with each procedure a unique owner and with each user a list of others whom that user trusts. Before executing any procedure, the dynamic linker checks that the user executing the procedure trusts the procedure's owner [125]. This scheme assumes that users' trust in one another is always well-placed.

A more general proposal [1066] suggests that programs to be protected be placed at the lowest possible level of an implementation of a multilevel security policy. Because the mandatory access controls will prevent those processes from writing to objects at lower levels, any process can read the programs but no process can write to them. Such a scheme would have to be combined with an integrity model to provide protection against viruses to prevent both disclosure and file corruption.

EXAMPLE: The Data General model (see Figure 5-3, on page 129) places the executables below the user region in the hierarchy of layers. The site-specific executables are highest, followed by the trusted data, and the Data General executables are at the lowest level. This prevents alteration of the Data General executables and trusted data by site executables and alteration of all executables and trusted data by user applications.

Carrying this idea to its extreme would result in isolation of each domain. Because sharing would not be possible, no viruses could propagate. Unfortunately, the usefulness of such systems would be minimal.

22.7.4 Malicious Logic Altering Files

Mechanisms using manipulation detection codes (or MDCs) apply some function to a file to obtain a set of bits called the signature block and then protect that block. If, after recomputing the signature block, the result differs from the stored signature block, the file has changed, possibly as a result of malicious logic altering the file. This mechanism relies on selection of good cryptographic checksums (see Section 9.4).

EXAMPLE: Tripwire [568, 569] is an integrity checker that targets the UNIX environment. This program computes a signature block for each file and stores it in a database. The signature of each file consists of file attributes (such as size, owner, protection mode, and inode number) and various cryptographic checksums (such as MD-4, MD-5, HAVAL, SHS, and various CRCs). The system administrator selects the components that make up the signature.

When Tripwire is executed, it recomputes each signature block and compares the recomputed blocks with those in the file. If any of them differ, the change is reported as indicating a possibly corrupted file.

An assumption is that the signed file does not contain malicious logic before it is signed. Page [793] has suggested expansion of Boebert and Kain's model [126] to include the software development process (in effect, limiting execution domains for each development tool and user) to ensure that software is not contaminated during development.

EXAMPLE: Pozzo and Grey [817, 818] have implemented Biba's integrity model on the distributed operating system LOCUS [811] to make the level of trust in the above-mentioned assumption explicit. They have different classes of signed executable programs. Credibility ratings (Biba's "integrity levels") assign a measure of trustworthiness on a scale of 0 (unsigned) to N (signed and formally verified), based on the origin of the software. Trusted file systems contain only signed executable files with the same credibility level. Associated with each user (subject) is a risk level that starts out as the highest credibility level. Users may execute programs with credibility levels no less than their risk levels.When the credibility level is lower than the risk level, a special "run-untrusted" command must be used.

All integrity-based schemes rely on software that if infected may fail to report tampering. Performance will be affected because encrypting the file or computing the signature block may take a significant amount of time. The encrypting key must also be secret because if it is not, then malicious logic can easily alter a signed file without the change being detected.

Antivirus scanners check files for specific viruses and, if a virus is present, either warn the user or attempt to "cure" the infection by removing the virus. Many such agents exist for personal computers, but because each agent must look for a particular virus or set of viruses, they are very specific tools and, because of the undecidability results stated earlier, cannot deal with viruses not yet analyzed.

22.7.5 Malicious Logic Performing Actions Beyond Specification

Fault-tolerant techniques keep systems functioning correctly when the software or hardware fails to perform to specifications. Joseph and Avizienis have suggested treating the infection and execution phases of a virus as errors. The first such proposal [529, 530] breaks programs into sequences of nonbranching instructions and checksums each sequence, storing the results in encrypted form. When the program is run, the processor recomputes checksums, and at each branch a coprocessor compares the computed checksum with the encrypted checksum; if they differ, an error (which may be an infection) has occurred. Later proposals advocate checking of each instruction [260]. These schemes raise issues of key management and protection as well as the degree to which the software managing keys, which transmit the control flow graph to the coprocessor and implement the recovery mechanism, can be trusted.

A proposal based on N-version programming [48] requires implementation of several different versions of an algorithm, running them concurrently and periodically checking their intermediate results against each other. If they disagree, the value assumed to be correct is the intermediate value that a majority of the programs have obtained, and the programs with different values are malfunctioning (possibly owing to malicious logic). This requires that a majority of the programs are not infected and that the underlying operating system is secure. Also, Knight and Leveson [574] question the efficacy of N-version programming. Detecting the spread of a virus would require voting on each file system access. To achieve this level of comparison, the programs would all have to implement the same algorithm, which would defeat the purpose of using N-version programming [575].

22.7.5.1 Proof-Carrying Code

Necula has proposed a technique that combines specification and integrity checking [762]. His method, called proof-carrying code (PCC), requires a "code consumer" (user) to specify a safety requirement. The "code producer" (author) generates a proof that the code meets the desired safety property and integrates that proof with the executable code. This produces a PCC binary. The binary is delivered (through the network or other means) to the consumer. The consumer then validates the safety proof and, if it is correct, can execute the code knowing that it honors that policy. The key idea is that the proof consists of elements drawn from the native code. If the native code is changed in a way that violates the safety policy, the proof is invalidated and will be rejected.

EXAMPLE: Necula and Lee [763] tested their method on UNIX-based network packet filters as supported by the Berkeley Packet Filter (BPF) [669, 724]. These filters were written in an interpreted language. The kernel performed the interpretations and prevented the filter from looping and from writing to any location except the packet's data or a small scratch memory. The filters were rewritten in assembly language and augmented with proofs that showed that they met the safety policy that the kernel enforced. The proofs ranged from 300 to 900 bytes, and the validation times ranged from 0.3 to 1.3 ms. As expected, the start-up cost was higher (because the proofs had to be validated before the filters were run), but the runtimes were considerably shorter. In their experiments, in which 1,000 packets were received per second (on the average), the total cost of using the BPF exceeded the PCC after 1,200 packets. The method also compared favorably with implementations using a restrictive subset of Modula-3 (after 10,500 packets) [89, 496] and software fault isolation (after 28,000 packets; see Section 17.2.2).

22.7.6 Malicious Logic Altering Statistical Characteristics

Like human languages, programs have specific statistical characteristics that malicious logic might alter. Detection of such changes may lead to detection of malicious logic.

EXAMPLE: Malicious logic might be present if a program appears to have more programmers than were known to have worked on it or if one particular programmer appears to have worked on many different and unrelated programs [1066]. Programmers have their own individual styles of writing programs. At the source code level, features such as language, formatting, and comment styles can distinguish coding styles. However, adherence to organizational coding standards obscures these features [598]. At the object code level, features such as choice of data structures and algorithms may distinguish programmers [957].

Comparison of object and source may reveal that the object file contains conditionals not corresponding to any in the source. In this case, the object may be infected [385]. Similar proposals suggest examination of the appearance of programs for identical sequences of instructions or byte patterns [516, 1066]. The disadvantage of such comparisons is that they require large numbers of comparisons and need to take into account the reuse of common library routines or of code [564].

Another proposal suggests that a filter be designed to detect, analyze, and classify all modifications that a program makes as ordinary or suspicious [247]. Along the same lines, Dorothy Denning suggests the use of an intrusion-detection expert system[6] to detect viruses by looking for increases in file size, increases in the frequency of writing to executable files, or alterations in the frequency of execution of a specific program in ways that do not match the profiles of users who are spreading the infection [270].

[6] Chapter 25, "Intrusion Detection," discusses this system in more detail.

22.7.7 The Notion of Trust

The effectiveness of any security mechanism depends on the security of the underlying base on which the mechanism is implemented and the correctness of the implementation. If the trust in the base or in the implementation is misplaced, the mechanism will not be secure. Thus, "secure," like "trust," is a relative notion, and the design of any mechanism for enhancing computer security must attempt to balance the cost of the mechanism against the level of security desired and the degree of trust in the base that the site accepts as reasonable. Research dealing with malicious logic assumes that the interface, software, and/or hardware used to implement the proposed scheme will perform exactly as desired, meaning that the trust is in the underlying computing base, the implementation, and (if done) the verification.


  Previous section   Next section
Top